CN106341722A

CN106341722A - Video editing method and device

Info

Publication number: CN106341722A
Application number: CN201610836707.5A
Authority: CN
Inventors: 徐桃; 李岩
Original assignee: Nubia Technology Co Ltd
Current assignee: Nubia Technology Co Ltd
Priority date: 2016-09-21
Filing date: 2016-09-21
Publication date: 2017-01-18

Abstract

The invention discloses a video editing method and device, belonging to the technical field of mobile terminals. The method comprises a step of receiving voice input, a step of judging whether the content of the voice input contains a word belonging to a lexicon or not, and a step of displaying the content of the voice input in a text form in a video if so. According to the video editing method and device, the processing of later editing processing of a video image is omitted, the combined formation with the video image can be achieved, the interest of a video recording process is enhanced, the efficiency of video recording is improved, and a good use experience is brought to the user.

Description

A kind of video editing method and device

Technical field

The present invention relates to technical field of mobile terminals, more particularly, to a kind of video editing method and device.

Background technology

With the continuous development of the mobile terminal technologies such as mobile phone, increasingly competitive between cell phone manufacturer, from list The competition of pure hardware configuration is converted into the competition of ecological content.Therefore, cell phone system has been no longer simple carrier, heavier Want is the derivative of its content.

When using mobile terminal viewing video, barrage usually occurs.A large amount of comments being shown with captions form are gone out simultaneously Existing phenomenon is referred to as barrage.Tell groove comment effect when screen is sailed in a large number and look like the bullet flight shooting game Curtain, so netizen has the substantial amounts of effect told when groove comment occurs to do barrage by this.In China, original only a large amount of comments are same When occur could be barrage, but be as misuse wall scroll comment on also can be barrage.Current mobile video editor is base mostly In the editor in later stage, need user to enter edit pattern after having shot video, be added word etc. and process, before addition Also need to user and be dragged to corresponding position manually, lead to efficiency to reduce, Consumer's Experience is bad.

Therefore, it is necessary to propose a kind of video editing method and device, it is to avoid the occurrence of above-mentioned, improve user's body Test.

Content of the invention

Present invention is primarily targeted at proposing a kind of video editing method and device it is intended to solve existing video editing Mainly process in the later stage, lead to the problem that efficiency reduces, Consumer's Experience is bad.

For achieving the above object, a kind of video editing method that the present invention provides, is applied to mobile terminal, methods described bag Include step: receive phonetic entry；Judge whether the content of described phonetic entry contains the word belonging in dictionary；If so, then will The content of described phonetic entry is shown in video with written form.

Alternatively, after described reception phonetic entry, methods described also includes: obtains the key in described phonetic entry Word；Correspondingly, whether the described content judging described phonetic entry contains the word belonging in dictionary, comprising: judge described pass Whether key word belongs to the word in dictionary.

Alternatively, the described content by described phonetic entry is shown in video with written form, comprising: extracts and receives institute State the recording time of phonetic entry；The Content Transformation of described phonetic entry is Word message；In described recording time, will be described Word message is illustrated in video image.

Alternatively, methods described also includes: detects the face location in described video image；Described Word message avoids institute State face location, and show default specially good effect.

Alternatively, methods described also includes: if it is not, the content of then phonetic entry described in automatic fitration.

Additionally, for achieving the above object, the present invention also proposes a kind of video editing apparatus, is applied to mobile terminal, described Device includes: receiver module, for receiving phonetic entry；Judge module, whether the content for judging described phonetic entry contains There is the word belonging in dictionary；Editor module, belongs to for judging that when described judge module the content of described phonetic entry contains Word in dictionary, then shown the content of described phonetic entry in video with written form.

Alternatively, described device also includes: acquisition module, for obtaining the keyword in described phonetic entry；Correspondingly, Described judge module, specifically for judging whether described keyword belongs to the word in dictionary.

Alternatively, described editor module, comprising: extraction unit, for extracting the recording time receiving described phonetic entry； Converting unit, for being Word message by the Content Transformation of described phonetic entry；Edit cell, in described recording time, Described Word message is illustrated in video image.

Alternatively, described editor module also includes: detector unit, for detecting the face location in described video image； Display unit, avoids described face location for described Word message, and shows default specially good effect.

Alternatively, described device also includes: filtering module, for when in the described phonetic entry of judgement of described judge module When appearance does not contain the word belonging in dictionary, then the content of phonetic entry described in automatic fitration.

Video editing method proposed by the present invention and device, when reception phonetic entry, and judge the interior of described phonetic entry When holding containing the word belonging in dictionary, then the content of phonetic entry is shown in video with written form.With prior art Compare, the video editing method of the present invention and device, eliminate the later stage compilation to video image and the operation such as process, can with regard Frequency image is formed in the lump, not only increases the interest of video record process, more improves the efficiency of video record, is that user carries Carry out preferable experience.

Brief description

Fig. 1 is the hardware architecture diagram realizing the optional mobile terminal of each embodiment of the present invention one；

Fig. 2 is the wireless communication system schematic diagram of mobile terminal as shown in Figure 1；

The schematic flow sheet of the video editing method that Fig. 3 provides for first embodiment of the invention；

The sub-process schematic diagram of the video editing method that Fig. 4 provides for first embodiment of the invention；

The schematic flow sheet of the video editing method that Fig. 5 provides for second embodiment of the invention；

The schematic flow sheet of the video editing method that Fig. 6 provides for third embodiment of the invention；

The schematic flow sheet of the video editing method that Fig. 7 further provides for for third embodiment of the invention；

The module diagram of the video editing apparatus that Fig. 8 provides for fourth embodiment of the invention；

Fig. 9 is the module diagram of judging unit in Fig. 8；

The module diagram of the video editing apparatus that Figure 10 provides for sixth embodiment of the invention.

The realization of the object of the invention, functional characteristics and advantage will be described further in conjunction with the embodiments referring to the drawings.

Specific embodiment

It should be appreciated that specific embodiment described herein, only in order to explain the present invention, is not intended to limit the present invention.

Realize the mobile terminal of each embodiment of the present invention referring now to Description of Drawings.In follow-up description, use For represent element such as " module ", " part " or " unit " suffix only for being conducive to the explanation of the present invention, itself Not specific meaning.Therefore, " module " and " part " can mixedly use.

Mobile terminal can be implemented in a variety of manners.For example, the terminal described in the present invention can include such as moving Phone, smart phone, notebook computer, digit broadcasting receiver, pda (personal digital assistant), pad (panel computer), pmp The mobile terminal of (portable media player), guider etc. and such as numeral tv, desk computer etc. consolidate Determine terminal.Hereinafter it is assumed that terminal is mobile terminal.However, it will be understood by those skilled in the art that, except being used in particular for moving Outside the element of purpose, construction according to the embodiment of the present invention can also apply to the terminal of fixed type.

Fig. 1 is that the hardware configuration realizing the optional mobile terminal of each embodiment of the present invention one is illustrated.

Mobile terminal 1 00 can include wireless communication unit 110, a/v (audio/video) input block 120, user input Unit 130, sensing unit 140, output unit 150, memorizer 160, interface unit 170, controller 180 and power subsystem 190 Etc..Fig. 1 shows the mobile terminal with various assemblies, it should be understood that being not required for implementing all groups illustrating Part.More or less of assembly can alternatively be implemented.Will be discussed in more detail below the element of mobile terminal.

Wireless communication unit 110 generally includes one or more assemblies, and it allows mobile terminal 1 00 and wireless communication system Or the radio communication between network.For example, wireless communication unit can include broadcasting reception module 111, mobile communication module 112nd, at least one of wireless Internet module 113, short range communication module 114 and location information module 115.

Broadcasting reception module 111 receives broadcast singal and/or broadcast via broadcast channel from external broadcast management server Relevant information.Broadcast channel can include satellite channel and/or terrestrial channel.Broadcast management server can be generated and sent The broadcast singal generating before the server of broadcast singal and/or broadcast related information or reception and/or broadcast related information And send it to the server of terminal.Broadcast singal can include tv broadcast singal, radio signals, data broadcasting Signal etc..And, broadcast singal may further include the broadcast singal combining with tv or radio signals.Broadcast phase Pass information can also provide via mobile communications network, and in this case, broadcast related information can be by mobile communication mould Block 112 is receiving.Broadcast singal can exist in a variety of manners, and for example, it can be with the electronics of DMB (dmb) The form of program guide (epg), the electronic service guidebooks (esg) of digital video broadcast-handheld (dvb-h) etc. and exist.Broadcast Receiver module 111 can be broadcasted by using various types of broadcast system receipt signals.Especially, broadcasting reception module 111 Can be wide by using such as multimedia broadcasting-ground (dmb-t), DMB-satellite (dmb-s), digital video Broadcast-hand-held (dvb-h), forward link media (mediaflo^@) Radio Data System, received terrestrial digital broadcasting integrated service Etc. (isdb-t) digit broadcasting system receives digital broadcasting.Broadcasting reception module 111 may be constructed such that and is adapted to provide for extensively Broadcast the various broadcast systems of signal and above-mentioned digit broadcasting system.Via broadcasting reception module 111 receive broadcast singal and/ Or broadcast related information can be stored in memorizer 160 (or other types of storage medium).

Mobile communication module 112 sends radio signals to base station (for example, access point, node b etc.), exterior terminal And at least one of server and/or receive from it radio signal.Such radio signal can include voice and lead to Words signal, video calling signal or the various types of data sending and/or receiving according to text and/or Multimedia Message.

Wireless Internet module 113 supports the Wi-Fi (Wireless Internet Access) of mobile terminal.This module can be internally or externally It is couple to terminal.Wi-Fi (Wireless Internet Access) technology involved by this module can include wlan (wireless lan) (wi-fi), wibro (WiMAX), wimax (worldwide interoperability for microwave accesses), hsdpa (high-speed downlink packet access) etc..

Short range communication module 114 is the module for supporting junction service.Some examples of short-range communication technology include indigo plant Tooth^tm, RF identification (rfid), Infrared Data Association (irda), ultra broadband (uwb), purple honeybee^tmEtc..

Location information module 115 be for check or obtain mobile terminal positional information module.Location information module Typical case be gps (global positioning system).According to current technology, gps module 115 calculates and is derived from three or more satellites Range information and correct time information and for the Information application triangulation calculating, thus according to longitude, latitude Highly accurately calculate three-dimensional current location information.Currently, the method for calculating position and temporal information is defended using three Star and the error of the position that calculates by using other satellite correction and temporal information.Additionally, gps module 115 Can be by Continuous plus current location information in real time come calculating speed information.

A/v input block 120 is used for receiving audio or video signal.A/v input block 120 can include camera 121 He Mike 1220, camera 121 is to the static map being obtained by image capture apparatus in Video Capture pattern or image capture mode The view data of piece or video is processed.Picture frame after process may be displayed on display unit 151.At camera 121 Picture frame after reason can be stored in memorizer 160 (or other storage medium) or carry out via wireless communication unit 110 Send, two or more cameras 1210 can be provided according to the construction of mobile terminal.Mike 122 can be in telephone relation mould Sound (voice data) is received via mike in formula, logging mode, speech recognition mode etc. operational mode, and can be by Such acoustic processing is voice data.Audio frequency (voice) data after process can be changed in the case of telephone calling model For can be sent to the form output of mobile communication base station via mobile communication module 112.Mike 122 can implement all kinds Noise eliminate (or suppression) algorithm with eliminate (or suppression) receive and the noise that produces during sending audio signal or Person disturbs.

User input unit 130 can generate key input data to control each of mobile terminal according to the order of user input Plant operation.User input unit 130 allows the various types of information of user input, and can include keyboard, metal dome, touch Plate (for example, detection due to touched and lead to resistance, pressure, the change of electric capacity etc. sensitive component), roller, rocking bar etc. Deng.Especially, when touch pad is superimposed upon on display unit 151 as a layer, touch screen can be formed.

Sensing unit 140 detect mobile terminal 1 00 current state, (for example, mobile terminal 1 00 open or close shape State), the position of mobile terminal 1 00, user is for the presence or absence of the contact (that is, touch input) of mobile terminal 1 00, mobile terminal 100 orientation, the acceleration or deceleration movement of mobile terminal 1 00 and direction etc., and generate for controlling mobile terminal 1 00 The order of operation or signal.For example, when mobile terminal 1 00 is embodied as sliding-type mobile phone, sensing unit 140 can sense This sliding-type phone opens or cuts out.In addition, sensing unit 140 can detect power subsystem 190 whether provide electric power or Whether person's interface unit 170 is coupled with external device (ED).

Interface unit 170 is connected, with mobile terminal 1 00, the interface that can pass through as at least one external device (ED).For example, External device (ED) can include wired or wireless head-band earphone port, external power source (or battery charger) port, wired or nothing Line FPDP, memory card port, the port of device for connection with identification module, audio input/output (i/o) end Mouth, video i/o port, ear port etc..Identification module can be storage for verifying that user uses each of mobile terminal 1 00 Kind of information and subscriber identification module (uim), client identification module (sim), Universal Subscriber identification module (usim) can be included Etc..In addition, the device (hereinafter referred to as " identifying device ") with identification module can take the form of smart card, therefore, know Other device can be connected with mobile terminal 1 00 via port or other attachment means.Interface unit 170 can be used for reception and is derived from The input (for example, data message, electric power etc.) of the external device (ED) and input receiving is transferred in mobile terminal 1 00 One or more elements or can be used for transmission data between mobile terminal and external device (ED).

In addition, when mobile terminal 1 00 is connected with external base, interface unit 170 can serve as allowing by it by electricity Power provides the path of mobile terminal 1 00 from base or can serve as allowing the various command signals from base input to pass through it It is transferred to the path of mobile terminal.May serve as identifying that mobile terminal is from the various command signals of base input or electric power The no signal being accurately fitted within base.Output unit 150 is configured to defeated with the offer of vision, audio frequency and/or tactile manner Go out signal (for example, audio signal, video signal, alarm signal, vibration signal etc.).Output unit 150 can include showing Unit 151, dio Output Modules 152, alarm unit 153 etc..

Display unit 151 may be displayed on the information processing in mobile terminal 1 00.For example, when mobile terminal 1 00 is in electricity During words call mode, display unit 151 can show (for example, text messaging, the multimedia file that communicate with call or other Download etc.) related user interface (ui) or graphic user interface (gui).When mobile terminal 1 00 is in video calling pattern Or during image capture mode, display unit 151 can show the image of capture and/or the image of reception, illustrate video or figure Ui or gui of picture and correlation function etc..

Meanwhile, when display unit 151 and the touch pad touch screen with formation superposed on one another as a layer, display unit 151 can serve as input equipment and output device.Display unit 151 can include liquid crystal display (lcd), thin film transistor (TFT) In lcd (tft-lcd), Organic Light Emitting Diode (oled) display, flexible display, three-dimensional (3d) display etc. at least A kind of.Some in these display may be constructed such that transparence to allow user from outside viewing, and this is properly termed as transparent Display, typical transparent display can be, for example, toled (transparent organic light emitting diode) display etc..According to specific The embodiment wanted, mobile terminal 1 00 can include two or more display units (or other display device), for example, moves Dynamic terminal can include outernal display unit (not shown) and inner display unit (not shown).Touch screen can be used for detection and touches Input pressure and touch input position and touch input area.

Dio Output Modules 152 can mobile terminal be in call signal reception pattern, call mode, logging mode, When under the isotypes such as speech recognition mode, broadcast reception mode, that wireless communication unit 110 is received or in memorizer 160 The voice data transducing audio signal of middle storage and be output as sound.And, dio Output Modules 152 can provide and move The audio output (for example, call signal receives sound, message sink sound etc.) of the specific function correlation of terminal 100 execution. Dio Output Modules 152 can include speaker, buzzer etc..

Alarm unit 153 can provide output to notify event to mobile terminal 1 00.Typical event is permissible Including calling reception, message sink, key signals input, touch input etc..In addition to audio or video output, alarm unit 153 can provide output in a different manner with the generation of notification event.For example, alarm unit 153 can be in the form of vibrating Output is provided, enters when communicating (incomingcommunication) when receiving calling, message or some other, alarm list Unit 153 can provide tactile output (that is, vibrating) to notify to user.By providing such tactile output, even if When the mobile phone of user is in the pocket of user, user also can recognize that the generation of various events.Alarm unit 153 The output of the generation of notification event can be provided via display unit 151 or dio Output Modules 152.

Memorizer 160 can store software program of the process being executed by controller 180 and control operation etc., or can Temporarily to store oneself data (for example, telephone directory, message, still image, video etc.) through exporting or will export.And And, memorizer 160 can be to store the vibration of various modes with regard to exporting and audio signal when touching and being applied to touch screen Data.

Memorizer 160 can include the storage medium of at least one type, and described storage medium includes flash memory, hard disk, many Media card, card-type memorizer (for example, sd or dx memorizer etc.), random access storage device (ram), static random-access storage Device (sram), read only memory (rom), Electrically Erasable Read Only Memory (eeprom), programmable read only memory (prom), magnetic storage, disk, CD etc..And, mobile terminal 1 00 can execute memorizer with by network connection The network storage device cooperation of 160 store function.

Controller 180 generally controls the overall operation of mobile terminal.For example, controller 180 execution and voice call, data The related control of communication, video calling etc. and process.In addition, controller 180 can be included for reproducing (or playback) many matchmakers The multi-media module 1810 of volume data, multi-media module 1810 can construct in controller 180, or it is so structured that and control Device 180 processed separates.Controller 180 can be with execution pattern identifying processing, by the handwriting input executing on the touchscreen or figure Piece is drawn input and is identified as character or image.

Power subsystem 190 receives external power or internal power under the control of controller 180 and provides operation each unit Suitable electric power needed for part and assembly.

Various embodiment described herein can be with using such as computer software, hardware or its any combination of calculating Machine computer-readable recording medium is implementing.Hardware is implemented, embodiment described herein can be by using application-specific IC (asic), digital signal processor (dsp), digital signal processing device (dspd), programmable logic device (pld), scene can Program gate array (fpga), processor, controller, microcontroller, microprocessor, be designed to execute function described herein At least one in electronic unit implementing, in some cases, can be implemented in controller 180 by such embodiment. Software is implemented, the embodiment of such as process or function can with allow to execute the single of at least one function or operation Software module is implementing.Software code can be come by the software application (or program) write with any suitable programming language Implement, software code can be stored in memorizer 160 and be executed by controller 180.

So far, oneself is through describing mobile terminal according to its function.Below, for the sake of brevity, will describe such as folded form, Slide type mobile terminal in various types of mobile terminals of board-type, oscillating-type, slide type mobile terminal etc. is as showing Example.Therefore, the present invention can be applied to any kind of mobile terminal, and is not limited to slide type mobile terminal.

As shown in Figure 1 mobile terminal 1 00 may be constructed such that using via frame or packet transmission data all if any Line and wireless communication system and satellite-based communication system are operating.

The communication system being wherein operable to according to the mobile terminal of the present invention referring now to Fig. 2 description.

Such communication system can use different air interfaces and/or physical layer.For example, used by communication system Air interface includes such as frequency division multiple access (fdma), time division multiple acess (tdma), CDMA (cdma) and universal mobile communications system System (umts) (especially, Long Term Evolution (lte)), global system for mobile communications (gsm) etc..As non-limiting example, under The description in face is related to cdma communication system, but such teaching is equally applicable to other types of system.

With reference to Fig. 2, cdma wireless communication system can include multiple mobile terminal 1s 00, multiple base station (bs) 270, base station Controller (bsc) 275 and mobile switching centre (msc) 280.Msc280 is configured to and Public Switched Telephony Network (pstn) 290 formation interfaces.Msc280 is also structured to and can form interface via the bsc275 that back haul link is couple to base station 270. If back haul link can construct according to any one in the interface that Ganji knows, described interface includes such as e1/t1, atm, ip, Ppp, frame relay, hdsl, adsl or xdsl.It will be appreciated that system as shown in Figure 2 can include multiple bsc2750.

Each bs270 can service one or more subregions (or region), by the sky of multidirectional antenna or sensing specific direction Each subregion that line covers is radially away from bs270.Or, each subregion can by for diversity reception two or more Antenna covers.Each bs270 may be constructed such that support multiple frequency distribution, and the distribution of each frequency has specific frequency spectrum (for example, 1.25mhz, 5mhz etc.).

Intersecting that subregion and frequency are distributed can be referred to as cdma channel.Bs270 can also be referred to as base station transceiver System (bts) or other equivalent terms.In this case, term " base station " can be used for broadly representing single Bsc275 and at least one bs270.Base station can also be referred to as " cellular station ".Or, each subregion of specific bs270 can be claimed For multiple cellular stations.

As shown in Figure 2, broadcast singal is sent to the mobile terminal of operation in system by broadcsting transmitter (bt) 295 100.Broadcasting reception module 111 is arranged at mobile terminal 1 00 to receive the broadcast being sent by bt295 as shown in Figure 1 Signal.In fig. 2 it is shown that several global positioning system (gps) satellites 300.Satellite 300 helps position multiple mobile terminals At least one of 100.

In fig. 2, depict multiple satellites 300, it is understood that be, it is possible to use any number of satellite obtains useful Location information.Gps module 115 is generally configured to coordinate with satellite 300 to obtain the positioning letter wanted as shown in Figure 1 Breath.Substitute gps tracking technique or outside gps tracking technique, it is possible to use other of the position of mobile terminal can be followed the tracks of Technology.In addition, at least one gps satellite 300 can optionally or additionally process satellite dmb transmission.

As a typical operation of wireless communication system, bs270 receives the reverse link from various mobile terminal 1s 00 Signal.Mobile terminal 1 00 generally participates in call, information receiving and transmitting and other types of communication.Each of certain base station 270 reception is anti- Processed in specific bs270 to link signal.The data obtaining is forwarded to the bsc275 of correlation.Bsc provides call Resource allocation and the mobile management function of including the coordination of soft switching process between bs270.Bsc275 is also by the number receiving According to being routed to msc280, it provides the extra route service for forming interface with pstn290.Similarly, pstn290 with Msc280 forms interface, and msc and bsc275 form interface, and bsc275 correspondingly controls bs270 with by forward link signals It is sent to mobile terminal 1 00.

Embodiment one

As shown in figure 3, first embodiment of the invention proposes a kind of video editing method, it is applied to mobile terminal, this movement Terminal, based on above-mentioned mobile terminal hardware configuration and communication system.Video editing method in the present embodiment includes:

Step 310, opens video record function.

Specifically, by the input instruction of receive user, the application-specific of mobile terminal is selected to carry out video record； Or, select the photographing unit program of mobile terminal, and enter video recording mode.

Preferably, before step 310, further comprise the steps of: pre- setting video dictionary, described dictionary includes multiple words.

Specifically, the multiple words in dictionary can obtain in the following way:

Mode one: by capturing the forms such as the hot word on the Internet, descriptive words are imported in system in advance, formed The preset dictionary of editor's video.Preferably, (preferably use wireless network, next to that making after detecting connection of mobile terminal into network With carrier network), it is possible to achieve the real-time update to the word in default dictionary, can be according to the rule pre-setting to net Network word carries out crawl and updates, comprising: mood word, network buzzword etc..

Mode two: the word of receive user voice typing.

Mode three: the word of receive user word input, and the word that word is inputted is converted to voice words and phrases.

It is possible to further dictionary is carried out with language classification, form subdirectory it may be assumed that dividing according to the language that word obtains For Chinese, English, Korean etc..

Further, if dictionary is preset with a kind of voice typing with user, it is automatically translated into other language, and lists phase in In the dictionary subdirectory answered.For example, dictionary is preset with Chinese speech " excellent " typing, then automatic translation English is " wonderful ", translating Korean isAnd respectively by " wonderful " andPut into English In subdirectory and Korean subdirectory.If in translation, there are multiple synonyms, then put in dictionary subdirectory in the lump.

Step 320, receives phonetic entry.

Specifically, open video record function after, the audio content data incoming data of typing to upper layer software (applications), in data Hold and include: semantic content, corresponding recording time point position.

Further, speech recognition technology is a cross discipline.Recent two decades come, and speech recognition technology obtains significantly to enter Step, starts to move towards market from laboratory.It is contemplated that, in coming 10 years, speech recognition technology will enter industry, household electrical appliances, communication, The every field such as automotive electronics, medical treatment, home services, consumption electronic product.Speech recognition dictation machine is in the application in some fields It is chosen as one of ten major issues of development of computer in 1997 by US News circle.A lot of experts think that speech recognition technology is 2000 One of areas of information technology ten development in science and technology technology important greatly between year to 2010.Field bag involved by speech recognition technology Include: signal processing, pattern recognition, theory of probability and theory of information, sound generating mechanism and hearing mechanism, artificial intelligence etc..Enter with machine Row speech exchange, allows machine understand what you say, this is the thing that people dream of for a long time.Chinese Internet of Things school-run enterprise connection Alliance is vivid to be obtained speech recognition than as " auditory system of machine ".Speech recognition technology is exactly to allow machine to pass through to identify and to understand Process is changed into the high-tech of corresponding text or order voice signal.Speech recognition technology mainly includes feature extraction skill Art, pattern match criterion and three aspects of model training technology.

Further, the language of phonetic entry can be Chinese, English, Korean etc..

Step 330, judges whether the content of described phonetic entry contains the word belonging in dictionary.If so, then enter step Rapid 340, if it is not, then entering step 350.

Specifically, the characteristic vector of input voice is carried out similarity-rough set with each word in dictionary successively, by phase Like degree, soprano exports as recognition result.

More specifically, can be to realize step 330 using following two modes:

Mode one: the key word collected is mated by upper layer software (applications) with the word of default dictionary, if matching result is The word collected belongs to the word of dictionary, then enter step 340, to be for further processing.If matching result is to be not belonging to Dictionary word, then enter step 350.

Mode two: according to the content of phonetic entry, sentence in the default dictionary of search, if search with phonetic entry in Hold the words and phrases matching, then enter step 340, to be for further processing.If not searching the content with phonetic entry The words and phrases matching, then enter step 350.

Further, if in the content of phonetic entry, the word phase of only or several words and default dictionary Coupling, then refer to Fig. 4, step 330 further includes:

Step 410, according to preset rules, the division of teaching contents of described phonetic entry is multibyte, and extracts each respectively The sentence of byte.

Specifically, preset rules may is that the first, by the division of teaching contents of phonetic entry be word, phrase, common-use words it One or combination in any；Second, the content of phonetic entry is divided according to the grammer of subject and predicate, guest.

Step 420, judges whether each sentence contains the word belonging in dictionary respectively, if so, then enters step 340, If it is not, then entering step 350.

Specifically, judge one of each word, phrase, common-use words respectively or whether combination in any contains and belong in dictionary Word.Or, judge whether each subject and predicate, guest contain the word belonging in dictionary respectively.

Step 340, the content of described phonetic entry is shown in video with written form.

Specifically, contain the word in words and phrases when the content of phonetic entry, then the content of phonetic entry is retained, and Upper layer software (applications) positions to the content corresponding recording time point retaining, and the content of phonetic entry is presented on written form In the predeterminated position of video image.

Further, the language category that setting video shows, and by the content of phonetic entry with the word of this language category Form is illustrated in video image.For example: the language of the phonetic entry receiving is English, the language category that setting video shows For Chinese, then extract the Chinese word corresponding with the voice content receiving from Chinese vocabulary bank subdirectory, and be shown in and regard In frequency image.

Further, in addition to being shown with written form, can also according to the semanteme of the content of phonetic entry and Linguistic context shows specific punctuation mark, emoticon etc., forms the form similar to barrage.

Further, the Show Styles of word are set, for example: with color and sytlized font, the characters in a fancy style of size etc..

Further, as to further improvement of this embodiment, methods described further comprises the steps of: receiving user's input Editor's video instructions, enter edlin with the needs according to user to the content of display, increased the interest of video record.

Step 350, the content of phonetic entry described in automatic fitration.

Specifically, the word in words and phrases, the then content of automatic fitration phonetic entry is not contained when the content of phonetic entry.

The video editing method of the present embodiment, by opening video record function, receives phonetic entry, the predicate when judging When the content of sound input contains the word belonging in dictionary, then the content of phonetic entry is shown in video with written form. Compared with prior art, the video editing method of the present invention eliminates the operations such as the later stage compilation process to video image, can Formed in the lump with video image, not only increase the interest of video record process, more improve the efficiency of video record, be use Preferable experience is brought at family.

Embodiment two

Refer to Fig. 5, second embodiment of the invention further provides for a kind of video editing method, methods described includes walking Rapid:

Step 510, opens video record function.

Step 520, receives phonetic entry.

The content of above-mentioned steps 510-520 is identical with the content of step 310-320 in first embodiment, for identical Content, the present embodiment will not be described here.

Step 530, obtains the keyword in described phonetic entry.

Specifically, obtain keyword from the content of phonetic entry, this keyword can be word, phrase, common-use words, network Buzzword, mood word etc..

Further, the keyword of acquisition can be speech form, can also be written form, can also be voice and turn literary composition Font formula.

Further, in the voice according to reception, therefrom can only obtain a key word, multiple passes can also be obtained Keyword.

Step 540, judges whether described keyword belongs to the word in dictionary.If so, then enter step 550, if it is not, then Enter step 560.

Step 550, described keyword is shown in video with written form.

Step 560, keyword described in automatic fitration.

The content of step 330-350 in above-mentioned steps 540-560 and first embodiment is similar, and the present embodiment here is not Repeat again.

The video editing method of the present embodiment, by extracting keyword from the phonetic entry receiving, and is judging this pass When containing, in key word, the word belonging in dictionary, this keyword is shown in video with written form, improves video record The interest of process simultaneously improves video editing efficiency.

Embodiment three

Refer to Fig. 6, third embodiment of the invention further provides for a kind of video editing method, in the third embodiment, Described video editing method is done improvement further on the basis of first embodiment, differs only in, the first enforcement The content by described phonetic entry in example is shown in video with written form, comprising:

Step 610, extracts the recording time receiving described phonetic entry.

Specifically, upper layer software (applications) positions to the corresponding recording time of content of the phonetic entry retaining, and extracts this Recording time.

Step 620, the Content Transformation of described phonetic entry is Word message.

Specifically, by receive phonetic entry Content Transformation be Word message.

Step 630, in described recording time, described Word message is illustrated in video image.

Specifically, in recording time, Word message is illustrated in the specified location of video image.

Further it is intended that position can be the positions such as the top of the screen showing video image, bottom, centre, permissible It is configured according to the needs of user.

For example: recording " excellent " receiving the first phonetic entry on the 1st minute, in the 2nd minute reception the second language recorded " Gao Fushuai " of sound input, and respectively " excellent " and " Gao Fushuai " of phonetic entry is converted to Word message, and at the 1st minute Video image in show " excellent ", and in the video image of the 2nd minute show " Gao Fushuai ".

Refer to Fig. 7, as to further improvement of this embodiment, after step 630, methods described also includes:

Step 710, detects the face location in described video image.

Specifically, if there is a face in video image, the position coordinateses of this face in detection video image.If depending on There are multiple faces in frequency image, then detect the position coordinateses of each face in video image.

Step 720, described Word message avoids described face location, and shows default specially good effect.

Specifically, described Word message is illustrated on non-face position, and is equipped with default characteristic, for example: be corresponding Punctuation mark, emoticon, sytlized font, form barrage effect

If there is not face in video image, face location display Word message need not be avoided, but according to default Specified location is shown.

The video editing method of the present embodiment, by extracting the recording time receiving phonetic entry, by phonetic entry Hold and be converted to Word message, in recording time, this Word message is illustrated in video image, improves the entertaining of video record Property and playability, are that user brings preferably Consumer's Experience.

Example IV

The present invention further provides a kind of video editing apparatus.

With reference to Fig. 8, the video editing apparatus that Fig. 8 provides for fourth embodiment of the invention.

A kind of video editing apparatus of the present embodiment, application and mobile terminal, described device includes:

Opening module 810, for opening video record function.

Specifically, the input instruction by receive user for the opening module 810, selects the application-specific of mobile terminal to enter Row video record；Or, select the photographing unit program of mobile terminal, and enter video recording mode.

Preferably, described device can also include: dictionary setup module, for pre- setting video dictionary, wraps in described dictionary Include multiple words.

Specifically, the multiple words in dictionary can obtain in the following way:

Mode one: dictionary setup module is passed through to capture the forms such as the hot word on the Internet, and descriptive words are imported in advance To in system, form the preset dictionary of editor's video.Preferably, (preferably use wireless after detecting connection of mobile terminal into network Network, next to that use carrier network), it is possible to achieve the real-time update to the word in default dictionary, can be according in advance The rule of setting carries out crawl and updates to network vocabulary, comprising: mood word, network buzzword etc..

Mode two: the word of dictionary setup module receive user voice typing.

Mode three: the word of dictionary setup module receive user word input, and the word that word is inputted is converted to language Sound words and phrases.

Further, dictionary setup module can carry out language classification to dictionary, forms subdirectory it may be assumed that obtaining according to word The language taking is divided into Chinese, English, Korean etc..

Further, if presetting dictionary with user with a kind of voice typing, dictionary setup module is automatically translated into other Language, and list in corresponding dictionary subdirectory.For example, dictionary is preset with Chinese speech " excellent " typing, then dictionary setting mould Block automatic translation English is " wonderful ", and translation Korean isAnd respectively by " wonderful " andPut in English subdirectory and Korean subdirectory.If in translation, there are multiple synonyms, then put in the lump In dictionary subdirectory.

Receiver module 820, for receiving phonetic entry.

Specifically, after opening video record function, the audio content data incoming data of receiver module 820 typing is to upper strata Software, data content includes: semantic content, corresponding recording time point position.

Further, the language of phonetic entry can be Chinese, English, Korean etc..

Judge module 830, whether the content for judging described phonetic entry contains the word belonging in dictionary.If so, Then trigger editor module 840, if it is not, then triggering filtering module 850.

Specifically, judge module 830 by input voice characteristic vector carry out to each word in dictionary successively similar Degree compares, and similarity soprano is exported as recognition result.

More specifically, judge module 830 can be to be judged using following two modes:

Mode one: the key word collected is mated by judge module 830 with the word of default dictionary, if matching result Word for collecting belongs to the word of dictionary, then trigger editor module 840, to be for further processing.If matching result is It is not belonging to dictionary word, then trigger filtering module 850.

Mode two: according to the content of phonetic entry, sentence in the default dictionary of judge module 830 search, if search with The words and phrases that the content of phonetic entry matches, then trigger editor module 840, to be for further processing.If not searching The words and phrases matching with the content of phonetic entry, then trigger filtering module 850.

Further, if in the content of phonetic entry, the word phase of only or several words and default dictionary Coupling, then refer to Fig. 9, judge module 830 further includes:

Division unit 910, for according to preset rules, the division of teaching contents of described phonetic entry being multibyte, and respectively Extract the sentence of each byte.

Judging unit 920, judges whether each sentence contains the word belonging in dictionary respectively, and if so, then triggering is edited Module 840, if it is not, then trigger filtering module 850.

Specifically, judging unit 920 judges whether one of each word, phrase, common-use words or combination in any contain respectively There is the word belonging in dictionary.Or, judge whether each subject and predicate, guest contain the word belonging in dictionary respectively.

Editor module 840, for showing in video the content of described phonetic entry with written form.

Specifically, the word in words and phrases, the then content to phonetic entry for the editor module 840 is contained when the content of phonetic entry Retained, and editor module 840 to retain content corresponding recording time point position, by the content of phonetic entry with Written form is presented in the predeterminated position of video image.

Further, the language category that setting video shows, editor module 840 is by the content of phonetic entry with this class of languages Other written form is illustrated in video image.For example: the language of the phonetic entry receiving is English, setting video shows Language category is Chinese, then extract the Chinese word corresponding with the voice content receiving from Chinese vocabulary bank subdirectory, and It is shown in video image.

Further, in addition to being shown with written form, editor module 840 can also be according in phonetic entry The semanteme holding and linguistic context show specific punctuation mark, emoticon etc., form the form similar to barrage.

Further, as to further improvement of this embodiment, described editor module 840 is additionally operable to receiving user's input Editor's video instructions, with according to user needs to display content enter edlin, increased the interest of video record.

Filtering module 850, for the content of phonetic entry described in automatic fitration.

Specifically, do not contain the word in words and phrases when the content of phonetic entry, then filtering module 850 automatic fitration voice is defeated The content entering.

The video editing apparatus of the present embodiment, open video record function by opening module 810, and receiver module 820 connects Receive phonetic entry, when the content that judge module 830 judges described phonetic entry contains the word belonging in dictionary, then edit mould The content of phonetic entry is shown in video by block 840 with written form.Compared with prior art, the video editing dress of the present invention Put and eliminate the operations such as the later stage compilation process to video image, can be formed in the lump with video image, not only increase video The interest of recording process, more improves the efficiency of video record, is that user brings preferable experience.

Embodiment five

Fifth embodiment of the invention further provides for a kind of video editing apparatus.In the 5th embodiment, described video is compiled Collecting device is done improvement further on the basis of fourth embodiment, differs only in, described device also includes: obtains Module.

Acquisition module, for obtaining the keyword in the described phonetic entry that receiver module is received.

Specifically, acquisition module obtains keyword from the content of phonetic entry, and this keyword can be word, phrase, often Term, network buzzword, mood word etc..

Further, the keyword that acquisition module obtains can be speech form, can also be written form, can also be Voice turns written form.

Further, in the voice content according to receiver module reception, acquisition module therefrom can only obtain a key Word, multiple key words can also be obtained.

The video editing apparatus of the present embodiment, extract keyword by acquisition module from the phonetic entry receiving, and When judging to contain, in this keyword, the word belonging in dictionary, this keyword is shown in video with written form, improves The interest of video record process simultaneously improves video editing efficiency.

Embodiment six

Refer to Figure 10, sixth embodiment of the invention further provides for a kind of video editing apparatus.In the sixth embodiment, Described video editing apparatus are done improvement further on the basis of fourth embodiment, differ only in, editor module Further include:

Extraction unit 1010, for extracting the recording time receiving described phonetic entry.

Specifically, extraction unit 1010 positions to the corresponding recording time of content of the phonetic entry retaining, and carries Take this recording time.

Converting unit 1020, for being Word message by the Content Transformation of described phonetic entry.

Specifically, the Content Transformation of the phonetic entry receiving is Word message by converting unit 1020.

Edit cell 1030, in described recording time, described Word message being illustrated in video image.

Specifically, in recording time, Word message is illustrated in the specific bit of video image by edit cell 1030 In putting.

For example: recording " excellent " receiving the first phonetic entry on the 1st minute, in the 2nd minute reception the second language recorded " Gao Fushuai " of sound input, and " excellent " and " Gao Fushuai " of phonetic entry is converted to word letter by converting unit 1020 respectively Breath, edit cell 1030 shows " excellent " in the video image of the 1st minute, and shows in the video image of the 2nd minute " Gao Fushuai ".

Detector unit 1040, for detecting the face location in described video image.

Specifically, if there is a face in video image, detector unit 1040 detects this face in video image Position coordinateses.If there are multiple faces in video image, detector unit 1040 detects the position of each face in video image Coordinate.

Display unit 1050, for described Word message is avoided described face location, and shows default specially good effect.

Specifically, described Word message is illustrated on non-face position display unit 1050, and is equipped with default characteristic, For example: corresponding punctuation mark, emoticon, sytlized font, form barrage effect

The video editing apparatus of the present embodiment, extract, by extraction unit 1010, the recording time receiving phonetic entry, turn Change unit 1020 by the Content Transformation of phonetic entry be Word message, edit cell 1030 in recording time, by this Word message It is illustrated in video image, improves interest and the playability of video record, be that user brings preferably Consumer's Experience.

It should be noted that herein, term " inclusion ", "comprising" or its any other variant are intended to non-row The comprising of his property, so that including a series of process of key elements, method, article or device not only include those key elements, and And also include other key elements of being not expressly set out, or also include intrinsic for this process, method, article or device institute Key element.In the absence of more restrictions, the key element being limited by sentence "including a ..." is it is not excluded that including being somebody's turn to do Also there is other identical element in the process of key element, method, article or device.

The embodiments of the present invention are for illustration only, do not represent the quality of embodiment.

Through the above description of the embodiments, those skilled in the art can be understood that above-described embodiment side Method can be realized by the mode of software plus necessary general hardware platform naturally it is also possible to pass through hardware, but in many cases The former is more preferably embodiment.Based on such understanding, technical scheme is substantially done to prior art in other words Go out partly can embodying in the form of software product of contribution, this computer software product is stored in a storage medium In (as rom/ram, magnetic disc, CD), including some instructions with so that a station terminal equipment (can be mobile phone, computer, clothes Business device, air-conditioner, or network equipment etc.) method described in execution each embodiment of the present invention.

These are only the preferred embodiments of the present invention, not thereby limit the present invention the scope of the claims, every using this Equivalent structure or equivalent flow conversion that bright description and accompanying drawing content are made, or directly or indirectly it is used in other related skills Art field, is included within the scope of the present invention.

Claims

1. a kind of video editing method is it is characterised in that be applied to mobile terminal, and methods described includes step:

Receive phonetic entry；

Judge whether the content of described phonetic entry contains the word belonging in dictionary；

If so, then the content of described phonetic entry is shown in video with written form.

2. video editing method according to claim 1 it is characterised in that described reception phonetic entry after, described Method also includes:

Obtain the keyword in described phonetic entry；

Correspondingly, whether the described content judging described phonetic entry contains the word belonging in dictionary, comprising: judge described pass Whether key word belongs to the word in dictionary.

3. video editing method according to claim 1 is it is characterised in that the described content by described phonetic entry is with literary composition Font formula is shown in video, comprising:

Extract the recording time receiving described phonetic entry；

The Content Transformation of described phonetic entry is Word message；

In described recording time, described Word message is illustrated in video image.

4. video editing method according to claim 3 is it is characterised in that methods described also includes:

Detect the face location in described video image；

Described Word message avoids described face location, and shows default specially good effect.

5. video editing method according to claim 1 is it is characterised in that methods described also includes:

If it is not, the content of then phonetic entry described in automatic fitration.

6. it is characterised in that being applied to mobile terminal, described device includes a kind of video editing apparatus:

Receiver module, for receiving phonetic entry；

Judge module, whether the content for judging described phonetic entry contains the word belonging in dictionary；

Editor module, for judging that when described judge module the content of described phonetic entry contains the word belonging in dictionary, then The content of described phonetic entry is shown in video with written form.

7. video editing apparatus according to claim 6 are it is characterised in that described device also includes:

Acquisition module, for obtaining the keyword in described phonetic entry；

Correspondingly, described judge module, specifically for judging whether described keyword belongs to the word in dictionary.

8. video editing apparatus according to claim 6 are it is characterised in that described editor module, comprising:

Extraction unit, for extracting the recording time receiving described phonetic entry；

Converting unit, for being Word message by the Content Transformation of described phonetic entry；

Edit cell, in described recording time, described Word message being illustrated in video image.

9. video editing apparatus according to claim 8 are it is characterised in that described editor module also includes:

Detector unit, for detecting the face location in described video image；

Display unit, avoids described face location for described Word message, and shows default specially good effect.

10. video editing apparatus according to claim 6 are it is characterised in that described device also includes:

Filtering module, for judging that when described judge module the content of described phonetic entry does not contain the word belonging in dictionary When, then the content of phonetic entry described in automatic fitration.