CN106385548A - Mobile terminal and method for generating video captions - Google Patents
Mobile terminal and method for generating video captions Download PDFInfo
- Publication number
- CN106385548A CN106385548A CN201610801534.3A CN201610801534A CN106385548A CN 106385548 A CN106385548 A CN 106385548A CN 201610801534 A CN201610801534 A CN 201610801534A CN 106385548 A CN106385548 A CN 106385548A
- Authority
- CN
- China
- Prior art keywords
- talker
- file
- name
- video file
- video
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 45
- 230000006870 function Effects 0.000 claims description 84
- 230000001815 facial effect Effects 0.000 claims description 15
- 238000004891 communication Methods 0.000 description 22
- 230000008569 process Effects 0.000 description 12
- 238000010586 diagram Methods 0.000 description 8
- 238000010295 mobile communication Methods 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 5
- 230000009466 transformation Effects 0.000 description 4
- 230000000712 assembly Effects 0.000 description 3
- 238000000429 assembly Methods 0.000 description 3
- 230000005540 biological transmission Effects 0.000 description 3
- 238000007726 management method Methods 0.000 description 3
- 230000005236 sound signal Effects 0.000 description 3
- 230000001413 cellular effect Effects 0.000 description 2
- 239000000203 mixture Substances 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000002123 temporal effect Effects 0.000 description 2
- 101150012579 ADSL gene Proteins 0.000 description 1
- 102100020775 Adenylosuccinate lyase Human genes 0.000 description 1
- 108700040193 Adenylosuccinate lyases Proteins 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 238000005314 correlation function Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 239000009730 ganji Substances 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 239000002184 metal Substances 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 238000013468 resource allocation Methods 0.000 description 1
- 230000002441 reversible effect Effects 0.000 description 1
- 230000009131 signaling function Effects 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 239000010409 thin film Substances 0.000 description 1
- 230000002463 transducing effect Effects 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N5/00—Details of television systems
- H04N5/222—Studio circuitry; Studio devices; Studio equipment
- H04N5/262—Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
- H04N5/278—Subtitling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/93—Document management systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M1/00—Substation equipment, e.g. for use by subscribers
- H04M1/72—Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
- H04M1/724—User interfaces specially adapted for cordless or mobile telephones
- H04M1/72403—User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality
- H04M1/7243—User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality with interactive means for internal management of messages
- H04M1/72439—User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality with interactive means for internal management of messages for image or video messaging
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Human Computer Interaction (AREA)
- Signal Processing (AREA)
- Business, Economics & Management (AREA)
- General Business, Economics & Management (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Computer Networks & Wireless Communication (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Acoustics & Sound (AREA)
- Telephone Function (AREA)
Abstract
The present invention provides a mobile terminal. The mobile terminal comprises a voice identification module, a caption generation module and a face identification module. The voice identification module is configured to identify the voice content in a video file through a voice identification function; the caption generation module is configured to convert the voice content to character content and make the character content to a caption file; and the face identification module is configured to determine the speaker's name in the video file through a face identification function and mark the corresponding speaker's name for the caption file. The present invention further provides a method for generating video captions. The mobile terminal and the method for generating video captions can reduce the users' time to search related video files and mark the speaker's names for the caption files to take as the recording documents of the video files so as to facilitate searching the video content by the users.
Description
Technical field
The present invention relates to the communications field, more particularly, it relates to a kind of mobile terminal and the method generating video caption.
Background technology
With the popularization of the electronic equipment with camera function, increasing people begins to use camera function to be recorded
Picture, obtains video file data, and existing electronic equipment cannot provide corresponding captions after obtaining video file data it is impossible to expire
The demand of sufficient user's reading sub titles, is carried out after video record to meeting it is impossible to differentiate the speech content of each talker it is impossible to carry
For detailed minutes document.
Content of the invention
The invention provides a kind of mobile terminal, video file can be generated with subtitle file, differentiate in video file and talk
The speech content of person.Described mobile terminal includes:
Sound identification module, for identifying the voice content in video file by speech identifying function;
Captions generation module, for described voice content is converted into word content, and described word content is fabricated to
For subtitle file;And
Face recognition module, for judging the facial characteristics of talker in described video file by face identification functions,
Determine talker's name according to described facial characteristics, and indicate corresponding talker's name to described subtitle file.
Further, described sound identification module, is additionally operable to identify the sound characteristic of talker by speech identifying function,
Talker's name in described video file is determined according to described sound characteristic, and indicates corresponding talker to described subtitle file
Name.
Further, described mobile terminal also includes:
Retrieval module, for obtaining key word, and retrieves the literary composition related to described key word from described word content
Field falls, and finds out the corresponding video file of described word paragraph.
Further, described mobile terminal also includes:
Document creation module, for by described subtitle file, described subtitle file in described video file residing when
Between, described subtitle file corresponding talker name arrange become document information;
Display module, for video file described in simultaneous display and described subtitle file.
Further, described mobile terminal also includes:
Control module, for receiving the first control signal, closes described recognition of face work(according to described first control signal
, described speech identifying function can be opened, judge the talker's name in described video file by described speech identifying function, and
Receive the second control signal, close described speech identifying function, open described recognition of face work(according to described second control signal
Can, the talker's name in described video file is judged by described face identification functions.
The mobile terminal that the present invention provides identifies the voice content in video file by speech identifying function, by voice
Hold and be converted into word content, word content is organized into subtitle file, by subtitle file and video file simultaneous display, and judges
The name of talker in video file, indicates the name of talker, side in the corresponding subtitle file of voice content of talker
Just user reads the caption content of video file, after carrying out video record, differentiates the speech content of each talker, provides in detail
Recording documents, simplify and video file carried out with word arrange flow process.
The present invention also provides a kind of method generating video caption, video file can be generated with subtitle file, differentiate video
The speech content of talker in file.The described method generating video caption includes:
Voice content in video file is identified by speech identifying function;
Described voice content is converted into word content, and the making of described word content is become subtitle file;And
Judge the facial characteristics of talker in described video file by face identification functions, true according to described facial characteristics
Determine talker's name, and indicate corresponding talker's name to described subtitle file.
Further, the described method generating video caption also includes:
Identify the sound characteristic of talker by speech identifying function, described video file is determined according to described sound characteristic
Middle talker's name, and indicate corresponding talker's name to described subtitle file.
Further, the described method generating video caption also includes:
Obtain key word, and retrieve the word paragraph related to described key word from described word content, and search
Go out the corresponding video file of described word paragraph.
Further, the described method generating video caption also includes:
By time residing in described video file to described subtitle file, described subtitle file, described subtitle file pair
The talker's name answered arranges becomes document information;
Video file described in simultaneous display and described subtitle file.
Further, the described method generating video caption also includes:
Receive the first control signal, close described face identification functions, open institute's predicate according to described first control signal
Sound identification function, judges the talker's name in described video file by described speech identifying function;
Receive the second control signal, close described speech identifying function, open described people according to described second control signal
Face identification function, judges the talker's name in described video file by described face identification functions.
The method of the generation video caption that the present invention provides is identified in the voice in video file by speech identifying function
Hold, voice content is converted into word content, word content is organized into subtitle file, will be synchronous to subtitle file and video file
Display, and judge the name of talker in video file, the corresponding subtitle file of voice content of talker indicates speech
The name of person, facilitates user to read the caption content of video file, after carrying out video record, differentiates in the speech of each talker
Hold, detailed recording documents are provided, simplify and video file is carried out with word arrangement flow process.
Brief description
Fig. 1 is the hardware architecture diagram of the mobile terminal realizing each embodiment of the present invention;
Fig. 2 is the wireless communication system schematic diagram of mobile terminal as shown in Figure 1;
Fig. 3 is the functional block diagram of the embodiment of the present invention one mobile terminal;
Fig. 4 is the functional block diagram of the embodiment of the present invention two mobile terminal;
Fig. 5 is the applied environment figure of the embodiment of the present invention three mobile terminal;
Fig. 6 is the flow chart of the method that the embodiment of the present invention four generates video caption;
Fig. 7 is the flow chart of the method that the embodiment of the present invention five generates video caption.
The realization of the object of the invention, functional characteristics and advantage will be described further in conjunction with the embodiments referring to the drawings.
Specific embodiment
It should be appreciated that specific embodiment described herein, only in order to explain the present invention, is not intended to limit the present invention.
Realize the mobile terminal of each embodiment of the present invention referring now to Description of Drawings.In follow-up description, use
For represent element such as " module ", " part " or " unit " suffix only for being conducive to the explanation of the present invention, itself
Not specific meaning.Therefore, " module " and " part " can mixedly use.
Mobile terminal can be implemented in a variety of manners.For example, the terminal described in the present invention can include such as moving
Phone, smart phone, notebook computer, digit broadcasting receiver, PDA (personal digital assistant), PAD (panel computer), PMP
The mobile terminal of (portable media player), guider etc. and such as numeral TV, desk computer etc. consolidate
Determine terminal.Hereinafter it is assumed that terminal is mobile terminal.However, it will be understood by those skilled in the art that, except being used in particular for moving
Outside the element of purpose, construction according to the embodiment of the present invention can also apply to the terminal of fixed type.
Fig. 1 is the hardware architecture diagram of the mobile terminal realizing each embodiment of the present invention.
Mobile terminal 10 can include, but not limited to memorizer 20, controller 30, wireless communication unit 40, input block
50th, input block 60, photographic head 70, mike 71, interface unit 80 and power subsystem 90.Fig. 1 shows with various assemblies
Mobile terminal 10 it should be appreciated that being not required for implementing all assemblies illustrating.Can alternatively implement more or
Less assembly.Will be discussed in more detail below the element of mobile terminal 10.
Wireless communication unit 40 generally includes one or more assemblies, its allow mobile terminal 10 and wireless communication system or
Wireless points communication between network.For example, wireless communication unit can include broadcasting reception module, mobile communication module, wireless
At least one of the Internet module, short range communication module and location information module.
Broadcasting reception module receives broadcast singal via broadcast channel from external broadcast management server and/or broadcast is related
Information.Broadcast channel can include satellite channel and/or terrestrial channel.Broadcast management server can be to generate and send broadcast
The server of signal and/or broadcast related information or receive before generate broadcast singal and/or broadcast related information and
Send it to the server of terminal.Broadcast singal can include TV broadcast singal, radio signals, data broadcasting signal
Etc..And, broadcast singal may further include the broadcast singal combining with TV or radio signals.The related letter of broadcast
Breath can also provide via mobile communications network, and in this case, broadcast related information can be come by mobile communication module
Receive.Broadcast singal can exist in a variety of manners, and for example, it can be referred to the electronic programming of DMB (DMB)
The form of southern (EPG), the electronic service guidebooks (ESG) of digital video broadcast-handheld (DVB-H) etc. and exist.Broadcast reception mould
Block can be broadcasted by using various types of broadcast system receipt signals.Especially, broadcasting reception module can be by using
Such as multimedia broadcasting-ground (DMB-T), DMB-satellite (DMB-S), DVB-hand-held (DVB-
H), the number of the Radio Data System of forward link media (MediaFLO@), received terrestrial digital broadcasting integrated service (ISDB-T) etc.
Word broadcast system receives digital broadcasting.Broadcasting reception module may be constructed such that the various broadcast systems being adapted to provide for broadcast singal
And above-mentioned digit broadcasting system.Via broadcasting reception module, the broadcast singal receiving and/or broadcast related information can store
In memorizer 20 (or other types of storage medium).
Mobile communication module send radio signals to base station (for example, access point, node B etc.), exterior terminal with
And at least one of server and/or receive from it radio signal.Such radio signal can include voice call
Signal, video calling signal or the various types of data sending and/or receiving according to text and/or Multimedia Message.
Wireless Internet module supports the Wi-Fi (Wireless Internet Access) of mobile terminal.This module can internally or externally couple
To terminal.Wi-Fi (Wireless Internet Access) technology involved by this module can include WLAN (WLAN) (Wi-Fi), Wibro (no
Live width band), Wimax (worldwide interoperability for microwave accesses), HSDPA (high-speed downlink packet access) etc..
Short range communication module is the module for supporting junction service.Some examples of short-range communication technology include bluetoothTM、
RF identification (RFID), Infrared Data Association (IrDA), ultra broadband (UWB), purple honeybeeTMEtc..
Location information module be for check or obtain mobile terminal positional information module.The allusion quotation of location information module
Type example is GPS (global positioning system).According to current technology, GPS module calculates the distance from three or more satellites
Information and correct time information and the Information application triangulation for calculating, thus according to longitude, latitude and height
Calculate three-dimensional current location information exactly.Currently, for calculate position and temporal information method use three satellites and
The position calculating by using an other satellite correction and the error of temporal information.Additionally, GPS module can be by real
When ground Continuous plus current location information carry out calculating speed information.
Output unit 50 be configured to vision, audio frequency and/or tactile manner provide output signal (for example, audio signal,
Video signal, alarm signal, vibration signal etc.).Output unit 50 can include display unit 51, dio Output Modules 52,
Alarm unit 53 etc..
Display unit 51 may be displayed on the information processing in mobile terminal 10.For example, when mobile terminal 10 is in phone
During call mode, display unit 51 can show and communicate with call or other (for example, text messaging, under multimedia file
Carry etc.) related user interface (UI) or graphic user interface (GUI).When mobile terminal 10 be in video calling pattern or
During image capture mode, display unit 51 can show the image of capture and/or the image of reception, illustrate video or image and
UI or GUI of correlation function etc..
Meanwhile, when display unit 51 and the touch pad touch screen with formation superposed on one another as a layer, display unit 51
Can serve as input equipment and output device.Display unit 51 can include liquid crystal display (LCD), thin film transistor (TFT) LCD
(TFT-LCD), at least in Organic Light Emitting Diode (OLED) display, flexible display, three-dimensional (3D) display etc.
Kind.Some in these display may be constructed such that transparence to allow user from outside viewing, and this is properly termed as transparent aobvious
Show device, typical transparent display can be, for example, TOLED (transparent organic light emitting diode) display etc..Thought according to specific
The embodiment wanted, mobile terminal 10 can include two or more display units (or other display device), for example, mobile whole
End can include outernal display unit (not shown) and inner display unit (not shown).Touch screen can be used for detecting touch input
Pressure and touch input position and touch input area.
Dio Output Modules 52 can be in call signal reception pattern, call mode, logging mode, language in mobile terminal
When under the isotypes such as sound recognition mode, broadcast reception mode, that wireless communication unit 40 is received or deposit in memorizer 20
Storage voice data transducing audio signal and be output as sound.And, dio Output Modules 52 can provide and mobile terminal
The audio output (for example, call signal receives sound, message sink sound etc.) of the specific function correlation of 10 execution.Audio frequency is defeated
Go out module 52 and can include speaker, buzzer etc..
Alarm unit 53 can provide output to notify event to mobile terminal 10.Typical event can be wrapped
Include calling reception, message sink, key signals input, touch input etc..In addition to audio or video output, alarm unit 53
Output can be provided in a different manner with the generation of notification event.For example, alarm unit 53 can be provided in the form of vibrating
Output, enters when communicating (incoming communication) when receiving calling, message or some other, alarm unit 53
Tactile output (that is, vibrating) can be provided to notify to user.By providing such tactile output, even if user's
When mobile phone is in the pocket of user, user also can recognize that the generation of various events.Alarm unit 53 can also be through
The output of the generation of notification event is provided by display unit 51 or dio Output Modules 52.
Input block 60 can generate key input data to control the various behaviour of mobile terminal according to the order of user input
Make.Input block 60 allows the various types of information of user input, and can include keyboard, metal dome, touch pad (for example,
Detection due to touched and lead to resistance, pressure, the change of electric capacity etc. sensitive component), roller, rocking bar etc..Especially
Ground, when touch pad is superimposed upon on display unit 50 as a layer, can form touch screen.In an embodiment of the present invention,
Described input block 60 includes touch screen and ink screen.Photographic head 70 is used for shooting image data, and mike 71 is used for enrolling sound
Frequency data.
Interface unit 80 is connected, with mobile terminal 10, the interface that can pass through as at least one external device (ED).For example, outward
Part device can include wired or wireless head-band earphone port, external power source (or battery charger) port, wired or wireless
FPDP, memory card port, for connect have the port of device of identification module, audio input/output (I/O) port,
Video i/o port, ear port etc..Identification module can be storage for verifying that user uses the various letters of mobile terminal 10
Cease and subscriber identification module (UIM), client identification module (SIM), Universal Subscriber identification module (USIM) etc. can be included.
In addition, the device (hereinafter referred to as " identifying device ") with identification module can take the form of smart card, therefore, identifying device
Can be connected with mobile terminal 10 via port or other attachment means.Interface unit 80 can be used for receiving from external device (ED)
Input (for example, data message, electric power etc.) and the input receiving is transferred to one or many in mobile terminal 10
Individual element or can be used for transmission data between mobile terminal and external device (ED).
In addition, when mobile terminal 10 is connected with external base, interface unit 80 can serve as allowing by it by electric power
There is provided the path of mobile terminal 10 from base or can serve as allowing the various command signals from base input to pass by it
The defeated path to mobile terminal.May serve as whether identifying mobile terminal from the various command signals of base input or electric power
It is accurately fitted within the signal on base.
Memorizer 20 can store software program of the process being executed by controller 30 and control operation etc., or permissible
Temporarily store oneself data (for example, telephone directory, message, still image, video etc.) through exporting or will export.And,
Memorizer 20 can be to store the vibration of various modes and the data of audio signal with regard to exporting when touching and being applied to touch screen.
Memorizer 20 can include the storage medium of at least one type, and described storage medium includes flash memory, hard disk, many matchmakers
Body card, card-type memorizer (for example, SD or DX memorizer etc.), random access storage device (RAM), static random-access memory
(SRAM), read only memory (ROM), Electrically Erasable Read Only Memory (EEPROM), programmable read only memory
(PROM), magnetic storage, disk, CD etc..And, mobile terminal 10 can execute memorizer 20 with by network connection
Store function network storage device cooperation.
Controller 30 generally controls the overall operation of mobile terminal.For example, controller 30 execution and voice call, data are led to
The related control of letter, video calling etc. and process.In addition, controller 30 can be included for reproducing (or playback) multimedia number
According to multi-media module, multi-media module can construct in controller 30, or it is so structured that separates with controller 30.Control
The handwriting input executing on the touchscreen or picture can be drawn input and are identified as with execution pattern identifying processing by device 30 processed
Character or image.
Power subsystem 90 receives external power or internal power under the control of the controller 30 and provides operation each element
With the suitable electric power needed for assembly.
Various embodiment described herein can be with using such as computer software, hardware or its any combination of calculating
Machine computer-readable recording medium is implementing.Hardware is implemented, embodiment described herein can be by using application-specific IC
(ASIC), digital signal processor (DSP), digital signal processing device (DSPD), programmable logic device (PLD), scene can
Program gate array (FPGA), processor, controller, microcontroller, microprocessor, be designed to execute function described herein
At least one in electronic unit implementing, in some cases, can be implemented in controller 180 by such embodiment.
Software is implemented, the embodiment of such as process or function can with allow to execute the single of at least one function or operation
Software module is implementing.Software code can be come by the software application (or program) write with any suitable programming language
Implement, software code can be stored in memorizer 160 and be executed by controller 180.
So far, oneself is through describing mobile terminal according to its function.Below, for the sake of brevity, will describe such as folded form,
Slide type mobile terminal in various types of mobile terminals of board-type, oscillating-type, slide type mobile terminal etc. is as showing
Example.Therefore, the present invention can be applied to any kind of mobile terminal, and is not limited to slide type mobile terminal.
As shown in Figure 1 mobile terminal 10 may be constructed such that using such as wired via frame or packet transmission data
To operate with wireless communication system and satellite-based communication system.
The communication system being wherein operable to according to the mobile terminal of the present invention referring now to Fig. 2 description.
Such communication system can use different air interfaces and/or physical layer.For example, used by communication system
Air interface includes such as frequency division multiple access (FDMA), time division multiple acess (TDMA), CDMA (CDMA) and universal mobile communications system
System (UMTS) (especially, Long Term Evolution (LTE)), global system for mobile communications (GSM) etc..As non-limiting example, under
The description in face is related to cdma communication system, but such teaching is equally applicable to other types of system.
With reference to Fig. 2, cdma wireless communication system can include multiple mobile terminal 1s 0, multiple base station (BS) 270, base station control
Device (BSC) 275 processed and mobile switching centre (MSC) 280.MSC280 is configured to and Public Switched Telephony Network (PSTN) 290
Form interface.MSC280 is also structured to and can form interface via the BSC275 that back haul link is couple to base station 270.Backhaul
If circuit can construct according to any one in the interface that Ganji knows, described interface includes such as E1/T1, ATM, IP, PPP,
Frame relay, HDSL, ADSL or xDSL.It will be appreciated that system as shown in Figure 2 can include multiple BSC2750.
Each BS270 can service one or more subregions (or region), by the sky of multidirectional antenna or sensing specific direction
Each subregion that line covers is radially away from BS270.Or, each subregion can by for diversity reception two or more
Antenna covers.Each BS270 may be constructed such that support multiple frequency distribution, and the distribution of each frequency has specific frequency spectrum
(for example, 1.25MHz, 5MHz etc.).
Intersecting that subregion and frequency are distributed can be referred to as CDMA Channel.BS270 can also be referred to as base station transceiver
System (BTS) or other equivalent terms.In this case, term " base station " can be used for broadly representing single
BSC275 and at least one BS270.Base station can also be referred to as " cellular station ".Or, each subregion of specific BS270 can be claimed
For multiple cellular stations.
As shown in Figure 2, broadcast singal is sent to the mobile terminal of operation in system by broadcsting transmitter (BT) 295
10.Broadcasting reception module 111 is arranged on and is believed by the broadcast that BT295 sends with receiving at mobile terminal 10 as shown in Figure 1
Number.In fig. 2 it is shown that several global positioning system (GPS) satellites 300.Satellite 300 helps position in multiple mobile terminal 1s 0
At least one.
In fig. 2, depict multiple satellites 300, it is understood that be, it is possible to use any number of satellite obtains useful
Location information.GPS module 115 is generally configured to coordinate with satellite 300 to obtain the positioning letter wanted as shown in Figure 1
Breath.Substitute GPS tracking technique or outside GPS tracking technique, it is possible to use other of the position of mobile terminal can be followed the tracks of
Technology.In addition, at least one gps satellite 300 can optionally or additionally process satellite dmb transmission.
As a typical operation of wireless communication system, BS270 receives the reverse link from various mobile terminal 1s 0
Signal.Mobile terminal 10 generally participates in call, information receiving and transmitting and other types of communication.Each of certain base station 270 reception is anti-
Processed in specific BS270 to link signal.The data obtaining is forwarded to the BSC275 of correlation.BSC provides call
Resource allocation and the mobile management function of including the coordination of soft switching process between BS270.BSC275 is also by the number receiving
According to being routed to MSC280, it provides the extra route service for forming interface with PSTN290.Similarly, PSTN290 with
MSC280 forms interface, and MSC and BSC275 form interface, and BSC275 correspondingly controls BS270 with by forward link signals
It is sent to mobile terminal 10.
Based on above-mentioned mobile terminal hardware configuration and communication system, each embodiment of the inventive method is proposed.
Refer to Fig. 3, Fig. 3 is the functional block diagram of the embodiment of the present invention one mobile terminal.Mobile terminal 10 shown in Fig. 3
Including:Sound identification module 101, captions generation module 103, face recognition module 105.Sound identification module 101 passes through voice
Identification function identifies the voice content in video file, and voice content is transferred to captions generation module 103.Captions generate mould
Voice content is converted into word content by block 103, and word content making is become subtitle file.Face recognition module 105 is led to
Cross the facial characteristics that face identification functions judge talker in video file, determine talker's name according to facial characteristics, and give
Described subtitle file indicates corresponding talker's name.Supplementary notes, sound identification module 101 passes through speech identifying function
The sound characteristic of identification talker, determines talker's name in video file according to sound characteristic, and right to subtitle file sign
The talker's name answered.
The mobile terminal that the present embodiment provides identifies the voice content in video file by speech identifying function, by voice
Content Transformation becomes word content, and word content is organized into subtitle file, by subtitle file and video file simultaneous display, and sentences
The name of talker in disconnected video file, indicates the name of talker in the corresponding subtitle file of voice content of talker,
User is facilitated to read the caption content of video file.
Refer to Fig. 4, Fig. 4 is the functional block diagram of the embodiment of the present invention two mobile terminal.Mobile terminal 10 shown in Fig. 4
Including sound identification module 101, captions generation module 103, face recognition module 105, retrieval module 109, document creation module
111st, display module 113, control module 115.In the present embodiment, sound identification module 101 is identified by speech identifying function
Voice content in video file, and voice content is transferred to captions generation module 103.Captions generation module 103 is by voice
Content Transformation becomes word content, and word content making is become subtitle file.Face recognition module 105 passes through recognition of face work(
The name of talker in video file can be judged, and indicate corresponding talker's name to subtitle file.Supplementary notes, language
Sound identification module 101 identifies the sound characteristic of talker by speech identifying function, judges talker's name in video file, and
Indicate corresponding talker's name to subtitle file.
Retrieval module 109 obtains key word, and retrieves the word paragraph related to key word from word content, and looks into
Find out the corresponding video file of word paragraph, subtitle file can be carried out extensive according to keyword and have and targetedly examine
Rope, it is possible to achieve quickly video file positioning, is that user finds the have video file related to keyword, reduces user and look into
Look for the time of associated video files.
Subtitle file, subtitle file residing time in video file, subtitle file are corresponded to by document creation module 111
Talker's name arrange and become document information, the meeting for example many people being participated in carries out conference video recording, document creation module 111
The speech content of participant in conference video recording is arranged, it will in view video recording, subtitle file, subtitle file are in procceedingss
As in residing time, the corresponding conference speech person's name of subtitle file be organized into detailed document information, as minutes, just
Consult the content of meeting in participant.
After generating subtitle file, display module 113 simultaneous display video file and subtitle file, facilitate user in viewing
Reading sub titles file during video file, when video file volume is less, accent is unclear it is ensured that user understands in the voice of video
Hold.
It should be noted that saying in the present embodiment, it is possible to be differentiated by the speech identifying function of sound identification module 101
The name of words person it is also possible to identify the name of talker, mobile terminal 10 by the face identification functions of face recognition module 105
Control module 115 select a kind of mode to differentiate the name of talker from both modes, specifically, when control module 115
When receiving the first control signal, control module 115 is closed face identification functions, is opened speech recognition work(according to the first control signal
Can, judge the talker's name in video file by speech identifying function, when control module 115 receives the second control signal
When, control module 115 is closed speech identifying function, is opened face identification functions, by recognition of face according to the second control signal
Function judges the talker's name in video file.
The mobile terminal that the present embodiment provides identifies the voice content in video file by speech identifying function, by voice
Content Transformation becomes word content, and word content is organized into subtitle file, by subtitle file and video file simultaneous display, and sentences
The name of talker in disconnected video file, indicates the name of talker in the corresponding subtitle file of voice content of talker,
Facilitate user to read the caption content of video file, after carrying out video record, differentiate the speech content of each talker, can be automatically
Detailed recording documents are provided, simplify the flow process that video file is carried out with word arrangement.
Refering to Fig. 5, Fig. 5 is the applied environment figure of the embodiment of the present invention three mobile terminal.Applied environment in the present embodiment
In figure includes mobile terminal 10, party A-subscriber, party B-subscriber, C user, D user, and mobile terminal 10 is the mobile terminal 10 of Fig. 3 or Fig. 4.
In the present embodiment, when party A-subscriber, party B-subscriber, C user, D user are when carrying out meeting, mobile terminal 10 carries out whole record to meeting
Picture.Mobile terminal 10 is opened photographic head 70 and is carried out video record to party A-subscriber, party B-subscriber, C user, D user, opens Mike 71 to A
User, party B-subscriber, C user, D user are recorded, and carry out video record to the speech in conference process.
When shooting video, user can send control signal by the touch screen of mobile terminal 10, selects to differentiate speech
The mode of the name of person, selects according to user, can differentiate talker's by the speech identifying function of sound identification module 101
Name is it is also possible to pass through the name of the face identification functions identification talker of face recognition module 105, specifically, when control mould
When block 115 receives the first control signal that touch screen spreads out of, control module 115 closes recognition of face work(according to the first control signal
, speech identifying function can be opened, judge the talker's name in video file by speech identifying function, when control module 115
When receiving the second control signal that touch screen spreads out of, control module 115 is closed speech identifying function, is beaten according to the second control signal
Open face identification functions, judge the talker's name in video file by face identification functions.
Supplementary notes, the speech identifying function of sound identification module 101 differentiates the concrete steps of the name of talker
Including the sound characteristic obtaining party A-subscriber, party B-subscriber, C user, D user from video file, sound characteristic and address name are built
Vertical corresponding relation, detects to the voice content in video file, when the voice content detecting meets the sound spy of party A-subscriber
When levying, determine that this voice content is sent by party A-subscriber, same mode, determine in the voice that user B, C user, D user send
Hold.Judge that by face identification functions the concrete steps of the talker's name in video file include obtaining A from video file
User, party B-subscriber, C user, D user facial characteristics, facial characteristics and address name are set up corresponding relation, to video literary composition
The face occurring in part carries out facial recognition, when the facial characteristics detecting meet the facial characteristics of party A-subscriber, determines now
Voice content sends for party A-subscriber, same mode, determines the voice content that user B, C user, D user send.
Before or after the discriminant approach of the talker's name in determining video file, sound identification module 101 leads to
Cross the voice content that speech identifying function identifies in video file, and voice content is transferred to captions generation module 103.Captions
Voice content is converted into word content by generation module 103, and word content making is become subtitle file.Face recognition module
105 judge the name of talker in video file by face identification functions, and indicate corresponding talker's surname to subtitle file
Name.Sound identification module 101 identifies the sound characteristic of talker by speech identifying function, judges talker's surname in video file
Name, and indicate corresponding talker's name to subtitle file.
User can input keyword to be retrieved by touch screen, after retrieval module 109 obtains key word, in word
Retrieve the word paragraph related to key word in appearance, and find out the corresponding video file of word paragraph, associated is regarded
Frequency file shows in the form of a list, subtitle file can be carried out extensive according to keyword and have and targetedly examine
Rope, it is possible to achieve quickly video file positioning, is that user finds the have video file related to keyword, reduces user and look into
Look for the time of associated video files.
After conference video recording, document creation module 111 by subtitle file, subtitle file in video file residing when
Between, subtitle file corresponding talker name arrange become document information, according to order of speech in conference process to party A-subscriber, B use
Family, C user, D user's speech content are arranged, it will in view video recording, subtitle file, subtitle file are residing in conference video recording
Time, the corresponding conference speech person's name of subtitle file be organized into detailed document information, as minutes, be easy to participant
Member consults the content of meeting.
After generating subtitle file, while conference video recording, display module 113 simultaneous display video file and captions are civilian
Part, facilitates user's reading sub titles file when watching video file, when video file volume is less, accent is unclear it is ensured that using
Family understands the voice content of video.
The mobile terminal that the present embodiment provides identifies the voice content in video file by speech identifying function, by voice
Content Transformation becomes word content, and word content is organized into subtitle file, by subtitle file and video file simultaneous display, and sentences
The name of talker in disconnected video file, indicates the name of talker in the corresponding subtitle file of voice content of talker,
Facilitate user to read the caption content of video file, after carrying out video record, differentiate the speech content of each talker, can be automatically
Detailed recording documents are provided, simplify the flow process that video file is carried out with word arrangement.
The present invention also provides a kind of method generating video caption, the mobile terminal shown in the method application Fig. 3 or Fig. 4
10, below the method for the generation video caption of the present embodiment is described in detail.
Refering to Fig. 6, Fig. 6 is the flow chart of the method that the embodiment of the present invention four generates video caption.
In step s 601, sound identification module 101 identifies the voice content in video file by speech identifying function,
And voice content is transferred to captions generation module 103.
In step S603, voice content is converted into word content by captions generation module 103, and word content is made
Become subtitle file.
In step s 605, face recognition module 105 judges the surname of talker in video file by face identification functions
Name, and indicate corresponding talker's name to subtitle file.
In step S607, sound identification module 101 identifies the sound characteristic of talker by speech identifying function, judges
Talker's name in video file, and indicate corresponding talker's name to subtitle file.
It should be added that, in the present embodiment, can only select the speech recognition by sound identification module 101
Function differentiates the name of talker it is also possible to only select to identify talker by the face identification functions of face recognition module 105
Name, specifically, when the control module 115 of mobile terminal 10 receives the first control signal, control module 115 is according to first
Control signal is closed face identification functions, is opened speech identifying function, judges saying in video file by speech identifying function
Words person's name, when control module 115 receives the second control signal, control module 115 closes voice according to the second control signal
Identification function, open face identification functions, judge the talker's name in video file by face identification functions.
The method of the generation video caption that the present embodiment provides identifies the voice in video file by speech identifying function
Content, voice content is converted into word content, and word content is organized into subtitle file, will be same to subtitle file and video file
Step display, and judge the name of talker in video file, indicate in the corresponding subtitle file of voice content of talker and say
The name of words person, facilitates user to read the caption content of video file, and understands the name of spokesman in real time.
Refering to Fig. 7, Fig. 7 is the method flow diagram that the embodiment of the present invention five generates video caption.The method application with Fig. 3 or
Mobile terminal 10 shown in Fig. 4, describes in detail to the method for the generation video caption of the present embodiment below.
In step s 701, retrieval module 109 obtains key word, and retrieves related to key word from word content
Word paragraph, and find out the corresponding video file of word paragraph.In this step, the method obtaining key word can be passed through
The touch screen of mobile terminal 10 inputs corresponding keyword, subtitle file is carried out extensive according to keyword and has targetedly
Retrieval, it is possible to achieve quickly video file positioning, is that user finds the have video file related to keyword, reduces user
Search the time of associated video files.
In step S703, document creation module 111 by subtitle file, subtitle file in video file residing when
Between, subtitle file corresponding talker name arrange become document information.The meeting for example many people being participated in carries out conference video recording,
The speech content of participant in conference video recording is arranged by document creation module 111, it will subtitle file, word in view video recording
Curtain file residing time in conference video recording, the corresponding conference speech person's name of subtitle file are organized into detailed document information,
As minutes, it is easy to the content that participant consults meeting.
In step S705, after generating subtitle file, display module 113 simultaneous display video file and subtitle file,
Facilitate user's reading sub titles file when watching video file, when video file volume is less, accent is unclear it is ensured that user
The voice content of solution video.
The method of the generation video caption that present embodiment provides identifies the language in video file by speech identifying function
Sound content, voice content is converted into word content, and word content is organized into subtitle file, by subtitle file and video file
Simultaneous display, and judge the name of talker in video file, indicate in the corresponding subtitle file of voice content of talker
The name of talker, facilitates user to read the caption content of video file, after carrying out video record, differentiates sending out of each talker
Speech content, provides detailed recording documents, simplifies the flow process that video file is carried out with word arrangement.
These are only presently preferred embodiments of the present invention, not in order to limit the present invention, all spirit in the present invention and
Any modification, equivalent and improvement of being made within principle etc., should be included within the scope of the present invention.
Claims (10)
1. a kind of mobile terminal is it is characterised in that include:
Sound identification module, for identifying the voice content in video file by speech identifying function;
Captions generation module, for described voice content is converted into word content, and the making of described word content is become word
Curtain file;And
Face recognition module, for judging the facial characteristics of talker in described video file by face identification functions, according to
Described facial characteristics determine talker's name, and indicate corresponding talker's name to described subtitle file.
2. mobile terminal as claimed in claim 1 is it is characterised in that described sound identification module is additionally operable to by speech recognition
The sound characteristic of identification of function talker, determines talker's name in described video file according to described sound characteristic, and gives institute
State subtitle file and indicate corresponding talker's name.
3. the mobile terminal as described in as arbitrary in claim 1-2 one is it is characterised in that also include:
Retrieval module, for obtaining key word, and retrieves the word section related to described key word from described word content
Fall, and find out the corresponding video file of described word paragraph.
4. the mobile terminal as described in claim 1-2 is it is characterised in that also include:
Document creation module, for by time residing in described video file to described subtitle file, described subtitle file, institute
Stating subtitle file corresponding talker name and arranging becomes document information;
Display module, for video file described in simultaneous display and described subtitle file.
5. the mobile terminal described in arbitrary one of claim 1-2 is it is characterised in that also include:
Control module, for receiving the first control signal, closes described face identification functions, beats according to described first control signal
Open described speech identifying function, judge the talker's name in described video file by described speech identifying function, and receive
Second control signal, closes described speech identifying function, opens described face identification functions according to described second control signal, lead to
Cross described face identification functions and judge the talker's name in described video file.
6. a kind of method generating video caption is it is characterised in that include:
Voice content in video file is identified by speech identifying function;
Described voice content is converted into word content, and the making of described word content is become subtitle file;And
Judge the facial characteristics of talker in described video file by face identification functions, determined according to described facial characteristics and say
Words person's name, and indicate corresponding talker's name to described subtitle file.
7. the method generating video caption as claimed in claim 6 is it is characterised in that also include:
Identify the sound characteristic of talker by speech identifying function, determined in described video file according to described sound characteristic and say
Words person's name, and indicate corresponding talker's name to described subtitle file.
8. the method for the generation video caption as described in as arbitrary in claim 6-7 one is it is characterised in that also include:
Obtain key word, and retrieve the word paragraph related to described key word from described word content, and find out institute
State the corresponding video file of word paragraph.
9. the method for the generation video caption as described in claim 6-7 is it is characterised in that also include:
Will be corresponding to time residing in described video file to described subtitle file, described subtitle file, described subtitle file
Talker's name arranges becomes document information;
Video file described in simultaneous display and described subtitle file.
10. the method for the generation video caption described in arbitrary one of claim 6-7 is it is characterised in that also include:
Receive the first control signal, close described face identification functions, open described voice knowledge according to described first control signal
Other function, judges the talker's name in described video file by described speech identifying function;
Receive the second control signal, close described speech identifying function, open described face knowledge according to described second control signal
Other function, judges the talker's name in described video file by described face identification functions.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610801534.3A CN106385548A (en) | 2016-09-05 | 2016-09-05 | Mobile terminal and method for generating video captions |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610801534.3A CN106385548A (en) | 2016-09-05 | 2016-09-05 | Mobile terminal and method for generating video captions |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106385548A true CN106385548A (en) | 2017-02-08 |
Family
ID=57939007
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610801534.3A Pending CN106385548A (en) | 2016-09-05 | 2016-09-05 | Mobile terminal and method for generating video captions |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106385548A (en) |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108366182A (en) * | 2018-02-13 | 2018-08-03 | 京东方科技集团股份有限公司 | Text-to-speech synchronizes the calibration method reported and device, computer storage media |
CN109151597A (en) * | 2018-09-03 | 2019-01-04 | 聚好看科技股份有限公司 | information display method and device |
CN109587429A (en) * | 2017-09-29 | 2019-04-05 | 北京国双科技有限公司 | Audio-frequency processing method and device |
CN109920428A (en) * | 2017-12-12 | 2019-06-21 | 杭州海康威视数字技术股份有限公司 | A kind of notes input method, device, electronic equipment and storage medium |
CN110544491A (en) * | 2019-08-30 | 2019-12-06 | 上海依图信息技术有限公司 | Method and device for real-time association of speaker and voice recognition result thereof |
CN110797024A (en) * | 2019-11-07 | 2020-02-14 | 大连海事大学 | VHF (very high frequency) maritime safety communication system based on voice recognition and subtitle display |
US10580410B2 (en) | 2018-04-27 | 2020-03-03 | Sorenson Ip Holdings, Llc | Transcription of communications |
CN111354349A (en) * | 2019-04-16 | 2020-06-30 | 深圳市鸿合创新信息技术有限责任公司 | Voice recognition method and device and electronic equipment |
CN111986656A (en) * | 2020-08-31 | 2020-11-24 | 上海松鼠课堂人工智能科技有限公司 | Teaching video automatic caption processing method and system |
CN112135197A (en) * | 2019-06-24 | 2020-12-25 | 腾讯科技(深圳)有限公司 | Subtitle display method and device, storage medium and electronic equipment |
CN112489683A (en) * | 2020-11-24 | 2021-03-12 | 广州市久邦数码科技有限公司 | Method and device for realizing fast forward and fast backward of audio based on key word positioning |
CN112672099A (en) * | 2020-12-31 | 2021-04-16 | 深圳市潮流网络技术有限公司 | Subtitle data generation and presentation method, device, computing equipment and storage medium |
US11418832B2 (en) | 2018-11-27 | 2022-08-16 | Guangdong Oppo Mobile Telecommunications Corp., Ltd. | Video processing method, electronic device and computer-readable storage medium |
CN114979054A (en) * | 2022-05-13 | 2022-08-30 | 维沃移动通信有限公司 | Video generation method and device, electronic equipment and readable storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102572372A (en) * | 2011-12-28 | 2012-07-11 | 中兴通讯股份有限公司 | Extraction method and device for conference summary |
CN103856689A (en) * | 2013-10-31 | 2014-06-11 | 北京中科模识科技有限公司 | Character dialogue subtitle extraction method oriented to news video |
US20150046164A1 (en) * | 2013-08-07 | 2015-02-12 | Samsung Electronics Co., Ltd. | Method, apparatus, and recording medium for text-to-speech conversion |
CN105704538A (en) * | 2016-03-17 | 2016-06-22 | 广东小天才科技有限公司 | Audio and video subtitle generation method and system |
-
2016
- 2016-09-05 CN CN201610801534.3A patent/CN106385548A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102572372A (en) * | 2011-12-28 | 2012-07-11 | 中兴通讯股份有限公司 | Extraction method and device for conference summary |
US20150046164A1 (en) * | 2013-08-07 | 2015-02-12 | Samsung Electronics Co., Ltd. | Method, apparatus, and recording medium for text-to-speech conversion |
CN103856689A (en) * | 2013-10-31 | 2014-06-11 | 北京中科模识科技有限公司 | Character dialogue subtitle extraction method oriented to news video |
CN105704538A (en) * | 2016-03-17 | 2016-06-22 | 广东小天才科技有限公司 | Audio and video subtitle generation method and system |
Cited By (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109587429A (en) * | 2017-09-29 | 2019-04-05 | 北京国双科技有限公司 | Audio-frequency processing method and device |
CN109920428A (en) * | 2017-12-12 | 2019-06-21 | 杭州海康威视数字技术股份有限公司 | A kind of notes input method, device, electronic equipment and storage medium |
CN108366182A (en) * | 2018-02-13 | 2018-08-03 | 京东方科技集团股份有限公司 | Text-to-speech synchronizes the calibration method reported and device, computer storage media |
US10580410B2 (en) | 2018-04-27 | 2020-03-03 | Sorenson Ip Holdings, Llc | Transcription of communications |
CN109151597A (en) * | 2018-09-03 | 2019-01-04 | 聚好看科技股份有限公司 | information display method and device |
WO2020048275A1 (en) * | 2018-09-03 | 2020-03-12 | 聚好看科技股份有限公司 | Information display method and apparatus |
US11418832B2 (en) | 2018-11-27 | 2022-08-16 | Guangdong Oppo Mobile Telecommunications Corp., Ltd. | Video processing method, electronic device and computer-readable storage medium |
CN111354349A (en) * | 2019-04-16 | 2020-06-30 | 深圳市鸿合创新信息技术有限责任公司 | Voice recognition method and device and electronic equipment |
CN112135197A (en) * | 2019-06-24 | 2020-12-25 | 腾讯科技(深圳)有限公司 | Subtitle display method and device, storage medium and electronic equipment |
CN112135197B (en) * | 2019-06-24 | 2022-12-09 | 腾讯科技(深圳)有限公司 | Subtitle display method and device, storage medium and electronic equipment |
CN110544491A (en) * | 2019-08-30 | 2019-12-06 | 上海依图信息技术有限公司 | Method and device for real-time association of speaker and voice recognition result thereof |
CN110797024A (en) * | 2019-11-07 | 2020-02-14 | 大连海事大学 | VHF (very high frequency) maritime safety communication system based on voice recognition and subtitle display |
CN111986656A (en) * | 2020-08-31 | 2020-11-24 | 上海松鼠课堂人工智能科技有限公司 | Teaching video automatic caption processing method and system |
CN112489683A (en) * | 2020-11-24 | 2021-03-12 | 广州市久邦数码科技有限公司 | Method and device for realizing fast forward and fast backward of audio based on key word positioning |
CN112672099A (en) * | 2020-12-31 | 2021-04-16 | 深圳市潮流网络技术有限公司 | Subtitle data generation and presentation method, device, computing equipment and storage medium |
CN112672099B (en) * | 2020-12-31 | 2023-11-17 | 深圳市潮流网络技术有限公司 | Subtitle data generating and presenting method, device, computing equipment and storage medium |
CN114979054A (en) * | 2022-05-13 | 2022-08-30 | 维沃移动通信有限公司 | Video generation method and device, electronic equipment and readable storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106385548A (en) | Mobile terminal and method for generating video captions | |
CN105119971A (en) | Mobile terminal and fast mutual-picture-taking sharing system and method | |
CN106356065A (en) | Mobile terminal and voice conversion method | |
CN105760057A (en) | Screenshot device and method | |
CN104967802A (en) | Mobile terminal, recording method of screen multiple areas and recording device of screen multiple areas | |
CN106453056A (en) | Mobile terminal and method for safely sharing picture | |
CN106254617B (en) | A kind of mobile terminal and control method | |
CN104951549A (en) | Mobile terminal and photo/video sort management method thereof | |
CN106909681A (en) | A kind of information processing method and its device | |
CN106803860A (en) | The storage processing method and device of a kind of terminal applies | |
CN106372607A (en) | Method for reading pictures from videos and mobile terminal | |
CN105430258A (en) | Method and device for taking self-timer group photos | |
CN105049582B (en) | A kind of save set of calling record, method and display methods | |
CN106383707A (en) | Picture display method and system | |
CN106254439A (en) | A kind of method for pushing, device and a kind of mobile device management system | |
CN106657643A (en) | Mobile terminal and communication session display method | |
CN107071161A (en) | The aggregation display method and mobile terminal of icon in a kind of status bar | |
CN106911486A (en) | A kind of message push processing method, apparatus and system | |
CN106469221A (en) | Picture searching method, device and terminal | |
CN106550133A (en) | Calling identification device and method | |
CN106227454B (en) | A kind of touch trajectory detection system and method | |
CN105049916A (en) | Video recording method and device | |
CN104735259A (en) | Mobile terminal shooting parameter setting method and device and mobile terminal | |
CN107194243A (en) | A kind of mobile terminal and the method for installing application program | |
CN106534446A (en) | Mobile terminal dialing apparatus and method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170208 |