CN104794104A

CN104794104A - Multimedia document generating method and device

Info

Publication number: CN104794104A
Application number: CN201510220843.7A
Authority: CN
Inventors: 马子平
Original assignee: Nubia Technology Co Ltd
Current assignee: Nubia Technology Co Ltd
Priority date: 2015-04-30
Filing date: 2015-04-30
Publication date: 2015-07-22

Abstract

The invention discloses a multimedia document generating method. The method comprises the steps of receiving voice data and picture data, converting the voice data into text information, and inserting the text information and the picture data into a preset document to generate a multimedia document. The invention further discloses a multimedia document generating device. According to the multimedia document generating method and device, documents are generated automatically according to voice documents and picture documents, after journalists carry out interview work, news interview release can be generated automatically according to the voice data and the picture data generated in the interview process, and the working efficiency of the journalists is improved.

Description

The generation method of multimedia document and device

Technical field

The present invention relates to multimedia technology field, particularly relate to a kind of generation method and device of multimedia document.

Background technology

At present, journalist, in interview, general need recording and takes pictures, and after interview completes, needs the picture file Manual arranging by the audio file recorded and shooting, finally generates the document that a section is furnished with picture.The defect of prior art is cannot insert pictures file and the text message corresponding with audio file in a document automatically, and picture file and Word message corresponding to audio file must manually insert in document by journalist.

Summary of the invention

Fundamental purpose of the present invention is the generation method and the device that provide a kind of multimedia document, is intended to realize automatically insert pictures file and the text message corresponding with audio file in a document.

The generation method of multimedia document provided by the invention comprises:

Audio reception data and image data;

Described voice data is converted to text message;

Described text message and image data is inserted, to generate multimedia document in default document.

Preferably, the described step described voice data being converted to text message comprises:

Extract the first creation-time information of image data described in each, and the second creation-time information of described voice data;

According to described first creation-time information and the second creation-time information, described voice data is divided into some first sub-audio data;

Respectively the first sub-audio data described in each is converted to sub-text message.

Preferably, the described step inserting described text message and image data in default document comprises:

Sub-text message described in each is inserted in default document;

The image data that the first creation-time for dividing described two adjacent sub-text messages is corresponding is inserted between two adjacent sub-text messages.

According to audio frequency characteristics parameter, described voice data is divided into some second sub-audio data;

Respectively the second sub-audio data described in each is converted to text message, and the display parameter of the different text messages corresponding to the second sub-audio data are different.

Preferably, after the step of described audio reception data and image data, the generation method of described multimedia document also comprises:

Create with the file of current date name;

The described voice data received and image data are stored in the described file of establishment.

In addition, the generating apparatus of multimedia document provided by the invention comprises:

Receiver module, for audio reception data and image data;

Modular converter, for being converted to text message by described voice data;

Insert module, for inserting described text message and image data in default document, to generate multimedia document.

Preferably, described modular converter comprises:

Extraction unit, for extracting the first creation-time information of image data described in each, and the second creation-time information of described voice data;

First division unit, for being divided into some first sub-audio data according to described first creation-time information and the second creation-time information by described voice data;

First converting unit, for being converted to sub-text message by the first sub-audio data described in each respectively.

Preferably, described insert module comprises:

First plug-in unit, for inserting sub-text message described in each in default document;

Second plug-in unit, for inserting image data corresponding to the first creation-time for dividing described two adjacent sub-text messages between two adjacent sub-text messages.

Preferably, described modular converter comprises:

Second division unit, for being divided into some second sub-audio data according to audio frequency characteristics parameter by described voice data;

Second converting unit, for respectively the second sub-audio data described in each being converted to text message, and the display parameter of the different text messages corresponding to the second sub-audio data are different.

Preferably, the generating apparatus of described multimedia document also comprises:

Creation module, for creating with the file of current date name;

Memory module, for being stored in the described voice data received and image data in the described file of establishment.

The generation method of multimedia document provided by the invention and device, by the voice data received is converted to text message, and described text message and image data is inserted in default document, to generate multimedia document, after journalist carries out acquisition, automatically can generate news interview original text according to the voice data generated in interview and image data, improve journalistic work efficiency.

Accompanying drawing explanation

Fig. 1 is the hardware configuration signal of the mobile terminal realizing each embodiment of the present invention;

Fig. 2 is the wireless communication system schematic diagram of mobile terminal as shown in Figure 1;

Fig. 3 is the schematic flow sheet of generation method first embodiment of multimedia document of the present invention;

Fig. 4 is the refinement schematic flow sheet in the generation method of multimedia document of the present invention, voice data being converted to text message step the first embodiment;

Fig. 5 is the refinement schematic flow sheet in the generation method of multimedia document of the present invention, voice data being converted to text message step the second embodiment;

Fig. 6 is the high-level schematic functional block diagram of generating apparatus first embodiment of multimedia document of the present invention;

Fig. 7 is the refinement high-level schematic functional block diagram of modular converter first embodiment in the generating apparatus of multimedia document of the present invention;

Fig. 8 is the refinement high-level schematic functional block diagram of modular converter second embodiment in the generating apparatus of multimedia document of the present invention.

The realization of the object of the invention, functional characteristics and advantage will in conjunction with the embodiments, are described further with reference to accompanying drawing.

Embodiment

Should be appreciated that specific embodiment described herein only in order to explain the present invention, be not intended to limit the present invention.

The mobile terminal realizing each embodiment of the present invention is described referring now to accompanying drawing.In follow-up description, use the suffix of such as " module ", " parts " or " unit " for representing element only in order to be conducive to explanation of the present invention, itself is specific meaning not.Therefore, " module " and " parts " can mixedly use.

Mobile terminal can be implemented in a variety of manners.Such as, the terminal described in the present invention can comprise the such as mobile terminal of mobile phone, smart phone, notebook computer, digit broadcasting receiver, PDA (personal digital assistant), PAD (panel computer), PMP (portable media player), guider etc. and the fixed terminal of such as digital TV, desk-top computer etc.Below, suppose that terminal is mobile terminal.But it will be appreciated by those skilled in the art that except the element except being used in particular for mobile object, structure according to the embodiment of the present invention also can be applied to the terminal of fixed type.

Fig. 1 is the hardware configuration signal of the mobile terminal realizing each embodiment of the present invention.

Mobile terminal 100 can comprise wireless communication unit 110, A/V (audio/video) input block 120, user input unit 130, sensing cell 140, output unit 150, storer 160, interface unit 170, controller 180 and power supply unit 190 etc.Fig. 1 shows the mobile terminal with various assembly, it should be understood that, does not require to implement all assemblies illustrated.Can alternatively implement more or less assembly.Will be discussed in more detail below the element of mobile terminal.

Wireless communication unit 110 generally includes one or more assembly, and it allows the wireless communication between mobile terminal 100 and wireless communication system or network.Such as, wireless communication unit can comprise at least one in broadcast reception module 111, mobile communication module 112, wireless Internet module 113, short range communication module 114 and positional information module 115.

Broadcast reception module 111 via broadcast channel from external broadcasting management server receiving broadcast signal and/or broadcast related information.Broadcast channel can comprise satellite channel and/or terrestrial channel.Broadcast management server can be generate and send the server of broadcast singal and/or broadcast related information or the broadcast singal generated before receiving and/or broadcast related information and send it to the server of terminal.Broadcast singal can comprise TV broadcast singal, radio signals, data broadcasting signal etc.And broadcast singal may further include the broadcast singal combined with TV or radio signals.Broadcast related information also can provide via mobile communications network, and in this case, broadcast related information can be received by mobile communication module 112.Broadcast singal can exist in a variety of manners, such as, it can exist with the form of the electronic service guidebooks (ESG) of the electronic program guides of DMB (DMB) (EPG), digital video broadcast-handheld (DVB-H) etc.Broadcast reception module 111 can by using the broadcast of various types of broadcast system Received signal strength.Especially, broadcast reception module 111 can by using such as multimedia broadcasting-ground (DMB-T), DMB-satellite (DMB-S), digital video broadcasting-hand-held (DVB-H), forward link media (MediaFLO ^@) Radio Data System, received terrestrial digital broadcasting integrated service (ISDB-T) etc. digit broadcasting system receive digital broadcasting.Broadcast reception module 111 can be constructed to be applicable to providing the various broadcast system of broadcast singal and above-mentioned digit broadcasting system.The broadcast singal received via broadcast reception module 111 and/or broadcast related information can be stored in storer 160 (or storage medium of other type).

Radio signal is sent at least one in base station (such as, access point, Node B etc.), exterior terminal and server and/or receives radio signals from it by mobile communication module 112.Various types of data that such radio signal can comprise voice call signal, video calling signal or send according to text and/or Multimedia Message and/or receive.

Wireless Internet module 113 supports the Wi-Fi (Wireless Internet Access) of mobile terminal.This module can be inner or be externally couple to terminal.Wi-Fi (Wireless Internet Access) technology involved by this module can comprise WLAN (WLAN) (Wi-Fi), Wibro (WiMAX), Wimax (worldwide interoperability for microwave access), HSDPA (high-speed downlink packet access) etc.

Short range communication module 114 is the modules for supporting junction service.Some examples of short-range communication technology comprise bluetooth ^tM, radio-frequency (RF) identification (RFID), Infrared Data Association (IrDA), ultra broadband (UWB), purple honeybee ^tMetc..

Positional information module 115 is the modules of positional information for checking or obtain mobile terminal.The typical case of positional information module is GPS (GPS).According to current technology, GPS module 115 calculates from the range information of three or more satellite and correct time information and for the Information application triangulation calculated, thus calculates three-dimensional current location information according to longitude, latitude and pin-point accuracy.Current, the method for calculating position and temporal information uses three satellites and by using the error of the position that goes out of an other satellite correction calculation and temporal information.In addition, GPS module 115 can carry out computing velocity information by Continuous plus current location information in real time.

A/V input block 120 is for audio reception or vision signal.A/V input block 120 can comprise camera 121 and microphone 1220, and the view data of camera 121 to the static images obtained by image capture apparatus in Video Capture pattern or image capture mode or video processes.Picture frame after process may be displayed on display unit 151.Picture frame after camera 121 processes can be stored in storer 160 (or other storage medium) or via wireless communication unit 110 and send, and can provide two or more cameras 1210 according to the structure of mobile terminal.Such acoustic processing can via microphones sound (voice data) in telephone calling model, logging mode, speech recognition mode etc. operational mode, and can be voice data by microphone 122.Audio frequency (voice) data after process can be converted to the formatted output that can be sent to mobile communication base station via mobile communication module 112 when telephone calling model.Microphone 122 can be implemented various types of noise and eliminate (or suppress) algorithm and receiving and sending to eliminate (or suppression) noise or interference that produce in the process of sound signal.

User input unit 130 can generate key input data to control the various operations of mobile terminal according to the order of user's input.User input unit 130 allows user to input various types of information, and keyboard, the young sheet of pot, touch pad (such as, detecting the touch-sensitive assembly of the change of the resistance, pressure, electric capacity etc. that cause owing to being touched), roller, rocking bar etc. can be comprised.Especially, when touch pad is superimposed upon on display unit 151 as a layer, touch-screen can be formed.

Sensing cell 140 detects the current state of mobile terminal 100, (such as, mobile terminal 100 open or close state), the position of mobile terminal 100, user for mobile terminal 100 contact (namely, touch input) presence or absence, the orientation of mobile terminal 100, the acceleration or deceleration of mobile terminal 100 move and direction etc., and generate order or the signal of the operation for controlling mobile terminal 100.Such as, when mobile terminal 100 is embodied as sliding-type mobile phone, sensing cell 140 can sense this sliding-type phone and open or close.In addition, whether whether sensing cell 140 can detect power supply unit 190 provides electric power or interface unit 170 to couple with external device (ED).Sensing cell 140 can comprise proximity transducer 1410 and will be described this in conjunction with touch-screen below.

Interface unit 170 is used as at least one external device (ED) and is connected the interface that can pass through with mobile terminal 100.Such as, external device (ED) can comprise wired or wireless head-band earphone port, external power source (or battery charger) port, wired or wireless FPDP, memory card port, for connecting the port, audio frequency I/O (I/O) port, video i/o port, ear port etc. of the device with identification module.Identification module can be that storage uses the various information of mobile terminal 100 for authentication of users and can comprise subscriber identification module (UIM), client identification module (SIM), Universal Subscriber identification module (USIM) etc.In addition, the device (hereinafter referred to " recognition device ") with identification module can take the form of smart card, and therefore, recognition device can be connected with mobile terminal 100 via port or other coupling arrangement.Interface unit 170 may be used for receive from external device (ED) input (such as, data message, electric power etc.) and the input received be transferred to the one or more element in mobile terminal 100 or may be used for transmitting data between mobile terminal and external device (ED).

In addition, when mobile terminal 100 is connected with external base, interface unit 170 can be used as to allow by it electric power to be provided to the path of mobile terminal 100 from base or can be used as the path that allows to be transferred to mobile terminal by it from the various command signals of base input.The various command signal inputted from base or electric power can be used as and identify whether mobile terminal is arranged on the signal base exactly.Output unit 150 is constructed to provide output signal (such as, sound signal, vision signal, alarm signal, vibration signal etc.) with vision, audio frequency and/or tactile manner.Output unit 150 can comprise display unit 151, dio Output Modules 152, alarm unit 153 etc.

Display unit 151 may be displayed on the information of process in mobile terminal 100.Such as, when mobile terminal 100 is in telephone calling model, display unit 151 can show with call or other communicate (such as, text messaging, multimedia file are downloaded etc.) be correlated with user interface (UI) or graphic user interface (GUI).When mobile terminal 100 is in video calling pattern or image capture mode, display unit 151 can the image of display capture and/or the image of reception, UI or GUI that video or image and correlation function are shown etc.

Meanwhile, when display unit 151 and touch pad as a layer superposed on one another to form touch-screen time, display unit 151 can be used as input media and output unit.Display unit 151 can comprise at least one in liquid crystal display (LCD), thin film transistor (TFT) LCD (TFT-LCD), Organic Light Emitting Diode (OLED) display, flexible display, three-dimensional (3D) display etc.Some in these displays can be constructed to transparence and watch from outside to allow user, and this can be called transparent display, and typical transparent display can be such as TOLED (transparent organic light emitting diode) display etc.According to the specific embodiment wanted, mobile terminal 100 can comprise two or more display units (or other display device), such as, mobile terminal can comprise outernal display unit (not shown) and inner display unit (not shown).Touch-screen can be used for detecting touch input pressure and touch input position and touch and inputs area.

When dio Output Modules 152 can be under the isotypes such as call signal receiving mode, call mode, logging mode, speech recognition mode, broadcast reception mode at mobile terminal, voice data convert audio signals that is that wireless communication unit 110 is received or that store in storer 160 and exporting as sound.And dio Output Modules 152 can provide the audio frequency relevant to the specific function that mobile terminal 100 performs to export (such as, call signal receives sound, message sink sound etc.).Dio Output Modules 152 can comprise loudspeaker, hummer etc.

Alarm unit 153 can provide and export that event informed to mobile terminal 100.Typical event can comprise calling reception, message sink, key signals input, touch input etc.Except audio or video exports, alarm unit 153 can provide in a different manner and export with the generation of notification event.Such as, alarm unit 153 can provide output with the form of vibration, when receive calling, message or some other enter communication (incoming communication) time, alarm unit 153 can provide sense of touch to export (that is, vibrating) to notify to user.By providing such sense of touch to export, even if when the mobile phone of user is in the pocket of user, user also can identify the generation of various event.Alarm unit 153 also can provide the output of the generation of notification event via display unit 151 or dio Output Modules 152.

Storer 160 software program that can store process and the control operation performed by controller 180 etc., or temporarily can store oneself through exporting the data (such as, telephone directory, message, still image, video etc.) that maybe will export.And, storer 160 can store about when touch be applied to touch-screen time the vibration of various modes that exports and the data of sound signal.

Storer 160 can comprise the storage medium of at least one type, described storage medium comprises flash memory, hard disk, multimedia card, card-type storer (such as, SD or DX storer etc.), random access storage device (RAM), static random-access memory (SRAM), ROM (read-only memory) (ROM), Electrically Erasable Read Only Memory (EEPROM), programmable read only memory (PROM), magnetic storage, disk, CD etc.And mobile terminal 100 can be connected the memory function of execute store 160 network storage device with by network cooperates.

Controller 180 controls the overall operation of mobile terminal usually.Such as, controller 180 performs the control relevant to voice call, data communication, video calling etc. and process.In addition, controller 180 can comprise the multi-media module 1810 for reproducing (or playback) multi-medium data, and multi-media module 1810 can be configured in controller 180, or can be configured to be separated with controller 180.Controller 180 can pattern recognition process, is identified as character or image so that input is drawn in the handwriting input performed on the touchscreen or picture.

Power supply unit 190 receives external power or internal power and provides each element of operation and the suitable electric power needed for assembly under the control of controller 180.

Various embodiment described herein can to use such as computer software, the computer-readable medium of hardware or its any combination implements.For hardware implementation, embodiment described herein can by using application-specific IC (ASIC), digital signal processor (DSP), digital signal processing device (DSPD), programmable logic device (PLD), field programmable gate array (FPGA), processor, controller, microcontroller, microprocessor, being designed at least one performed in the electronic unit of function described herein and implementing, in some cases, such embodiment can be implemented in controller 180.For implement software, the embodiment of such as process or function can be implemented with allowing the independent software module performing at least one function or operation.Software code can be implemented by the software application (or program) write with any suitable programming language, and software code can be stored in storer 160 and to be performed by controller 180.

So far, oneself is through the mobile terminal according to its functional description.Below, for the sake of brevity, by the slide type mobile terminal that describes in various types of mobile terminals of such as folded form, board-type, oscillating-type, slide type mobile terminal etc. exemplarily.Therefore, the present invention can be applied to the mobile terminal of any type, and is not limited to slide type mobile terminal.

Mobile terminal 100 as shown in Figure 1 can be constructed to utilize and send the such as wired and wireless communication system of data via frame or grouping and satellite-based communication system operates.

Describe wherein according to the communication system that mobile terminal of the present invention can operate referring now to Fig. 2.

Such communication system can use different air interfaces and/or Physical layer.Such as, the air interface used by communication system comprises such as frequency division multiple access (FDMA), time division multiple access (TDMA) (TDMA), CDMA (CDMA) and universal mobile telecommunications system (UMTS) (especially, Long Term Evolution (LTE)), global system for mobile communications (GSM) etc.As non-limiting example, description below relates to cdma communication system, but such instruction is equally applicable to the system of other type.

With reference to figure 2, cdma wireless communication system can comprise multiple mobile terminal 100, multiple base station (BS) 270, base station controller (BSC) 275 and mobile switching centre (MSC) 280.MSC280 is constructed to form interface with Public Switched Telephony Network (PSTN) 290.MSC280 is also constructed to form interface with the BSC275 that can be couple to base station 270 via back haul link.Back haul link can construct according to any one in some interfaces that oneself knows, described interface comprises such as E1/T1, ATM, IP, PPP, frame relay, HDSL, ADSL or xDSL.Will be appreciated that system as shown in Figure 2 can comprise multiple BSC2750.

Each BS270 can serve one or more subregion (or region), by multidirectional antenna or point to specific direction each subregion of antenna cover radially away from BS270.Or each subregion can by two or more antenna covers for diversity reception.Each BS270 can be constructed to support multiple parallel compensate, and each parallel compensate has specific frequency spectrum (such as, 1.25MHz, 5MHz etc.).

Subregion can be called as CDMA Channel with intersecting of parallel compensate.BS270 also can be called as base station transceiver subsystem (BTS) or other equivalent terms.Under these circumstances, term " base station " may be used for broadly representing single BSC275 and at least one BS270.Base station also can be called as " cellular station ".Or each subregion of particular B S270 can be called as multiple cellular station.

As shown in Figure 2, broadcast singal is sent to the mobile terminal 100 at operate within systems by broadcsting transmitter (BT) 295.Broadcast reception module 111 as shown in Figure 1 is arranged on mobile terminal 100 and sentences the broadcast singal receiving and sent by BT295.In fig. 2, several GPS (GPS) satellite 300 is shown.Satellite 300 helps at least one in the multiple mobile terminal 100 in location.

In fig. 2, depict multiple satellite 300, but understand, the satellite of any number can be utilized to obtain useful locating information.GPS module 115 as shown in Figure 1 is constructed to coordinate to obtain the locating information wanted with satellite 300 usually.Substitute GPS tracking technique or outside GPS tracking technique, can use can other technology of position of tracking mobile terminal.In addition, at least one gps satellite 300 optionally or extraly can process satellite dmb transmission.

As a typical operation of wireless communication system, BS270 receives the reverse link signal from various mobile terminal 100.Mobile terminal 100 participates in call usually, information receiving and transmitting communicates with other type.Each reverse link signal that certain base station 270 receives is processed by particular B S270.The data obtained are forwarded to relevant BSC275.BSC provides call Resourse Distribute and comprises the mobile management function of coordination of the soft switching process between BS270.The data received also are routed to MSC280 by BSC275, and it is provided for the extra route service forming interface with PSTN290.Similarly, PSTN290 and MSC280 forms interface, and MSC and BSC275 forms interface, and BSC275 correspondingly control BS270 so that forward link signals is sent to mobile terminal 100.

Based on above-mentioned mobile terminal 100 hardware configuration and communication system, the invention provides a kind of generation method of multimedia document, in each embodiment of the generation method of following multimedia document, be all described for above-mentioned mobile terminal 100 as executive agent.

With reference to the schematic flow sheet that Fig. 3, Fig. 3 are generation method first embodiment of multimedia document of the present invention, the generation method of the multimedia document that the present invention proposes comprises the following steps:

Step S10, audio reception data and image data;

In the present embodiment, can detect mobile terminal start sound-track engraving apparatus time audio reception data or detect mobile terminal start exposal model time receive image data, also the voice data from external unit and image data can be received when mobile terminal and external unit connect, or when receiving multimedia file and obtaining instruction, directly from the storer preset path in mobile terminal, obtain voice data and image data.Should be noted that, receiving video data at mobile terminal (can for the video data from external unit received, or the video data for mobile terminal is recorded), and when receiving extraction instruction, extract the voice data of the video data received; When extracting voice data, can be considered and have received voice data.The form of said extracted instruction can be arranged according to actual needs, such as, can arrange an extraction control at mobile terminal, when user triggers this extraction control, then be considered as have received extraction instruction.

Step S20, is converted to text message by described voice data;

In the present embodiment, by existing speech conversion text software, voice data can be converted to text message, such as, speech conversion text software can be AudioNote, Viovoice etc., in this no limit.

Reference Fig. 4, Fig. 4 are the refinement schematic flow sheet in the generation method of multimedia document of the present invention, voice data being converted to text message step the first embodiment, and step S20 comprises:

Step S21, extracts the first creation-time information of image data described in each, and the second creation-time information of described voice data;

In the present embodiment, the first creation-time information is the creation-time of image data, and the second creation-time information is the creation-time of voice data.Such as, in certain interview, create 3 pictures files and 1 section of recording file altogether, wherein, the creation-time of three pictures files is respectively 9:10,9:30 and 9:45, the creation-time of recording file is 9:00, and the duration of recording file is 1 hour, and namely the recording time of recording file is for continue to 10:00 by 9:00.

Step S22, is divided into some first sub-audio data according to described first creation-time information and the second creation-time information by described voice data.

Such as, the first creation-time information described in each and the time difference between described second creation-time information can first be calculated respectively;

In the present embodiment, the difference between the creation-time of each picture file and recording file creation-time is respectively 10 minutes, 30 minutes and 45 minutes.

According to the duration of time difference described in each and described audio file, described voice data is divided into some first sub-audio data;

In the present embodiment, the duration of recording file is 60 minutes, separation is divided into the 10th point, the 30th point and the 45th, recording file is divided into 4 the first sub-audio data, first sub-audio data is divided into the 10th point by the 0th point, 10 to 30 point is divided into first sub-audio data, and the 30th to the 45th point is divided into first sub-audio data, and the 45th point is divided into first sub-audio data to the 60th point.

Step S23, is converted to sub-text message by the first sub-audio data described in each respectively.

In the present embodiment, by existing speech conversion text software, each first sub-audio data can be converted to sub-text message, such as, speech conversion text software can be AudioNote, Viovoice etc., in this no limit.Therefore, each first sub-audio data obtains a cross-talk text message, obtains four cross-talk text messages altogether.

Step S30, inserts described text message and image data in default document, to generate multimedia document.

In the present embodiment, default document can be blank word file, pdf file and ppt file etc., specifically can select according to actual needs, in this no limit.Text message and image data are saved in document jointly, thus automatically generate the interview original text being furnished with picture.

Preferably, step S30 comprises:

Sub-text message described in each is inserted in default document;

In the present embodiment, be the picture file that 9:10 divides for creation-time, the time difference of its correspondence is 10 minutes, be the picture file of 9:30 for creation-time, the time difference of its correspondence is 30 minutes, be the picture file of 9:45 for creation-time, the time difference of its correspondence is 45 minutes, above-mentioned recording file is divided into 4 the first sub-audio data by each time difference, according to above-mentioned analysis, first sub-audio data of the 0th point to the 10th point and first sub-audio data of the 10 to 30 point is respectively using two first sub-audio data that difference divides as separation for 10 minutes, therefore, be that the picture file that 9:10 divides inserts between text message corresponding to these two the first sub-audio data by creation-time, by that analogy, final generation document.

The generation method of multimedia document provided by the invention, by the voice data received is converted to text message, and described text message and image data is inserted in default document, after journalist carries out acquisition, automatically can generate news interview original text according to the voice data generated in interview and image data, improve journalistic work efficiency.

Further, based on the first embodiment of the generation method of multimedia document of the present invention, the invention allows for the second embodiment of the generation method of multimedia document, with reference to Fig. 5, Fig. 5 is the refinement schematic flow sheet in the generation method of multimedia document of the present invention, voice data being converted to text message step the second embodiment, with the first embodiment unlike, step S20 comprises:

Step S24, is divided into some second sub-audio data according to audio frequency characteristics parameter by described voice data;

In the present embodiment, audio frequency characteristics parameter can be tamber parameter, Acoustics Parameters and volume parameters etc., is preferably tamber parameter.By voice data being divided into some second sub-audio data according to tamber parameter, voice data that can be corresponding by different people is distinguished.

Step S25, respectively the second sub-audio data described in each is converted to text message, and the display parameter of the different text messages corresponding to the second sub-audio data are different, in the document generated, show different forms to make text message corresponding to the second different sub-audio data.

In the present embodiment, the form of each text message can be arranged according to actual needs, such as, and color, different fonts etc. that each text message can be corresponding different.Such as, in the document generated, each text message is shown with different colors, thus the text message being more beneficial to journalist corresponding to different respondent is distinguished, and is more convenient to journalist and increases work efficiency.

Preferably, step S30 comprises:

Determine the time interval that described in each, the second sub-audio data is corresponding;

Determine the described time interval at the first creation-time information place of each picture file;

Each picture file is inserted respectively in the text message of the second sub-audio data corresponding to the time interval of its correspondence.Such as, picture file can be inserted the centre of the Word message determined, also can insert head or the afterbody of the Word message determined.

Should be noted that, second embodiment of the generation method of above-mentioned multimedia document and the first embodiment can be in conjunction with, such as, can be divided into some second sub-audio data to each first sub-audio data that the first embodiment obtains according to audio frequency characteristics parameter, that is, step S20 can also comprise:

According to audio frequency characteristics parameter, described first sub-audio data is divided into some second sub-audio data;

Respectively the second sub-audio data described in each is converted to text message, and the form of the different text messages corresponding to the second sub-audio data is different, in the document generated, show different forms to make text message corresponding to the second different sub-audio data.

Further, arrange the voice data and image data that obtain in interview for the ease of journalist, after step slo, the generation method of described multimedia document also comprises:

Create with the file of current date name;

In addition, also can, when terminal receives multimedia file acquisition instruction, create with the file of current date name, such as, when detecting mobile terminal and starting recording mode or exposal model, then be considered as have received multimedia file and obtain instruction; Or can a control be set on mobile terminals, when user triggers this control, be then considered as have received multimedia file and obtains instruction; Or some time interval can be preset, when current time equals initial time interval sometime, be then considered as have received multimedia file and obtain instruction.

When journalist interviews, every day may carry out multiple interview, and general each interview can be carried out recording and taking pictures.Journalist, when starting to carry out a certain interview, can trigger above-mentioned control, then mobile terminal creates corresponding file, and folder name is preferably current date and numbering.Such as, if journalist has carried out 3 interviews in certain sky, interviewing the date same day is 2015.04.20, folder name corresponding to first interview can be 2015.04.20.1, folder name corresponding to second interview can be 2015.04.20.2, and the folder name of the 3rd interview can be 2015.04.20.3.

Preferably, multimedia file can be sorted according to the time, and be stored in the file of establishment.

The present invention further provides a kind of generating apparatus of multimedia document.

With reference to the high-level schematic functional block diagram that Fig. 6, Fig. 6 are generating apparatus first embodiment of multimedia document of the present invention, the generating apparatus of multimedia document provided by the invention comprises:

Receiver module 10, for audio reception data and image data;

In the present embodiment, can detect mobile terminal start sound-track engraving apparatus time audio reception data or detect mobile terminal start exposal model time receive image data, also the voice data from external unit and image data can be received when mobile terminal and external unit connect, or when receiving multimedia file and obtaining instruction, directly from the storer preset path in mobile terminal, obtain voice data and image data.Should be noted that, receiving video data at receiver module 10 (can for the video data from external unit received, or the video data for mobile terminal is recorded), and when receiving extraction instruction, extract the voice data of the video data received; When extracting voice data, can be considered and have received voice data.The form of said extracted instruction can be arranged according to actual needs, such as, can arrange an extraction control at mobile terminal, when user triggers this extraction control, then be considered as have received extraction instruction.

Modular converter 20, for being converted to text message by described voice data;

With reference to the refinement high-level schematic functional block diagram that Fig. 7, Fig. 7 are modular converter first embodiment in the generating apparatus of multimedia document of the present invention, modular converter 20 comprises:

Extraction unit 21, for extracting the first creation-time information of image data described in each, and the second creation-time information of described voice data;

First division unit 22, for being divided into some first sub-audio data according to described first creation-time information and the second creation-time information by described voice data;

First converting unit 23, for being converted to sub-text message by the first sub-audio data described in each respectively.

Insert module 30, for inserting described text message and image data in default document, to generate multimedia document.

Preferably, described insert module 30 comprises:

The generating apparatus of multimedia document provided by the invention, by the voice data received is converted to text message, and described text message and image data is inserted in default document, after journalist carries out acquisition, automatically can generate news interview original text according to the voice data generated in interview and image data, improve journalistic work efficiency.

Further, based on the first embodiment of the generating apparatus of multimedia document of the present invention, the invention allows for the second embodiment of the generating apparatus of multimedia document, with reference to Fig. 8, Fig. 8 is the refinement high-level schematic functional block diagram of modular converter second embodiment in the generating apparatus of multimedia document of the present invention, and described modular converter 20 comprises:

Second division unit 24, for being divided into some second sub-audio data according to audio frequency characteristics parameter by described voice data;

Second converting unit 25, for respectively the second sub-audio data described in each being converted to text message, and the display parameter of the different text messages corresponding to the second sub-audio data are different, in the document generated, show different forms to make text message corresponding to the second different sub-audio data.

Preferably, insert module 30 comprises:

Determining unit, for the time interval determining that described in each, the second sub-audio data is corresponding;

Determining unit is also for determining the described time interval at the first creation-time information place of each picture file;

Plug-in unit, in the text message of the second sub-audio data that the time interval for each picture file being inserted respectively its correspondence is corresponding.Such as, picture file can be inserted the centre of the Word message determined, also can insert head or the afterbody of the Word message determined.

Further, arrange the voice data and image data that obtain in interview for the ease of journalist, the generating apparatus of described multimedia document also comprises:

Creation module, for creating with the file of current date name;

When journalist interviews, every day may carry out multiple interview, and general each interview can be carried out recording and taking pictures.Journalist, when starting to carry out a certain interview, can trigger above-mentioned control, then creating unit 11 creates corresponding file, and folder name is preferably current date and numbering.Such as, if journalist has carried out 3 interviews in certain sky, interviewing the date same day is 2015.04.20, folder name corresponding to first interview can be 2015.04.20.1, folder name corresponding to second interview can be 2015.04.20.2, and the folder name of the 3rd interview can be 2015.04.20.3.

Receiving element 12, for receiving multimedia file, and is stored in the described multimedia file received in the described file of establishment.

It should be noted that, in this article, term " comprises ", " comprising " or its any other variant are intended to contain comprising of nonexcludability, thus make to comprise the process of a series of key element, method, article or device and not only comprise those key elements, but also comprise other key elements clearly do not listed, or also comprise by the intrinsic key element of this process, method, article or device.When not more restrictions, the key element limited by statement " comprising ... ", and be not precluded within process, method, article or the device comprising this key element and also there is other identical element.

The invention described above embodiment sequence number, just to describing, does not represent the quality of embodiment.

Through the above description of the embodiments, those skilled in the art can be well understood to the mode that above-described embodiment method can add required general hardware platform by software and realize, hardware can certainly be passed through, but in a lot of situation, the former is better embodiment.Based on such understanding, technical scheme of the present invention can embody with the form of software product the part that prior art contributes in essence in other words, this computer software product is stored in a storage medium (as ROM/RAM, magnetic disc, CD), comprising some instructions in order to make a station terminal equipment (can be mobile phone, computing machine, server, air conditioner, or the network equipment etc.) perform method described in each embodiment of the present invention.

These are only the preferred embodiments of the present invention; not thereby the scope of the claims of the present invention is limited; every utilize instructions of the present invention and accompanying drawing content to do equivalent structure or equivalent flow process conversion; or be directly or indirectly used in other relevant technical fields, be all in like manner included in scope of patent protection of the present invention.

Claims

1. a generation method for multimedia document, is characterized in that, the generation method of described multimedia document comprises:

Audio reception data and image data;

Described voice data is converted to text message;

2. the generation method of multimedia document as claimed in claim 1, it is characterized in that, the described step described voice data being converted to text message comprises:

3. the generation method of multimedia document as claimed in claim 2, it is characterized in that, the described step inserting described text message and image data in default document comprises:

Sub-text message described in each is inserted in default document;

4. the generation method of multimedia document as claimed in claim 1, it is characterized in that, the described step described voice data being converted to text message comprises:

5. the generation method of multimedia document as claimed in claim 1, it is characterized in that, after the step of described audio reception data and image data, the generation method of described multimedia document also comprises:

Create with the file of current date name;

6. a generating apparatus for multimedia document, is characterized in that, the generating apparatus of described multimedia document comprises:

Receiver module, for audio reception data and image data;

Modular converter, for being converted to text message by described voice data;

7. the generating apparatus of multimedia document as claimed in claim 6, it is characterized in that, described modular converter comprises:

8. the generating apparatus of multimedia document as claimed in claim 7, it is characterized in that, described insert module comprises:

9. the generating apparatus of multimedia document as claimed in claim 6, it is characterized in that, described modular converter comprises:

10. the generating apparatus of multimedia document as claimed in claim 6, it is characterized in that, the generating apparatus of described multimedia document also comprises:

Creation module, for creating with the file of current date name;