CN107592400A

CN107592400A - The processing method and terminal of recording file

Info

Publication number: CN107592400A
Application number: CN201710664366.2A
Authority: CN
Inventors: 张承军
Original assignee: Shenzhen Jinli Communication Equipment Co Ltd
Current assignee: Shenzhen Jinli Communication Equipment Co Ltd
Priority date: 2017-08-04
Filing date: 2017-08-04
Publication date: 2018-01-16

Abstract

The embodiment of the invention discloses a kind of processing method of recording file and terminal, wherein method includes：Voice data is gathered by voice pick device, obtains target recording file；Speech recognition is carried out to the target recording file, obtains target text file；The target text file is sent to server.Target text file is sent to server by the embodiment of the present invention, can reduce the network bandwidth of occupancy, so as to improve the efficiency of backup and success rate.

Description

The processing method and terminal of recording file

Technical field

The present invention relates to electronic technology field, and in particular to the processing method and terminal of a kind of recording file.

Background technology

With scientific and technological progress, increasing terminal moves towards general marketplace, provide the user great convenience, enriches Popular life.

Current most of terminals are respectively provided with sound-recording function, and thing important under user record is helped by recording.For example, eventually End can carry out calling record when receiving calls or calling, and when user needs to record to call, pass through triggering Calling record button can start the recording to current talking, recording i.e. termination after end of conversation.

For some important recording files, user, which is uploaded onto the server, to be backed up.When recording file is larger, Need to take substantial amounts of network bandwidth when backing up to server.And in the case of network state difference, it is easily caused backup failure.

The content of the invention

The embodiment of the present invention proposes the processing method and terminal of a kind of recording file, can solve recording file and back up to During server, the problem of taking substantial amounts of network bandwidth, and be easily caused backup failure, it is possible to increase the efficiency of backup and successfully Rate.

In a first aspect, the embodiments of the invention provide a kind of processing method of recording file, this method includes：

Voice data is gathered by voice pick device, obtains target recording file；

Speech recognition is carried out to the target recording file, obtains target text file, the target text file includes Text information, temporal information and sound characteristic information；

The target text file is sent to server.

Second aspect, the embodiments of the invention provide a kind of terminal, the terminal includes:

Collecting unit, for gathering voice data by voice pick device, obtain target recording file；

Voice recognition unit, for carrying out speech recognition to the target recording file, target text file is obtained, it is described Target text file includes text information, temporal information and sound characteristic information；

Transmitting element, for the target text file to be sent into server.

The third aspect, the embodiments of the invention provide another terminal, including processor, input equipment, output equipment and Memory, the processor, the input equipment, the output equipment and the memory are connected with each other, wherein, the storage Device is used to store the application code for supporting terminal to perform the above method, and the processor is arranged to perform above-mentioned first The method of aspect.

Fourth aspect, the embodiments of the invention provide a kind of computer-readable recording medium, the computer-readable storage medium Computer program is stored with, the computer program includes programmed instruction, and described program instruction makes institute when being executed by a processor The method for stating the above-mentioned first aspect of computing device.

The embodiment of the present invention, voice data is gathered by voice pickup unit and obtains target recording file, target is recorded File carries out speech recognition and obtains target text file, and target text file is uploaded onto the server.Due to target text file For text, and target recording file is audio file, then target text file is compared with file for target recording file Small more of capacity, the network bandwidth of occupancy can be reduced, improve backup efficiency and success rate.And target text file is believed comprising word Breath, temporal information and sound characteristic information, beneficial to subsequently through target text file synthesis target recording file, so as to improve The validity of backup.

Brief description of the drawings

In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing There is the required accompanying drawing used in technology description to be briefly described, it should be apparent that, drawings in the following description are the present invention Some embodiments, for those of ordinary skill in the art, on the premise of not paying creative work, can also basis These accompanying drawings obtain other accompanying drawings.

Wherein：

Fig. 1 is a kind of schematic flow sheet of the processing method of recording file provided in an embodiment of the present invention；

Fig. 2 is the schematic flow sheet of the processing method of another recording file provided in an embodiment of the present invention；

Fig. 3 is a kind of structural representation of terminal provided in an embodiment of the present invention；

Fig. 4 is the structural representation of another terminal provided in an embodiment of the present invention；

Fig. 5 is the structural representation of another terminal provided in an embodiment of the present invention；

Fig. 6 is the structural representation of another terminal provided in an embodiment of the present invention.

Embodiment

Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Site preparation describes, it is clear that described embodiment is part of the embodiment of the present invention, rather than whole embodiments.Based on this hair Embodiment in bright, the every other implementation that those of ordinary skill in the art are obtained under the premise of creative work is not made Example, belongs to the scope of protection of the invention.

It should be appreciated that ought be in this specification and in the appended claims in use, term " comprising " and "comprising" instruction Described feature, entirety, step, operation, the presence of element and/or component, but it is not precluded from one or more of the other feature, whole Body, step, operation, element, component and/or its presence or addition for gathering.

It is also understood that the term used in this description of the invention is merely for the sake of the mesh for describing specific embodiment And be not intended to limit the present invention.As used in description of the invention and appended claims, unless on Other situations are hereafter clearly indicated, otherwise " one " of singulative, "one" and "the" are intended to include plural form.

It will be further appreciated that the term "and/or" used in description of the invention and appended claims is Refer to any combinations of one or more of the associated item listed and be possible to combine, and including these combinations.

As used in this specification and in the appended claims, term " if " can be according to context quilt Be construed to " when ... " or " once " or " in response to determining " or " in response to detecting ".Similarly, phrase " if it is determined that " or " if detecting [described condition or event] " can be interpreted to mean according to context " once it is determined that " or " in response to true It is fixed " or " once detecting [described condition or event] " or " in response to detecting [described condition or event] ".

In the specific implementation, the terminal described in the embodiment of the present invention is including but not limited to such as with touch sensitive surface The mobile phone, laptop computer or tablet PC of (for example, touch-screen display and/or touch pad) etc it is other just Portable device.It is to be further understood that in certain embodiments, the equipment is not portable communication device, but with tactile Touch the desktop computer of sensing surface (for example, touch-screen display and/or touch pad).

In discussion below, the terminal including display and touch sensitive surface is described.It is, however, to be understood that It is that terminal can include one or more of the other physical user-interface device of such as physical keyboard, mouse and/or control-rod.

Terminal supports various application programs, such as one or more of following：Drawing application program, demonstration application journey Sequence, word-processing application, website create application program, disk imprinting application program, spreadsheet applications, game application Program, telephony application, videoconference application, email application, instant messaging applications, exercise Support application program, photo management application program, digital camera application program, digital camera application program, web-browsing application Program, digital music player application and/or video frequency player application program.

The various application programs that can be performed in terminal can use at least one public of such as touch sensitive surface Physical user-interface device.It can adjust and/or change among applications and/or in corresponding application programs and touch sensitive table The corresponding information shown in the one or more functions and terminal in face.So, the public physical structure of terminal is (for example, touch Sensing surface) the various application programs with user interface directly perceived and transparent for a user can be supported.

Referring to Fig. 1, Fig. 1 is a kind of schematic flow sheet of the processing method of recording file provided in an embodiment of the present invention, such as Shown in Fig. 1, the processing method of the recording file may include：

101st, voice data is gathered by voice pick device, obtains target recording file.

In the embodiment of the present invention, voice pick device can be the microphone in terminal, can also be the external earphone of terminal On microphone or other equipment.After sound-recording function is started, voice pick device is acquired to voice data, that is, will The sound collected is converted into electric signal, then converts electrical signals to the process of audio file corresponding to data signal, by sound File is as target recording file corresponding to frequency evidence.Target recording file is stored in internal memory or storage card by usual terminal, It can certainly upload onto the server.

Terminal gathers voice data by voice pick device, mainly includes two classes, a kind of taped conversations, can be common Session operational scenarios or call scene, pass through record improve operation convenience.Such as：Record one is needed in call A little important matters or telephone number, the sound-recording function that user can open mobile phone can record required content；Or collecting evidence Scene in, the dialogue with other people is recorded.Another kind is the recording of user oneself, such as：Record is reminded item, sung Or the recording such as read aloud.The embodiment of the present invention, for taking the application scenarios of voice data to be not construed as limiting, preferably dialogue record Sound.

102nd, speech recognition is carried out to above-mentioned target recording file, obtains target text file.

Speech recognition technology, it is that the vocabulary Content Transformation in the voice by the mankind is computer-readable input, such as by Key, binary coding or character string.It is widely used at present in many fields, including phonetic search, audio dictation (audio conversion text), intelligent sound navigation system (being used for customer service system) etc.

Sound groove recognition technology in e, it is that acoustical signal is converted into electric signal, then is identified with computer.

In the embodiment of the present invention, target text file includes text information, temporal information and sound characteristic information.Its In, text information is word corresponding to voice content corresponding to target recording file；When temporal information is target recording file Between record, it may include the time started, the end time, wherein at the beginning of a word between and end time or some keywords Time etc.；Sound characteristic information includes tone, tone color and loudness etc..Terminal can pass through speech recognition technology and Application on Voiceprint Recognition skill Art carries out speech recognition to target recording file, so as to obtain text information, temporal information and the sound in target recording file Sound characteristic information etc..

Specifically, above-mentioned target recording file is divided into by N number of sound bite, above-mentioned N number of language according to default volume threshold Tablet section includes target voice fragment；Speech recognition is carried out to above-mentioned target voice fragment, obtains above-mentioned target voice fragment Text fragments.

Wherein, N is more than 1, and default volume threshold can be 0, the embodiment of the present invention for this preset the source of volume threshold with And specific value does not make uniqueness restriction.Continuous speech piece when the volume in target recording file is more than default volume threshold Section, as a sound bite, can obtain N number of sound bite.

Target voice fragment is any sound bite in N number of sound bite, that is to say, that is entered in units of sound bite Row speech recognition, text information, temporal information and the sound characteristic information of each sound bite are obtained, improve the standard of speech recognition Exactness.

For example, Zhang San calls 20 minutes June 14 day 14 point in 2017 to Li Si, Li Si converse start when pair Dialog context is recorded, and the call terminated 22 minutes June 14 day 14 point in 2017, obtained target recording file.By voice Identification target recording file obtains target text file, as follows：

14 points 20 minutes：Zhang San：Hello, Li Si, I is Zhang San.

Li Si：You are good by Zhang San, what

Zhang San：Tomorrow, it is exactly on June 15th, 2017 to participate in meeting in science hall, 8 thirty gathered in company o'clock sharp in the morning 9.

14 points 21 minutes：Li Si：, it is good, it is necessary to what certificateWho passes by together

Zhang San：Identity card and employee's card, remember to wear formal dress, also Alice, Jane and Tom.

Li Si：OK, good, I remembers, that just first so.

14 points 22 minutes：Zhang San：Grace, goodbye.

Wherein, Zhang San is saying that " feeding, Li Si, I is Zhang San " is natural sound characteristic information, " tomorrow, is exactly 2017 6 The moon participates in meeting o'clock sharp on the 15 morning 9 in science hall, and 8 thirty gathered in company " and " identity card and employee's card, remember to wear formal dress, Also Alice, Jane and be serious sound characteristic information during Tom ", and saying " grace, goodbye." when for happiness sound characteristic Information.Li Si say " you are good by Zhang San, what" and ", it is good, it is necessary to what certificateWho passes by together " when for query and serious sound characteristic, saying that " OK, good, I remembers, and that is just first so." it is right sound characteristic Information.

Optionally, judged whether according to the text information of target voice fragment it is important, when target voice fragment word believe When ceasing inessential, the target voice fragment is deleted.That is, some unessential sound bites are deleted, so as to reduce target The occupancy capacity of text.

Optionally, the identity information according to corresponding to tut characteristic information obtains above-mentioned target recording file；According to upper The temporal information for stating identity information and above-mentioned target recording file determines the title of above-mentioned target text file.

That is, the identity according to corresponding to the sound characteristic information of acquisition can determine that target recording file sound intermediate frequency data Information, then the title of target text file is determined according to the temporal information of the identity information and target recording file.As long as i.e. Search the temporal information and identity information of target recording file, so that it may believe with the temporal information and identity of the target recording file Target text file corresponding to being found in text corresponding to breath, so as to improve the convenience of search operation.

Wherein, the name of the title of target text file can use identity information and the combining form on date.Such as：Zhang San 170614th, Zhang San -170614, incoming call：Zhang San-the duration of call：25 minutes etc., the embodiment of the present invention, for the specific of name Mode does not make uniqueness restriction.

For example, recorded when Li Si June 14 day 14: 20 in 2017 is switched to one and takes on the telephone and conversed When, obtain target recording file.The sound characteristic information of recording object can be obtained by speech recognition, by the sound characteristic information The identity information for being matched to obtain the recording object is Zhang San, then the title of the target text file obtained speech recognition is ordered Entitled Zhang San 170614.

Because the object in target recording file is probably to call first, or the sound not stored in terminal is special Reference ceases, then can determine identity information according to dialog context.For example, the dialogue after generally answering the call is as follows：" feed, you are good！It is Li SiI is Zhang San ", so as to understand that both call sides are Li Si and Zhang San.

Optionally, server is classified according to identity information, and it is literary to search target text according to identity information beneficial to subsequently Part.

103rd, above-mentioned target text file is sent to server.

In the embodiment of the present invention, target text file is sent to server, i.e., to target recording file in a text form Backed up.

It should be noted that when terminal is under network state, can just be performed after networking above-mentioned by target text text Part is sent to the step of server, can also be periodically executed above-mentioned steps.Network state can be in the case of Wi-Fi or 2G 3G the data traffic such as 4G situation, in order to save the usage amount of customer flow, reduce unnecessary rate and waste, and improve biography Defeated efficiency, it can be set in terminal and recording file backup is carried out only in the case of Wi-Fi.The embodiment of the present invention, for how will Target text file is sent to server and is not construed as limiting.

Optionally, after server is sent to, target text file and target recording file are deleted, end can be saved Hold the memory source taken.

Optionally, the target text summary of above-mentioned target text file is extracted；Above-mentioned target text summary is sent to above-mentioned Server.

Wherein, target text summary is the summary info of target text file, i.e., is believed according to the time of target text file The summary info of breath, text information and sound characteristic information extraction keyword therein and scene information etc. generation.With Family is by checking that target text summary can obtain the essential information of target text file, so as to search target text beneficial to raising The efficiency of file.

For example, in the example of above-mentioned Li Si and Zhang San's call, target text summary is：On 15 days June in 2017 O'clock sharp at noon 9 participates in meeting in science hall, and 8 thirty and Alice, Jane and Tom gather in company, with identity card and employee's card, Remember to wear formal dress.

By implementing the present embodiment, because target text file is text, and target recording file is audio file, Then more small compared with the capacity of file for target recording file of target text file, can reduce the network bandwidth of occupancy, improve Backup efficiency and success rate.And target text file includes text information, temporal information and sound characteristic information, beneficial to follow-up By target text file synthesis target recording file, so as to improve the validity of backup.

Referring to Fig. 2, Fig. 2 is the flow signal of the processing method of another recording file provided in an embodiment of the present invention Figure, as shown in Fig. 2 the processing method of the recording file may include：

201st, voice data is gathered by voice pick device, obtains target recording file.

202nd, speech recognition is carried out to above-mentioned target recording file, obtains target text file.

203rd, above-mentioned target text file is sent to server.

204th, the above-mentioned target text file stored in above-mentioned server is obtained.

In the present embodiment, when user needs to reduce the target recording file in server, target record is downloaded from server Target text file corresponding to sound file.According to above-described embodiment, can be plucked according to the title or target text of target text file Searched, so as to improve the efficiency of lookup.

205th, voice restoration is carried out to above-mentioned target text file, obtains above-mentioned target recording file.

Speech synthesis technique：It is the technology that artificial voice is produced by the method for machinery, electronics.

According to text information, temporal information and the sound characteristic included in speech synthesis technique and target text file Information, voice restoration can be carried out to target text file and obtain target recording file.It should be noted that what voice restoration obtained The target recording file that target recording file obtains with voice pick device collection voice data matches, that is to say, that by It may cannot get duplicate audio file in technical processing error.

It is understood that the specific implementation of the embodiment of the present invention may be referred to the implementation of above-described embodiment, Here do not repeat.

Implement the embodiment of the present invention, because target text file is text, and target recording file is audio file, Then more small compared with the capacity of file for target recording file of target text file, can reduce the network bandwidth of occupancy, improve Backup efficiency and success rate.When user needs target text file being reduced to target recording file, obtain in server Target text file, the text information included according to target text file, temporal information and sound characteristic information carry out voice Reduction, so as to improve the validity of backup.

Referring to Fig. 3, Fig. 3 is a kind of structural representation of terminal provided in an embodiment of the present invention.As shown in figure 3, the terminal Including：

Collecting unit 301, for gathering voice data by voice pick device, obtain target recording file.

Voice recognition unit 302, for carrying out speech recognition to above-mentioned target recording file, target text file is obtained, Above-mentioned target text file includes text information, temporal information and sound characteristic information.

Transmitting element 303, for above-mentioned target text file to be sent into server.

Implement the embodiment of the present invention, because target text file is text, and target recording file is audio file, Then more small compared with the capacity of file for target recording file of target text file, can reduce the network bandwidth of occupancy, improve Backup efficiency and success rate.And target text file includes text information, temporal information and sound characteristic information, beneficial to follow-up By target text file synthesis target recording file, so as to improve the validity of backup.

Referring to Fig. 4, Fig. 4 is the structural representation of another terminal provided in an embodiment of the present invention.As shown in figure 4, the end End includes：

Collecting unit 401, for gathering voice data by voice pick device, obtain target recording file.

Voice recognition unit 402, for carrying out speech recognition to above-mentioned target recording file, target text file is obtained, Above-mentioned target text file includes text information, temporal information and sound characteristic information.

Transmitting element 403, for above-mentioned target text file to be sent into server.

First acquisition unit 404, for obtaining the above-mentioned target text file stored in above-mentioned server；

Voice restoration unit 405, for carrying out voice restoration to above-mentioned target text file, obtain above-mentioned target recording text Part.

As a kind of possible embodiment, above-mentioned terminal also includes：

Division unit 406, for above-mentioned target recording file to be divided into N number of sound bite according to default volume threshold, Above-mentioned N is more than 1, and above-mentioned N number of sound bite includes target voice fragment；

Above-mentioned voice recognition unit 402, specifically for carrying out speech recognition to above-mentioned target voice fragment, obtain above-mentioned mesh Mark the text fragments of sound bite.

As a kind of possible embodiment, above-mentioned terminal also includes：

Second acquisition unit 407, for the body according to corresponding to tut characteristic information acquisition above-mentioned target recording file Part information；

Determining unit 408, it is above-mentioned for being determined according to the temporal information of above-mentioned identity information and above-mentioned target recording file The title of target text file.

As a kind of possible embodiment, above-mentioned terminal also includes：

Extraction unit 409, the target text for extracting above-mentioned target text file are made a summary；

Above-mentioned transmitting element 403, it is additionally operable to above-mentioned target text file and above-mentioned target text summary being sent to above-mentioned clothes Business device.

Referring to Fig. 5, Fig. 5 is the structural representation of another terminal provided in an embodiment of the present invention.As shown in figure 5, this reality Applying the terminal in example can include：One or more processors 501；One or more input equipments 502, one or more output Equipment 503 and memory 504.Above-mentioned processor 501, input equipment 502, output equipment 503 and memory 504 pass through bus 505 connections.Memory 502 is used for store instruction, and processor 501 is used for the instruction for performing the storage of memory 502.Wherein, handle Device 501 is used for：Voice data is gathered by voice pick device, obtains target recording file；Above-mentioned target recording file is entered Row speech recognition, obtains target text file, and above-mentioned target text file includes text information, temporal information and sound characteristic Property information；Above-mentioned target text file is sent to server.

It should be appreciated that in embodiments of the present invention, alleged processor 501 can be CPU (Central Processing Unit, CPU), the processor can also be other general processors, digital signal processor (Digital Signal Processor, DSP), application specific integrated circuit (Application Specific Integrated Circuit, ASIC), ready-made programmable gate array (Field-Programmable Gate Array, FPGA) or other FPGAs Device, discrete gate or transistor logic, discrete hardware components etc..General processor can be microprocessor or this at It can also be any conventional processor etc. to manage device.

Input equipment 502 can include Trackpad, fingerprint adopt sensor (finger print information that is used to gathering user and fingerprint Directional information), microphone etc., output equipment 503 can include display (LCD etc.), loudspeaker etc..

The memory 504 can include read-only storage and random access memory, and to processor 501 provide instruction and Data.The a part of of memory 504 can also include nonvolatile RAM.For example, memory 504 can also be deposited Store up the information of device type.

In the specific implementation, processor 501, input equipment 502, the output equipment 503 described in the embodiment of the present invention can Perform the reality described in the first embodiment and second embodiment of the processing method of recording file provided in an embodiment of the present invention Existing mode, the implementation of the terminal described by the embodiment of the present invention is also can perform, will not be repeated here.

A kind of computer-readable recording medium, above computer readable storage medium are provided in another embodiment of the invention Matter is stored with computer program, and above computer program is realized when being executed by processor：Detect the first predeterminable area first is aobvious Show display duration corresponding to content；When the display duration is more than preset duration, first display content is switched to the Two predeterminable areas are shown.

Above computer readable storage medium storing program for executing can be the internal storage unit of the above-mentioned terminal of foregoing any embodiment, example Such as the hard disk or internal memory of terminal.Above computer readable storage medium storing program for executing can also be the External memory equipment of above-mentioned terminal, such as The plug-in type hard disk being equipped with above-mentioned terminal, intelligent memory card (Smart Media Card, SMC), secure digital (Secure Digital, SD) card, flash card (Flash Card) etc..Further, above computer readable storage medium storing program for executing can also be wrapped both Including the internal storage unit of above-mentioned terminal also includes External memory equipment.Above computer readable storage medium storing program for executing is above-mentioned for storing Other programs and data needed for computer program and above-mentioned terminal.Above computer readable storage medium storing program for executing can be also used for temporarily When store the data that has exported or will export.

Fig. 6 is illustrated that the block diagram of the part-structure of the mobile phone related to terminal provided in an embodiment of the present invention.Reference chart 6, mobile phone includes：Radio frequency (Radio Frequency, RF) circuit 610, memory 620, input block 630, display unit 640, Sensor 650, voicefrequency circuit 660, Wireless Fidelity (wireless fidelity, Wi-Fi) module 670, processor 680 and The grade part of power supply 690.It will be understood by those skilled in the art that the handset structure shown in Fig. 6 does not form the restriction to mobile phone, It can include than illustrating more or less parts, either combine some parts or different parts arrangement.

Each component parts of mobile phone is specifically introduced with reference to Fig. 6：

RF circuits 610 can be used for receive and send messages or communication process in, the reception and transmission of signal, especially, by base station After downlink information receives, handled to processor 680；In addition, it is sent to base station by up data are designed.Generally, RF circuits 610 Including but not limited to antenna, at least one amplifier, transceiver, coupler, low-noise amplifier (Low Noise Amplifier, LNA), duplexer etc..In addition, RF circuits 610 can also be communicated by radio communication with network and other equipment. Above-mentioned radio communication can use any communication standard or agreement, including but not limited to global system for mobile communications (Global System of Mobile communication, GSM), general packet radio service (General Packet Radio Service, GPRS), CDMA (Code Division Multiple Access, CDMA), WCDMA (Wideband Code Division Multiple Access, WCDMA), Long Term Evolution (Long Term Evolution, LTE), Email, Short Message Service (Short Messaging Service, SMS) etc..

Memory 620 can be used for storage software program and module, and processor 680 is stored in memory 620 by operation Software program and module, so as to perform the various function application of mobile phone and data processing.Memory 620 can mainly include Storing program area and storage data field, wherein, storing program area can storage program area, the application journey needed at least one function Sequence (such as sound-playing function, image player function etc.) etc.；Storage data field can store uses what is created according to mobile phone Data (such as voice data, phone directory etc.) etc.., can be with addition, memory 620 can include high-speed random access memory Including nonvolatile memory, for example, at least a disk memory, flush memory device or other volatile solid-states Part.

Input block 630 can be used for the numeral or character information for receiving input, and produce with the user of mobile phone set with And the key signals input that function control is relevant.Specifically, input block 630 may include that contact panel 631 and other inputs are set Standby 632.Contact panel 631, also referred to as touch-screen, collect user on or near it touch operation (such as user use The operation of any suitable object such as finger, stylus or annex on contact panel 630 or near contact panel 630), and root Corresponding attachment means are driven according to formula set in advance.Optionally, contact panel 630 may include touch detecting apparatus and touch Two parts of controller.Wherein, the touch orientation of touch detecting apparatus detection user, and the signal that touch operation is brought is detected, Transmit a signal to touch controller；Touch controller receives touch information from touch detecting apparatus, and is converted into touching Point coordinates, then give processor 680, and the order sent of reception processing device 680 and can be performed.Furthermore, it is possible to using electricity The polytypes such as resistive, condenser type, infrared ray and surface acoustic wave realize contact panel 630.Except contact panel 630, input Unit 630 can also include other input equipments 632.Specifically, other input equipments 632 can include but is not limited to secondary or physical bond One or more in disk, function key (such as volume control button, switch key etc.), trace ball, mouse, action bars etc..

Display unit 640 can be used for display by user input information or be supplied to user information and mobile phone it is various Menu.Display unit 640 may include display panel 641, optionally, can use liquid crystal display (Liquid Crystal Display, LCD), the form such as Organic Light Emitting Diode (Organic Light-Emitting Diode, OLED) it is aobvious to configure Show panel 641.Further, contact panel 630 can cover display panel 641, when contact panel 630 is detected thereon or attached After near touch operation, processor 680 is sent to determine the type of touch event, is followed by subsequent processing device 680 according to touch event Type corresponding visual output is provided on display panel 641.Although in figure 6, contact panel 630 and display panel 641 It is the part independent as two to realize the input of mobile phone and input function, but in some embodiments it is possible to by touch-control Panel 630 is integrated with display panel 641 and realizes input and the output function of mobile phone.

Mobile phone may also include at least one sensor 650, such as optical sensor, motion sensor and other sensors. Specifically, optical sensor may include ambient light sensor and proximity transducer, wherein, ambient light sensor can be according to ambient light Light and shade adjust the brightness of display panel 641, proximity transducer can close display panel 641 when mobile phone is moved in one's ear And/or backlight.As one kind of motion sensor, accelerometer sensor can detect in all directions (generally three axles) acceleration Size, size and the direction of gravity are can detect that when static, (for example horizontal/vertical screen is cut available for the application of identification mobile phone posture Change, dependent game, magnetometer pose calibrating), Vibration identification correlation function (such as pedometer, tap) etc.；May be used also as mobile phone The other sensors such as the gyroscope of configuration, barometer, hygrometer, thermometer, infrared ray sensor, will not be repeated here.

Voicefrequency circuit 660, loudspeaker 661, microphone 662 can provide the COBBAIF between user and mobile phone.Audio-frequency electric Electric signal after the voice data received conversion can be transferred to loudspeaker 661, sound is converted to by loudspeaker 661 by road 660 Signal output；On the other hand, the voice signal of collection is converted to electric signal by microphone 662, is turned after being received by voicefrequency circuit 660 Voice data is changed to, then after voice data output processor 680 is handled, through RF circuits 610 to be sent to such as another mobile phone, Or voice data is exported to memory 620 further to handle.

Wi-Fi belongs to short range wireless transmission technology, and mobile phone can help user's transceiver electronicses by Wi-Fi module 670 Mail, browse webpage and access streaming video etc., it has provided the user wireless broadband internet and accessed.Although Fig. 6 is shown Wi-Fi module 670, but it is understood that, it is simultaneously not belonging to must be configured into for mobile phone, completely can be as needed not Change in the essential scope of invention and omit.

Processor 680 is the control centre of mobile phone, using various interfaces and the various pieces of connection whole mobile phone, is led to Cross operation or perform the software program and/or module being stored in memory 620, and call and be stored in memory 620 Data, the various functions and processing data of mobile phone are performed, so as to carry out integral monitoring to mobile phone.Optionally, processor 680 can wrap Include one or more processing units；Preferably, processor 680 can integrate application processor and modem processor, wherein, should Operating system, user interface and application program etc. are mainly handled with processor, modem processor mainly handles radio communication. It is understood that above-mentioned modem processor can not also be integrated into processor 680.

Mobile phone also includes the power supply 690 (such as battery) to all parts power supply, it is preferred that power supply can pass through power supply pipe Reason system and processor 680 are logically contiguous, so as to realize management charging, electric discharge and power managed by power-supply management system Etc. function.

Although being not shown, mobile phone can also include camera, bluetooth module etc., will not be repeated here.

Those of ordinary skill in the art are it is to be appreciated that the list of each example described with reference to the embodiments described herein Member and algorithm steps, it can be realized with electronic hardware, computer software or the combination of the two, in order to clearly demonstrate hardware With the interchangeability of software, the composition and step of each example are generally described according to function in the above description.This A little functions are performed with hardware or software mode actually, application-specific and design constraint depending on technical scheme.Specially Industry technical staff can realize described function using distinct methods to each specific application, but this realization is not It is considered as beyond the scope of this invention.

It is apparent to those skilled in the art that for convenience of description and succinctly, the end of foregoing description End and the specific work process of unit, may be referred to the corresponding process in preceding method embodiment, will not be repeated here.

In several embodiments provided herein, it should be understood that disclosed terminal and method, it can be passed through Its mode is realized.For example, device embodiment described above is only schematical, for example, the division of said units, only Only a kind of division of logic function, there can be other dividing mode when actually realizing, such as multiple units or component can be tied Another system is closed or is desirably integrated into, or some features can be ignored, or do not perform.In addition, shown or discussed phase Coupling or direct-coupling or communication connection between mutually can be INDIRECT COUPLING or the communication by some interfaces, device or unit Connection or electricity, the connection of mechanical or other forms.

The above-mentioned unit illustrated as separating component can be or may not be physically separate, show as unit The part shown can be or may not be physical location, you can with positioned at a place, or can also be distributed to multiple On NE.Some or all of unit therein can be selected to realize scheme of the embodiment of the present invention according to the actual needs Purpose.

In addition, each functional unit in each embodiment of the present invention can be integrated in a processing unit, can also It is that unit is individually physically present or two or more units are integrated in a unit.It is above-mentioned integrated Unit can both be realized in the form of hardware, can also be realized in the form of SFU software functional unit.

If above-mentioned integrated unit is realized in the form of SFU software functional unit and is used as independent production marketing or use When, it can be stored in a computer read/write memory medium.Based on such understanding, technical scheme is substantially The part to be contributed in other words to prior art, or all or part of the technical scheme can be in the form of software product Embody, the computer software product is stored in a storage medium, including some instructions are causing a computer Equipment (can be personal computer, server, or network equipment etc.) performs the complete of each embodiment above method of the present invention Portion or part steps.And foregoing storage medium includes：USB flash disk, mobile hard disk, read-only storage (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disc or CD etc. are various can store journey The medium of sequence code.

It is above-mentioned above, it is only the embodiment of the present invention, but protection scope of the present invention is not limited thereto, and it is any Those familiar with the art the invention discloses technical scope in, various equivalent modifications can be readily occurred in or replaced Change, these modifications or substitutions should be all included within the scope of the present invention.Therefore, protection scope of the present invention should be with right It is required that protection domain be defined.

Claims

A kind of 1. processing method of recording file, it is characterised in that including：

Voice data is gathered by voice pick device, obtains target recording file；

Speech recognition is carried out to the target recording file, obtains target text file, the target text file includes word Information, temporal information and sound characteristic information；

The target text file is sent to server.
2. according to the method for claim 1, it is characterised in that it is described by the target text file be sent to server it Afterwards, methods described also includes：

Obtain the target text file stored in the server；

Voice restoration is carried out to the target text file, obtains the target recording file.
3. method according to claim 1 or 2, it is characterised in that described that voice knowledge is carried out to the target recording file Not, target text file is obtained, including：

The target recording file is divided into by N number of sound bite according to default volume threshold, the N is more than 1, N number of language Tablet section includes target voice fragment；

Speech recognition is carried out to the target voice fragment, obtains the text fragments of the target voice fragment.
4. method according to claim 1 or 2, it is characterised in that it is described obtain target text file after, the side Method also includes：

The identity information corresponding to target recording file according to the sound characteristic acquisition of information；

The title of the target text file is determined according to the temporal information of the identity information and the target recording file.
5. method according to claim 1 or 2, it is characterised in that it is described obtain target text file after, the side Method also includes：

Extract the target text summary of the target text file；

Target text summary is sent to the server.
A kind of 6. terminal, it is characterised in that including：

Collecting unit, for gathering voice data by voice pick device, obtain target recording file；

Voice recognition unit, for carrying out speech recognition to the target recording file, obtain target text file, the target Text includes text information, temporal information and sound characteristic information；

Transmitting element, for the target text file to be sent into server.
7. terminal according to claim 6, it is characterised in that the terminal also includes：

First acquisition unit, for obtaining the target text file stored in the server；

Voice restoration unit, for carrying out voice restoration to the target text file, obtain the target recording file.
8. the terminal according to claim 6 or 7, it is characterised in that the terminal also includes：

Division unit, for the target recording file to be divided into N number of sound bite according to default volume threshold, the N is big In 1, N number of sound bite includes target voice fragment；

The voice recognition unit, specifically for carrying out speech recognition to the target voice fragment, obtain the target voice The text fragments of fragment.
9. the terminal according to claim 6 or 7, it is characterised in that the terminal also includes：

Second acquisition unit, for the identity information corresponding to target recording file according to the sound characteristic acquisition of information；

Determining unit, for determining the target text according to the temporal information of the identity information and the target recording file The title of file.
10. the terminal according to claim 6 or 7, it is characterised in that the terminal also includes：

Extraction unit, the target text for extracting the target text file are made a summary；

The transmitting element, it is additionally operable to target text summary being sent to the server.
A kind of 11. terminal, it is characterised in that including processor, input equipment, output equipment and memory, the processor, institute Input equipment, the output equipment and the memory is stated to be connected with each other, wherein, the memory is used to store application program generation Code, the processor are arranged to call described program code, perform the method as described in claim any one of 1-5.
A kind of 12. computer-readable recording medium, it is characterised in that the computer-readable storage medium is stored with computer program, The computer program includes programmed instruction, and described program instruction makes the computing device such as right when being executed by a processor It is required that the method described in any one of 1-5.