CN106911832A

CN106911832A - A kind of method and device of voice record

Info

Publication number: CN106911832A
Application number: CN201710299128.6A
Authority: CN
Inventors: 李承敏; 王文斌; 叶巧莉; 包振毅
Original assignee: Shanghai Yude Technology Co Ltd
Current assignee: Sichuan Yinchuang Weiye Technology Co ltd
Priority date: 2017-04-28
Filing date: 2017-04-28
Publication date: 2017-06-30
Anticipated expiration: 2037-04-28
Also published as: CN106911832B

Abstract

The embodiment of the invention discloses a kind of method and device of voice record.The method includes：In the first audio-frequency information of the in running order lower acquisition of the first collecting unit, the first audio-frequency information generation time, and idle condition is according to the first audio-frequency information formation text data corresponding with the first audio-frequency information in the first collecting unit；In the second audio-frequency information of the in running order lower acquisition of the second collecting unit, the second audio-frequency information generation time, and idle condition is according to the second audio-frequency information formation text data corresponding with the second audio-frequency information in the second collecting unit；The corresponding text data of the first audio-frequency information and the corresponding text data of the second audio-frequency information are ranked up to form a text according to the first audio-frequency information generation time and the second audio-frequency information generation time under the first collecting unit and/or the second collecting unit are in idle condition.By technical scheme, enable to the record of voice more convenient.

Description

A kind of method and device of voice record

Technical field

The present embodiments relate to intelligent terminal field, more particularly to a kind of method and device of voice record.

Background technology

Due to developing rapidly for wireless mobile telecommunication technology, user's employing wireless mobile communication equipment has carried out voice communication As essential communication mode.

User is frequently encountered following scene when being received calls by call terminals such as mobile phone, flat boards, such as Party A exists Tell Party B some important information, such as videoconference in phone, leader passes on some important by call terminal to subordinate Information, subordinate needs to record the content that leader passes on, and is organized into original text, because memory is limited, it is easy to forget some weights Content is wanted, the work to after brings puzzlement.

In the prior art, user is solution problems, the sound-recording function that call terminal can be selected to carry, to call Content is recorded, and recording substance then is organized into original text.But this mode has the disadvantage that：Even if user is to dialog context Recorded, it is still necessary to which user takes pen and paper to be recorded when recording is listened to afterwards, still inconvenient.

The content of the invention

The embodiment of the present invention provides a kind of method and device of voice record so that the record of voice is more convenient.

In a first aspect, the embodiment of the invention provides a kind of method of voice record, the method includes：

When the first audio-frequency information of the in running order lower acquisition of the first collecting unit, first audio-frequency information are produced Between, and idle condition is according to first audio-frequency information formation and first audio-frequency information in first collecting unit Corresponding text data；

When the second audio-frequency information of the in running order lower acquisition of the second collecting unit, second audio-frequency information are produced Between, and idle condition is according to second audio-frequency information formation and second audio-frequency information in second collecting unit Corresponding text data；

According to first audio under first collecting unit and/or second collecting unit are in idle condition Information generation time and the second audio-frequency information generation time are to the corresponding text data of first audio-frequency information and described The corresponding text data of second audio-frequency information is ranked up to form a text.

Further, also include：

Set up right between the corresponding text data of first audio-frequency information and the first audio-frequency information generation time Should be related to；

Set up right between the corresponding text data of second audio-frequency information and the second audio-frequency information generation time Should be related to.

Further, according to institute under first collecting unit and/or second collecting unit are in idle condition The first audio-frequency information generation time and the second audio-frequency information generation time are stated to the corresponding text of first audio-frequency information Data and the corresponding text data of second audio-frequency information are ranked up to form a text, including：

It is under idle condition in first collecting unit and/or second collecting unit, and by Preset Time Afterwards, according to the first audio-frequency information generation time and the second audio-frequency information generation time to first audio-frequency information pair The text data and the corresponding text data of second audio-frequency information answered are ranked up to form a text.

Further, also include：

Wrong lteral data in text data is marked, and by mark described wrong lteral data and corresponding institute The audio-frequency information for stating wrong lteral data sets up mapping relations, wherein, the text data includes correct lteral data and mistake Lteral data.

Second aspect, the embodiment of the present invention additionally provides a kind of device of voice record, and the device includes：

First text data forms module, in the first collecting unit in running order lower acquisition the first audio letter Breath, the first audio-frequency information generation time, and idle condition is according to first audio in first collecting unit Information forms text data corresponding with first audio-frequency information；

Second text data forms module, in the second collecting unit in running order lower acquisition the second audio letter Breath, the second audio-frequency information generation time, and idle condition is according to second audio in second collecting unit Information forms text data corresponding with second audio-frequency information；

Text forms module, for being in the free time in first collecting unit and/or second collecting unit First audio is believed according to the first audio-frequency information generation time and the second audio-frequency information generation time under state Cease corresponding text data and the corresponding text data of second audio-frequency information is ranked up to form a text.

Further, also include：

First corresponding relation building module, for setting up the corresponding text data of first audio-frequency information and described first Corresponding relation between audio-frequency information generation time；

Second corresponding relation building module, for setting up the corresponding text data of second audio-frequency information and described second Corresponding relation between audio-frequency information generation time.

Further, the text formed module specifically for：

Further, also include：

Mapping relations set up module, are marked for the wrong lteral data in text data, and the institute that will be marked State wrong lteral data and set up mapping relations with the audio-frequency information of the corresponding wrong lteral data, wherein, the text data Including correct lteral data and wrong lteral data.

The embodiment of the present invention is by the first audio-frequency information of the in running order lower acquisition of the first collecting unit, the first audio Information generation time, and idle condition is according to the formation of the first audio-frequency information and the first audio-frequency information pair in the first collecting unit The text data answered；When the second audio-frequency information of the in running order lower acquisition of the second collecting unit, the second audio-frequency information are produced Between, and idle condition is according to the second audio-frequency information formation textual data corresponding with the second audio-frequency information in the second collecting unit According to；According to the first audio-frequency information generation time and the under the first collecting unit and/or the second collecting unit are in idle condition Two audio-frequency information generation times are entered to the corresponding text data of the first audio-frequency information and the corresponding text data of the second audio-frequency information Row sorts to form a text.Avoid causes to forget that some in voice call process are important because memory is limited The situation of content, or avoid and listened to due to recording voice call content, side by the way of being recorded to voice call Recording side takes pen and paper to be recorded and for recording substance to be organized into original text, and the situation for causing voice recording process very cumbersome can So that the record of voice is more convenient, Consumer's Experience is lifted.

Brief description of the drawings

Fig. 1 is a kind of flow chart of the method for the voice record in the embodiment of the present invention one；

Fig. 2 is a kind of flow chart of the method for the voice record in the embodiment of the present invention two；

Fig. 3 is a kind of structural representation of the device of the voice record in the embodiment of the present invention three.

Specific embodiment

The present invention is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched The specific embodiment stated is used only for explaining the present invention, rather than limitation of the invention.It also should be noted that, in order to just Part rather than entire infrastructure related to the present invention is illustrate only in description, accompanying drawing.

Embodiment one

Fig. 1 is a kind of flow chart of the method for voice record that the embodiment of the present invention one is provided, and the present embodiment is applicable to The situation of voice record, the method can be performed by the device of voice record provided in an embodiment of the present invention, and the device can be adopted Realized with the mode of software and/or hardware, as shown in figure 1, the method specifically includes following steps：

S110, produces in the first audio-frequency information of the in running order lower acquisition of the first collecting unit, first audio-frequency information The raw time, and idle condition is according to first audio-frequency information formation and first audio in first collecting unit The corresponding text data of information.

Wherein, can be when two users are carried out in communication process under first collecting unit is in running order One collecting unit is in running order, or two users are carried out first during voice call by communication class application Collecting unit is in running order.

Wherein, first audio-frequency information is to carry out the audio-frequency information that the user of voice call sends.First audio Information generation time can collect the time that user has just begun to send out audio-frequency information for the first collecting unit, or first Collecting unit collects user to be terminated to send the time of audio-frequency information, and the present embodiment is not limited to this.

Wherein, first collecting unit is referred in the case that user do not send audio-frequency information in idle condition, First collecting unit without gather audio-frequency information state.

S120, produces in the second audio-frequency information of the in running order lower acquisition of the second collecting unit, second audio-frequency information The raw time, and idle condition is according to second audio-frequency information formation and second audio in second collecting unit The corresponding text data of information.

Wherein, can be when two users are carried out in communication process under second collecting unit is in running order Two collecting units are in running order, or two users are carried out second during voice call by communication class application Collecting unit is in running order.

Wherein, second audio-frequency information is the audio-frequency information that the other users for carrying out voice call with user send.Institute It can be that the other users that the second collecting unit is collected with user carries out voice call are firm to state the second audio-frequency information generation time Begin to send out the time of audio-frequency information, or the second collecting unit collects the other users that voice call is carried out with user End sends the time of audio-frequency information, and the present embodiment is not limited to this.

Wherein, second collecting unit does not have in the other users that idle condition refers to carrying out voice call with user In the case of sending audio-frequency information, the second collecting unit without gather audio-frequency information state.

In a specific example, during being conversed, the first collecting unit is in user A for user A and user B Side, in user B sides, when user A is during speaking, the first collecting unit is in running order for the second collecting unit, and first Collecting unit gathers user's A words, and record collects the time of user's A words, and user A is finished, by the first collecting unit The user's A words for collecting translate into word, and user A finishes rear user B and speaks, and the second collecting unit is in running order, Second collecting unit gathers user's B words, and record collects time of user's B words, the free time spoken using user B, User's A words are translated into word；User B finishes rear user A and speaks, and the free time spoken using user A, user B is said Words translate into word.

S130, according to described the under first collecting unit and/or second collecting unit are in idle condition One audio-frequency information generation time and the second audio-frequency information generation time are to the corresponding text data of first audio-frequency information Text data corresponding with second audio-frequency information is ranked up to form a text.

Wherein, the record form of the text be the time spoken according to user order carry out record user speak Content.

If specifically, meet the first collecting unit being under idle condition, being ranked up to form text to text data File；If meeting the second collecting unit to be under idle condition, text data is ranked up to form text；If meeting First collecting unit and the second collecting unit are under idle condition, then text data is ranked up to form text.

Optionally, also include：

Specifically, when can pre-build that the corresponding text data of the first audio-frequency information and first audio-frequency information are produced Between between corresponding relation and the corresponding text data of the second audio-frequency information and the second audio-frequency information generation time between Corresponding relation, then forms a text according to above-mentioned corresponding relation.

Optionally, according to described under first collecting unit and/or second collecting unit are in idle condition First audio-frequency information generation time and the second audio-frequency information generation time are to the corresponding textual data of first audio-frequency information It is ranked up to form a text according to text data corresponding with second audio-frequency information, including：

Wherein, the Preset Time be from start collection audio-frequency information be ranked up to form text to text data Time.Preset Time can experience is set according to value, it is also possible to be the value set according to subjective idea.This Embodiment is not limited to this.

If specifically, meet the first collecting unit being under idle condition, and by Preset Time, then entering to text data Row sequence forms text；If meet the second collecting unit to be under idle condition, and by Preset Time, then to textual data Text is formed according to being ranked up；If meeting the first collecting unit and the second collecting unit being under idle condition, and pass through Preset Time, then be ranked up to form text to text data.

A kind of pattern is added in a specific example, in communication process, is exactly " sound recordings " pattern, that is, The audio-frequency information recorded translates into word by speech identifying function, and mark is specifically spoken the time.Voice call process In, after sound recordings pattern is clicked on, system is divided into two parts, and a part is exactly common sound-recording function, can be the sound heard Sound direct recording gets off；Another part is the audio file that will record carries out noise treatment on backstage, that is, not being inconsistent The sound for closing voice is got rid of, and after collecting sound, word is translated into by speech recognition technology.For example can be user A and The people of user B two carry out voice call, and the time is had when generally call, when record, dialogue are recorded according to the time, press Time sequencing is counted, if user A first loquiturs, the first collecting unit collection user A words, record user A speak Time and user's A words are recorded in a text form, when user A finishes user B and loquiturs, second adopts Collection unit can gather user B words, the time that record user B speaks and carry out user B words in a text form Record.To be ranked up according to the order of time according to the corresponding text data of audio-frequency information, form a text, for example： User A：..., user B：..., user A：..., user B：..., user A：….

The technical scheme of the present embodiment, by the first collecting unit it is in running order it is lower acquisition the first audio-frequency information, First audio-frequency information generation time, and idle condition is according to the formation of the first audio-frequency information and the first sound in the first collecting unit The corresponding text data of frequency information；In the second audio-frequency information of the in running order lower acquisition of the second collecting unit, the second audio letter Breath generation time, and it is corresponding with the second audio-frequency information according to the formation of the second audio-frequency information to be in idle condition in the second collecting unit Text data；Produced according to the first audio-frequency information under the first collecting unit and/or the second collecting unit are in idle condition Time and the second audio-frequency information generation time are to the corresponding text data of the first audio-frequency information and the corresponding text of the second audio-frequency information Notebook data is ranked up to form a text.Avoid in causing to forget voice call process because memory is limited The situation of some important contents, or avoid due to being recorded in voice call by the way of being recorded to voice call Hold, take pen and paper to be recorded recording substance to be organized into original text, causes voice recording process very cumbersome when recording is listened to Situation, enables to the record of voice more convenient, lifts Consumer's Experience.

Embodiment two

Fig. 2 is a kind of flow chart of the method for voice record that the embodiment of the present invention two is provided, and the present embodiment is with foregoing reality Apply and optimize based on example one, there is provided the method for preferred voice record, specifically, also include：To in text data Mistake lteral data is marked, and the described wrong lteral data that will be marked is believed with the audio of the corresponding wrong lteral data Breath sets up mapping relations, wherein, the text data includes correct lteral data and wrong lteral data.

Accordingly, the method for the present embodiment specifically includes following steps：

S210, produces in the first audio-frequency information of the in running order lower acquisition of the first collecting unit, first audio-frequency information The raw time, and idle condition is according to first audio-frequency information formation and first audio in first collecting unit The corresponding text data of information.

S220, produces in the second audio-frequency information of the in running order lower acquisition of the second collecting unit, second audio-frequency information The raw time, and idle condition is according to second audio-frequency information formation and second audio in second collecting unit The corresponding text data of information.

S230, according to described the under first collecting unit and/or second collecting unit are in idle condition One audio-frequency information generation time and the second audio-frequency information generation time are to the corresponding text data of first audio-frequency information Text data corresponding with second audio-frequency information is ranked up to form a text.

S240, to text data in wrong lteral data be marked, and will mark described wrong lteral data with The audio-frequency information of the correspondence wrong lteral data sets up mapping relations, wherein, the text data includes correct lteral data With wrong lteral data.

Specifically, when clicking on the wrong lteral data, chain is associated with wrong lteral data using voice conversion software pair The audio-frequency information for connecing is recognized, and carries out editable to the secondary text information for identifying in a document and show.

Specifically, when being ranked up to text data, text data includes correct lteral data and wrong word number According to.Therefore, the present invention using being marked to wrong lteral data, for example, can be that wrong lteral data can be carried out red Color underscore is marked, or is changed font color and is marked, and also or by way of annotation is marked.Together When, the wrong lteral data of the mark is associated and linked with the audio-frequency information of corresponding mistake lteral data, when clicking on mistake By mistake during lteral data, the audio-frequency information of the correspondence wrong lteral data is recognized, and in a document to secondary knowledge Text information data not out carry out editable and show.In this manner it is possible to wrong lteral data in being shown by editable Corrigendum editor is carried out, with the text information data corrected, and the mistake is replaced with the text information data of the corrigendum Lteral data.

In a specific example, the content said in the first audio-frequency information is " you are very good, goes to stroll in the park together ", After conversion is identified to the first audio-frequency information, the content for obtaining is " you just, together go to close public member's plate ", then can be seen Arrive, wherein " just " and " closing public member's plate " is wrong lteral data, therefore, when text is formed, can be to wrong word number It is marked according to " just " and " closing public member's plate ".Wrong lteral data can be corrected by artificial, the method for corrigendum is just The wrong lteral data being marked in text is click on, due to the wrong audio of lteral data association link first letter Breath, then conversion will be recognized to the first audio-frequency information, and editable is carried out in text and show, such as it is, aobvious " very good, whole good, true, pin, earn, demonstrate,prove ... " is shown as, has correct word from editable content, then can click directly on " very good " is selected, then " very good " will replace wrong lteral data " just " afterwards, if not having in editable content There is correct correspondence word, such as be shown " whole good, true, pin, earn, demonstrate,prove ... ", then can first click on "true", then connect Get off the word that automatically can again show and be matched with "true", such as " good, bold and unconstrained, number ... ", at this point it is possible to reselection " good ", with this To complete the corrigendum of wrong text information.Can also be, by the audio-frequency information insertion text text of the correspondence wrong lteral data In part, audio-frequency information is clicked on by artificial, wrong lteral data is directly changed manually.

The technical scheme of the present embodiment, is marked by the wrong lteral data in text data, and by mark The wrong lteral data sets up mapping relations with the audio-frequency information of the corresponding wrong lteral data.Enable to the note of voice Record is more convenient, accurate, lifts Consumer's Experience.

Embodiment three

Fig. 3 is a kind of structural representation of the device of voice record that the embodiment of the present invention three is provided.The present embodiment can be fitted For the situation of voice record, the device can be realized by the way of software and/or hardware, as shown in figure 3, the voice record Device include that the first text data forms module 310, the second text data and forms module 320 and text and form module 330。

Wherein, the first text data forms module 310, in the in running order lower acquisition first of the first collecting unit Audio-frequency information, the first audio-frequency information generation time, and idle condition is according to described the in first collecting unit One audio-frequency information forms text data corresponding with first audio-frequency information；

Second text data forms module 320, in the second collecting unit the second audio of in running order lower acquisition Information, the second audio-frequency information generation time, and idle condition is according to second sound in second collecting unit Frequency information forms text data corresponding with second audio-frequency information；

Text forms module 330, for being in sky in first collecting unit and/or second collecting unit According to the first audio-frequency information generation time and the second audio-frequency information generation time to first audio under not busy state The corresponding text data of information and the corresponding text data of second audio-frequency information are ranked up to form a text.

Optionally, also include：

Optionally, the text formed module 330 specifically for：

Optionally, also include：

The said goods can perform the method that any embodiment of the present invention is provided, and possess the corresponding functional module of execution method And beneficial effect.

Note, above are only presently preferred embodiments of the present invention and institute's application technology principle.It will be appreciated by those skilled in the art that The invention is not restricted to specific embodiment described here, can carry out for a person skilled in the art various obvious changes, Readjust and substitute without departing from protection scope of the present invention.Therefore, although the present invention is carried out by above example It is described in further detail, but the present invention is not limited only to above example, without departing from the inventive concept, also More other Equivalent embodiments can be included, and the scope of the present invention is determined by scope of the appended claims.

Claims

1. a kind of method of voice record, it is characterised in that including：

In the first audio-frequency information of the in running order lower acquisition of the first collecting unit, the first audio-frequency information generation time, and Idle condition is in first collecting unit form corresponding with first audio-frequency information according to first audio-frequency information Text data；

In the second audio-frequency information of the in running order lower acquisition of the second collecting unit, the second audio-frequency information generation time, and Idle condition is in second collecting unit form corresponding with second audio-frequency information according to second audio-frequency information Text data；

According to first audio-frequency information under first collecting unit and/or second collecting unit are in idle condition Generation time and the second audio-frequency information generation time are to the corresponding text data of first audio-frequency information and described second The corresponding text data of audio-frequency information is ranked up to form a text.

2. method according to claim 1, it is characterised in that also include：

The correspondence pass set up between the corresponding text data of first audio-frequency information and the first audio-frequency information generation time System；

The correspondence pass set up between the corresponding text data of second audio-frequency information and the second audio-frequency information generation time System.

3. method according to claim 1, it is characterised in that in first collecting unit and/or second collection Unit is under idle condition according to the first audio-frequency information generation time and the second audio-frequency information generation time to institute State the corresponding text data of the first audio-frequency information and the corresponding text data of second audio-frequency information is ranked up to form one Text, including：

It is under idle condition in first collecting unit and/or second collecting unit, and by after Preset Time, root It is corresponding to first audio-frequency information according to the first audio-frequency information generation time and the second audio-frequency information generation time Text data and the corresponding text data of second audio-frequency information are ranked up to form a text.

4. method according to claim 1, it is characterised in that also include：

Wrong lteral data in text data is marked, and by mark described wrong lteral data and the corresponding mistake The audio-frequency information of lteral data sets up mapping relations by mistake, wherein, the text data includes correct lteral data and wrong word Data.

5. a kind of device of voice record, it is characterised in that including：

First text data forms module, in the first audio-frequency information of the in running order lower acquisition of the first collecting unit, institute The first audio-frequency information generation time is stated, and idle condition is according to the first audio-frequency information shape in first collecting unit Into text data corresponding with first audio-frequency information；

Second text data forms module, in the second audio-frequency information of the in running order lower acquisition of the second collecting unit, institute The second audio-frequency information generation time is stated, and idle condition is according to the second audio-frequency information shape in second collecting unit Into text data corresponding with second audio-frequency information；

Text forms module, for being in idle condition in first collecting unit and/or second collecting unit It is lower according to the first audio-frequency information generation time and the second audio-frequency information generation time to first audio-frequency information pair The text data and the corresponding text data of second audio-frequency information answered are ranked up to form a text.

6. device according to claim 5, it is characterised in that also include：

First corresponding relation building module, for setting up the corresponding text data of first audio-frequency information and first audio Corresponding relation between information generation time；

Second corresponding relation building module, for setting up the corresponding text data of second audio-frequency information and second audio Corresponding relation between information generation time.

7. device according to claim 5, it is characterised in that the text formed module specifically for：

8. device according to claim 5, it is characterised in that also include：

Mapping relations set up module, are marked for the wrong lteral data in text data, and the mistake that will be marked Lteral data sets up mapping relations with the audio-frequency information of the corresponding wrong lteral data by mistake, wherein, the text data includes Correct lteral data and wrong lteral data.