CN110459239A

CN110459239A - Role analysis method, apparatus and computer readable storage medium based on voice data

Info

Publication number: CN110459239A
Application number: CN201910210501.5A
Authority: CN
Inventors: 朱浩华; 吕嘉威; 曹鹏程
Original assignee: Shenzhen One Secret Technology Co Ltd
Current assignee: Shenzhen One Secret Technology Co Ltd
Priority date: 2019-03-19
Filing date: 2019-03-19
Publication date: 2019-11-15

Abstract

The role analysis method based on voice data that the invention discloses a kind of.This method comprises: obtain the voice data and with angle-data corresponding to the voice data；Role Parsing is carried out to the voice data according to the angle-data, obtains character data corresponding with the voice data.The invention also discloses a kind of role analysis device and computer readable storage medium based on voice data.The present invention, which can be realized, provides a kind of new audio role's separation method, carries out role's separation without using microphone apparatus.

Description

Role analysis method, apparatus and computer readable storage medium based on voice data

Technical field

The present invention relates to recording identification field more particularly to a kind of role analysis method, apparatus based on voice data and Computer readable storage medium.

Background technique

During the modern times consultation of doctors, the case where more human hairs are sayed is frequently involved, therefore, records and sets used in conference process If standby can identify that the i.e. corresponding speech content of different characters seems increasingly important, the effect of meeting can be greatly improved Rate simplifies the effect of minutes.

But currently, in the related art, role's separation in the session recording system that conference process uses mostly passes through Multiple microphone apparatus are connected, each microphone apparatus collects the audio-frequency information of a role personnel to realize, and each Mike It is connected between wind devices and main equipment with connecting line, connection is caused greatly to user apart from limited and not portable It is inconvenient.

Summary of the invention

The role analysis method, apparatus and computer that the main purpose of the present invention is to provide a kind of based on voice data can Read storage medium, it is intended to realize and a kind of new audio role's separation method is provided, carry out role point without using microphone apparatus From.

To achieve the above object, the present invention provides a kind of role analysis method based on voice data, described to be based on sound The role analysis methods of data the following steps are included:

Obtain the voice data and with angle-data corresponding to the voice data；

Role Parsing is carried out to the voice data according to the angle-data, obtains angle corresponding with the voice data Chromatic number evidence.

Optionally, described that Role Parsing is carried out to the voice data according to the angle-data, it obtains and the sound Include: before the step of data corresponding character data

The voice data is converted, lteral data is obtained；

It is described that Role Parsing is carried out to the voice data according to the angle-data, it obtains corresponding with the voice data Character data the step of replace are as follows:

Role Parsing is carried out to the lteral data according to the angle-data, obtains angle corresponding with the lteral data Chromatic number evidence.

Optionally, described that Role Parsing is carried out to the lteral data according to the angle-data, it obtains and the text The step of data corresponding character data includes:

The lteral data is divided according to the angle-data, obtain Ziwen digital data and with the Ziwen number of words According to corresponding subangle data；

Role Parsing is carried out to the subangle data, is obtained and multiple role's numbers corresponding to the Ziwen digital data According to.

Optionally, described that the lteral data is divided according to the angle-data, obtain Ziwen digital data and with The step of Ziwen digital data corresponding subangle data includes:

The angle-data is traversed, the variation node of the angle-data is obtained；

It is divided according to the variation node pair lteral data corresponding with the angle-data, obtains Ziwen number of words According to.

Optionally, it is described to the subangle data carry out Role Parsing, obtain with the Ziwen digital data corresponding to The step of multiple character datas includes:

The subangle data are calculated using preset formula, are obtained and Ziwen corresponding to the subangle data The character data of digital data.

Optionally, the formula are as follows:

Dp=360/N,

R=(d-d₀+ dp-1)/dp,

Wherein, dp indicates that everyone accounts for angled numerical value；R indicates character data；What d expression was read from merging file Angle value, value range are [0,360]；d₀It is the deviation angle angle value of initialization, value range is [0,30]；N is participant Numerical value.

Optionally, the role analysis method based on voice data is further comprising the steps of:

The lteral data and character data corresponding with the lteral data are sent to mobile terminal, so that mobile Terminal is shown.

Receive the actual name of user's input and the corresponding relationship of the actual name and the character data；

The character data is replaced with into corresponding actual name according to corresponding relationship, and by actual name and lteral data It is associated preservation.

In addition, to achieve the above object, the present invention also provides a kind of role analysis device based on voice data, the base Include: memory, processor in the role analysis device of voice data and is stored on the memory and can be in the processing The role analysis program based on voice data run on device, the role analysis program based on voice data is by the processing The step of role analysis method based on voice data as described above is realized when device executes.

In addition, to achieve the above object, it is described computer-readable the present invention also provides a kind of computer readable storage medium The role analysis program based on voice data is stored on storage medium, the role analysis program based on voice data is located The step of reason device realizes the above-mentioned role analysis method based on voice data when executing.

The present invention provides a kind of role analysis method, apparatus and computer storage medium based on voice data.In the party In method, obtain the voice data and with angle-data corresponding to the voice data；According to the angle-data to described Voice data carries out Role Parsing, obtains character data corresponding with the voice data.By the above-mentioned means, the present invention provides The mode of role's separation in a kind of new acoustic information can carry out angle to voice data according to the angle-data of voice data Colour analysis, occupied angle is different when being spoken by personage in analysis voice data, analyzes sound number by angle-data The multiple character datas for including in identify each sound number to carry out role's separation to voice data according to angle-data According to corresponding multiple role personages.The analysis method that character is separated by angle is not required to install each role personage special The microphone of door carries out the separation of role in audio, can comprehensive 360 degree of knowledge sound carry out role's separation.

Detailed description of the invention

Fig. 1 is the apparatus structure schematic diagram for the hardware running environment that the embodiment of the present invention is related to；

Fig. 2 is that the present invention is based on the flow diagrams of the role analysis method first embodiment of voice data；

Fig. 3 is that the present invention is based on the flow diagrams of the role analysis method second embodiment of voice data；

Fig. 4 is that the present invention is based on the flow diagrams of the role analysis method 3rd embodiment of voice data；

Fig. 5 is that the present invention is based on the flow diagrams of the role analysis method fourth embodiment of voice data；

Fig. 6 is that the present invention is based on the flow diagrams of the 5th embodiment of role analysis method of voice data；

Fig. 7 is that the present invention is based on the flow diagrams of the role analysis method sixth embodiment of voice data；

Fig. 8 is that the present invention is based on the flow diagrams of the 7th embodiment of role analysis method of voice data；

Fig. 9 is that the present invention is based on the detailed process figures of the role analysis method fourth embodiment of voice data；

Figure 10 is that the present invention is based on the detailed process figures of the 7th embodiment of role analysis method of voice data.

The embodiments will be further described with reference to the accompanying drawings for the realization, the function and the advantages of the object of the present invention.

Specific embodiment

It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, it is not intended to limit the present invention.

As shown in Figure 1, Fig. 1 is the apparatus structure schematic diagram for the hardware running environment that the embodiment of the present invention is related to.

The terminal of that embodiment of the invention can be PC, and being also possible to smart phone, tablet computer, portable computer etc. has number According to the terminal device of processing function.

As shown in Figure 1, the terminal may include: processor 1001, such as CPU, network interface 1004, user interface 1003, memory 1005, communication bus 1002.Wherein, communication bus 1002 is for realizing the connection communication between these components. User interface 1003 may include display screen (Display), input unit such as keyboard (Keyboard), optional user interface 1003 can also include standard wireline interface and wireless interface.Network interface 1004 optionally may include that the wired of standard connects Mouth, wireless interface (such as WI-FI interface).Memory 1005 can be high speed RAM memory, be also possible to stable memory (non-volatile memory), such as magnetic disk storage.Memory 1005 optionally can also be independently of aforementioned processor 1001 storage device.

Optionally, terminal can also include camera, RF (Radio Frequency, radio frequency) circuit, sensor, audio Circuit, Wi-Fi module etc..Wherein, sensor such as optical sensor, motion sensor and other sensors.Specifically, light Sensor may include ambient light sensor and proximity sensor, wherein ambient light sensor can according to the light and shade of ambient light come The brightness of display screen is adjusted, proximity sensor can close display screen and/or backlight when mobile terminal is moved in one's ear.As One kind of motion sensor, gravity accelerometer can detect the size of (generally three axis) acceleration in all directions, quiet Size and the direction that can detect that gravity when only, the application that can be used to identify mobile terminal posture are (such as horizontal/vertical screen switching, related Game, magnetometer pose calibrating), Vibration identification correlation function (such as pedometer, tap) etc.；Certainly, mobile terminal can also match The other sensors such as gyroscope, barometer, hygrometer, thermometer, infrared sensor are set, details are not described herein.

It will be understood by those skilled in the art that the restriction of the not structure paired terminal of terminal structure shown in Fig. 1, can wrap It includes than illustrating more or fewer components, perhaps combines certain components or different component layouts.

As shown in Figure 1, as may include that operating system, network are logical in a kind of memory 1005 of computer storage medium Believe module, Subscriber Interface Module SIM and the role analysis program based on voice data.

In terminal shown in Fig. 1, network interface 1004 is mainly used for connecting background server, carries out with background server Data communication；User interface 1003 is mainly used for connecting client (user terminal), carries out data communication with client；And processor 1001 can be used for calling the role analysis program based on voice data stored in memory 1005, and execute following operation:

Obtain the voice data and with angle-data corresponding to the voice data；

Further, processor 1001 can call the role analysis journey based on voice data stored in memory 1005 Sequence also executes following operation:

The voice data is converted, lteral data is obtained；

The formula are as follows:

Dp=360/N,

R=(d-d₀+ dp-1)/dp,

The present invention is based on the specific embodiments of the role analysis equipment of voice data and following roles based on voice data Each embodiment of analysis method is essentially identical, and therefore not to repeat here.

Referring to Fig. 2, Fig. 2 be the present invention is based on the flow diagram of the role analysis method first embodiment of voice data, The role analysis method based on voice data includes:

Step S100, obtain the voice data and with angle-data corresponding to the voice data；

In embodiments of the present invention, the role analysis method based on voice data is somebody's turn to do suitable for Recording Process, is also suitable In other scenes.In Recording Process, voice data is acquired by microphone array, and microphone array includes multiple microphones, can Think that a variety of composition forms, the microphone arrays such as 4 microphones or 4+1 microphone, 6+1 microphone can pick up The audio-frequency information of 360 degrees omnidirection breaks single microphone because there are directive property for self-characteristic, cannot collect 360 degree of omnidirectional's sound The problem of sound, achievees the effect that 360 degree of omnidirections collect sound, in each direction can perfect radio reception, the people of any angle The acoustic information of member can be collected, therefore, while picking up voice data by microphone array, it is also possible to obtain the sound Angle-data corresponding to sound data.The voice data of acquisition is saved by WAV format, and angle-data is saved by DIR format.

Step S200 carries out Role Parsing to the voice data according to the angle-data, obtains and the sound number According to corresponding character data.

The present embodiment carries out Role Parsing to voice data by angle-data corresponding to voice data, is said using personage Multiple characters in voice data are distinguished when words relative to the difference of the occupied angle of microphone, and then reach role Isolated effect.The present embodiment can be adapted for Linux x86 environment, by carrying out to angle-data corresponding to voice data It calculates, determines the angle position of voice data to determine character data corresponding to the voice data.Such as in Recording Process, Microphone array is collected into the one section of voice data and angle-data of the sound relative to microphone of certain personage A, to the angle Data carry out calculating parsing, determine that this section of voice data derives from some angle, and then determine the sound source in some angle Role A speech.

The present invention is a kind of new role's separate mode, and previous sound pick-up outfit carries out role's separation, is more by installing A microphone, the corresponding role personage of each microphone, different microphones correspond to different role, pass through voice data source The difference of microphone determines the role of voice data.I.e. certain section of voice data comes from microphone 1, then the angle of this section of voice data Color is 1, certain section of voice data source microphone 2, then the role of the voice data is 2.Therefore previous sound pick-up outfit is to pass through Different microphones is installed to determine the character data of voice data, and the application, it is not required to that multiple microphones are installed, only needs one It is a to have the function of the microphone of microphone array, it can determine angle using the different angle that a microphone collects voice data Color, it is different from role's separate mode of previous sound pick-up outfit, it is a kind of new role's separate mode.

Referring to Fig. 3, Fig. 3 is that the present invention is based on the signals of the process of the role analysis method second embodiment of voice data Figure.

Based on the above embodiment, the present embodiment further include:

The voice data is converted, obtains lteral data by step S010；

In the present embodiment, voice data can be subjected to real-time voice conversion, be converted to lteral data.The lteral data can To be saved by JSON format.The lteral data in addition to voice data by voice it is converted come text information other than, may be used also With comprising other related informations, such as record sound receiving time data.These information are saved, after can be convenient It is continuous to be handled.

Step S210 carries out Role Parsing to the lteral data according to the angle-data, obtains and the text number According to corresponding character data.

Voice data is converted, after obtaining lteral data, progress can be carried out lteral data according to angle-data Role Parsing calculates angle corresponding to certain section of lteral data, determines the role to speak corresponding to the angle, and then really The character data of fixed this section of lteral data.The present embodiment is can achieve in this way through difference corresponding to lteral data Angle determine the corresponding each role personage to speak of each section of lteral data, and then obtain respectively corresponding with each role Speech content text information, can also include other information, such as the temporal information of speaking of this section of words in text information.Such as The corresponding angle of certain section of text data is a, and correspondence is described in role 1, and the corresponding angle of certain section of text data is b, and correspondence is angle Described in color 2.

Referring to Fig. 4, Fig. 4 is that the present invention is based on the signals of the process of the role analysis method 3rd embodiment of voice data Figure.

Based on the above embodiment, in the present embodiment, step S210 includes:

Step S220 divides the lteral data according to the angle-data, obtain Ziwen digital data and with institute State the corresponding subangle data of Ziwen digital data；

Specifically, lteral data can be divided according to angle-data in the present embodiment, lteral data is divided into Multiple Ziwen digital datas, each Ziwen digital data correspond to a sub- angle-data, and a sub- lteral data corresponds to a son Angle-data.Lteral data is divided according to the variation of the corresponding angle of lteral data, lteral data can be divided For multiple Ziwen digital datas, the corresponding angle of each Ziwen digital data.

Step S230 is more corresponding to acquisition and the Ziwen digital data to subangle data progress Role Parsing A character data.

Each Ziwen digital data corresponding one and only one subangle data, calculate according to the angle, are somebody's turn to do Speak character, and then the role to speak corresponding to the determining Ziwen digital data corresponding to angle, and then can determine The corresponding each role of various pieces in passage information, and then reach and role's separation is carried out to this section of text information Effect.In certain section of text information, a word is what role 1 said, second and third word is the described in role 2 the 4th to the 9th Word is described in role 3 etc..

Referring to Fig. 5, Fig. 5 is that the present invention is based on the signals of the process of the role analysis method fourth embodiment of voice data Figure.

Based on the above embodiment, in the present embodiment, step S220 includes:

Step S221 traverses the angle-data, obtains the variation node of the angle-data；

The present embodiment be the lteral data is divided according to the angle-data, obtain Ziwen digital data and with institute State a kind of embodiment of implementation of the corresponding subangle data of Ziwen digital data.The present embodiment can pass through computer Computational algorithm traverses angle-data, obtains the variation node of angle-data.Such as on those nodes, angle-data hair The angle of changing, the acoustic information collected converts, and has some angle to become another angle, and angle change institute is right The variation node answered.

Step S222 is divided according to the variation node pair lteral data corresponding with the angle-data, is obtained Ziwen digital data.

According to the variation node, to the angle-data, accordingly corresponding lteral data is divided, and lteral data is divided into Multiple Ziwen digital datas, the corresponding angle of each Ziwen digital data.The corresponding angle of each Ziwen digital data can be identical, It can be different.And then can according to corresponding to the Ziwen digital data unique angle calculate and determine the Ziwen digital data Role, and then play the effect of role's separation.Detailed process can be found in Fig. 9.

Referring to Fig. 6, Fig. 6 is that the present invention is based on the signals of the process of the 5th embodiment of role analysis method of voice data Figure.

Based on the above embodiment, in the present embodiment, step S230 includes:

Step S231 calculates the subangle data using preset formula, obtains and the subangle data institute The character data of corresponding Ziwen digital data.

In the present embodiment, subangle data corresponding to sub- lteral data are calculated, determines the Ziwen digital data Corresponding role can be calculate by the following formula:

Dp=360/N,

R=(d-d₀+ dp-1)/dp,

This formula can isolate specific role, above-mentioned two formula according to preset number of participants N and start angle All round up, if that is, fractional part be greater than 0, add 1 in integer part, as long as and integer part value.For example, default d₀It is 6, number of participants N is 4, then when the subangle data for obtaining certain cross-talk lteral data are 126 degree, is calculated from the formula Dp is 90, and then when calculated role R=(126-6+89)/90=2, i.e. group angle-data are 126 degree, it is determined that corresponding Character data be 2.Angle is calculated using above-mentioned formula, angle is divided, is divided into multiple angular regions, often The corresponding character of a angular regions, and then the angular regions at place can be determined according to angle, and then determine the angle Corresponding character.

Referring to Fig. 7, Fig. 7 is that the present invention is based on the signals of the process of the role analysis method sixth embodiment of voice data Figure.

Based on the above embodiment, the present embodiment further includes following steps:

The lteral data and character data corresponding with the lteral data are sent to mobile terminal by step S300, So that mobile terminal is shown.

In the present embodiment, lteral data and character data are sent to mobile terminal, so that mobile terminal is by text Data and character data are shown, finally show the effect of role's separation and text conversion on mobile terminals.The movement Terminal can be the mobile device of support iOS and Android, and mobile terminal, will after getting lteral data and character data Text conversion result and role's separating resulting are shown on the interface of mobile terminal.The content of display may include: voice conversion Rear word content, relative time, role's title, the role's head portrait spoken etc..These files can also be saved, so as to Subsequent to be operated, such as subsequent recording playback positions word content, basis by progress playback, according to recording playback progress Word content locating recordings progress, again transcription recording file etc. operate.By mode as synchronous mobile terminal, can incite somebody to action Role's separating effect visualizes intuitiveization, and the effect of role's separation is intuitively seen convenient for user.

Referring to Fig. 8, Fig. 8 is that the present invention is based on the signals of the process of the 7th embodiment of role analysis method of voice data Figure.

Step S400 receives actual name and the actual name pass corresponding with the character data of user's input System；

It in the present embodiment, can be by role's number after obtaining character data corresponding to lteral data and lteral data According to being modified, it is corrected as the real name or the pet name of practical Role Users, the effect for separating role is more intuitive.Such as obtaining Under the premise of knowing that the corresponding role personage to speak of certain words is 1, the actual name for receiving user's input such as spends flower, and flower this Character data corresponding to actual name is the information such as 1.

The character data is replaced with corresponding actual name according to corresponding relationship by step S500, and by actual name Preservation is associated with lteral data.

Flower is such as spent in the actual name for receiving user's input, and character data corresponding to flower this actual name is After the information such as 1, the actual name that character data 1 replaces with user's input is spent into flower, to obtain the corresponding angle spoken of certain words Color personage is to spend flower, to reach the word content spoken and the one-to-one result of Real Name.Setting is edited by user Specific name, the title and above-mentioned character data R are associated preservation, to reach the effect for being modified to actual name Fruit.At the same time it can also according to original preset Icon Color or head portrait, by Icon Color and head portrait and character data R into Row association saves, and keeps the relevant information of the corresponding role personage to speak of text information more comprehensively intuitive.Detailed process is referring to figure 10。

In addition, the embodiment of the present invention also proposes a kind of computer readable storage medium.

The role analysis program based on voice data is stored on computer readable storage medium of the present invention, it is described to be based on sound Role analysis method based on voice data as described above is realized when the role analysis program of sound data is executed by processor Step.

Wherein, the role analysis program based on voice data run on the processor is performed realized side Method can refer to each embodiment of role analysis method the present invention is based on voice data, and details are not described herein again.

It should be noted that, in this document, the terms "include", "comprise" or its any other variant are intended to non-row His property includes, so that the process, method, article or the system that include a series of elements not only include those elements, and And further include other elements that are not explicitly listed, or further include for this process, method, article or system institute it is intrinsic Element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that including being somebody's turn to do There is also other identical elements in the process, method of element, article or system.

The serial number of the above embodiments of the invention is only for description, does not represent the advantages or disadvantages of the embodiments.

Through the above description of the embodiments, those skilled in the art can be understood that above-described embodiment side Method can be realized by means of software and necessary general hardware platform, naturally it is also possible to by hardware, but in many cases The former is more preferably embodiment.Based on this understanding, technical solution of the present invention substantially in other words does the prior art The part contributed out can be embodied in the form of software products, which is stored in one as described above In storage medium (such as ROM/RAM, magnetic disk, CD), including some instructions are used so that terminal device (it can be mobile phone, Computer, server, air conditioner or network equipment etc.) execute method described in each embodiment of the present invention.

The above is only a preferred embodiment of the present invention, is not intended to limit the scope of the invention, all to utilize this hair Equivalent structure or equivalent flow shift made by bright specification and accompanying drawing content is applied directly or indirectly in other relevant skills Art field, is included within the scope of the present invention.

Claims

1. a kind of role analysis method based on voice data, which is characterized in that the role analysis side based on voice data Method the following steps are included:

Obtain the voice data and with angle-data corresponding to the voice data；

Role Parsing is carried out to the voice data according to the angle-data, obtains role's number corresponding with the voice data According to.

2. the role analysis method based on voice data as described in claim 1, which is characterized in that described according to the angle It is wrapped before the step of data carry out Role Parsing to the voice data, obtain character data corresponding with the voice data It includes:

The voice data is converted, lteral data is obtained；

It is described that Role Parsing is carried out to the voice data according to the angle-data, obtain angle corresponding with the voice data Chromatic number according to the step of replace are as follows:

Role Parsing is carried out to the lteral data according to the angle-data, obtains role's number corresponding with the lteral data According to.

3. the role analysis method based on voice data as claimed in claim 2, which is characterized in that described according to the angle Data to the lteral data carry out Role Parsing, obtain character data corresponding with the lteral data the step of include:

The lteral data is divided according to the angle-data, obtain Ziwen digital data and is divided with the Ziwen digital data Not corresponding subangle data；

Role Parsing is carried out to the subangle data, is obtained and multiple character datas corresponding to the Ziwen digital data.

4. the role analysis method based on voice data as claimed in claim 3, which is characterized in that described to the text number It is divided according to according to the angle-data, obtains Ziwen digital data and subangle number corresponding with the Ziwen digital data According to the step of include:

It is divided according to the variation node pair lteral data corresponding with the angle-data, obtains Ziwen digital data.

5. the role analysis method based on voice data as claimed in claim 3, which is characterized in that described to the subangle Data carry out Role Parsing, and acquisition includes: with the step of multiple character datas corresponding to the Ziwen digital data

The subangle data are calculated using preset formula, are obtained and Ziwen number of words corresponding to the subangle data According to character data.

6. the role analysis method based on voice data as claimed in claim 5, which is characterized in that the formula are as follows:

Dp=360/N,

R=(d-d₀+ dp-1)/dp,

Wherein, dp indicates that everyone accounts for angled numerical value；R indicates character data；D indicates the angle read from merging file Value, value range are [0,360]；d₀It is the deviation angle angle value of initialization, value range is [0,30]；N is participant numerical value.

7. the role analysis method based on voice data as claimed in claim 2, which is characterized in that described to be based on voice data Role analysis method it is further comprising the steps of:

The lteral data and character data corresponding with the lteral data are sent to mobile terminal, so that mobile terminal It is shown.

8. the role analysis method based on voice data as claimed in claim 2, which is characterized in that described to be based on voice data Role analysis method it is further comprising the steps of:

The character data is replaced with into corresponding actual name according to corresponding relationship, and actual name and lteral data are carried out Association saves.

9. a kind of role analysis device based on voice data, which is characterized in that the role analysis dress based on voice data Set include: memory, processor and be stored on the memory and can run on the processor based on voice data Role analysis program, realize when the role analysis program based on voice data is executed by the processor such as claim The step of role analysis method described in any one of 1 to 8 based on voice data.

10. a kind of computer readable storage medium, which is characterized in that be stored on the computer readable storage medium based on sound The role analysis program of sound data is realized when the role analysis program based on voice data is executed by processor as right is wanted The step of role analysis method described in asking any one of 1 to 8 based on voice data.