CN108597521A - Audio role divides interactive system, method, terminal and the medium with identification word - Google Patents
Audio role divides interactive system, method, terminal and the medium with identification word Download PDFInfo
- Publication number
- CN108597521A CN108597521A CN201810421520.8A CN201810421520A CN108597521A CN 108597521 A CN108597521 A CN 108597521A CN 201810421520 A CN201810421520 A CN 201810421520A CN 108597521 A CN108597521 A CN 108597521A
- Authority
- CN
- China
- Prior art keywords
- role
- data stream
- voice data
- module
- audio
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000002452 interceptive effect Effects 0.000 title claims abstract description 24
- 238000000034 method Methods 0.000 title claims description 29
- 230000011218 segmentation Effects 0.000 claims abstract description 50
- 238000003860 storage Methods 0.000 claims description 12
- 238000004590 computer program Methods 0.000 claims description 10
- 230000005540 biological transmission Effects 0.000 claims description 5
- 238000009826 distribution Methods 0.000 claims description 3
- 230000004069 differentiation Effects 0.000 abstract description 10
- 238000006243 chemical reaction Methods 0.000 abstract description 7
- 238000010586 diagram Methods 0.000 description 6
- 230000006870 function Effects 0.000 description 5
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000003825 pressing Methods 0.000 description 3
- 238000004891 communication Methods 0.000 description 2
- 230000008878 coupling Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000011712 cell development Effects 0.000 description 1
- 238000005520 cutting process Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 239000003550 marker Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/04—Segmentation; Word boundary detection
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Telephonic Communication Services (AREA)
Abstract
The invention discloses the interactive systems of a kind of conversation audio role segmentation and identification word, including server and user terminal, server includes speech processing module, speech recognition character module and output module, and speech processing module is configured as playing out dialogue audio data stream to be identified;User terminal is obtained to the batch operation of speech roles and identifies that speech roles distribute;Voice data stream is marked by role;It is partitioned into the voice data stream corresponding to different role according to role's label;Speech recognition character module is configured as the voice data stream of different role being identified as text information;Output module is configured as output character information.Server is marked the differentiation of role according to user terminal, divides, the audio data stream of segmentation corresponding text information is converted into again to export, realization is split automatically to the conversation audio of different role and text conversion, quickly, efficiently and accurately realizes conversation audio role segmentation and Text region.
Description
Technical field
The present invention relates to audio frequency identification technique fields, and in particular to a kind of friendship of conversation audio role segmentation and identification word
Mutual system, method, terminal and medium.
Background technology
Existing automatic identification conversational character simultaneously carries out the technology of voice segmentation and role's ownership that there is also precision is not high
Problem inevitably there is a situation where to identify and cut inaccuracy, it is also necessary to manual cutting of arranging in pairs or groups voice and distribute role come into
Row accurate adjustment, the interactive mode of existing manually implemented audio segmentation are predominantly arranged starting and ending in a section audio and divide
Point, then audio is intercepted out, but can not the dialogue split be carried out role's ownership automatically, and voice is switched into text simultaneously
Word content.That is, needing to realize segmentation voice, the affiliated role of distribution voice and the friendship that voice is switched to word content function
Mutual mode is not yet integrated at present, is operated less efficient.
Invention content
For the defects in the prior art, one of the objects of the present invention is to provide a kind of conversation audio role segmentation and knowledges
The interactive system of other word, realization is split automatically to the conversation audio of different role and text conversion, quick, efficient, accurate
Really realize conversation audio role segmentation and Text region.
In a first aspect, the interactive system of conversation audio role segmentation and identification word provided in an embodiment of the present invention, including
Server and user terminal, the server receive the dialogue audio data stream to be identified that user terminal is sent;The server
Including speech processing module, speech recognition character module and output module, the speech processing module is configured as to be identified
Dialogue audio data stream plays out;User terminal is obtained to the batch operation of speech roles and identifies that speech roles distribute;It presses
Voice data stream is marked in role;It is partitioned into the voice data stream corresponding to different role according to role's label;Institute's predicate
Sound identification character module is configured as the voice data stream of different role being identified as text information;The output module is configured
For output character information.
Optionally, the speech processing module includes voice playing module, and the voice playing module is configured as playing
Dialogue audio data stream to be identified.
Optionally, the speech processing module further includes role's mark module, and role's mark module is configured as root
Information is distributed according to the speech roles, role's label is carried out to the voice data stream of broadcasting, and record the corresponding sound of role's label
The time point of frequency data stream.
Optionally, the speech processing module further includes voice segmentation module, voice segmentation module be configured as by
The voice data stream that the voice data stream of adjacent time point is marked as different role is split processing, to adjacent time point
Voice data stream is marked as the adjacent voice data stream of same role then without dividing processing, is partitioned into different role correspondence
Voice data stream.
Second aspect, the exchange method of audio role segmentation and identification word provided in an embodiment of the present invention, specifically includes
Following steps:
Server receives and obtains the dialogue audio data stream to be identified of user terminal transmission;
Server obtains user terminal and flows into edlin request to the dialogue audio data to be identified;
Server plays out dialogue audio data stream to be identified;
Server obtains user terminal to the batch operation of speech roles and identifies that speech roles distribute, by conversation audio number
Role's label is carried out to dialogue audio data stream by role distribution according to stream, and records the corresponding audio data of role's label
The time point of stream;
Server is partitioned into the voice data stream corresponding to different role according to role's label;
Voice data stream corresponding to the different role is identified and is converted to text information by server;
Server exports the text information.
Optionally, the server is partitioned into the specific side of the voice data stream corresponding to different role according to role's label
Method includes:The voice data stream that the voice data stream of adjacent time point is marked as to different role is split processing, to phase
The voice data stream at adjacent time point is marked as the adjacent voice data stream of same role then without dividing processing.
The third aspect, mobile terminal provided in an embodiment of the present invention, including processor, input equipment, output equipment and deposit
Reservoir, the processor, input equipment, output equipment and memory are connected with each other, and the memory is for storing computer journey
Sequence, the computer program include program instruction, and the processor is configured for calling described program instruction, executes above-mentioned side
Method.
Fourth aspect, computer readable storage medium provided in an embodiment of the present invention, the computer program include program
Instruction, described program instruction make the processor execute the above method when being executed by a processor.
Beneficial effects of the present invention:
Interactive system, method, terminal and Jie of conversation audio role segmentation and identification word provided in an embodiment of the present invention
Matter obtains differentiation of the user to role by obtaining the operating interactive gesture of user on the subscriber terminal, and server is according to user
The differentiation of terminal-pair role carries out role's label, segmentation to dialogue audio data stream, then the audio data stream of segmentation is converted into
Corresponding text information output, realization is split automatically to the conversation audio of different role and text conversion, quick, efficient,
Accurately realize conversation audio role segmentation and Text region.
Description of the drawings
It, below will be to specific in order to illustrate more clearly of the specific embodiment of the invention or technical solution in the prior art
Embodiment or attached drawing needed to be used in the description of the prior art are briefly described.In all the appended drawings, similar element
Or part is generally identified by similar reference numeral.In attached drawing, each element or part might not be drawn according to actual ratio.
Fig. 1 shows the first reality of a kind of conversation audio role segmentation and the interactive system of identification word provided by the invention
Apply the functional block diagram of example;
Fig. 2 shows the second embodiments of conversation audio role provided by the invention segmentation and the interactive system of identification word
Functional block diagram;
Fig. 3 shows the first embodiment of conversation audio role segmentation and the exchange method of identification word provided by the invention
Flow chart;
Fig. 4 shows the structural schematic diagram of the first embodiment of mobile terminal provided by the invention.
Specific implementation mode
The embodiment of technical solution of the present invention is described in detail below in conjunction with attached drawing.Following embodiment is only used for
Clearly illustrate technical scheme of the present invention, therefore be intended only as example, and the protection of the present invention cannot be limited with this
Range.
It should be noted that unless otherwise indicated, technical term or scientific terminology used in this application should be this hair
The ordinary meaning that bright one of ordinary skill in the art are understood.
As shown in Figure 1, showing the interactive system of a kind of conversation audio role segmentation and identification word provided by the invention
First embodiment functional block diagram, which includes server 1 and user terminal 2, and the server 1 receives user terminal 2
The dialogue audio data stream to be identified sent;The server 1 includes speech processing module 11,12 and of speech recognition character module
Output module 13, the speech processing module 11 are configured as playing out dialogue audio data stream to be identified;Obtain user
Terminal 2 is to the batch operations of speech roles and identifies that speech roles distribute;Voice data stream is marked by role;According to angle
Color marker is partitioned into the voice data stream corresponding to different role;The speech recognition character module 12 is configured as different angles
The voice data stream of color is identified as text information;The output module 13 is configured as output character information.
User terminal sends dialogue audio data stream to be identified to server, and server receives and obtains dialogue to be identified
Voice data stream, conversation audio are the dialogic voice segment of two roles of A and B.User is edited by user terminal transmission and waits knowing
The request of other conversation audio, server feed back conversation audio edit page, the speech processing module pair of server to user terminal
Dialogue audio data stream to be identified plays out, and user judges that conversation audio role, user hear out one, judges that the words is A
It says, then presses A role's control key on the user terminal voice edition page, speech processing module is by this section of video data stream
Conversational character is labeled as A role, and user continues to play dialogue audio data stream, and user hears out one, judges that the words is that B is said
, press B role's control key on user terminal edit page, speech processing module is by the conversational character mark of the section audio data flow
It is denoted as B role, then proceedes to play, continues that role is marked according to the method described above, after conversation audio finishes, voice
The audio data for being marked as different role is split by processing module, and user presses speech-to-text control key, speech recognition
Voice data stream after segmentation is carried out voice and is converted to Text extraction by character module, identifies the corresponding word letter of voice
Breath, output module export the text information identified.
The interactive system of the conversation audio role segmentation and identification word of the embodiment of the present invention, by obtaining user in user
Operating interactive gesture in terminal obtains differentiation of the user to role, and server is according to user terminal to the differentiation of role into rower
Note, segmentation, then the audio data stream of segmentation is converted into corresponding text information and is exported, it realizes automatically to the dialogue of different role
Audio is split and text conversion, quickly, efficiently and accurately realizes conversation audio role segmentation and Text region.
As shown in Fig. 2, showing the of conversation audio role provided by the invention segmentation and the interactive system of identification word
The functional block diagram of two embodiments, is different from the first embodiment in, and speech processing module 11 includes voice playing module
111, role's mark module 112 and voice divide module 113, and it is to be identified right that the voice playing module 111 is configured as playing
Speech frequency data stream;Role's mark module 112 is configured as the audio for distributing information to broadcasting according to the speech roles
Data flow carries out role's label, and records the time point of the corresponding voice data stream of role's label;Voice divides 113 quilt of module
The voice data stream for being configured to the voice data stream of adjacent time point being marked as different role is split processing, to adjacent
The voice data stream at time point is marked as the adjacent voice data stream of same role then without dividing processing, is partitioned into difference
The corresponding voice data stream of role.
User terminal sends dialogue audio data stream to be identified to server, and server receives and obtains dialogue to be identified
Voice data stream, conversation audio are the dialogic voice segment of two roles of A and B.User is edited by user terminal transmission and waits knowing
The request of other conversation audio, server feed back conversation audio edit page to user terminal, and voice playing module is to be identified right
Speech frequency data stream plays out, and user judges that role's ownership of conversation audio, user hear out one, judge that the words is that A is said
, A role's control key is then pressed on the user terminal voice edition page, voice playing module suspends speech play, Jiao Sebiao
Remember that the role of this section of video data stream is labeled as A role by module, and records user at the time point for pressing A role's control key.
User continues to play dialogue audio data stream, and user hears out one, judges that the words is that B is said, in user terminal edit page
On press B role's control key, voice playing module suspends speech play, and role's mark module marks the role of this section of audio data stream
It is denoted as B role, and records user at the time point for pressing B role's control key.When voice segmentation module in server will be adjacent
Between the voice data stream put be marked as the voice data stream of different role and be split processing, to the audio number of adjacent time point
It is marked as belonging to the adjacent voice data stream of same role then without dividing processing according to stream, is partitioned into different role correspondence
Voice data stream.The voice data stream of different role is identified as text information by the speech recognition character module in server;
The corresponding text information of the voice data stream of each role is distributed to conversational character, output character information by output module.
The interactive system of the conversation audio role segmentation and identification word of the embodiment of the present invention, by obtaining user in user
Operating interactive gesture in terminal obtains differentiation of the user to role, and server is according to user terminal to the differentiation of role into rower
Note, segmentation, then the audio data stream of segmentation is converted into corresponding text information and is exported, it realizes automatically to the dialogue of different role
Audio is split and text conversion, quickly, efficiently and accurately realizes conversation audio role segmentation and Text region.
As shown in figure 3, showing the of conversation audio role provided by the invention segmentation and the exchange method of identification word
The flow chart of one embodiment, the interactive system of audio role segmentation and identification word of this method suitable for above-described embodiment,
This method specifically includes following steps:
S1:User terminal sends dialogue audio data stream to be identified to server.
S2:Server receives and obtains the dialogue audio data stream to be identified of user terminal transmission.Conversation audio to be identified
For different role dialogue audio data stream.
S3:Server obtains user terminal and flows into edlin request to dialogue audio data to be identified.
S4:Server plays out dialogue audio data stream to be identified.
S5:Server obtains user terminal to the batch operation of speech roles and identifies that speech roles distribute, will be to speech
Frequency data stream is distributed by the role and carries out role's label to dialogue audio data stream, and records the corresponding audio of role's label
The time point of data flow.
S6:Server is partitioned into the voice data stream corresponding to different role according to role's label.
Specifically, the voice data stream that the voice data stream of adjacent time point is marked as to different role is split place
Reason, the adjacent voice data stream of same role is marked as then without dividing processing to the voice data stream of adjacent time point.
S7:Voice data stream corresponding to the different role is identified and is converted to text information by server.
S8:Server exports the text information.
The realization of this method is described in detail so that conversation audio includes the dialogic voice segment of A and B roles as an example below:
User terminal sends dialogue audio data stream to be identified to server, and server receives and obtains dialogue to be identified
Voice data stream, user send the request for editing conversation audio to be identified by user terminal, and server is fed back to user terminal
Conversation audio edit page, voice playing module play out dialogue audio data stream to be identified, and user judges conversation audio
Role's ownership, user hears out one, judges that the words is that A is said, then press the angles A on the user terminal voice edition page
Color control key, voice playing module suspend speech play, and the role of this section of video data stream is labeled as the angles A by role's mark module
Color, and user is recorded at the time point for pressing A role's control key.User continues to play dialogue audio data stream, and user hears out one
Sentence, judges that the words is that B is said, B role's control key is pressed on user terminal edit page, and voice playing module pause voice is broadcast
It puts, the role of this section of audio data stream is labeled as B role by role's mark module, and is recorded user and pressed B role's control key
Time point.The voice data stream of adjacent time point is marked as the audio of different role by the voice segmentation module in server
Data flow is split processing, is marked as belonging to the adjacent tone frequency of same role to the voice data stream of adjacent time point
According to stream then without dividing processing, it is partitioned into the corresponding voice data stream of different role.Speech recognition word mould in server
The voice data stream of different role is identified as text information by block;Output module is by the corresponding text of the voice data stream of each role
Word information distributes to conversational character, output character information.
The exchange method of the conversation audio role segmentation and identification word of the embodiment of the present invention, by obtaining user in user
Operating interactive gesture in terminal obtains differentiation of the user to role, and server is according to user terminal to the differentiation of role into rower
Note, segmentation, then the audio data stream of segmentation is converted into corresponding text information and is exported, it realizes automatically to the dialogue of different role
Audio is split and text conversion, quickly, efficiently and accurately realizes conversation audio role segmentation and Text region.
As shown in figure 4, showing the structural schematic diagram of the first embodiment of mobile terminal provided by the invention, mobile terminal
Including processor 31, input equipment 32, output equipment 33 and memory 34, the processor 31, input equipment 32, output equipment
33 and memory 34 be connected with each other, for the memory 34 for storing computer program, the computer program includes that program refers to
It enables, the processor 31 is configured for calling described program instruction, the method for executing above-described embodiment description.
Mobile terminal provided in an embodiment of the present invention is obtained by obtaining the operating interactive gesture of user on the subscriber terminal
Differentiation of the user to role, server are marked the differentiation of role according to user terminal, divide, then by the voice number of segmentation
Corresponding text information output is changed into according to circulation, realization is split automatically to the conversation audio of different role and text conversion,
Quickly, conversation audio role segmentation and Text region are efficiently and accurately realized.
The embodiments of the present invention also provide a kind of computer readable storage medium, the computer storage media is stored with
Computer program, the computer program include program instruction, and described program instruction makes the processing when being executed by a processor
The method that device executes above-described embodiment description.
Computer readable storage medium can be the internal storage unit of the terminal described in previous embodiment, such as terminal
Hard disk or memory.The computer readable storage medium can also be the External memory equipment of the terminal, such as the terminal
The plug-in type hard disk of upper outfit, intelligent memory card (Smart Media Card, SMC), secure digital (Secure Digital,
SD) block, flash card (Flash Card) etc..Further, the computer readable storage medium can also both include the end
The internal storage unit at end also includes External memory equipment.The computer readable storage medium is for storing the computer journey
Other programs needed for sequence and the terminal and data.The computer readable storage medium can be also used for temporarily storing
The data that has exported or will export.
Those of ordinary skill in the art may realize that lists described in conjunction with the examples disclosed in the embodiments of the present disclosure
Member and algorithm steps, can be realized with electronic hardware, computer software, or a combination of the two, in order to clearly demonstrate hardware
With the interchangeability of software, each exemplary composition and step are generally described according to function in the above description.This
A little functions are implemented in hardware or software actually, depend on the specific application and design constraint of technical solution.Specially
Industry technical staff can use different methods to achieve the described function each specific application, but this realization is not
It is considered as beyond the scope of this invention.
It is apparent to those skilled in the art that for convenience of description and succinctly, the end of foregoing description
The specific work process at end and unit, can refer to corresponding processes in the foregoing method embodiment, details are not described herein.
In several embodiments provided herein, it should be understood that disclosed terminal and method, it can be by other
Mode realize.For example, the apparatus embodiments described above are merely exemplary, for example, the division of the unit, only
For a kind of division of logic function, formula that in actual implementation, there may be another division manner, such as multiple units or component can combine
Or it is desirably integrated into another system, or some features can be ignored or not executed.In addition, shown or discussed is mutual
Between coupling, direct-coupling or communication connection can be INDIRECT COUPLING or communication link by some interfaces, device or unit
It connects, can also be electricity, mechanical or other form connections.
Finally illustrate, the above examples are only used to illustrate the technical scheme of the present invention and are not limiting, although with reference to compared with
Good embodiment describes the invention in detail, it will be understood by those of ordinary skill in the art that, it can be to the skill of the present invention
Art scheme is modified or replaced equivalently, and without departing from the objective and range of technical solution of the present invention, should all be covered at this
In the right of invention.
Claims (9)
1. a kind of interactive system of conversation audio role segmentation and identification word, which is characterized in that whole including server and user
End, the server receive the dialogue audio data stream to be identified that user terminal is sent;The server includes speech processes mould
Block, speech recognition character module and output module, the speech processing module are configured as to dialogue audio data stream to be identified
It plays out;User terminal is obtained to the batch operation of speech roles and identifies that speech roles distribute;By role to audio data
Flow into line flag;It is partitioned into the voice data stream corresponding to different role according to role's label;The speech recognition character module
It is configured as the voice data stream of different role being identified as text information;The output module is configured as output character letter
Breath.
2. the interactive system of conversation audio role segmentation and identification word as described in claim 1, which is characterized in that institute's predicate
Sound processing module includes voice playing module, and the voice playing module is configured as playing dialogue audio data stream to be identified.
3. the interactive system of conversation audio role segmentation and identification word as described in claim 1, which is characterized in that institute's predicate
Sound processing module includes that role distributes identification module, and the role distributes identification module and is configured as obtaining user terminal to voice
The batch operation of role simultaneously identifies that speech roles distribute information.
4. the interactive system of conversation audio role segmentation and identification word as claimed in claim 3, which is characterized in that institute's predicate
Sound processing module further includes role's mark module, and role's mark module is configured as distributing information according to the speech roles
Role's label is carried out to the voice data stream of broadcasting, and records the time point of the corresponding voice data stream of role's label.
5. the interactive system of conversation audio role segmentation and identification word as claimed in claim 4, which is characterized in that institute's predicate
Sound processing module further includes voice segmentation module, and the voice segmentation module is configured as the voice data stream of adjacent time point
The voice data stream for being marked as different role is split processing, is marked as to the voice data stream of adjacent time point same
The adjacent voice data stream of role then without dividing processing, is partitioned into the corresponding voice data stream of different role.
6. a kind of exchange method of conversation audio role segmentation and identification word, which is characterized in that following steps are specifically included,
Server receives and obtains the dialogue audio data stream to be identified of user terminal transmission;
Server obtains user terminal and flows into edlin request to the dialogue audio data to be identified;
Server plays out dialogue audio data stream to be identified;
Server obtains user terminal to the batch operation of speech roles and identifies that speech roles distribute, by dialogue audio data stream
Role's label is carried out to dialogue audio data stream by role distribution, and records the corresponding voice data stream of role's label
Time point;
Server is partitioned into the voice data stream corresponding to different role according to role's label;
Voice data stream corresponding to the different role is identified and is converted to text information by server;
Server exports the text information.
7. the exchange method of audio role segmentation and identification word as claimed in claim 6, which is characterized in that the server
The specific method that voice data stream corresponding to different role is partitioned into according to role's label includes:By the audio of adjacent time point
The voice data stream that data flow is marked as different role is split processing, labeled to the voice data stream of adjacent time point
For same role adjacent voice data stream then without dividing processing.
8. a kind of mobile terminal, including processor, input equipment, output equipment and memory, the processor, input equipment,
Output equipment and memory are connected with each other, and for the memory for storing computer program, the computer program includes program
Instruction, which is characterized in that the processor is configured for calling described program instruction, executes as claimed in claims 6 or 7
Method.
9. a kind of computer readable storage medium, which is characterized in that the computer storage media is stored with computer program, institute
It includes program instruction to state computer program, and described program instruction makes the processor execute as right is wanted when being executed by a processor
Seek the method described in 6 or 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810421520.8A CN108597521A (en) | 2018-05-04 | 2018-05-04 | Audio role divides interactive system, method, terminal and the medium with identification word |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810421520.8A CN108597521A (en) | 2018-05-04 | 2018-05-04 | Audio role divides interactive system, method, terminal and the medium with identification word |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108597521A true CN108597521A (en) | 2018-09-28 |
Family
ID=63620824
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810421520.8A Pending CN108597521A (en) | 2018-05-04 | 2018-05-04 | Audio role divides interactive system, method, terminal and the medium with identification word |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108597521A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110166818A (en) * | 2018-11-30 | 2019-08-23 | 腾讯科技(深圳)有限公司 | Wait match generation method, computer equipment and the storage medium of audio-video |
CN112382288A (en) * | 2020-11-11 | 2021-02-19 | 湖南常德牌水表制造有限公司 | Method and system for debugging equipment by voice, computer equipment and storage medium |
CN113192516A (en) * | 2021-04-22 | 2021-07-30 | 平安科技(深圳)有限公司 | Voice role segmentation method and device, computer equipment and storage medium |
CN114339423A (en) * | 2021-12-24 | 2022-04-12 | 咪咕文化科技有限公司 | Short video generation method and device, computing equipment and computer readable storage medium |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1770262A (en) * | 2004-11-01 | 2006-05-10 | 英业达股份有限公司 | Speech display system and method |
CN102543063A (en) * | 2011-12-07 | 2012-07-04 | 华南理工大学 | Method for estimating speech speed of multiple speakers based on segmentation and clustering of speakers |
CN102543080A (en) * | 2010-12-24 | 2012-07-04 | 索尼公司 | Audio editing system and audio editing method |
CN105405439A (en) * | 2015-11-04 | 2016-03-16 | 科大讯飞股份有限公司 | Voice playing method and device |
US20160217793A1 (en) * | 2015-01-26 | 2016-07-28 | Verint Systems Ltd. | Acoustic signature building for a speaker from multiple sessions |
CN105845129A (en) * | 2016-03-25 | 2016-08-10 | 乐视控股(北京)有限公司 | Method and system for dividing sentences in audio and automatic caption generation method and system for video files |
CN105957531A (en) * | 2016-04-25 | 2016-09-21 | 上海交通大学 | Speech content extracting method and speech content extracting device based on cloud platform |
CN106683661A (en) * | 2015-11-05 | 2017-05-17 | 阿里巴巴集团控股有限公司 | Role separation method and device based on voice |
CN106782507A (en) * | 2016-12-19 | 2017-05-31 | 平安科技(深圳)有限公司 | The method and device of voice segmentation |
CN106851407A (en) * | 2017-01-24 | 2017-06-13 | 维沃移动通信有限公司 | A kind of control method and terminal of video playback progress |
CN107358945A (en) * | 2017-07-26 | 2017-11-17 | 谢兵 | A kind of more people's conversation audio recognition methods and system based on machine learning |
CN107808659A (en) * | 2017-12-02 | 2018-03-16 | 宫文峰 | Intelligent sound signal type recognition system device |
-
2018
- 2018-05-04 CN CN201810421520.8A patent/CN108597521A/en active Pending
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1770262A (en) * | 2004-11-01 | 2006-05-10 | 英业达股份有限公司 | Speech display system and method |
CN102543080A (en) * | 2010-12-24 | 2012-07-04 | 索尼公司 | Audio editing system and audio editing method |
CN102543063A (en) * | 2011-12-07 | 2012-07-04 | 华南理工大学 | Method for estimating speech speed of multiple speakers based on segmentation and clustering of speakers |
US20160217793A1 (en) * | 2015-01-26 | 2016-07-28 | Verint Systems Ltd. | Acoustic signature building for a speaker from multiple sessions |
CN105405439A (en) * | 2015-11-04 | 2016-03-16 | 科大讯飞股份有限公司 | Voice playing method and device |
CN106683661A (en) * | 2015-11-05 | 2017-05-17 | 阿里巴巴集团控股有限公司 | Role separation method and device based on voice |
CN105845129A (en) * | 2016-03-25 | 2016-08-10 | 乐视控股(北京)有限公司 | Method and system for dividing sentences in audio and automatic caption generation method and system for video files |
CN105957531A (en) * | 2016-04-25 | 2016-09-21 | 上海交通大学 | Speech content extracting method and speech content extracting device based on cloud platform |
CN106782507A (en) * | 2016-12-19 | 2017-05-31 | 平安科技(深圳)有限公司 | The method and device of voice segmentation |
CN106851407A (en) * | 2017-01-24 | 2017-06-13 | 维沃移动通信有限公司 | A kind of control method and terminal of video playback progress |
CN107358945A (en) * | 2017-07-26 | 2017-11-17 | 谢兵 | A kind of more people's conversation audio recognition methods and system based on machine learning |
CN107808659A (en) * | 2017-12-02 | 2018-03-16 | 宫文峰 | Intelligent sound signal type recognition system device |
Non-Patent Citations (7)
Title |
---|
余小清等: "一种改进型BIC话者改变检测算法 ", 《上海大学学报(自然科学版)》 * |
曹洪林,李敬阳: "论声纹鉴定意见的表述形式", 《证据科学》 * |
曹洪林等: "论声纹鉴定意见的表述形式 ", 《证据科学》 * |
梁晓轩: "破解声音密码", 《检察风云》 * |
檀蕊莲等: "说话人识别技术的研究进展", 《科技资讯》 * |
郑铁然等: "基于预分割的说话人分割方法 ", 《通信学报》 * |
马勇等: "说话人分割聚类研究进展 ", 《信号处理》 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110166818A (en) * | 2018-11-30 | 2019-08-23 | 腾讯科技(深圳)有限公司 | Wait match generation method, computer equipment and the storage medium of audio-video |
CN112382288A (en) * | 2020-11-11 | 2021-02-19 | 湖南常德牌水表制造有限公司 | Method and system for debugging equipment by voice, computer equipment and storage medium |
CN112382288B (en) * | 2020-11-11 | 2024-04-02 | 湖南常德牌水表制造有限公司 | Method, system, computer device and storage medium for voice debugging device |
CN113192516A (en) * | 2021-04-22 | 2021-07-30 | 平安科技(深圳)有限公司 | Voice role segmentation method and device, computer equipment and storage medium |
CN113192516B (en) * | 2021-04-22 | 2024-05-07 | 平安科技(深圳)有限公司 | Voice character segmentation method, device, computer equipment and storage medium |
CN114339423A (en) * | 2021-12-24 | 2022-04-12 | 咪咕文化科技有限公司 | Short video generation method and device, computing equipment and computer readable storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108597521A (en) | Audio role divides interactive system, method, terminal and the medium with identification word | |
US8315866B2 (en) | Generating representations of group interactions | |
CN110166816B (en) | Video editing method and system based on voice recognition for artificial intelligence education | |
CN105929980B (en) | Method and apparatus for information input | |
CN103561229B (en) | Meeting label generates and application process, device, system | |
CN105632498A (en) | Method, device and system for generating conference record | |
CN109636345B (en) | Intelligent management method and system for business handling workflow | |
WO2018130173A1 (en) | Dubbing method, terminal device, server and storage medium | |
US20190221213A1 (en) | Method for reducing turn around time in transcription | |
CN111128212A (en) | Mixed voice separation method and device | |
CN109064532A (en) | The automatic shape of the mouth as one speaks generation method of cartoon role and device | |
CN110610698A (en) | Voice labeling method and device | |
CN112562677B (en) | Conference voice transcription method, device, equipment and storage medium | |
CN112911332B (en) | Method, apparatus, device and storage medium for editing video from live video stream | |
CN110312161A (en) | A kind of video dubbing method, device and terminal device | |
CN111583932A (en) | Sound separation method, device and equipment based on human voice model | |
CN111027093A (en) | Access right control method and device, electronic equipment and storage medium | |
CN113076074B (en) | Electronic blackboard writing reproduction method and system, electronic blackboard and readable medium | |
CN112289321B (en) | Explanation synchronization video highlight processing method and device, computer equipment and medium | |
CN109376228A (en) | A kind of information recommendation method, device, equipment and medium | |
CN105930323A (en) | File generating method and apparatus | |
CN113033718B (en) | Artificial intelligence data annotation task allocation method and device | |
EP3962073A1 (en) | Online interview method and system | |
CN111312260A (en) | Human voice separation method, device and equipment | |
CN110689924B (en) | Knockout strategy screening method and system based on multiple knockout types |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20200121 Address after: 510000 3-25-2, No. 309, Huangpu Avenue middle, Tianhe District, Guangzhou City, Guangdong Province Applicant after: Guangzhou xinyuxinban Internet Information Service Co., Ltd Address before: 511442 Panyu District Town, Guangzhou, Guangdong, 7 times 2001 Applicant before: Xu Yong |
|
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180928 |