CN106534108A

CN106534108A - Call party identification method, switch and command voice terminal under multi-party communication scene

Info

Publication number: CN106534108A
Application number: CN201610975040.7A
Authority: CN
Inventors: 叶飞; 淳增辉; 付建强; 白小平; 汤灵; 张康
Original assignee: Wuhan Ship Communication Research Institute
Current assignee: Wuhan Ship Communication Research Institute
Priority date: 2016-10-27
Filing date: 2016-10-27
Publication date: 2017-03-22

Abstract

The invention discloses a call party identification method under a multi-party communication scene. The method comprises the steps that a voice activity indication signal is acquired from a received voice signal through a voice activity detection algorithm; user identification information corresponding to the voice signal is acquired according to the voice activity indication signal; the voice signal and the user identification information are synchronously output; and a receiving party generates a call party instruction according to the mixed voice signal and user identification information. The invention further discloses a switch under a multi-party communication scene and a command voice terminal under a multi-party communication scene. The switch and the command voice terminal are combined with the call party identification method. The instruction signal is generated according to the voice signal decision. The unique identification is combined to identify a call party. The call party is identified in real time on the command voice terminal under a multi-party voice communication scene. The command efficiency can be greatly improved.

Description

A kind of correspondent under multi-party communication scene knows method for distinguishing, switch and commander's words Voice terminal

Technical field

The invention belongs to cable communicating technology field, knows more particularly, to the correspondent under a kind of multi-party communication scene Method for distinguishing, switch and commander voice terminal.

Background technology

Ship communication system include at least one switch, multiple commander voice terminals, multiple radio station, and switch it Between transmission link, commander voice terminal and switch between transmission link, the transmission link between radio station and switch.Refer to Wave voice terminal and point-to-point voice communication can be set up between radio station, it is also possible to conference model is set up with multiple radio station, entered The point-to-points voice communication of row；There is a unique number use in local voice network in each commander voice terminal or radio station With distinguishing identifier, for the addressing set up in calling procedure.

Under conference model, meeting is set up by commander voice terminal, multiple radio station are added into meeting；Commander voice terminal sends PTT (Push to Talk, PTT) signals and voice signal；PTT signals and voice signal are turned after receiving by the first switch Switch is dealt into, two link broadcasts are passed through respectively to radio station Switching Module and backbone communication module by meeting mix module；Electricity The signal for receiving is transmitted to radio station by platform Switching Module, and voice signal is sent under the control of PTT signals by the radio station；Trunk Voice signal and PTT signals are sent to second switch by communication module, are transmitted to another radio station by the latter；

In recipient, the voice signal for receiving is sent to the radio station Switching Module of the first switch in radio station, then by speech Signal is sent to meeting mix module；The voice signal for receiving is sent to second switch by another radio station, is exchanged by second Machine is sent to the first switch and carries out voice extraction；The voice signal for extracting is sent to meeting mix module, after stereo process Voice signal be sent respectively to terminal switch module again；Have a disadvantage in that, under the multi-party communication scene such as conference model, commander Voice terminal user cannot be distinguished by being which radio station user is making a speech.

The content of the invention

For the disadvantages described above or Improvement requirement of prior art, the invention provides the correspondent under multi-party communication scene is known Method for distinguishing, switch and commander voice terminal, its object is to solve, under multi-party communication scene, to command voice terminal user It is technical problem of which radio station user in speech that cannot be distinguished by.

To realize the object of the invention, according to one aspect of the present invention, there is provided the call under a kind of multi-party communication scene Square recognition methodss, comprise the steps：

(1) voice activity indication signal is obtained from the voice signal for receiving using voice activity detection algorithms and is talked with Message number synchronizes delay process；

(2) the corresponding user totem information of the voice signal is obtained according to the voice activity indication signal；

(3) speech signal synchronizes output with user totem information；

(4) recipient identifies correspondent according to the user totem information when voice signal is received.

Preferably, above-mentioned correspondent recognition methodss, when the voice activity indication signal is effective, indicate to believe from the speech Corresponding voice activity detection module numbering is extracted in number, corresponding user totem information is obtained by tabling look-up.

Preferably, above-mentioned correspondent recognition methodss, each voice activity detection module have a device interior unique Voice activity detection module is numbered；Voice activity detection module numbering be hardware number, logic module numbering, process number or Thread number.

Preferably, above-mentioned correspondent recognition methodss, such as identification information of the user totem information for equipment, radio station numbering Or title, commander voice terminal numbering or title, or telephone number, IP address, MAC Address etc. in the communication network only One identification information；

The user totem information be set up during voice communication is set up and preserve the user totem information and Mapping table between voice activity detection module numbering, the user totem information and voice activity detection module numbering tool There is one-to-one relation.

Preferably, above-mentioned correspondent recognition methodss, are carried out at audio mixing to the voice signal of correspondence different user identification information After reason, output is synchronized with user totem information.

To realize the object of the invention, according to another aspect of the present invention, there is provided the friendship under a kind of multi-party communication scene Change planes, including：

Voice activity detection module, for obtaining speech using voice activity detection algorithms from the voice signal for receiving Activity indicative signal, and delay process is synchronized to the voice signal；

Correspondent identification module, for obtaining the corresponding user of the voice signal according to the voice activity indication signal Identification information；

Terminal switch module, for exporting after the voice signal and user totem information synchronization that will be received.

To realize the object of the invention, according to it is still another aspect of the present invention to provide a kind of friendship under multi-party communication scene Commander voice terminal, for receiving voice signal and user totem information；Call is identified by the user totem information Side；And the user totem information is synchronously shown in its display unit with the output of voice signal.

In general, by the contemplated above technical scheme of the present invention compared with prior art, can obtain down and show Beneficial effect：

The mthods, systems and devices of the correspondent identification under the above-mentioned multi-party communication scene that the present invention is provided, using speech Activity detection algorithms are analyzed voice signal to adjudicate generation indication signal, and combine radio station numbering or title, commander voice terminal In numbering or the communication network such as title, telephone number, IP address, MAC Address, unique mark is recognizing correspondent；It is applied to meeting Under pattern or multipartite voice communication scenes, correspondent can be gone out in commander's voice terminal Real time identification, be commanded with greatly improving The beneficial effect of efficiency.

Description of the drawings

Fig. 1 is the flow chart of the correspondent recognition methodss under multi-party communication scene provided in an embodiment of the present invention；

Fig. 2 is the system block diagram of the switch under the multi-party communication scene that the present invention is provided；

The system block diagram of the switch under Fig. 3 multi-party communication scenes provided in an embodiment of the present invention；

Fig. 4 is the system block diagram of the commander voice terminal under the multi-party communication scene of the embodiment of the present invention；

Fig. 5 be conversed under conference model in embodiment and identification of being conversed schematic diagram；

Fig. 6 is the schematic diagram that voice signal and voice activity detection algorithms produce voice activity indication signal：Wherein, scheme A () is voice activity detection algorithms result of calculation；Figure (b) adjudicates the speech indication signal for producing for voice activity detection algorithms.

In all of the figs, identical reference be used for represent identical element or structure, wherein：

101- commanders voice terminal, 102- correspondent identifying systems, the first links of 104-, the second links of 106-, 301- ends End Switching Module, 302- radio station Switching Module, 303- meeting mix modules, 304- backbone communication modules, the identification of 305- correspondents Module, 306- voice activity detection modules.

Specific embodiment

In order that the objects, technical solutions and advantages of the present invention become more apparent, it is below in conjunction with drawings and Examples, right The present invention is further elaborated.It should be appreciated that specific embodiment described herein is only to explain the present invention, and It is not used in the restriction present invention.As long as additionally, technical characteristic involved in invention described below each embodiment Do not constitute conflict each other can just be mutually combined.

Correspondent recognition methodss under multi-party communication scene provided in an embodiment of the present invention, as shown in figure 1, including following step Suddenly：

Step 1001, obtains voice activity from the voice signal for receiving using voice activity detection algorithms and indicates letter Number；

Step 1002, obtains the corresponding user totem information of the voice signal according to above-mentioned voice activity indication signal；

The voice activity indication signal is effective, then extract corresponding voice activity from the speech indication signal Detection module is numbered, and obtains corresponding user totem information by tabling look-up.

Each voice activity detection module has the unique voice activity detection module numbering of device interior, the words Sound activity detection module numbering can be physics hardware or logic module numbering, or software in process number (ID) or Thread number (ID), it is relevant with concrete implementation mode.

The user totem information can be the identification information of equipment, and such as radio station numbering or title, commander voice terminal are compiled Number or unique identification information in the communication network such as title, or telephone number, IP address, MAC Address.

Step 1003, speech signal synchronize output with user totem information.

The voice signal of correspondence different user identification information is carried out, after stereo process, synchronizing with user totem information Output；

Step 1004, such as recipient, commander voice terminal, is led to according to the voice signal for receiving and user totem information Words side recognizes, including following sub-step：

2001, receive voice signal and user totem information；

2002, identify that voice signal is which equipment or user are making a speech by the user totem information；

2003, the user totem information is shown in its display unit, it is synchronous with the output of voice signal holding.

The system block diagram of the switch under multi-party communication scene provided by the present invention is as shown in figure 1, specifically include terminal Switching Module, radio station Switching Module, meeting mix module, backbone communication module, correspondent identification module and voice activity detection Module；

Terminal switch module has for connecting the outside PTT interface and speech interface for commanding voice terminal；Terminal switch The first end of module is connected with the first end of meeting mix module, the second end is connected with the first end of backbone communication module, the 3rd End is connected with the first end of correspondent identification module；

Radio station Switching Module has the PTT interface and speech interface for being used to connecting outside radio station；The of radio station Switching Module One end is connected with the second end of backbone communication module, the second end is connected with the first end of voice activity detection module；

Second end of meeting mix module is connected with the second end of voice activity detection module；Backbone communication module has company Connect the multiplex link interface and Single Link Interface of peripheral link；3rd end of backbone communication module and voice activity detection module 3rd end is connected；Second end of correspondent identification module is connected with the 4th end of voice activity detection module.

Wherein, terminal switch module is used for PTT signals and the voice signal meeting of being sent to for receiving outside commander voice terminal View mix module；Or from the reverse voice signal of meeting mix module reception, from the electricity of correspondent identification module reception activity Platform is numbered, and voice signal and radio station number information are sent to outside commander voice terminal by speech interface；

Radio station Switching Module is for receiving PTT signals and voice signal from voice activity detection module, and is connect by its PTT Mouth and speech interface are sent to outside radio station；Or voice signal is received from outside radio station, and it is forwarded to voice activity detection mould Block；

Meeting mix module is for receiving voice signal and optional PTT signals from terminal switch module, and is broadcast to ginseng Plus the radio station Switching Module of meeting, other terminal switch modules or the voice activity detection mould being associated with backbone communication module Block；And for being sent to terminal switch module after the voice signal that voice activity detection module sends is carried out stereo process；

Backbone communication module for from peripheral link extract multiplex signal, and therefrom extract single channel voice signal and PTT signals；Or receive the voice signal and PTT letter of voice activity detection module, radio station Switching Module or terminal switch module Number, and multiplexing generation multiplex signal is carried out, it is sent to peripheral link；

Correspondent identification module is for outer from the speech activity indicative signal judgement sent according to voice activity detection module Whether portion radio station or commander voice terminal are active；

When voice activity indication signal is effective, then show to detect the voice signal of user；Otherwise show to be not detected by using The voice signal at family；Correspondent identification module obtains the volume of corresponding voice activity detection module according to effective indication signal Number, corresponding terminal unit numbering is obtained by tabling look-up, and the terminal unit numbering of activity is sent to into terminal switch module；Will The terminal unit numbering is sent to outside commander voice terminal, can be directly fed back to user by the way of display screen, and It is synchronous with the output of voice signal preservation；

Voice activity detection module is used to process voice signal, it will voice signal that view mix module sends and optional PTT signals are forwarded to radio station Switching Module, commander's voice terminal Switching Module or the backbone communication module of its association；And according to from The voice signal that the radio station Switching Module of association, commander's voice terminal Switching Module or backbone communication module are received, by speech Activity detection algorithms are analyzed judgement and produce voice activity indication signal, by the voice activity indication signal and synchronizing relay Voice signal afterwards is separately sent to correspondent identification module and meeting mix module；

Voice activity detection module can when voice communication is set up with radio station Switching Module, commander voice terminal interchange mode Block or backbone communication module set up association；Each voice activity detection module can process voice signal all the way；Hand in each radio station Mold changing block and commander's voice terminal Switching Module can only be with voice activity detection module relations, and backbone communication module then may be used With with multiple voice activity detection module relations.

Switch 102 under multi-party communication scene provided in an embodiment of the present invention, its system as schematically shown in Figure 3, including Multiple terminal switch modules 301, radio station Switching Module 302, meeting mix module 303, backbone communication module 304, and call Square identification module 305 and voice activity detection module 306；

In embodiment, terminal switch module 301 passes through PTT interfaces and speech interface respectively and connects from commander voice terminal 101 PTT signals and voice signal are received, meeting mix module 303 is then sent to；

Terminal switch module 301 receives reverse voice signal from meeting mix module 303, and from correspondent identification module The user totem information of 305 reception activities, after by voice signal and user totem information synchronization respectively pass through speech interface or Voice signal and user totem information are sent to commander voice terminal 101 by other interfaces；

Radio station Switching Module 302 receives PTT signals and voice signal from voice activity detection module 306, then passes through respectively PTT interfaces and speech interface are sent to radio station 103；Radio station Switching Module 302 receives voice signal from radio station 103, and is transmitted to Voice activity detection module 306；

Backbone communication module 304 is connected with the first link 104 by physical port, for setting up logical with other switches Letter；Backbone communication module 304 receives multiplex signal from the first link 104, and extracts single channel voice signal and PTT letters Number, it is sent to voice activity detection module 306, radio station Switching Module 302 or terminal switch module 301；

Backbone communication module 304 is from voice activity detection module 306, radio station Switching Module 302 or terminal switch module 301 Voice signal and PTT signals are received, other switches are transferred to by the first link 104 Jing after multiplexing.

Meeting mix module 303 receives voice signal and optional PTT signals from terminal switch module 301, is broadcast to ginseng Plus the voice activity that the radio station Switching Module 302 of meeting, other terminal switch modules 301 or backbone communication module 304 are associated Detection module 306；

Meeting mix module 303 receives voice signal from voice activity detection module 306, is sent to after carrying out stereo process Terminal switch module 301, is transmitted to commander voice terminal 101 by terminal switch module 301.

Each voice activity detection module 306 has the unique voice activity detection module numbering of device interior, institute State the process number in the hardware or logic module numbering, or software that voice activity detection module numbering can be physics (ID) it is or thread number (ID), relevant with concrete implementation mode.

Voice activity detection module 306 can when voice communication is set up with radio station Switching Module 302, terminal switch module 301 or backbone communication module 304 set up association, set up user totem information and voice activity detection module numbering between mapping Relation table, the user totem information and voice activity detection module numbering are with one-to-one relation.Each voice activity Detection module 306 can process voice signal all the way.Each radio station Switching Module 302 and terminal switch module 301 can only be with one Individual voice activity detection module 306 is associated, and backbone communication module 304 then can be closed with multiple voice activity detection modules 306 Connection.

The user totem information can be the identification information of equipment, and such as radio station numbering or title, commander voice terminal are compiled Number or unique identification information in the communication network such as title, or telephone number, IP address, MAC Address, port numbers.

Voice activity detection module 306 receives voice signal and optional PTT signals from meeting mix module 303, and turns It is dealt into radio station Switching Module 302, terminal switch module 301 or the backbone communication module 304 of association.Voice activity detection module 306 receive voice signal from the radio station Switching Module 302, terminal switch module 301 or backbone communication module 304 for associating, and pass through Voice activity detection algorithms are analyzed process to the voice signal being input into, and make decisions generation voice activity indication signal, The indication signal and the voice signal after synchronizing relay are exported respectively to correspondent identification module 305 and meeting mix module 303。

Correspondent identification module 305 receives voice activity indication signal from each voice activity detection module 306, according to speech Activity indicative signal judges the whether activity of radio station or commander voice terminal.If i.e. voice activity indication signal be it is effective, If then indicating user, message number is detected, and corresponding equipment is movable, is otherwise not detected by message if user Number.For effective indication signal, correspondent identification module 305 can obtain corresponding voice activity detection module numbering, pass through Search the mapping table and can obtain corresponding user totem information, the user totem information of activity is sent to into terminal friendship Mold changing block 301.

Commander voice signal and user totem information of the voice terminal 101 from the 102 reception activity of correspondent identifying system, lead to Cross user totem information and identify that voice signal is which equipment or user in speech, and the use is shown in its display unit Family identification information, it is synchronous with the output of voice signal holding.

In embodiment, radio station numbering or commander voice terminal numbering are in the voice communication network to distribute to terminal unit Unique number；The physical port of each connection commander voice terminal 101 or radio station 103 of switch is assigned with one and sets Standby internal unique port numbers.

Backbone communication module 304 internally has multiple virtual ports, the corresponding voice signal all the way of each virtual port, correspondence In its multiple signals for sending or receiving from the first link 104 all the way；Virtual port is similarly assigned with a device interior Unique port numbers.During setting up voice communication, needs configure and preserve the port numbers, user totem information and speech and live Mapping relations between dynamic detection module numbering and business numbering, and the port numbers and voice activity detection mould after association Block number and user totem information are one-to-one relations.

In embodiments of the invention, user totem information adopt radio station or commander voice terminating set numbering or title, Telephone number, IP address, MAC Address are used as unique number；In actual applications, unique user name, the overall situation can also be adopted Port numbers of distribution etc. will not be described here distinguishing or labelling, but all in the protection domain of this patent.

Commander voice terminal under the multi-party communication scene that the present invention is provided, as schematically shown in Figure 4, including voice signal and User totem information receiver module 401, synchronous identification module 402, the conversion of voice signal output module 403, user totem information Module 404, and display module 405.

Voice signal and user totem information receiver module 401 receive voice signal and ID letter from switch 102 Breath.

Wherein, voice signal can be analogue signal, it is also possible to digital signal；If the latter, then it is probably to encode Or the form reception of frame.

By the user totem information, synchronous identification module 402, identifies that voice signal is which equipment or user exist Speech.

Optionally, the voice signal and user totem information of reception are synchronized by synchronous identification module 402.

Voice signal is converted into voice output by voice signal output module 403.

Optionally, the voice signal of reception is carried out by voice signal output module 403 according to mode corresponding with sender Solution frame or decoding.

User totem information conversion module 404 is for being converted into being easy to show by user totem information or be easy to user to understand Information.

Display module 405, user totem information or the information Jing after user totem information conversion module 404 is converted are carried out Show, it is synchronous with the voice signal holding of the output of voice signal output module 403.

Shown in Fig. 5, it is using being conversed under system and method conference model provided in an embodiment of the present invention and knowledge of being conversed Other schematic diagram, it is specific as follows：

(1) command voice terminal 101 to set up meeting, radio station 1 and radio station 2 are added into meeting；

(2) voice terminal 101 is commanded to send PTT signals and voice signal, the terminal switch of switch 1 by link 105 After module 1 is received, pass through link 307 and voice signal and PTT signals are sent to into meeting mix module 303, then pass through respectively Link 308-1 and 308-2 are broadcast to voice activity detection module 1 and 3；Voice activity detection module 1 is transmitted to by link 310 Radio station Switching Module 1, radio station Switching Module 1 are transmitted to radio station 1 by the second link 106, and radio station 1 will under the control of PTT signals Voice signal sends.Voice activity detection module 3 is transmitted to backbone communication module 2, backbone communication module by link 311 Voice signal and PTT signals are sent to switch 2 by the first link 104 by 2, are transmitted to by the second link 106 by the latter Voice signal is sent under the control of PTT signals by radio station 2, radio station 2.

The voice signal for receiving is sent to switch 1 by the second link 106 by radio station 1, is exchanged by the radio station of switch 1 Module 1 is received, then voice signal is sent to voice activity detection module 1 by link 310, and voice activity detection module 1 is led to Voice activity indication signal is produced after crossing speech detection algorithms analyzing and processing, and voice activity indication signal is passed through into link 309- 1 is sent to correspondent identification module 1.Voice activity detection module 1 will be passed through with the voice signal after indication signal synchronizing relay Link 308-1 is sent to meeting mix module 1.

The speech detection algorithms adopted in embodiment include the detection algorithm of feature based and the detection algorithm based on model； Wherein, the detection algorithm of feature based include based on short-time energy, zero-crossing rate, entropy, LPC cepstrum distances algorithm；Wherein, it is based on The detection algorithm of model, the including but not limited to algorithm based on hidden Markov model, support vector machine and neutral net.As schemed It is speech activity indicative signal schematic diagram that the voice signal that adopts in embodiment and voice activity detection algorithms are produced shown in 6； Wherein, it is voice activity detection algorithms result of calculation to scheme (a)；Figure (b) is adjudicated the speech for producing for voice activity detection algorithms and is referred to Show signal.In actual applications, can also adopt existing text hegemony algorithm combination, or adopt other algorithms, but be all for Detection voice signal whether there is, all in the protection domain of this patent.

In embodiment, the voice signal for receiving is sent to switch 2 by the second link 106 by radio station 2, by switch 2 Switch 1 is sent to by the first link 104.The voice signal in radio station 2 is extracted by the backbone communication module 2 of switch 1 Voice activity detection module 3 is sent to by link 311 afterwards, voice activity detection module 3 is produced after Algorithm Analysis process Voice activity indication signal, and voice activity indication signal is sent to into correspondent identification module 1 by link 309-2.Speech Activity detection module 3 will pass through link 308-2 and be sent to meeting audio mixing mould with the voice signal after the indication signal synchronizing relay Block 1.

The voice signal carried out after stereo process is passed through link 307 and is sent to terminal switch module by meeting mix module 1 1。

Correspondent identification module 1 receives the speech activity indicative signal that each voice activity detection module sends, by looking into Look for the mapping table obtain corresponding user totem information, and the user totem information of activity is sent to into terminal friendship Mold changing block 1.

The voice signal that terminal switch module 1 is received from meeting mix module 1, from the use that correspondent identification module 1 is received Family identification information, is sent to commander voice terminal by link 105 after by voice signal and user totem information synchronization respectively 101。

Commander voice terminal 101 is numbered from the terminal unit of 1 reception activity of switch, and institute is shown in its display unit User totem information is stated to user, it is synchronous with the output of voice signal holding.

As it will be easily appreciated by one skilled in the art that the foregoing is only presently preferred embodiments of the present invention, not to The present invention, all any modification, equivalent and improvement made within the spirit and principles in the present invention etc. are limited, all should be included Within protection scope of the present invention.

Claims

1. the correspondent recognition methodss under a kind of multi-party communication scene, it is characterised in that comprise the steps：

(1) voice activity indication signal is obtained from the voice signal for receiving using voice activity detection algorithms and talks with message Number synchronize delay process；

(3) speech signal synchronizes output with user totem information；

2. correspondent recognition methodss as claimed in claim 1, it is characterised in that when the voice activity indication signal effectively, Corresponding voice activity detection module numbering is extracted from the speech indication signal, corresponding ID is obtained by tabling look-up Information.

3. correspondent recognition methodss as claimed in claim 2, it is characterised in that each voice activity detection module has The unique voice activity detection module numbering of device interior；The voice activity detection module numbering is hardware number, logic mould Block number, process number or thread number.

4. correspondent recognition methodss as claimed in claim 2, it is characterised in that：The user totem information is equipment logical Unique identification information in communication network；Number including radio station or title, commander voice terminal numbering or title, telephone number, IP Address, MAC Address；

The user totem information is that the user totem information and speech are set up and preserved during voice communication is set up Mapping table between activity detection module numbering, the user totem information have one with voice activity detection module numbering One-to-one correspondence.

5. correspondent recognition methodss as claimed in claim 1, it is characterised in that the speech to correspondence different user identification information After signal carries out stereo process, output is synchronized with user totem information.

6. the switch under a kind of multi-party communication scene based on described in any one of Claims 1 to 5, it is characterised in that include：

Voice activity detection module, for obtaining voice activity using voice activity detection algorithms from the voice signal for receiving Indication signal, and delay process is synchronized to the voice signal；

Correspondent identification module, for obtaining the corresponding ID of the voice signal according to the voice activity indication signal Information；

7. the commander voice terminal under a kind of multi-party communication scene based on described in any one of Claims 1 to 5, its feature exist In for receiving voice signal and user totem information, and identifying correspondent by the user totem information；And and speech The output of signal synchronously shows the user totem information in its display unit.