CN107358958B

CN107358958B - Intercommunication method, apparatus and system

Info

Publication number: CN107358958B
Application number: CN201710768017.5A
Authority: CN
Inventors: 仇波
Original assignee: CHANGSHA SHIBANG COMMUNICATION TECHNOLOGY CO LTD
Current assignee: Shibang Communication Co., Ltd
Priority date: 2017-08-30
Filing date: 2017-08-30
Publication date: 2018-09-18
Anticipated expiration: 2037-08-30
Also published as: CN107358958A

Abstract

The present invention provides a kind of intercommunication method, apparatus and systems, are related to the technical field of intercommunication, and this method includes：Obtain the first solicited message of current time request intercommunication connection, wherein the first solicited message is audio request information；Phonetic analysis is carried out to audio request information, obtains phonetic analysis result, wherein phonetic analysis includes at least following one：Speech analysis, voiceprint analysis, volume analysis；According to phonetic analysis as a result, determining the first response priority of request intercommunication connection, response is carried out to request intercommunication connection so that response personnel are based on the first response priority.The technical issues of present invention alleviates traditional intercom system processing event lag and brings larger live load to staff.

Description

Intercommunication method, apparatus and system

Technical field

The present invention relates to intercommunication technical fields, more particularly, to a kind of intercommunication method, apparatus and system.

Background technology

The public arenas such as finance, judicial prison, public security traffic, often through intercom system so that be in the work of dispatching desk Make personnel and by attendant or be managed personnel in public arena and carry out call for accident, to wait taking Business personnel provide service or protect the safety of personnel to be serviced, or treat administrative staff and carry out remote management.For example, bank is sought In the industry Room, user meets with retain card when ATM is withdrawn the money, and user can be led to by intercom system and bank clerk Words, to obtain the respective handling of bank clerk.For another example, in the prison house in prison, when group occurs for convict to the event of beatting up When, if someone triggers warning device, prison guard is got in touch by intercom system and the convict in prison, to be carried out to convict Deterrence is propagandaed directed to communicate to prevent the further fermentation of event.

Intercom system in the prior art is all to be triggered by the active of people to establish intercommunication contact, and asked in intercommunication contact Response can be not necessarily obtained when asking more, intercom system this first has hysteresis quality for the processing of event.In addition, Staff in intercom system dispatching desk is often connected to a large amount of call requests in a short time, thus dispatching desk needs to take out Go out special staff to carry out reasonable arrangement to the processing sequence of these call requests, and suitable staff is commanded to come Respective handling is done to call request, this undoubtedly can make staff undertake more live load.Also, a large amount of call request In, inevitably have invalidation request, for example, bank's defendance center can answer a large amount of consulting events daily, and seek advice from it is interior often not Belong to the management area at bank's defendance center, but have to answer according to regulation staff, so that staff is unrestrained Take many times in uncorrelated event, reduce the working efficiency of staff, increases the live load of staff indirectly.

The technology of larger live load is brought for above-mentioned traditional intercom system processing event lag and to staff Problem lacks effective solution method at present.

Invention content

In view of this, the purpose of the present invention is to provide a kind of intercommunication method, apparatus and system, to alleviate traditional intercommunication The technical issues of system processing event lags and brings larger live load to staff.

In a first aspect, an embodiment of the present invention provides a kind of intercommunication methods, including：

Obtain the first solicited message of current time request intercommunication connection, wherein first solicited message is asked for audio Seek information；

Phonetic analysis is carried out to the audio request information, obtains phonetic analysis result, wherein the phonetic analysis is at least Including following one：Speech analysis, voiceprint analysis, volume analysis；

According to the phonetic analysis as a result, the first response priority of the request intercommunication connection is determined, so as to response people Member carries out response based on the first response priority to request intercommunication connection.

With reference to first aspect, an embodiment of the present invention provides the first possible embodiments of first aspect, wherein institute It states in the case that phonetic analysis includes voiceprint analysis, it is described that phonetic analysis is carried out to the audio request information, obtain sound point Analysis is as a result, include：

Human voice signal is extracted from the audio request information；

The human voice signal is handled, vector signal to be verified is obtained；

Obtain the sound characteristic signal being stored in advance in voice print database, wherein the sound characteristic signal is advance Vector signal obtained from handling the sound of target utterance person；

The vector signal to be verified and the sound characteristic signal are compared, the vector signal to be verified is obtained With the matching degree between the sound characteristic signal, and using the matching degree as the phonetic analysis result.

The possible embodiment of with reference to first aspect the first, an embodiment of the present invention provides second of first aspect Possible embodiment, wherein it is described to extract human voice signal from the audio request information, including：

Using acoustic echo cancellation adaptive algorithm by the audio signal high-frequency signal and low frequency signal carry out consider remove, obtain centre Signal, wherein the high-frequency signal is the signal that frequency is higher than voice frequency band, and the low frequency signal is that frequency is less than voice frequency band Signal；

Ambient noise signal in the M signal is cut down, the human voice signal is obtained, wherein the background The frequency of noise signal is in voice frequency band.

The possible embodiment of with reference to first aspect the first, an embodiment of the present invention provides the third of first aspect Possible embodiment, wherein it is described that the human voice signal is handled, vector signal to be verified is obtained, including：

Speech recognition is carried out to the human voice signal, obtains the word represented by the human voice signal, and according to institute's predicate The semanteme of language extracting keywords from the word, wherein the keyword can indicate that the core expressed by the word contains Justice；

The target human voice signal corresponding to the keyword is extracted from the human voice signal；

The target human voice signal is handled, the vector signal to be verified is obtained.

The possible embodiment of with reference to first aspect the first, an embodiment of the present invention provides the 4th kind of first aspect Possible embodiment, wherein the acquisition is stored in advance in before the sound characteristic signal in voice print database, the method Further include：

It obtains with reference to audio signal, the audio letter with reference to corresponding to the sound that audio signal is the target utterance person Number；

It is handled with reference to audio signal described, obtains the sound characteristic signal；

The sound characteristic signal is stored in the voice print database.

With reference to first aspect, an embodiment of the present invention provides the 5th kind of possible embodiments of first aspect, wherein institute It states in the case that phonetic analysis includes speech analysis, it is described that phonetic analysis is carried out to the audio request information, obtain sound point Analysis is as a result, include：

Human voice signal is extracted from the audio request information；

Speech recognition is carried out to the human voice signal, obtains the word represented by the human voice signal；

Whether include to be detected word, and using query result as the phonetic analysis result if being inquired from the word.

With reference to first aspect, an embodiment of the present invention provides the 6th kind of possible embodiments of first aspect, wherein institute It states in the case that phonetic analysis includes volume analysis, it is described that phonetic analysis is carried out to the audio request information, obtain sound point Analysis is as a result, include：

Human voice signal is extracted from the audio request information；

Detect the volume decibel value of the voice represented by the human voice signal；

The volume decibel value and volume threshold are compared, the volume decibel value and the volume threshold are obtained Difference, and using the difference as the phonetic analysis result.

With reference to first aspect, an embodiment of the present invention provides the 7th kind of possible embodiments of first aspect, wherein institute Stating intercommunication method further includes：Obtain the second solicited message of request intercommunication connection described in current time, wherein second request Information is image request information；

Image information to be verified is extracted from described image solicited message, by the image information to be verified and characteristics of image Information is compared, and obtains image comparison result, wherein described image characteristic information is to be stored in image data base to be used for The image information of reference；

According to described image comparison result, the second response priority of the request intercommunication connection is determined, so as to response people Member, which connects the request intercommunication with the second response priority based on the first response priority, carries out response.

Second aspect, the embodiment of the present invention also provide a kind of talkback unit, including：

First acquisition module, the first solicited message for obtaining current time request intercommunication connection, wherein described first Solicited message is audio request information；

Analysis module obtains phonetic analysis result, wherein institute for carrying out phonetic analysis to the audio request information It states phonetic analysis and includes at least following one：Speech analysis, voiceprint analysis, volume analysis；

First determining module is used for according to the phonetic analysis as a result, determining the first response of the request intercommunication connection Priority carries out response so that response personnel are based on the first response priority to request intercommunication connection.

The third aspect, the embodiment of the present invention also provide a kind of intercom system, including：Terminal device and dispatching desk, wherein

The terminal device is used to acquire the first solicited message of current time request intercommunication connection, wherein described first Solicited message is audio request information；The dispatching desk is connected with the terminal device, and the dispatching desk is for executing first party Intercommunication method described in face.

The embodiment of the present invention brings following advantageous effect：Obtain the first request letter of current time request intercommunication connection Breath, wherein the first solicited message is audio request information；Phonetic analysis is carried out to audio request information, obtains phonetic analysis knot Fruit, phonetic analysis include at least following one：Speech analysis, voiceprint analysis, volume analysis；Then, according to the phonetic analysis As a result, the first response priority of request intercommunication connection is determined, to which response personnel can be based on the first response priority to asking It asks intercommunication connection to carry out response, reduces the scheduling link to request intercommunication connection reply work, also, be based in response personnel During first response priority carries out response to request intercommunication connection, due to having fully taken into account answering for request intercommunication connection Priority is answered, for the intercommunication connection request that some priority are larger, response personnel understand preferential answering, traditional to alleviate The technical issues of intercom system processing event lags and brings larger live load to staff.

Other features and advantages of the present invention will illustrate in the following description, also, partly become from specification It obtains it is clear that understand through the implementation of the invention.The purpose of the present invention and other advantages are in specification, claims And specifically noted structure is realized and is obtained in attached drawing.

To enable the above objects, features and advantages of the present invention to be clearer and more comprehensible, preferred embodiment cited below particularly, and coordinate Appended attached drawing, is described in detail below.

Description of the drawings

It, below will be to specific in order to illustrate more clearly of the specific embodiment of the invention or technical solution in the prior art Embodiment or attached drawing needed to be used in the description of the prior art are briefly described, it should be apparent that, in being described below Attached drawing is some embodiments of the present invention, for those of ordinary skill in the art, before not making the creative labor It puts, other drawings may also be obtained based on these drawings.

Fig. 1 is a kind of flow chart for intercommunication method that the embodiment of the present invention one provides；

Fig. 2 is a kind of method flow diagram carrying out phonetic analysis to audio request information that the embodiment of the present invention one provides；

Fig. 3 is the method flow that human voice signal is extracted in a kind of information from audio request that the embodiment of the present invention one provides Figure；

Fig. 4 be the embodiment of the present invention one provide it is a kind of human voice signal is handled, obtain vector signal to be verified Method flow diagram；

Fig. 5 is the flow chart for another intercommunication method that the embodiment of the present invention one provides；

Fig. 6 is another method flow that phonetic analysis is carried out to audio request information that the embodiment of the present invention one provides Figure；

Fig. 7 is another method flow that phonetic analysis is carried out to audio request information that the embodiment of the present invention one provides Figure；

Fig. 8 is the flow chart for another intercommunication method that the embodiment of the present invention one provides；

Fig. 9 is a kind of structure diagram of talkback unit provided by Embodiment 2 of the present invention；

Figure 10 is a kind of structure diagram of analysis module provided by Embodiment 2 of the present invention；

Figure 11 is a kind of structure diagram of first extraction unit provided by Embodiment 2 of the present invention；

Figure 12 is a kind of structure diagram of first processing units provided by Embodiment 2 of the present invention；

Figure 13 is the structure diagram of another talkback unit provided by Embodiment 2 of the present invention；

Figure 14 is a kind of structure diagram for intercom system that the embodiment of the present invention three provides.

Icon：The first acquisition modules of 100-；200- analysis modules；The first extraction units of 201-；2011-, which considers, removes subelement； 2012- cuts down subelement；202- first processing units；2021- identifies subelement；2022- extracts subelement；2023- processing Unit；203- first acquisition units；204- comparing units；205- second acquisition units；206- second processing units；207- is stored Unit；The second extraction units of 208-；209- recognition units；210- query units；211- third extraction units；212- detections are single Member；213- comparing units；The first determining modules of 300-；The second acquisition modules of 400-；500- extraction modules；600- second determines mould Block；700- terminal devices；800- dispatching desks.

Specific implementation mode

In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with attached drawing to the present invention Technical solution be clearly and completely described, it is clear that described embodiments are some of the embodiments of the present invention, rather than Whole embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art are not making creative work premise Lower obtained every other embodiment, shall fall within the protection scope of the present invention.

Currently, intercom system is triggered by the active of people to establish intercommunication contact, and staff also needs to hold The scheduling that load response works to call request, thus, processing event has hysteresis quality, and brings larger work negative to staff Lotus.Based on this, a kind of intercommunication method, apparatus and system provided in an embodiment of the present invention can alleviate traditional intercom system The technical issues of processing event lags and brings larger live load to staff.

Embodiment one

A kind of intercommunication method provided in an embodiment of the present invention, as shown in Figure 1, including：

Step S102 obtains the first solicited message of current time request intercommunication connection, wherein the first solicited message is sound Frequency solicited message；

Step S104 carries out phonetic analysis to audio request information, obtains phonetic analysis result, wherein phonetic analysis is extremely Include following one less：Speech analysis, voiceprint analysis, volume analysis；

Step S106, according to phonetic analysis as a result, the first response priority of request intercommunication connection is determined, so as to response people Member carries out response based on the first response priority to request intercommunication connection.

It should be noted that the intercommunication method is applied to dispatching desk, dispatching desk is connected with terminal device, terminal device acquisition Current time asks the first solicited message of intercommunication connection, dispatching desk to be believed according to the first request that current time asks intercommunication to connect Breath determines the first response priority of request intercommunication connection, and the response personnel of dispatching desk are according to the first response priority to request Intercommunication connection carries out response.

It is actively acquired in predetermined interval duration it should be noted that the first solicited message here can be terminal device The audio-frequency information of itself local environment.

Intercommunication method provided in an embodiment of the present invention, due to asking the first solicited message of intercommunication connection according to current time The first response priority that request intercommunication connection is determined, to which response personnel can be based on the first response priority to request pair It says that connection carries out response, reduces the scheduling link to request intercommunication connection reply work, also, first is based in response personnel During response priority carries out response to request intercommunication connection, since the response for having fully taken into account request intercommunication connection is excellent First grade, for the intercommunication connection request that some priority are larger, response personnel understand preferential answering, to alleviate traditional intercommunication The technical issues of system processing event lags and brings larger live load to staff.

Specifically, it can be the active call response personnel after the first response priority reaches pre-set priority, be not necessarily to people Work active call response personnel carry out timely response, and the technology to be conducive to alleviate traditional intercom system processing event lag is asked Topic；And when the first response priority is less than pre-set priority, response personnel can be according to the first response priority come to request Intercommunication connection carries out response, no longer needs to manually be allocated response task, reduces the live load of staff, alleviate The technical issues of traditional intercom system brings larger live load to staff.

In one optional embodiment of the embodiment of the present invention, as shown in Fig. 2, the case where phonetic analysis includes voiceprint analysis Under, phonetic analysis is carried out to audio request information, obtains phonetic analysis as a result, including the following steps：

Step S201 extracts human voice signal from audio request information；

Step S202, handles human voice signal, obtains vector signal to be verified；

Step S203 is obtained and is stored in advance in sound characteristic signal in voice print database, wherein sound characteristic signal is Vector signal obtained from handling in advance the sound of target utterance person；

Vector signal to be verified and sound characteristic signal are compared step S204, obtain vector signal to be verified with Matching degree between sound characteristic signal, and using matching degree as phonetic analysis result.

Specifically, vector signal includes the amplitude and phase of vocal print, and the feature of vector signal is easier to be stored, and is more easy to It is called to be compared.May include multiple vector signals to be verified after one people's Underwater Acoustic channels, it will each arrow to be verified Amount signal and sound characteristic signal are compared, and then regard each obtained matching degree synthesis that compares as phonetic analysis result.

The embodiment of the present invention carries out phonetic analysis, first, sound groove recognition technology in e using voiceprint analysis to audio request information It is unrelated with language of speaking, unfailing performance that sound groove recognition technology in e provided and other biological identification technology unrelated with dialect intonation (for example, fingerprint, palm shape and iris) compares favourably；In addition, sound collection mode is contactless, voice collection device low cost Honest and clean, can access on the terminal device of intercom system can acquire sound with sound pick-up, and voice signal is convenient for remote transmission and obtains It takes, sound collection process is convenient；And voiceprint analysis is not related to privacy concern, adaptation population's range is than wide.Thus, the present invention The scope of application for the intercommunication method that embodiment provides is wider, and application is stronger.

In another optional embodiment of the embodiment of the present invention, as shown in figure 3, extracting voice from audio request information Signal includes the following steps：

Step S301, using acoustic echo cancellation adaptive algorithm by audio signal high-frequency signal and low frequency signal carry out consider remove, obtain To M signal, wherein high-frequency signal is the signal that frequency is higher than voice frequency band, and low frequency signal is frequency less than voice frequency band Signal；

Step S302 carries out gain operation to M signal, and the ambient noise signal in M signal is cut down, Obtain human voice signal, wherein the frequency of ambient noise signal is in voice frequency band.

Specifically, the frequency for the sound that human hair goes out is in voice frequency range, in embodiments of the present invention, voice frequency band Lower limit can be set as 20Hz, the upper limit can be set as 20KHz.

In addition, using acoustic echo cancellation adaptive algorithm by audio signal high-frequency signal and low frequency signal carry out consider remove, i.e.,：It utilizes The embedded software of dsp chip carries out operation to audio signal and obtains the frequency values of audio signal, then to surpassing in audio signal The signal for going out voice frequency band directly filters out, and the embedded software of dsp chip here is to consider to remove signal based on acoustic echo cancellation adaptive algorithm Principle and the software compiled are considered about acoustic echo cancellation adaptive algorithm except some existing echoes in the prior art may be used in signal principle Cancellation algorithms, which is not described herein again.

Moreover, Given this when M signal be mixed with ambient noise signal, the vocal print feature meeting of ambient noise signal It partly overlaps with the vocal print feature of voice, thus needs to cut down the ambient noise signal in M signal, obtain voice Signal.Specifically, discrete Fourier transform may be used to convert M signal, then from the signal that transformation obtains, The signal that continuous duration in time is less than to preset duration is determined as above-mentioned ambient noise signal, and is considered and removed, and people is obtained Acoustical signal.

In another optional embodiment of the embodiment of the present invention, as shown in figure 4, handling human voice signal, obtain Vector signal to be verified, includes the following steps：

Step S401 carries out speech recognition to human voice signal, obtains the word represented by human voice signal, and according to word Semanteme extracting keywords from word, wherein keyword can indicate the core meaning expressed by word.For example, bank business Keyword " robbery " in the Room in collected first solicited message, expresses the first solicited message and requests help.

Step S402 extracts the target human voice signal corresponding to keyword from human voice signal.

Step S403 handles target human voice signal, obtains vector signal to be verified.

The embodiment of the present invention, keyword can indicate the core meaning expressed by word, pass through the mesh corresponding to keyword Human voice signal is marked to determine the first response priority, can more accurately determine the priority of request intercommunication connection.

In another optional embodiment of the embodiment of the present invention, voice print database is stored in advance in as shown in figure 5, obtaining In sound characteristic signal before, intercommunication method further includes following steps：

Step S501 is obtained with reference to audio signal, the audio with reference to corresponding to the sound that audio signal is target utterance person Signal；

Step S502 obtains sound characteristic signal to being handled with reference to audio signal；

Sound characteristic signal is stored in voice print database by step S503.

In the embodiment of the present invention, first the sound of target utterance person is handled, obtains sound characteristic signal, then will be waited for Verification vector signal and sound characteristic signal are compared, and are completed to the sound in the first solicited message using the method for deep learning The identification of frequency signal, realizes the automatic discriminant function of machine, and intelligence degree is relatively high.And by acquiring more targets Sound characteristic signal corresponding to sounder so that the sound characteristic in voice print database is more, to the first solicited message In audio signal identification it is more accurate.

System can the continuous collected audio-video signal data of Automatic Optimal, fill vocal print library and Activity recognition library number According to amount, more access times are more, differentiate more accurate.

In another optional embodiment of the embodiment of the present invention, as shown in fig. 6, phonetic analysis includes the feelings of speech analysis Under condition, phonetic analysis is carried out to audio request information, obtains phonetic analysis as a result, including the following steps：

Step S601 extracts human voice signal from audio request information；

Step S602 carries out speech recognition to human voice signal, obtains the word represented by human voice signal；

Whether step S603, it includes word to be detected to be inquired from word, and using query result as phonetic analysis result.

In embodiments of the present invention, the semanteme of the word come out by speech recognition determines the first response priority, example It such as, can be with if inquiring the words to be detected such as " robbery ", " help " from word in the environment as bank bussiness hall The response priority of intercommunication connection will be asked to be determined as higher priority.

In another optional embodiment of the embodiment of the present invention, as shown in fig. 7, phonetic analysis includes the feelings of volume analysis Under condition, phonetic analysis is carried out to audio request information, obtains phonetic analysis as a result, including the following steps：

Step S701 extracts human voice signal from audio request information；

Step S702 detects the volume decibel value of the voice represented by human voice signal；

Volume decibel value and volume threshold are compared by step S703, obtain the difference of volume decibel value and volume threshold Value, and using difference as phonetic analysis result.

In the embodiment of the present invention, the response priority of request intercommunication connection, example are determined by the volume decibel value of voice Such as, if the decibel value of the voice in prison house is more than volume threshold, crowd fighting thing has occurred in the convict that can be determined as in prison house The response priority of the such request intercommunication response received is determined as larger priority by part.

In another optional embodiment of the embodiment of the present invention, as shown in figure 8, intercommunication method further includes following steps：

Step S801 obtains the second solicited message of current time request intercommunication connection, wherein the second solicited message is figure As solicited message；

Step S802 extracts image information to be verified from image request information, and image information to be verified and image is special Reference breath is compared, and obtains image comparison result, wherein image feature information is to be stored in image data base to be used to join According to image information；

Step S803 determines the second response priority of request intercommunication connection, so as to response people according to image comparison result Member connects request intercommunication with the second response priority based on the first response priority and carries out response.

In the embodiment of the present invention, response personnel combine the first response priority and image request that audio request information determines The second response priority that information determines more can reply in time and reasonably intercommunication to carry out response to request intercommunication connection Connection request.

Embodiment two

A kind of talkback unit provided in an embodiment of the present invention, as shown in figure 9, including：

First acquisition module 100, the first solicited message for obtaining current time request intercommunication connection, wherein first Solicited message is audio request information；

Analysis module 200 obtains phonetic analysis result, wherein sound for carrying out phonetic analysis to audio request information Analysis includes at least following one：Speech analysis, voiceprint analysis, volume analysis；

First determining module 300, for asking the first response of intercommunication connection preferential as a result, determining according to phonetic analysis Grade carries out response so that response personnel are based on the first response priority to request intercommunication connection.

In embodiments of the present invention, the first acquisition module 100 obtains the first request letter of current time request intercommunication connection Breath, wherein the first solicited message is audio request information；Analysis module 200 carries out phonetic analysis to audio request information, obtains Phonetic analysis result；Then the first determining module 300, according to phonetic analysis as a result, determining the first response of request intercommunication connection Priority.To which response personnel can be based on the first response priority and carry out response to request intercommunication connection, reduce to request Intercommunication connection reply work scheduling link, also, response personnel be based on the first response priority to request intercommunication connect into It is larger for some priority due to having fully taken into account the response priority of request intercommunication connection during row response Intercommunication connection request, response personnel understand preferential answering, to which the intercom system processing event for alleviating traditional lags and to work Personnel bring the technical issues of larger live load.

In one optional embodiment of the embodiment of the present invention, as shown in Figure 10, analysis module 200 includes：

First extraction unit 201, in the case where phonetic analysis includes voiceprint analysis, being carried from audio request information Take human voice signal；

First processing units 202 obtain vector signal to be verified for handling human voice signal；

First acquisition unit 203, for obtaining the sound characteristic signal being stored in advance in voice print database, wherein sound Sound characteristic signal is vector signal obtained from handling in advance the sound of target utterance person；

Comparing unit 204 obtains vector to be verified for vector signal to be verified and sound characteristic signal to be compared Matching degree between signal and sound characteristic signal, and using matching degree as phonetic analysis result.

In one optional embodiment of the embodiment of the present invention, as shown in figure 11, the first extraction unit 201 includes：

Consider remove subelement 2011, for using acoustic echo cancellation adaptive algorithm by audio signal high-frequency signal and low frequency signal into Row, which is considered, to be removed, and M signal is obtained, wherein high-frequency signal is the signal that frequency is higher than voice frequency band, and low frequency signal is less than for frequency The signal of voice frequency band；

Subelement 2012 is cut down, for cutting down the ambient noise signal in M signal, obtains human voice signal, In, the frequency of ambient noise signal is in voice frequency band.

In one optional embodiment of the embodiment of the present invention, as shown in figure 12, first processing units 202 include：

Identify subelement 2021, for human voice signal's progress speech recognition, obtaining the word represented by human voice signal, and According to the semantic extracting keywords from word of word, wherein keyword can indicate the core meaning expressed by word；

Subelement 2022 is extracted, for extracting the target human voice signal corresponding to keyword from human voice signal；

Processing subelement 2023 obtains vector signal to be verified for handling target human voice signal.

In another optional embodiment of the embodiment of the present invention, as shown in Figure 10, analysis module 200 further includes：

Second acquisition unit 205, for before obtaining the sound characteristic signal being stored in advance in voice print database, obtaining It takes with reference to audio signal, the audio signal with reference to corresponding to the sound that audio signal is target utterance person；

Second processing unit 206, for being handled with reference to audio signal, obtaining sound characteristic signal；

Storage unit 207, for sound characteristic signal to be stored in voice print database.

Second extraction unit 208, in the case where phonetic analysis includes speech analysis, being carried from audio request information Take human voice signal；

Recognition unit 209 carries out speech recognition to human voice signal, obtains the word represented by human voice signal；

Whether query unit 210 includes word to be detected for being inquired from word, and using query result as sound point Analyse result.

In another optional embodiment of the embodiment of the present invention, as shown in Figure 10, analysis module 200 includes：

Third extraction unit 211, in the case where phonetic analysis includes volume analysis, being carried from audio request information Take human voice signal；

Detection unit 212, the volume decibel value for detecting the voice represented by human voice signal；

Comparing unit 213 obtains volume decibel value and volume threshold for volume decibel value and volume threshold to be compared The difference of value, and using difference as phonetic analysis result.

It should be noted that above-mentioned first extraction unit 201, the second extraction unit 208 and third extraction unit 211 can To be the same unit, to execute the step of extracting human voice signal from audio request information.

In another optional embodiment of the embodiment of the present invention, as shown in figure 13, talkback unit further includes：

Second acquisition module 400, the second solicited message for obtaining current time request intercommunication connection, wherein second Solicited message is image request information；

Extraction module 500, for extracting image information to be verified from image request information, by image information to be verified and Image feature information is compared, and obtains image comparison result, wherein image feature information is to be stored in image data base Image information for reference；

Second determining module 600, for according to image comparison result, determining that the second response of request intercommunication connection is preferential Grade carries out response with the second response priority so that response personnel are based on the first response priority to asking intercommunication to connect.

Specifically, image feature information is the image information for reference being stored in image data base, characteristics of image Information is the image feature information that the pre- method for first passing through deep learning is extracted from the image that some are paid close attention to.The present invention is real It applies in example, image information to be verified and image feature information is compared to realize by the method for deep learning, are conducive to Machine determines automaticity and the intelligence of the second response priority.

Embodiment three

A kind of intercom system provided in an embodiment of the present invention, as shown in figure 14, including：Terminal device 700 and dispatching desk 800, wherein

Terminal device 700 is used to acquire the first solicited message of current time request intercommunication connection, wherein the first request letter Breath is audio request information；

Dispatching desk 800 and terminal device 700 connect, and dispatching desk 800 is used to execute the intercommunication method in embodiment one.

Specifically, intercom system can also include Cloud Server, and Cloud Server and dispatching desk 800 connect, sound characteristic letter Number, image feature information can be stored in Cloud Server, dispatching desk 800 wherein reads sound characteristic signal from cloud service, come Vector signal to be verified and sound characteristic signal are compared, alternatively, dispatching desk 800 wherein reads characteristics of image from cloud service Image information to be verified and image feature information are compared by information.

In addition, terminal device 700 can be arranged more, existing point is utilized between dispatching desk 800 and more station terminal equipment 700 The cloth network architecture is attached, and dispatching desk 800 and each terminal equipment 700 connect.

In the embodiment of the present invention, intercommunication method in embodiment one is executed by dispatching desk 800, alleviates traditional intercommunication system The technical issues of system processing event lags and brings larger live load to staff.

Intercom system in the embodiment of the present invention can well be applied in security protection industry, can largely be carried Safety, reliability, promptness and the interactivity for rising security protection industry, to promoting the long-term benign development of whole security protection industry can Play good progradation.

The computer program product of intercommunication method, apparatus and system that the embodiment of the present invention is provided, including store The computer readable storage medium of program code, the instruction that said program code includes can be used for executing in previous methods embodiment The method, specific implementation can be found in embodiment of the method, and details are not described herein.

It is apparent to those skilled in the art that for convenience and simplicity of description, the system of foregoing description It with the specific work process of device, can refer to corresponding processes in the foregoing method embodiment, details are not described herein.

In addition, in the description of the embodiment of the present invention unless specifically defined or limited otherwise, term " installation ", " phase Even ", " connection " shall be understood in a broad sense, for example, it may be being fixedly connected, may be a detachable connection, or be integrally connected；It can Can also be electrical connection to be mechanical connection；It can be directly connected, can also indirectly connected through an intermediary, Ke Yishi Connection inside two elements.For the ordinary skill in the art, above-mentioned term can be understood at this with concrete condition Concrete meaning in invention.

It, can be with if the function is realized in the form of SFU software functional unit and when sold or used as an independent product It is stored in a computer read/write memory medium.Based on this understanding, technical scheme of the present invention is substantially in other words The part of the part that contributes to existing technology or the technical solution can be expressed in the form of software products, the meter Calculation machine software product is stored in a storage medium, including some instructions are used so that a computer equipment (can be People's computer, server or network equipment etc.) it performs all or part of the steps of the method described in the various embodiments of the present invention. And storage medium above-mentioned includes：USB flash disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), arbitrary access are deposited The various media that can store program code such as reservoir (RAM, Random Access Memory), magnetic disc or CD.

In the description of the present invention, it should be noted that term "center", "upper", "lower", "left", "right", "vertical", The orientation or positional relationship of the instructions such as "horizontal", "inner", "outside" be based on the orientation or positional relationship shown in the drawings, merely to Convenient for the description present invention and simplify description, do not indicate or imply the indicated device or element must have a particular orientation, With specific azimuth configuration and operation, therefore it is not considered as limiting the invention.

In addition, term " first ", " second ", " third " are used for description purposes only, it is not understood to indicate or imply phase To importance.

Finally it should be noted that：Embodiment described above, only specific implementation mode of the invention, to illustrate the present invention Technical solution, rather than its limitations, scope of protection of the present invention is not limited thereto, although with reference to the foregoing embodiments to this hair It is bright to be described in detail, it will be understood by those of ordinary skill in the art that：Any one skilled in the art In the technical scope disclosed by the present invention, it can still modify to the technical solution recorded in previous embodiment or can be light It is readily conceivable that variation or equivalent replacement of some of the technical features；And these modifications, variation or replacement, do not make The essence of corresponding technical solution is detached from the spirit and scope of technical solution of the embodiment of the present invention, should all cover the protection in the present invention Within the scope of.Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. a kind of intercommunication method, which is characterized in that including：

Obtain the first solicited message of current time request intercommunication connection, wherein first solicited message is believed for audio request Breath；

Phonetic analysis is carried out to the audio request information, obtains phonetic analysis result；

According to the phonetic analysis as a result, the first response priority of the request intercommunication connection is determined, so as to response personnel's base Response is carried out to request intercommunication connection in the first response priority；

Wherein, the phonetic analysis includes voiceprint analysis；

It is described that phonetic analysis is carried out to the audio request information in the case that the phonetic analysis includes voiceprint analysis, it obtains Phonetic analysis is as a result, include：

Human voice signal is extracted from the audio request information；

The human voice signal is handled, vector signal to be verified is obtained；

Obtain the sound characteristic signal being stored in advance in voice print database, wherein the sound characteristic signal is in advance to mesh Vector signal obtained from the sound of mark sounder is handled；

The vector signal to be verified and the sound characteristic signal are compared, the vector signal to be verified and institute are obtained The matching degree between sound characteristic signal is stated, and using the matching degree as the phonetic analysis result；

It is described that the human voice signal is handled, vector signal to be verified is obtained, including：

Speech recognition is carried out to the human voice signal, obtains the word represented by the human voice signal, and according to the word Semanteme extracting keywords from the word, wherein the keyword can indicate the core meaning expressed by the word；

The target human voice signal is handled, the vector signal to be verified is obtained；

The acquisition is stored in advance in before the sound characteristic signal in voice print database, and the method further includes：

It obtains with reference to audio signal, the audio signal with reference to corresponding to the sound that audio signal is the target utterance person；

The sound characteristic signal is stored in the voice print database；

The phonetic analysis can also include：At least one of speech analysis, volume analysis；

It is described that phonetic analysis is carried out to the audio request information in the case that the phonetic analysis includes speech analysis, it obtains Phonetic analysis is as a result, include：

Human voice signal is extracted from the audio request information；

Whether include to be detected word, and using query result as the phonetic analysis result if being inquired from the word；

It is described that phonetic analysis is carried out to the audio request information in the case that the phonetic analysis includes volume analysis, it obtains Phonetic analysis is as a result, include：

Human voice signal is extracted from the audio request information；

The volume decibel value and volume threshold are compared, the difference of the volume decibel value and the volume threshold is obtained Value, and using the difference as the phonetic analysis result.

2. according to the method described in claim 1, it is characterized in that, described extract voice letter from the audio request information Number, including：

Using acoustic echo cancellation adaptive algorithm by the audio signal high-frequency signal and low frequency signal carry out consider remove, obtain intermediate letter Number, wherein the high-frequency signal is the signal that frequency is higher than voice frequency band, and the low frequency signal is frequency less than voice frequency band Signal；

Ambient noise signal in the M signal is cut down, the human voice signal is obtained, wherein the ambient noise The frequency of signal is in voice frequency band.

3. according to the method described in claim 1, it is characterized in that, the method further includes：

Obtain the second solicited message of request intercommunication connection described in current time, wherein second solicited message is asked for image Seek information；

Image information to be verified is extracted from described image solicited message, by the image information to be verified and image feature information It is compared, obtains image comparison result, wherein described image characteristic information is to be stored in image data base to be used for reference Image information；

According to described image comparison result, the second response priority of the request intercommunication connection is determined, so as to response personnel's base The request intercommunication is connected with the second response priority in the first response priority and carries out response.

4. a kind of talkback unit, which is characterized in that including：

Acquisition module, the first solicited message for obtaining current time request intercommunication connection, wherein first solicited message For audio request information；

Analysis module obtains phonetic analysis result for carrying out phonetic analysis to the audio request information；

Determining module is used for according to the phonetic analysis the first response priority as a result, the determining request intercommunication connection, with Just personnel are replied, response is carried out to request intercommunication connection based on the first response priority；

Wherein, the phonetic analysis includes voiceprint analysis；

In the case that the phonetic analysis includes voiceprint analysis, the analysis module includes：

First extraction unit, for extracting human voice signal from the audio request information；

First processing units obtain vector signal to be verified for handling the human voice signal；

First acquisition unit, for obtaining the sound characteristic signal being stored in advance in voice print database, wherein the sound is special Reference number is vector signal obtained from handling in advance the sound of target utterance person；

Comparing unit obtains described to be tested for the vector signal to be verified and the sound characteristic signal to be compared The matching degree between vector signal and the sound characteristic signal is demonstrate,proved, and using the matching degree as the phonetic analysis result；

The first processing units include：

Identify subelement, for human voice signal progress speech recognition, obtaining the word represented by the human voice signal, and According to the semantic extracting keywords from the word of the word, wherein the keyword can indicate word institute table The core meaning reached；

Subelement is extracted, for extracting the target human voice signal corresponding to the keyword from the human voice signal；

Processing subelement obtains the vector signal to be verified for handling the target human voice signal；

The analysis module further includes：

Second acquisition unit, for before obtaining the sound characteristic signal being stored in advance in voice print database, obtaining reference Audio signal, the audio signal with reference to corresponding to the sound that audio signal is the target utterance person；

Second processing unit obtains the sound characteristic signal for being handled with reference to audio signal described；

Storage unit, for the sound characteristic signal to be stored in the voice print database；

In the case that the phonetic analysis includes speech analysis, the analysis module further includes：

Second extraction unit, for extracting human voice signal from the audio request information；

Recognition unit obtains the word represented by the human voice signal for carrying out speech recognition to the human voice signal；

Whether query unit includes word to be detected for being inquired from the word, and using query result as the sound Analysis result；

In the case that the phonetic analysis includes volume analysis, the analysis module further includes：

Third extraction unit, for extracting human voice signal from the audio request information；

Detection unit, the volume decibel value for detecting the voice represented by the human voice signal；

Comparing unit, for the volume decibel value and volume threshold to be compared, obtain the volume decibel value with it is described The difference of volume threshold, and using the difference as the phonetic analysis result.

5. a kind of intercom system, which is characterized in that including：Terminal device and dispatching desk, wherein

The terminal device is used to acquire the first solicited message of current time request intercommunication connection, wherein first request Information is audio request information；The dispatching desk is connected with the terminal device, and the dispatching desk is wanted for executing the right Seek the intercommunication method described in any one of 1-3.