WO2020134646A1 - 分布式语音监控方法、装置、系统、存储介质和设备 - Google Patents
分布式语音监控方法、装置、系统、存储介质和设备 Download PDFInfo
- Publication number
- WO2020134646A1 WO2020134646A1 PCT/CN2019/116774 CN2019116774W WO2020134646A1 WO 2020134646 A1 WO2020134646 A1 WO 2020134646A1 CN 2019116774 W CN2019116774 W CN 2019116774W WO 2020134646 A1 WO2020134646 A1 WO 2020134646A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- audio
- review
- machine
- reviewed
- data
- Prior art date
Links
- 238000012544 monitoring process Methods 0.000 title claims abstract description 134
- 238000000034 method Methods 0.000 title claims abstract description 97
- 238000012552 review Methods 0.000 claims abstract description 354
- 238000012550 audit Methods 0.000 claims description 115
- 230000015654 memory Effects 0.000 claims description 32
- 238000004590 computer program Methods 0.000 claims description 17
- 238000013480 data collection Methods 0.000 claims description 10
- 230000006399 behavior Effects 0.000 claims description 9
- 238000011084 recovery Methods 0.000 claims description 9
- 238000012806 monitoring device Methods 0.000 claims description 5
- 230000004044 response Effects 0.000 claims description 3
- 238000012549 training Methods 0.000 claims description 3
- 230000000694 effects Effects 0.000 abstract description 19
- 230000006855 networking Effects 0.000 abstract description 5
- 230000008569 process Effects 0.000 description 22
- 230000006870 function Effects 0.000 description 16
- 238000012545 processing Methods 0.000 description 8
- 238000007726 management method Methods 0.000 description 7
- 238000010276 construction Methods 0.000 description 6
- 238000010586 diagram Methods 0.000 description 5
- 238000004891 communication Methods 0.000 description 4
- 230000005540 biological transmission Effects 0.000 description 3
- 238000001514 detection method Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000007689 inspection Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000032683 aging Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000001934 delay Effects 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- OWZREIFADZCYQD-NSHGMRRFSA-N deltamethrin Chemical compound CC1(C)[C@@H](C=C(Br)Br)[C@H]1C(=O)O[C@H](C#N)C1=CC=CC(OC=2C=CC=CC=2)=C1 OWZREIFADZCYQD-NSHGMRRFSA-N 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000008014 freezing Effects 0.000 description 1
- 238000007710 freezing Methods 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 230000007480 spreading Effects 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/72—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for transmitting results of analysis
Definitions
- This application relates to the technical field of voice content monitoring. Specifically, this application relates to a distributed voice monitoring method, device, system, computer-readable storage medium, and computer equipment.
- monitoring and auditing can be achieved through user reports, housekeeping inspections, or regular identification of audio data in combination with machine identification.
- user reports and housekeeping inspections have low monitoring and auditing coverage, large information lag, and low monitoring and auditing efficiency.
- all audio data must be sent to the center Central machine identification system.
- the central machine identification system is huge and complex, with high construction costs. It also requires a large amount of cost to the cross-operator traffic costs, and the input-output ratio is low.
- the present application provides a distributed voice monitoring method and corresponding device, system, computer readable storage medium, and computer equipment of the following technical solutions.
- the embodiments of the present application provide a distributed voice monitoring method, including the following steps:
- an audio machine review result is generated.
- the embodiments of the present application provide a distributed voice monitoring method, including the following steps:
- the service registration and discovery system broadcasts the address information of the machine review system that belongs to the same computer room as the media server;
- the media server sends audio stream data to the computer audit system belonging to the same computer room according to the address information
- the machine review system acquires the audio stream data belonging to the same computer room; collects the audio data to be reviewed from the audio stream data according to a preset push review strategy; input the audio data to be pre-trained into the pre-trained audio recognition model to obtain Corresponding to the predicted value of the audio recognition model; according to the predicted value, an audio machine review result is generated.
- the embodiments of the present application provide a distributed voice monitoring device according to the third aspect, including:
- Audio stream data acquisition module used to acquire audio stream data belonging to the same computer room
- a pending audio data collection module configured to collect pending audio data from the audio stream data according to a preset push review strategy
- An audio recognition module used to input the audio data to be examined into a pre-trained audio recognition model to obtain a prediction value corresponding to the audio recognition model;
- the machine review result generation module is configured to generate an audio machine review result based on the predicted value.
- the embodiments of the present application provide a distributed voice monitoring system, including: a service registration and discovery system, a media server, and a machine review system; wherein,
- the service registration and discovery system is used to broadcast the address information of the machine review system that belongs to the same computer room as the media server;
- the media server is configured to send audio stream data to the computer audit system belonging to the same computer room according to the address information
- the machine review system is used to obtain the audio stream data belonging to the same computer room; collect the audio data to be reviewed from the audio stream data according to a preset push review strategy; input the audio data to be pre-trained
- the audio recognition model obtains a prediction value corresponding to the audio recognition model; according to the prediction value, an audio machine review result is generated.
- an embodiment of the present application provides a computer-readable storage medium.
- the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the foregoing distributed voice monitoring is implemented. method.
- an embodiment of the present application provides a computer device, the computer includes one or more processors; a memory, and one or more computer programs, wherein the one or more computer programs are stored in The memory is configured to be executed by the one or more processors, and the one or more computer programs are configured to: execute the foregoing distributed voice monitoring method.
- the distributed voice monitoring method, device, system, computer readable storage medium and computer equipment provided in this application do not need to build a huge and complex central machine review system at high construction costs. And make the audio stream data flow in the machine room under normal circumstances, will not generate cross-machine room, cross-operator bandwidth flow, can achieve a higher input-output ratio, significantly reduce the cost of audio content monitoring; through each machine review system Collaborate to perform machine identification review on the pending audio data belonging to the same computer room, which can open up the real-time push review of audio streams with huge orders of magnitude of high-activity voice and social applications, support low-latency monitoring and review, and machine identification review support
- the audit coverage is large enough to achieve a high recognition rate and audit efficiency.
- This method can achieve high input-output ratio, high coverage, low latency, high recognition rate and efficient voice monitoring and auditing, and can meet the audio content monitoring requirements of highly active voice social applications in a multi-operator mixed network deployment network environment.
- FIG. 1 is a method flowchart of a first distributed voice monitoring method provided by an embodiment of this application
- FIG. 2 is a flowchart of a second distributed voice monitoring method provided by an embodiment of this application.
- FIG. 3 is a flowchart of a method for processing a Kafka pending message queue provided by an embodiment of the present application
- FIG. 5 is a flowchart of a fourth distributed voice monitoring method provided by an embodiment of this application.
- FIG. 6 is a flowchart of a fifth distributed voice monitoring method provided by an embodiment of this application.
- FIG. 7 is a schematic structural diagram of a distributed voice monitoring device provided by an embodiment of this application.
- FIG. 8 is a schematic structural diagram of a first distributed voice monitoring system provided by an embodiment of this application.
- FIG. 9 is a schematic structural diagram of a second distributed voice monitoring system provided by an embodiment of this application.
- FIG. 10 is a schematic structural diagram of a third distributed voice monitoring system provided by an embodiment of this application.
- FIG. 11 is a schematic structural diagram of a computer device provided by an embodiment of the present application.
- Voice social application A social application that uses voice communication as the main means to communicate, make friends, chat, and live broadcast.
- Service registration and discovery system a system that provides registration of each service process during the audit and monitoring process, as well as a system that provides broadcast service online and offline notifications to registered service processes.
- MS (Media Server) media server manages audio stream data generated by voice social applications in real time and pushes the audio stream data to the machine review system.
- Machine review system a system that provides machine recognition and review services for pure voice content.
- Review system a system that provides review services for machine identification review results.
- the single-line computer room has only one operator's line, such as telecommunications, China Unicom or mobile lines.
- the single-line computer room is only allowed to be accessed by users of the corresponding operator.
- Single-line equipment rooms have cheap bandwidth.
- Two-line computer room refers to the two operator lines in the computer room. If there are both telecommunications and China Unicom lines at the same time, both telecommunications and China Unicom users can access. The bandwidth of the two-line computer room is expensive.
- Multi-line computer room A multi-line computer room refers to a computer room with multiple operators' lines at the same time.
- the multi-line computer room allows users corresponding to the above multiple operators to access.
- Multi-line equipment room bandwidth is expensive.
- Kafka messaging middleware It is LinkedIn's open source distributed publish-subscribe messaging system, which currently belongs to the Apache grading project.
- the main feature of Kafka is to handle message consumption based on Pull mode, pursue high throughput, and have no strict requirements for message duplication, loss, or error. It is suitable for data collection services of Internet services that generate large amounts of data.
- Disaster recovery build two or more systems with the same function, and can perform health monitoring and function switching between each other. When a system stops working due to accidents such as fire or earthquake, other systems can immediately take over. effect.
- voice chat and live broadcast requires high real-time, and it is not suitable to adopt the mode of reviewing and presenting to users. Only patrols or delayed audio collection and review can be adopted to punish violations found during the review. Delays in audits and penalties can easily lead to vicious events that have already occurred and have a bad social impact. Need to support low-latency audits.
- the embodiments of the present application provide a distributed voice monitoring method, which is applied to a computer audit system. As shown in FIG. 1, the method includes:
- Step S110 Acquire audio stream data belonging to the same computer room.
- the audio stream data is all binary audio stream data generated in real-time during a user's voice chat in the form of a voice room and a live voice broadcast in a highly active voice social application.
- the audio streaming data is provided by an MS media server deployed by a voice social application.
- the MS media server and the computer audit system are deployed in the computer room, and the audio streaming data is transferred in the computer room when the computer audit system is operating normally.
- the computer audit system of the single-line computer room acquires the audio stream data belonging to the same computer room by receiving the audio stream data pushed by the MS media server belonging to the same computer room.
- Step S120 Collect audio data to be reviewed from the audio stream data according to a preset push review strategy.
- the machine review system performs machine identification review on part of the acquired audio stream data in the audio stream data, where the part of the audio stream data to be subjected to machine identification review is the audio data to be reviewed.
- a referral strategy is preset, and the preset referral strategy is a preset collection frequency and collection of audio data to be audited from audio stream data corresponding to audio stream data belonging to different preset categories duration.
- the preset categories include but are not limited to: users, voice rooms, user levels of users in voice social applications, room live broadcast types, and number of room users.
- Step S130 Input the audio data to be examined into a pre-trained audio recognition model to obtain a prediction value corresponding to the audio recognition model.
- the audio recognition model is an audio recognition model pre-trained based on GPU (Graphics Processing Unit).
- the GPU can be used to process speech. It has powerful computing capabilities and is suitable for accelerating the network of audio recognition models. training.
- the pre-trained audio recognition model based on GPU can provide machine recognition GPU service. Specifically, by executing on the GPU machine to perform intelligent detection and recognition of the features of the audio data to be reviewed, the predicted value corresponding to the audio recognition model is returned after recognition , To achieve intelligent detection and classification of audio data to be reviewed.
- the audio recognition model may be a variety of audio recognition models, which can be extended to support recognition of audio data of various types of bad information, such as an audio recognition model for identifying pornographic information, an audio recognition model for identifying political speech, An audio recognition model used to identify violent information, etc. It should be clearly pointed out that the audio recognition model can also be an audio recognition model for identifying other types of bad information. Those skilled in the art can determine the types of bad information that can be recognized by the audio recognition model according to actual application requirements. This is not limited in the embodiments of the present application.
- the collected audio data to be reviewed is input into a pre-trained audio recognition model for recognizing pornographic information to obtain corresponding information for recognizing pornographic information
- the predicted value of the audio recognition model of which can be used to determine whether there is a yellow-related problem in the audio data to be reviewed.
- Step S140 Generate an audio machine review result based on the predicted value.
- the risk of the type of bad information corresponding to the audio recognition model is evaluated for the audio data to be reviewed according to the level of the predicted value, An audio machine review result of the audio data to be reviewed is generated.
- the distributed voice monitoring method through the distributed decentralized machine audit mode, does not need to construct a huge and complicated central machine audit system at high construction costs, and makes the audio streaming data in normal conditions Circulation in this equipment room will not generate cross-machine room and cross-operator bandwidth traffic, which can achieve a higher input-output ratio and significantly reduce the cost of audio content monitoring; through the cooperation of various machine review systems, they are assigned to the same machine room for review Audio data can be machine-recognized for auditing, which can open up real-time push review of audio streams with huge orders of magnitude of high-activity voice and social applications, support low-latency monitoring audits, and machine identification audits support large enough audit coverage to achieve high Identification rate and audit efficiency.
- This method can achieve high input-output ratio, high coverage, low latency, high recognition rate and efficient voice monitoring and auditing, and can meet the audio content monitoring requirements of highly active voice social applications in a multi-operator mixed network deployment network environment.
- the step S110 of acquiring audio stream data belonging to the same computer room includes:
- S111 Receive a machine audit service call request sent by a media server belonging to the same machine room.
- S112 Acquire the audio stream data uploaded by the media server in response to the machine audit service invocation request.
- the machine audit system provides an invocation interface for the organic audit service
- the MS media server can push audio stream data to the machine audit system by invoking the machine audit service invocation interface of the machine audit system.
- the MS media server may preferentially send the computer audit service call request to the computer audit system of the computer room, and by calling the computer of the computer audit system of the computer room
- the audit service call interface pushes all the audio streaming data it manages to the computer audit system of the computer room.
- the computer audit system will receive the computer audit service sent by the MS media server belonging to the same computer room.
- the call request after responding to the machine review service call request, obtains the audio stream data belonging to the same room uploaded by the MS media server belonging to the same room through the machine review service call interface, thereby realizing the normal operation of the machine review system
- Downstream audio stream data is circulated in the computer room, with a high input-output ratio and the ability to open and real-time review of audio stream data of huge orders of magnitude with high-activity voice and social applications, supporting low-latency monitoring and auditing.
- step S130 inputs the audio data to be reviewed in advance before obtaining the predicted value corresponding to the audio recognition model, the trained audio recognition model further includes:
- Step S310 save the audio data to be reviewed, and determine the URL of the uniform resource locator to save the audio data to be reviewed.
- the audio data to be reviewed after collecting the audio data to be reviewed from the audio stream data, upload the audio data to be reviewed in a binary compressed format to the storage subsystem of the machine review system and save the audio data to be reviewed
- the data is stored in the uniform resource locator URL in the storage subsystem.
- Step S320 generate a pending message of the pending audio data according to the associated information of the pending audio data and the URL of the uniform resource locator; write the pending message of the pending audio data to Kafka for review message queue.
- the associated information of the audio data to be reviewed is information associated with the audio data to be reviewed, for example, it may be a user to which the audio data to be reviewed belongs, a voice room, or a user in a voice social application Related information such as user level, room live broadcast type, number of room users, etc.
- Kafka message middleware is introduced into the machine audit system to assist the voice monitoring and audit of the machine audit system. Specifically, according to the associated information of the audio data to be reviewed and the URL of the uniform resource locator, a message to be reviewed of the audio data to be reviewed is generated, and the message to be reviewed of the audio data to be reviewed is written to Kafka pending Message review queue, that is, the associated information of the audio data to be reviewed and the uniform resource locator URL of the audio data to be reviewed stored in the storage subsystem are saved to the Kafka message queue to be reviewed.
- the Kafka pending message queue is a container for storing the pending messages during message transmission.
- Kafka The main feature of Kafka is to handle message consumption based on Pull mode, pursue high throughput, and have no strict requirements for message duplication, loss, and error. It is suitable for data collection services of Internet services that generate large amounts of data. Therefore, by introducing Kafka pending messages The queue can ensure high reliability and easy horizontal expansion, achieve instantaneous high concurrency period of sharpening, and can also improve the utilization rate of the machine, and the instantiated storage of messages can flexibly support multiple failure retry strategies.
- Step S330 When reading the pending message from the Kafka pending message queue, download the pending audio data according to the uniform resource locator URL in the pending message.
- the pending message consumption process of the machine review system continuously reads pending messages from the Kafka pending message queue, and when the pending messages are read from the Kafka pending message queue, according to
- the URL of the uniform resource locator in the message to be reviewed downloads the audio data in a binary compressed format from the storage subsystem of the machine review system, and the audio data in the binary compressed format can be used to input pre-trained data after decoding An audio recognition model to obtain a prediction value corresponding to the audio recognition model.
- Kafka message middleware in the machine audit system assists the voice monitoring audit of the machine audit system, which can ensure that the system is flexible and easy to expand and shrink horizontally, and its sharpening and filling characteristics can ensure the high Available and highly reliable, the instantiated storage of messages can implement a flexible retry strategy, effectively meeting the audio content monitoring needs of highly active, instantaneous and highly concurrent voice social applications in a multi-operator mixed networking deployment network environment.
- the method before collecting the audio data to be reviewed from the audio stream data according to a preset push review strategy, the method further includes:
- a preset referral strategy for grading referrals of different users and voice rooms is generated.
- Push review strategy for configuration management. This method provides strong technical support for realizing flexible and multiple referral strategies and controlling reasonable monitoring and audit coverage.
- the preset period may be one day, one week, one month, etc., and a person skilled in the art may determine the specific period of the preset period according to actual application requirements, which is not limited in the embodiments of the present application.
- the user behavior data is behavior data generated when a user performs behaviors such as communication, dating, chatting, and live broadcasting in the voice social application.
- the user tag data is tag data of the user in the voice social application, for example, user personal tag data such as age, gender, personality, etc., or user preference tag data such as dating group, voice room type preference, etc.
- the voice room tag data is tag data such as the voice theme and voice group of the voice room in the voice social application.
- the collecting audio data to be reviewed from the audio stream data according to a preset push review strategy includes:
- a preset recommendation review strategy determine the collection frequency and collection duration of the audio data to be reviewed corresponding to the user and/or voice room;
- the preset push review strategy preliminarily sets different collection frequencies and durations of pending audio data for different users and voice rooms, so when collecting pending audio data from the audio stream data, support
- a preset referral strategy is implemented to realize different collection frequencies and collection durations of audio data to be reviewed according to users and voice rooms, and to collect audio data to be reviewed from the audio stream data at the collection frequency and collection duration.
- a referral strategy based on graded referrals different frequencies and durations of audio data to be audited are realized according to users and voice rooms, so that the audit monitoring scope can be more targeted and the monitored objects can be graded
- the strategy can achieve reasonable monitoring coverage, and can also achieve a higher audit recognition rate and accuracy rate, which significantly improves the operational efficiency of the machine audit system.
- the method before collecting the audio data to be reviewed from the audio stream data according to a preset push review strategy, the method further includes:
- the audio streaming data belonging to the same operator across the computer room is received.
- the audio stream data corresponding to the MS media server in the single-line computer room is no longer distributed to The computer audit system of this equipment room is distributed to other single-line equipment rooms of the same operator.
- the preset disaster recovery condition when the preset disaster recovery condition is reached, if there is an MS media server of another single-line computer room of the same operator, it chooses to send a request for an audit service call to the current single-line computer room, and by calling the machine of the current single-line computer room
- the audit service call interface pushes all the audio stream data it manages to the current auditing system of the single-line computer room, the current auditing system of the single-line computer room will receive the audio that belongs to the same operator across the computer room after responding to the request of the auditing service call Stream data, and perform machine identification and review on audio stream data belonging to the same operator across computer rooms to take over processing.
- computer room A and computer room B are two single-line computer rooms of the same operator.
- the MS media server of computer room A no longer pushes audio streaming data to the computer audit system of computer room A.
- the audio stream data of the MS media server of computer room A is immediately distributed to the computer audit system of computer room B.
- the computer audit system of computer room B receives the audio stream data belonging to the same operator across the computer room (computer room A) and performs further machine identification audit processing .
- the audio stream data is normally transferred in the local computer room without cross-computer room 1.
- Cross-operator bandwidth traffic only generates cross-machine room traffic with the same operator in disaster recovery situations, and the bandwidth cost can be controlled to meet the audio content monitoring needs of highly active voice and social applications deployed in a multi-operator mixed networking network environment .
- the step S140 after generating the audio machine review result according to the predicted value, further includes:
- Step S150 According to the audio machine review result, determine whether to review the audio data to be reviewed.
- a voice monitoring and auditing method combining machine auditing and review is used. After obtaining the audio machine review result corresponding to the audio recognition model of the audio data to be reviewed, there is a risk that the audio data to be reviewed reflected in the audio machine review result has the type of bad information corresponding to the audio recognition model, according to certain To determine whether to review the pending audio data.
- the same or different preset thresholds may be preset for different audio recognition models, and it is determined whether to review the audio recognition model according to whether the predicted value of the audio recognition model corresponding to the audio data to be reviewed exceeds the preset threshold corresponding to the audio recognition model For the audio data to be reviewed, when the predicted value exceeds a preset threshold, it is determined to review the audio data to be reviewed, and when the predicted value does not exceed the preset threshold, it is determined that it is not necessary to review the audio data to be reviewed.
- the review is specifically a manual review.
- Step S160 if yes, generate a machine review result message of the audio data to be reviewed according to the audio machine review result; write the machine review result message of the audio data to be reviewed into the Kafka machine review result message queue.
- the data format of the audio data to be reviewed is converted into a playable wav format file, and the audio data to be reviewed
- the wav format file is uploaded to the storage subsystem of the machine review system and saved, which facilitates the file retrieval and playback in the subsequent review stage.
- Kafka message middleware is introduced into the machine audit system to assist the distribution of audio machine audit results of the machine audit system.
- a machine review result message of the audio data to be reviewed is generated, and the machine review result message of the audio data to be reviewed is written into the Kafka machine review result message queue Save the machine review result message of the audio data to be reviewed to the Kafka machine review result message queue.
- the Kafka machine review result message queue is a container for storing the machine review result message during message transmission.
- the main feature of Kafka is to handle message consumption based on Pull mode, pursue high throughput, and have no strict requirements for message duplication, loss, or error. It is suitable for data collection services of Internet services that generate large amounts of data.
- Message queuing can ensure high reliability and easy horizontal expansion, achieve instantaneous high concurrency period sharpening, and can also improve the utilization rate of the machine, and the instantiated storage of messages can flexibly support multiple failure retry strategies.
- Step S170 When reading the machine review result message from the Kafka machine review result message queue, distribute the machine review results of the audio data to be reviewed to the review system.
- the service process of the machine audit result distribution subsystem of the machine audit system continuously reads the machine audit result message from the Kafka machine audit result message queue, and reads it from the Kafka pending message queue to the machine
- the machine review result of the audio data to be reviewed is distributed to a review system, so that the review system reviews the audio data to be reviewed corresponding to the machine review result message.
- the review system is specifically a person review system.
- the accuracy of audio content monitoring and auditing can be further improved through the voice monitoring and auditing method of combining machine review and review.
- the embodiment of the present application provides another distributed voice monitoring method. As shown in FIG. 5, the method includes the following steps:
- Step S510 The service registration and discovery system broadcasts the address information of the computer audit system that belongs to the same computer room as the media server.
- the service registration and discovery system is a system that provides registration of each service process in the audit monitoring process and provides broadcast service online and offline notification to the registered service process.
- the service registration and discovery system deploys service registration and discovery processes in the form of service instances.
- the service registration and discovery system can be used to implement distributed service management.
- the MS server can learn the address information of the machine audit system that belongs to the same computer room, and Prefer to push all the audio streaming data it manages to the computer room, so as to cooperate with each service process to realize that the audio stream data is transferred in the computer room under normal circumstances, and there will be no cross-computer room and cross-operator bandwidth traffic. Only when the computer audit system of the computer room stops working, it will be distributed to other single-line computer rooms of the same operator.
- the address information includes IP and port.
- Step S520 The media server sends audio stream data to the computer audit system belonging to the same computer room according to the address information.
- the machine audit system provides a call interface for an organic audit service
- the MS media server can push audio stream data to the machine audit system by invoking the machine audit service call interface of the machine audit system according to the address information corresponding to the call interface.
- the MS media server after receiving the address information of the machine audit system belonging to the same computer room, the MS media server sends a machine audit service call request to the machine audit system belonging to the same computer room according to the address information, and belongs to the same by calling
- the machine audit service call interface of the machine audit system of the computer room pushes all the audio streaming data it manages to the machine audit system of the machine room.
- Step S530 The computer review system acquires the audio stream data belonging to the same computer room; collects the audio data to be reviewed from the audio stream data according to a preset push review strategy; and inputs the audio data to be pre-trained for audio recognition A model to obtain a predicted value corresponding to the audio recognition model; based on the predicted value, an audio machine review result is generated.
- the machine review system provides pure voice content machine recognition review services, including but not limited to audio stream data reception, push review, and machine recognition.
- step S530 the specific function implementation of the machine review system in step S530 is the same as the technical features in steps S110 to S140 of the distributed voice monitoring method applied to the machine review system above.
- step S530 please Refer to the description in the above embodiment, and no more details are provided here.
- the machine audit system can also implement other method embodiments of the distributed voice monitoring method applied to the machine audit system.
- the machine audit system can also implement other method embodiments of the distributed voice monitoring method applied to the machine audit system.
- the distributed voice monitoring method realized in the embodiments of the present application realizes distributed decentralized pure voice content monitoring and auditing through a service registration and discovery system, an MS media server and a computer auditing system deployed in a single-line computer room, and this method can achieve high investment.
- Output ratio, high coverage, low latency, high reliability, easy expansion, high recognition rate and efficient voice monitoring audit which can meet the audio of social network applications with high activity and instantaneous high concurrent voice in multi-operator mixed networking deployment network environment Content monitoring needs.
- the method further includes:
- Step S540 the machine review system determines to review the audio data to be reviewed according to the audio machine review results; and distributes the audio machine review results of the audio data to the review system.
- a voice monitoring and auditing method combining machine auditing and review is used.
- step S540 the specific function implementation of the machine review system in step S540 is the same as the technical features in steps S150 to S170 of the distributed voice monitoring method applied to the machine review system above.
- step S540 please Refer to the description in the above embodiment, and no more details are provided here.
- Step S550 the review system receives the audio machine review result of the audio data to be reviewed; reviews the audio data to be reviewed according to the audio machine review result to obtain the review result of the audio data to be reviewed.
- the review system is specifically a person review system, and the person review system is a web platform system that provides content review and management.
- the review system receives the audio machine review result of the audio data to be reviewed distributed by the machine review system, and writes it into an operation database for manual review, and enters relevant information such as the audio machine review result of the audio data to be reviewed Pending work order, the review system, that is, after the manual review personnel of the person review system obtains the work order to be reviewed, the audio machine review results of the audio data to be reviewed corresponding to the work order to be reviewed are manually reviewed, etc. Operation to obtain the review result of the audio data to be reviewed.
- the review operation can be divided into multiple steps such as first review, second review and final review.
- the review system can also conduct a sample check on the results of the final review in the review operation to verify the correctness and rationality of the review results.
- the review system can also perform manual verification and other review operations on violation reports reported by the voice social application.
- the reviewing the audio data to be reviewed according to the audio machine review result, after obtaining the review result of the audio data to be reviewed further includes:
- the review system determines the user corresponding to the audio data to be reviewed; according to the address information of the violation penalty interface of the user client application broadcast by the service registration and discovery system, to the The violation penalty interface of the user client application sends a violation violation penalty call request;
- the client application penalizes the user for violations.
- the client application is preset with a violation penalty interface for providing a violation penalty service.
- the review system determines the user corresponding to the audio data to be reviewed to notify the voice social application of the corresponding client to impose a violation penalty on the user.
- the review system is connected to the service registration and discovery system, and the service registration and discovery system broadcasts the address information of the violation penalty interface of the voice social application corresponding to the client, and the review system can send the address information of the violation penalty interface to the The violation penalty interface of the voice social application corresponding to the client sends a violation penalty service invocation request.
- the review system may also save the review result of the violation, and send audio review data and other relevant data corresponding to the review result to the audio recognition model, marked for learning and corresponding audio recognition model. Training, continue to improve the accuracy of audio recognition model recognition audit.
- the voice social application of the client after receiving the violation penalty service invocation request, responds to the violation penalty service invocation request, and performs a preset violation penalty operation on the user who has violated the violation, and the violation penalty operation Including but not limited to account freezing, account ban, and corresponding live voice room ban.
- a distributed decentralized pure voice content monitoring and audit is implemented through a service registration and discovery system, an MS media server and a computer review system deployed in a single-line computer room, and a review system is implemented in conjunction with the review system to review the results of the computer review.
- the review result requests the client application to punish users who have violated the rules, and implements the end-to-end pure voice content review and monitoring closed-loop of voice social application audio stream data review, machine identification review, machine review result review, and voice social application punishment effect.
- Process which can support high concurrency and low audit delay, can quickly kill illegal information and content, avoid the occurrence and spread of vicious events, and can meet the multi-operator mixed networking deployment network environment with high activity and instantaneous high concurrent voice social applications Requirements for audio content monitoring.
- an embodiment of the present application provides a distributed voice monitoring device.
- the device includes: an audio stream data acquisition module 71, a pending audio data acquisition module 72, an audio recognition module 73, and a machine review result Generation module 74; where,
- the audio stream data obtaining module 71 is used to obtain audio stream data belonging to the same computer room;
- the audio data collection module 72 to be reviewed is used to collect audio data to be reviewed from the audio stream data according to a preset push review strategy
- the audio recognition module 73 is configured to input the pending audio data into a pre-trained audio recognition model to obtain a prediction value corresponding to the audio recognition model;
- the machine review result generation module 74 is configured to generate an audio machine review result based on the predicted value.
- the audio stream data acquisition module 71 is specifically used to:
- the audio data to be reviewed is input into a pre-trained audio recognition model to obtain the corresponding audio recognition Before the predicted value of the model, it also includes:
- the method before collecting the audio data to be reviewed from the audio stream data according to a preset push review strategy, the method further includes:
- the audio data collection module 72 to be reviewed is specifically used for:
- a preset recommendation review strategy determine the collection frequency and collection duration of the audio data to be reviewed corresponding to the user and/or voice room;
- the method before collecting the audio data to be reviewed from the audio stream data according to a preset push review strategy, the method further includes:
- the audio streaming data belonging to the same operator across the computer room is received.
- the method further includes:
- the machine review result of the audio data to be reviewed is distributed to a review system.
- the distributed voice monitoring device provided in this application can be realized: through the distributed decentralized machine review method, there is no need to construct a huge and complex central machine review system at high construction costs, and the audio stream data is normally under Circulation in this equipment room will not generate cross-machine room and cross-operator bandwidth traffic, which can achieve a higher input-output ratio and significantly reduce the cost of audio content monitoring; through the cooperation of various machine review systems, they are assigned to the same machine room for review Audio data can be machine-recognized for auditing, which can open up real-time push review of audio streams with huge orders of magnitude of high-activity voice and social applications, support low-latency monitoring audits, and machine identification audits support large enough audit coverage to achieve a high level of coverage Identification rate and audit efficiency.
- This method can achieve high input-output ratio, high coverage, low latency, high recognition rate and efficient voice monitoring and auditing, and can meet the audio content monitoring requirements of highly active voice social applications in a multi-operator mixed network deployment network environment. It can also be achieved: by introducing the Kafka message middleware to the machine monitoring system for voice monitoring and auditing, it can ensure the system is flexible and easy to expand and shrink horizontally, and its sharpening and filling features can ensure the high availability of the system and High reliability, the instantiated storage of messages can realize a flexible retry strategy, which can effectively meet the audio content monitoring needs of highly active and instantaneous high concurrent voice social applications deployed in a multi-operator mixed network deployment network environment; Push review strategy to achieve different frequency and duration of pending audio data collection according to user and voice room, which can make the audit monitoring scope more targeted, and have a grading strategy for monitoring objects to achieve a reasonable monitoring coverage and also reach The higher audit recognition rate and correct rate significantly improve the operational efficiency of the machine audit system.
- the distributed voice monitoring apparatus provided in the embodiments of the present application can implement the above-mentioned method embodiment provided for application to the machine review system.
- the distributed voice monitoring system includes: a service registration and discovery system 81, a media server 82, and a machine review system 83; wherein,
- the service registration and discovery system 81 is used to broadcast the address information of the machine review system that belongs to the same computer room as the media server;
- the media server 82 is configured to send audio stream data to the computer audit system belonging to the same computer room according to the address information;
- the machine review system 83 is used to obtain the audio stream data belonging to the same computer room; collect the audio data to be reviewed from the audio stream data according to a preset push review strategy; input the audio data to be pre-trained To obtain the predicted value corresponding to the audio recognition model; based on the predicted value, generate an audio machine review result.
- the distributed voice monitoring system further includes a review system 84; wherein,
- the machine review system 83 is further configured to determine to review the audio data to be reviewed based on the audio machine review results; distribute the audio machine review results of the audio data to be reviewed to the review system;
- the review system 84 is configured to receive an audio machine review result of the audio data to be reviewed; review the audio data to be reviewed according to the audio machine review result to obtain a review result of the audio data to be reviewed.
- FIG. 10 shows a specific embodiment to further elaborate the distributed voice monitoring system:
- the distributed voice monitoring system includes a service registration and discovery system, at least two machine review systems, at least two voice social application-MS media servers, and a person review system.
- the service registration and discovery system is a system that provides registration of each service process in the audit monitoring process, and provides broadcast service online and offline notification to the registered service process.
- the service registration and discovery system and the MS media server of the voice social application, The machine review system and the person review system as the review system are connected.
- the service registration and discovery system deploys service registration and discovery processes in the form of service instances. As shown in FIG. 10, multiple service registration and discovery service instances are deployed in the service registration and discovery system.
- the service registration and discovery system can implement distributed service management, work with various service processes, realize the transmission of audio stream data in the machine room under normal conditions, and avoid cross-machine room traffic costs. Only the machine review service process in the machine room It is distributed to other single-line equipment rooms of the same operator only when all are not working.
- the computer audit system is deployed as a single-line computer room. As shown in Figure 10, computer room 1 and computer room 2 have their own computer audit systems.
- the machine review system provides pure voice content machine identification review services, including but not limited to audio stream data reception, push review, storage, machine identification, and machine review result distribution, including: push review strategy subsystem, audio machine identification Subsystem, machine review result distribution subsystem, storage subsystem, audio pending message queue as the Kafka pending message queue, and audio review result queue as the Kafka machine review result message queue.
- Referral strategy subsystem Realize the reception of push audio streams in the same room, management of push strategies, compression of audio stream files and save to the storage subsystem, push messages into Kafka audio pending message queue.
- Audio pending message queue Save pending messages of pending audio data.
- the producer of the message to be reviewed is the service process of the referral strategy subsystem, and the consumer is the service process of the audio machine recognition subsystem.
- Audio machine identification subsystem Realize the acquisition of audio stream files from the storage subsystem, machine identification, and machine review results into the audio review result queue and save the corresponding wav format audio stream files to the storage subsystem.
- Audio review result queue Save the machine review result message of pending audio data.
- the producer of the machine review result message is the service process of the audio machine identification subsystem, and the consumer is the service process of the machine review result distribution subsystem.
- Machine review result distribution subsystem Realize to push the results of the machine review to the person review system.
- Storage subsystem Realize the storage of original audio stream files and transcoded high-risk WAV format audio files. Provides file upload API, receives binary audio stream data, and returns the stored URL. Support files are automatically cleaned according to specific storage aging strategies.
- the person review system is a web platform system that provides content review and management. It can receive the machine review results of the machine review system of all computer rooms, manually review the machine review results, manually review and confirm the violation reports reported by the voice social application, review the quality of the review results, and initiate punishment requests and management for violations.
- the personnel review system reviews personnel information, organizational structure, and configuration of personnel role permissions.
- MS media servers for voice social applications are deployed in a single-line computer room. As shown in Figure 10, computer room 1 and computer room 2 have their own MS media servers. Voice social application-MS media server provides the same room audio stream push and behavior penalty API.
- the distributed voice monitoring system provided by this application can be realized: a distributed decentralized pure voice content monitoring and audit can be achieved through a service registration and discovery system, an MS media server and a machine review system deployed in a single-line computer room, and a review system can also be used to implement the machine Review the results of the review and request the client application to punish users for violations based on the results of the review, and implement the end-to-end pure effect of the punishment of voice social applications audio stream data to machine recognition review, machine review results review, and voice social application punishment effect
- Voice content auditing and monitoring closed-loop process can support high concurrency and low audit delay, can quickly kill illegal information and content, avoid the occurrence and dissemination of vicious events, and can meet the high activity level of multi-operator mixed network deployment network environment. Audio content monitoring requirements for instantaneous high-concurrent voice social applications.
- the distributed voice monitoring system provided by the embodiments of the present application can implement the method embodiments provided above.
- inventions of the present application provide a computer-readable storage medium that stores a computer program on the computer-readable storage medium, and when the computer program is executed by a processor, implements the distributed voice monitoring method described in the above embodiments.
- the computer-readable storage medium includes, but is not limited to, any type of disk (including floppy disk, hard disk, optical disk, CD-ROM, and magneto-optical disk), ROM (Read-Only Memory, read-only memory), RAM (Random Access Memory (random memory), EPROM (Erasable Programmable Read-Only Memory, erasable programmable read-only memory), EEPROM (Electrically Erasable Programmable Read-Only Memory, erasable programmable read-only memory), flash memory, magnetic card or Ray card. That is, the storage device includes any medium that stores or transmits information in a readable form by a device (eg, computer, mobile phone), and may be a read-only memory, a magnetic disk, or an optical disk.
- the computer-readable storage medium provided by this application can achieve: through a distributed decentralized machine review method, there is no need to construct a huge and complex central machine review system at high construction costs, and the audio stream data is under normal circumstances Circulation in this equipment room will not generate cross-machine room and cross-operator bandwidth traffic, which can achieve a higher input-output ratio and significantly reduce the cost of audio content monitoring; through the cooperation of various machine review systems, they are assigned to the same machine room.
- Auditing audio data for machine identification review which can open up the real-time push review of huge orders of audio streaming data with high-activity voice social applications, supports low-latency monitoring audits, and machine identification audits support sufficiently large audit coverage to achieve high Recognition rate and audit efficiency.
- This method can achieve high input-output ratio, high coverage, low latency, high recognition rate and efficient voice monitoring and auditing, and can meet the audio content monitoring needs of highly active voice social applications deployed in a multi-operator mixed network deployment network environment. It can also be achieved: by introducing the Kafka message middleware to the machine monitoring system for voice monitoring and auditing, it can ensure the system is flexible and easy to expand and shrink horizontally, and its sharpening and filling features can ensure the high availability of the system and High reliability, the instantiated storage of messages can realize a flexible retry strategy, which can effectively meet the audio content monitoring needs of highly active and instantaneous high concurrent voice social applications deployed in a multi-operator mixed network deployment network environment; Push review strategy to achieve different frequency and duration of pending audio data collection according to user and voice room, which can make the audit monitoring scope more targeted, and have a grading strategy for monitoring objects to achieve a reasonable monitoring coverage and also reach High audit recognition rate and accuracy rate, significantly improve the operation efficiency of the machine review system; achieve distributed decentralized pure
- the computer-readable storage medium provided by the embodiments of the present application may implement the method embodiments provided above.
- an embodiment of the present application also provides a computer device, as shown in FIG. 11.
- the computer device described in this embodiment may be a server, a personal computer, a network device, and other devices.
- the computer device includes a processor 1002, a memory 1003, an input unit 1004, a display unit 1005, and other devices.
- the memory 1003 may be used to store a computer program 1001 and various functional modules.
- the processor 1002 runs the computer program 1001 stored in the memory 1003 to execute various functional applications and data processing of the device.
- the memory may be internal memory or external memory, or include both internal memory and external memory.
- the internal memory may include read-only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), flash memory, or random access memory.
- ROM read-only memory
- PROM programmable ROM
- EPROM electrically programmable ROM
- EEPROM electrically erasable programmable ROM
- flash memory or random access memory.
- the external memory may include hard disks, floppy disks, ZIP disks, U disks, magnetic tapes, etc.
- the memories disclosed in this application include but are not limited to these types of memories.
- the memory disclosed in this application is only an example and not a limitation.
- the input unit 1004 is used to receive an input of a signal and a keyword input by a user.
- the input unit 1004 may include a touch panel and other input devices.
- the touch panel can collect the user's touch operations on or near it (such as the user's operation with any suitable objects or accessories such as fingers, stylus, etc. on or near the touch panel), and according to the preset
- the program drives the corresponding connection device; other input devices may include but are not limited to one or more of a physical keyboard, function keys (such as playback control keys, switch keys, etc.), trackball, mouse, joystick, etc.
- the display unit 1005 can be used to display information input by the user or information provided to the user and various menus of the computer device.
- the display unit 1005 may take the form of a liquid crystal display, an organic light-emitting diode, or the like.
- the processor 1002 is the control center of the computer equipment, and uses various interfaces and lines to connect the various parts of the entire computer, executes or executes the software programs and/or modules stored in the memory 1002, and calls the data stored in the memory to execute Various functions and processing data.
- the computer device includes: one or more processors 1002, a memory 1003, and one or more computer programs 1001, wherein the one or more computer programs 1001 are stored in the memory 1003 and configured as Executed by the one or more processors 1002, the one or more computer programs 1001 are configured to execute the distributed voice monitoring method described in any of the above embodiments.
- the computer equipment provided in this application can be realized: through distributed decentralized computer review, there is no need to construct a huge and complex central computer review system at high construction costs, and the audio stream data is normally in the local computer room Circulation, no cross-machine room and cross-operator bandwidth traffic will be generated, which can achieve a higher input-output ratio and significantly reduce the cost of audio content monitoring; through the cooperation of various machine review systems, they will belong to the same room of the pending audio data Carry out machine identification audits, which can open up real-time push audits of huge orders of magnitude of audio streaming data with high-activity voice social applications, support low-latency monitoring audits, and machine identification audits support large enough audit coverage to achieve a high recognition rate And audit efficiency.
- This method can achieve high input-output ratio, high coverage, low latency, high recognition rate and efficient voice monitoring and auditing, and can meet the audio content monitoring needs of highly active voice social applications deployed in a multi-operator mixed network deployment network environment. It can also be achieved: by introducing the Kafka message middleware to the machine monitoring system for voice monitoring and auditing, it can ensure the system is flexible and easy to expand and shrink horizontally, and its sharpening and filling features can ensure the high availability of the system and High reliability, the instantiated storage of messages can realize a flexible retry strategy, which can effectively meet the audio content monitoring needs of highly active and instantaneous high concurrent voice social applications deployed in a multi-operator mixed network deployment network environment; Push review strategy to achieve different frequency and duration of pending audio data collection according to user and voice room, which can make the audit monitoring scope more targeted, and have a grading strategy for monitoring objects to achieve a reasonable monitoring coverage and also reach High audit recognition rate and accuracy rate, significantly improve the operation efficiency of the machine review system; achieve distributed decentralized pure
- the computer equipment provided in the embodiments of the present application may implement the method embodiments provided above.
- each functional unit in each embodiment of the present application may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module.
- the above integrated modules can be implemented in the form of hardware or software function modules. If the integrated module is implemented in the form of a software function module and sold or used as an independent product, it may also be stored in a computer-readable storage medium.
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Telephonic Communication Services (AREA)
Abstract
一种分布式语音监控方法,包括:获取归属同一机房的音频流数据(S110);按照预置推审策略,从音频流数据中采集待审音频数据(S120);将待审音频数据输入预先训练的音频识别模型,得到对应音频识别模型的预测值(S130);根据预测值,生成音频机审结果(S140)。可实现高投入产出比、高覆盖面、低延迟、高识别率和高效的语音监控审核,可满足多运营商混合组网部署网络环境下高活跃度语音社交应用的音频内容监控需求。
Description
本申请涉及语音内容监控技术领域,具体而言,本申请涉及一种分布式语音监控方法、装置、系统、计算机可读存储介质和计算机设备。
随着互联网的快速普及,以语音交流为主打手段进行沟通、交友、聊天和直播的社交应用颇受人们追捧。然而,庞大的用户群容易造成以语音房间形式进行的语音直播内容及语音聊天内容存在较大的不确定性,存在不法分子通过语音社交应用传播违规不良信息,影响应用平台的正常运营,因此需要对音频格式的聊天、直播内容进行实时审核识别,以及时打击语音社交应用内的违规不良行为。
目前,可通过用户举报、房管巡查或者结合机器识别定期采集音频数据等方式实现监控审核,但其局限性在于,用户举报、房管巡查的监控审核方式覆盖面低、信息滞后较大且监控审核效率低下,容易造成恶性事件已经发生并产生恶劣的社会影响;而结合机器识别定期采集音频数据的方式通常采用中心化方式提供服务,在多运营商组网部署环境下,所有的音频数据要集中送到中心机器识别系统,中心机器识别系统庞大而复杂,建设成本高,且还需对跨运营商之间的流量费用投入大量成本,投入产出比低下。
因此,现有的监控审核方法难以满足高活跃度的语音社交应用的语音内容审核需求,对于具有庞大数据量级音频数据的高活跃度语音社交应用,如何实现高投入产出比、高覆盖面、低延迟、高识别率和高效的语音监控是个非常大的挑战。
发明内容
为至少能解决上述的技术缺陷之一,本申请提供了以下技术方案的分 布式语音监控方法及对应的装置、系统、计算机可读存储介质和计算机设备。
本申请的实施例根据第一个方面,提供了一种分布式语音监控方法,包括如下步骤:
获取归属同一机房的音频流数据;
按照预置推审策略,从所述音频流数据中采集待审音频数据;
将所述待审音频数据输入预先训练的音频识别模型,得到对应所述音频识别模型的预测值;
根据所述预测值,生成音频机审结果。
本申请的实施例根据第二个方面,提供了一种分布式语音监控方法,包括如下步骤:
服务注册和发现系统广播与媒体服务器归属同一机房的机审系统的地址信息;
媒体服务器根据所述地址信息,向归属同一机房的机审系统发送音频流数据;
机审系统获取归属同一机房的所述音频流数据;按照预置推审策略,从所述音频流数据中采集待审音频数据;将所述待审音频数据输入预先训练的音频识别模型,得到对应所述音频识别模型的预测值;根据所述预测值,生成音频机审结果。
此外,本申请的实施例根据第三个方面,提供了一种分布式语音监控装置,包括:
音频流数据获取模块,用于获取归属同一机房的音频流数据;
待审音频数据采集模块,用于按照预置推审策略,从所述音频流数据中采集待审音频数据;
音频识别模块,用于将所述待审音频数据输入预先训练的音频识别模型,得到对应所述音频识别模型的预测值;
机审结果生成模块,用于根据所述预测值,生成音频机审结果。
本申请的实施例根据第四个方面,提供了一种分布式语音监控系统,包括:服务注册和发现系统、媒体服务器和机审系统;其中,
所述服务注册和发现系统,用于广播与媒体服务器归属同一机房的机审系统的地址信息;
所述媒体服务器,用于根据所述地址信息,向归属同一机房的机审系统发送音频流数据;
所述机审系统,用于获取归属同一机房的所述音频流数据;按照预置推审策略,从所述音频流数据中采集待审音频数据;将所述待审音频数据输入预先训练的音频识别模型,得到对应所述音频识别模型的预测值;根据所述预测值,生成音频机审结果。
本申请的实施例根据第五个方面,提供了一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现上述的分布式语音监控方法。
本申请的实施例根据第六个方面,提供了一种计算机设备,所述计算机包括一个或多个处理器;存储器;一个或多个计算机程序,其中所述一个或多个计算机程序被存储在所述存储器中并被配置为由所述一个或多个处理器执行,所述一个或多个计算机程序配置用于:执行上述的分布式语音监控方法。
本申请与现有技术相比,具有以下有益效果:
本申请提供的分布式语音监控方法、装置、系统、计算机可读存储介质和计算机设备,通过分布式去中心化的机审方式,无需以高昂的建设成本构建庞大、复杂的中心机审系统,并使得音频流数据在正常情况下都在本机房流转,不会产生跨机房、跨运营商带宽流量,可实现较高的投入产出比,显著降低音频内容监控成本;通过各个机审系统相互协作,分别对其归属同一机房的待审音频数据进行机器识别审核,可打通与高活跃度语音社交应用庞大数量级的音频流数据的实时推审,支持低延迟的监控审核,且机器识别审核支持足够大的审核覆盖面,可实现较高的识别率和审核效率。该方法可实现高投入产出比、高覆盖面、低延迟、高识别率和高效的语音监控审核,可满足多运营商混合组网部署网络环境下高活跃度语音社交应用的音频内容监控需求。
本申请上述的和/或附加的方面和优点从下面结合附图对实施例的描述中将变得明显和容易理解,其中:
图1为本申请实施例提供的第一种分布式语音监控方法的方法流程图;
图2为本申请实施例提供的第二种分布式语音监控方法的方法流程图;
图3为本申请实施例提供的Kafka待审消息队列处理方法的方法流程图;
图4为本申请实施例提供的第三种分布式语音监控方法的方法流程图;
图5为本申请实施例提供的第四种分布式语音监控方法的方法流程图;
图6为本申请实施例提供的第五种分布式语音监控方法的方法流程图;
图7为本申请实施例提供的分布式语音监控装置的结构示意图;
图8为本申请实施例提供的第一种分布式语音监控系统的结构示意图;
图9为本申请实施例提供的第二种分布式语音监控系统的结构示意图;
图10为本申请实施例提供的第三种分布式语音监控系统的结构示意图;
图11为本申请实施例提供的计算机设备的结构示意图。
下面详细描述本申请的实施例,所述实施例的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的,仅用于解释本 申请,而不能解释为对本申请的限制。
本技术领域技术人员可以理解,除非特意声明,这里使用的单数形式“一”、“一个”、“所述”和“该”也可包括复数形式。应该进一步理解的是,本申请的说明书中使用的措辞“包括”是指存在所述特征、整数、步骤、操作、元件和/或组件,但是并不排除存在或添加一个或多个其他特征、整数、步骤、操作、元件、组件和/或它们的组。这里使用的措辞“和/或”包括一个或更多个相关联的列出项的全部或任一单元和全部组合。
本技术领域技术人员可以理解,除非另外定义,这里使用的所有术语(包括技术术语和科学术语),具有与本申请所属领域中的普通技术人员的一般理解相同的意义。还应该理解的是,诸如通用字典中定义的那些术语,应该被理解为具有与现有技术的上下文中的意义一致的意义,并且除非像这里一样被特定定义,否则不会用理想化或过于正式的含义来解释。
本申请实施例所涉及的名词:
语音社交应用:以语音交流为主打手段进行沟通,交友,聊天,直播的社交应用。
服务注册和发现系统:提供审核监控过程中各服务进程的注册,以及向注册的服务进程提供广播服务上下线通知的系统。
MS(Media Server)媒体服务器:管理语音社交应用实时产生的音频流数据并将音频流数据向机审系统进行推流。
机审系统:提供纯语音内容的机器识别审核服务的系统。
复审系统:提供机器识别审核结果复审服务的系统。
单线机房:单线机房只有一条运营商的线路,如电信、联通或者移动线路,单线机房仅允许对应运营商的用户访问。单线机房带宽便宜。
双线机房:双线机房指机房有两条运营商的线路,如同时有电信、联通线路,则电信、联通用户都能访问。双线机房带宽费用昂贵。
多线机房:多线机房指机房同时有多条运营商的线路,多线机房允许对应上述多条运营商的用户访问。多线机房带宽费用昂贵。
Kafka消息中间件:是LinkedIn开源的分布式发布-订阅消息系统,目 前归属于Apache定级项目。Kafka主要特点是基于Pull的模式来处理消息消费,追求高吞吐量,对消息的重复、丢失、错误没有严格要求,适合产生大量数据的互联网服务的数据收集业务。
容灾:构建两套或者多套功能相同的系统,相互之间可以进行健康态监视和功能切换,当一处系统因如火灾、地震等意外停止工作时,其他系统可随即起到接管处理的作用。
有必要先对语音监控审核的特点进行如下的先导性说明。
相对于文字聊天内容、视频直播内容的监控审核,纯语音内容的监控审核具有更高的难度,具体表现在:
语音聊天和直播的内容要求很高的实时性,不宜采用先审核再呈现给用户的模式。只能采取巡查,或者音频延迟采集推审,对审核发现的违规行为进行处罚的方式。审核和处罚的延迟容易造成恶性事件已经发生,并产生恶劣的社会影响。需要支持低延迟的审核。
文字识别、图像识别都已经发展了很长时间,有成熟的机器识别技术,可以快速辅助审核人员识别,人工审识别图片和文字也更加迅速。而音频内容识别的技术发展相对滞后,音频内容和场景多样,常常伴有周围噪声和背景音乐等,信道复杂,语音质量参差不齐,信噪比较低,音频时长长短不一,大部分发言非常短,信息量不足。人工审核音频得花时间听足够时长后才能判断是否违规,审核工作量大,效率低。
目前,高活跃度的语音社交应用,为了支持高并发、容灾,通常选择多个网络运营商、多个单线机房部署音频媒体服务器,每日产生的音频数据量级非常大,一般达到10TB级别。对于具有庞大数据量级音频数据的高活跃度语音社交应用,如何实现高投入产出比、高覆盖面、低延迟、高可靠、易于扩展、高识别率和高效的语音监控审核是个非常大的挑战。
对此,本申请实施例提供了一种分布式语音监控方法,应用于机审系统,如图1所示,该方法包括:
步骤S110:获取归属同一机房的音频流数据。
对于本实施例,所述音频流数据为高活跃度的语音社交应用中用户通过语音房间的形式进行语音聊天、语音直播期间实时产生的所有二进制音 频流数据。其中,所述音频流数据由语音社交应用部署的MS媒体服务器提供。
对于本实施例,MS媒体服务器和机审系统均按机房进行部署,在机审系统运作正常情况下所述音频流数据均在本机房流转。单线机房的机审系统通过接收归属同一机房的MS媒体服务器推流的音频流数据来获取归属同一机房的音频流数据。
步骤S120:按照预置推审策略,从所述音频流数据中采集待审音频数据。
对于本实施例,机审系统对所获取的所述音频流数据中的部分音频流数据进行机器识别审核,其中,待进行机器识别审核的这部分音频流数据即所述待审音频数据。
对于本实施例,预先设置有推审策略,所述预置推审策略为对应归属不同预置分类的音频流数据而预先设定的从音频流数据中采集待审核音频数据的采集频率和采集时长。在获取所述音频流数据之后,根据所述音频流数据确定其归属的预置分类,按照所述预置推审策略,以所述预置分类对应的采集频率和采集时长从所述音频流数据中采集待审音频数据。其中,所述预置分类包括但不限于:用户、语音房间、用户在语音社交应用中的用户等级、房间直播类型、房间用户数。
步骤S130:将所述待审音频数据输入预先训练的音频识别模型,得到对应所述音频识别模型的预测值。
对于本实施例,所述音频识别模型为基于GPU(Graphics Processing Unit,图形处理单元)预先训练的音频识别模型,GPU可用于处理语音,其具有强大的计算能力,适用于加速音频识别模型的网络训练。
基于GPU预先训练的音频识别模型可提供机器识别GPU服务,具体地,通过在GPU机器上执行以对待审音频数据的特征进行机器智能检测识别,在识别后返回对应所述音频识别模型的预测值,实现对待审音频数据的智能检测归类。
其中,所述音频识别模型可以是多种音频识别模型,可扩展支持多种不良信息类型音频数据的识别,例如用于识别色情信息的音频识别模型、 用于识别涉及政治言论的音频识别模型、用于识别暴力信息的音频识别模型等等。需明确指出的是,所述音频识别模型还可以是用于识别其他不良信息类型的音频识别模型,本技术领域的技术人员可根据实际应用需求确定所述音频识别模型可实现识别的不良信息类型,本申请实施例对此不做限定。
例如,在为了识别待审音频数据是否存在涉黄问题的应用场景中,将采集得到的待审音频数据输入预先训练的用于识别色情信息的音频识别模型,得到对应所述用于识别色情信息的音频识别模型的预测值,所述预测值可用于判定所述待审音频数据是否存在涉黄问题。
步骤S140:根据所述预测值,生成音频机审结果。
对于本实施例,在得到待审音频数据对应所述音频识别模型的预测值之后,根据所述预测值的高低评估所述待审音频数据存在所述音频识别模型对应的不良信息类型的风险,生成所述待审音频数据的音频机审结果。
本申请实施例提供的分布式语音监控方法,通过分布式去中心化的机审方式,无需以高昂的建设成本构建庞大、复杂的中心机审系统,并使得音频流数据在正常情况下都在本机房流转,不会产生跨机房、跨运营商带宽流量,可实现较高的投入产出比,显著降低音频内容监控成本;通过各个机审系统相互协作,分别对其归属同一机房的待审音频数据进行机器识别审核,可打通与高活跃度语音社交应用庞大数量级的音频流数据的实时推审,支持低延迟的监控审核,且机器识别审核支持足够大的审核覆盖面,可实现较高的识别率和审核效率。该方法可实现高投入产出比、高覆盖面、低延迟、高识别率和高效的语音监控审核,可满足多运营商混合组网部署网络环境下高活跃度语音社交应用的音频内容监控需求。
在一个实施例中,如图2所示,所述步骤S110获取归属同一机房的音频流数据,包括:
S111:接收归属同一机房的媒体服务器发送的机审服务调用请求。
S112:响应所述机审服务调用请求,获取所述媒体服务器上传的音频流数据。
对于本实施例,机审系统提供有机审服务的调用接口,MS媒体服务 器可通过调用机审系统的机审服务调用接口向该机审系统推送音频流数据。
对于本实施例,由于MS媒体服务器和机审系统均按机房进行部署,MS媒体服务器可优先选择向本机房的机审系统发送机审服务调用请求,并通过调用本机房的机审系统的机审服务调用接口向本机房的机审系统推送其管理的所有音频流数据,相应地,在机审系统运作正常情况下,机审系统会接收到归属同一机房的MS媒体服务器发送的机审服务调用请求,在响应所述机审服务调用请求之后,获取到归属同一机房的MS媒体服务器通过所述机审服务调用接口上传的归属同一机房的音频流数据,从而实现在机审系统运作正常情况下音频流数据均在本机房流转,投入产出比高且可打通与高活跃度语音社交应用庞大数量级的音频流数据的实时推审,支持低延迟的监控审核。
在一个实施例中,如图3所示,所述步骤S120按照预置推审策略,从所述音频流数据中采集待审音频数据之后,所述步骤S130将所述待审音频数据输入预先训练的音频识别模型,得到对应所述音频识别模型的预测值之前,还包括:
步骤S310:保存所述待审音频数据,确定保存所述待审音频数据的统一资源定位符URL。
对于本实施例,在从所述音频流数据采集得所述待审音频数据之后,将二进制压缩格式的待审音频数据上传至机审系统的存储子系统并保存,可得到所述待审音频数据保存在所述存储子系统中的统一资源定位符URL。
步骤S320:根据所述待审音频数据的关联信息和所述统一资源定位符URL,生成所述待审音频数据的待审消息;将所述待审音频数据的待审消息写入Kafka待审消息队列。
对于本实施例,所述待审音频数据的关联信息为与所述待审音频数据相关联的信息,例如,可以是所述待审音频数据归属的用户、语音房间、用户在语音社交应用中的用户等级、房间直播类型、房间用户数等相关信息。
对于本实施例,在机审系统中引入Kafka消息中间件辅助机审系统的语音监控审核。具体地,根据所述待审音频数据的关联信息和所述统一资源定位符URL,生成所述待审音频数据的待审消息,并将所述待审音频数据的待审消息写入Kafka待审消息队列,即把所述待审音频数据的关联信息和所述待审音频数据保存在所述存储子系统中的统一资源定位符URL保存到Kafka待审消息队列。其中,所述Kafka待审消息队列为在消息传输过程中保存所述待审消息的容器。Kafka主要特点是基于Pull的模式来处理消息消费,追求高吞吐量,对消息的重复、丢失、错误没有严格要求,适合产生大量数据的互联网服务的数据收集业务,因此,通过引入Kafka待审消息队列可保证高可靠性和易水平扩展,实现瞬时高并发期的削锋,还可提升机器的利用率,且消息的实例化存储可灵活支持多种失败重试策略。
步骤S330:当从所述Kafka待审消息队列读取所述待审消息时,根据所述待审消息中的所述统一资源定位符URL下载所述待审音频数据。
对于本实施例,机审系统的待审消息消费进程不断从所述Kafka待审消息队列读取待审消息,当从所述Kafka待审消息队列读取到所述待审消息时,根据所述待审消息中的所述统一资源定位符URL从机审系统的存储子系统中下载二进制压缩格式的待审音频数据,所述二进制压缩格式的待审音频数据解码后可用于输入预先训练的音频识别模型,得到对应所述音频识别模型的预测值。
在本实施例中,通过在机审系统中引入Kafka消息中间件辅助机审系统的语音监控审核,可保证系统灵活、易水平扩缩容,且其削锋填谷的特性可保证系统的高可用和高可靠,消息的实例化存储可实现灵活的重试策略,有效满足多运营商混合组网部署网络环境下高活跃度、瞬时高并发语音社交应用的音频内容监控需求。
在一个实施例中,所述按照预置推审策略,从所述音频流数据中采集待审音频数据之前,还包括:
以预置周期采集应用内的用户行为数据和用户标签数据,生成按用户分级推审的预置推审策略;和/或
以预置周期采集应用内的语音房间标签数据,生成按语音房间分级推审的预置推审策略。
对于本实施例,通过以预置周期采集应用内用户行为数据、用户标签数据、语音房间标签数据,生成不同用户、语音房间的分级推审的预置推审策略,对分级推审的预置推审策略进行配置管理。该方法为实现灵活多种的推审策略、控制合理的监控审核覆盖范围提供了有力的技术支持。
其中,所述预置周期可以是一天、一周、一个月等时长,本技术领域的技术人员可根据实际应用需求确定所述预置周期的具体时长,本申请实施例对此不做限定。
所述用户行为数据为用户在语音社交应用内进行沟通、交友、聊天、直播等行为时产生的行为数据。
所述用户标签数据为用户在语音社交应用内的标签数据,例如年龄、性别、性格等用户个人标签数据或交友群体、语音房间类型偏好等用户偏好标签数据等等。
所述语音房间标签数据为语音社交应用内语音房间的语音主题、语音群体等标签数据。
在一个实施例中,所述按照预置推审策略,从所述音频流数据中采集待审音频数据,包括:
确定所述音频流数据对应的用户和/或语音房间;
按照预置推审策略,确定对应所述用户和/或语音房间的待审音频数据的采集频率和采集时长;
以所述采集频率和采集时长从所述音频流数据中采集待审音频数据。
对于本实施例,所述预置推审策略预先为不同用户、语音房间制定不同的待审音频数据的采集频率和采集时长,故从所述音频流数据中采集待审音频数据时,支持根据预置推审策略,按用户、语音房间实现不同的待审音频数据的采集频率和采集时长,以所述采集频率和采集时长从所述音频流数据中采集待审音频数据。
在本实施例中,通过基于分级推审的推审策略,按用户、语音房间实现不同的待审音频数据的采集频率和采集时长,可使审核监控范围更有针 对性,对监控对象有分级策略从而达到合理的监控覆盖范围,还可达到较高的审核识别率、正确率,显著提高机审系统的运作效率。
在一个实施例中,所述按照预置推审策略,从所述音频流数据中采集待审音频数据之前,还包括:
当达到预置容灾条件时,接收归属同一运营商跨机房的音频流数据。
对于本实施例,当达到预置容灾条件,即存在单线机房的机审系统完成停止工作,无法进行音频内容监控审核时,则对应该单线机房的MS媒体服务器的音频流数据不再分发到本机房的机审系统,而是分发到同一运营商的其他单线机房。
因此,当达到所述预置容灾条件时,若有同一运营商的其他单线机房的MS媒体服务器选择向当前单线机房发送机审服务调用请求,并通过调用当前单线机房的机审系统的机审服务调用接口向当前单线机房的机审系统推送其管理的所有音频流数据时,当前单线机房的机审系统在响应所述机审服务调用请求之后会接收到归属同一运营商跨机房的音频流数据,并对归属同一运营商跨机房的音频流数据进行机器识别审核,以起到接管处理的作用。
例如,机房A和机房B为同一运营商的两个单线机房,当机房A的机审系统发生故障而停止工作时,机房A的MS媒体服务器不再向机房A的机审系统推送音频流数据,机房A的MS媒体服务器的音频流数据随即分发至机房B的机审系统,由机房B的机审系统接收归属同一运营商跨机房(机房A)的音频流数据并作进一步机器识别审核处理。
在本实施例中,通过在达到容灾条件时接收归属同一运营商跨机房的音频流数据并起接管处理的作用,使得音频流数据在正常情况下都在本机房流转,不会产生跨机房、跨运营商带宽流量,仅在容灾情况下产生同运营商间跨机房的流量,带宽成本可控,从而满足多运营商混合组网部署网络环境高活跃度语音社交应用的音频内容监控需求。
在一个实施例中,如图4所示,所述步骤S140根据所述预测值,生成音频机审结果之后,还包括:
步骤S150:根据所述音频机审结果,判断是否复审所述待审音频数 据。
对于本实施例,采用机器审核结合复审的语音监控审核方式。在得到待审音频数据对应所述音频识别模型的音频机审结果之后,根据所述音频机审结果反映的所述待审音频数据存在所述音频识别模型对应的不良信息类型的风险,按照一定的策略判断是否复审所述待审音频数据。
具体地,可针对不同的音频识别模型预先设定相同或不同的预置阈值,根据待审音频数据对应音频识别模型的预测值是否超过对应所述音频识别模型的预置阈值判断是否复审所述待审音频数据,当预测值超过预置阈值时则判定复审所述待审音频数据,当预测值未超过预置阈值时则判定不需要复审所述待审音频数据。
其中,所述复审具体为人工审核。
步骤S160:若是,根据所述音频机审结果,生成所述待审音频数据的机审结果消息;将所述待审音频数据的机审结果消息写入Kafka机审结果消息队列。
对于本实施例,在根据所述音频机审结果判定复审所述待审音频数据之后,将所述待审音频数据的数据格式转换成可播放的wav格式文件,将所述待审音频数据的wav格式文件上传至机审系统的存储子系统并保存,为后续复审阶段的文件获取及播放提供便利。
对于本实施例,在机审系统中引入Kafka消息中间件辅助机审系统的音频机审结果分发。
具体地,根据所述待审音频数据的音频机审结果,生成所述待审音频数据的机审结果消息,并将所述待审音频数据的机审结果消息写入Kafka机审结果消息队列,即将所述待审音频数据的机审结果消息保存至Kafka机审结果消息队列。其中,所述Kafka机审结果消息队列为在消息传输过程中保存所述机审结果消息的容器。Kafka主要特点是基于Pull的模式来处理消息消费,追求高吞吐量,对消息的重复、丢失、错误没有严格要求,适合产生大量数据的互联网服务的数据收集业务,因此,通过引入Kafka机审结果消息队列可保证高可靠性和易水平扩展,实现瞬时高并发期的削锋,还可提升机器的利用率,且消息的实例化存储可灵活支持多种失败重 试策略。
步骤S170:当从所述Kafka机审结果消息队列读取所述机审结果消息时,将所述待审音频数据的机审结果分发复审系统。
对于本实施例,机审系统的机审结果分发子系统的服务进程不断从所述Kafka机审结果消息队列读取机审结果消息,当从所述Kafka待审消息队列读取到所述机审结果消息时,将所述待审音频数据的机审结果分发复审系统,以使复审系统对所述机审结果消息对应的待审音频数据进行复审。其中,所述复审系统具体为人审系统。
在本实施例中,通过机器审核结合复审的语音监控审核方式,可进一步提高音频内容监控审核的准确性。
此外,本申请实施例提供了另一种分布式语音监控方法,如图5所示,该方法包括如下步骤:
步骤S510:服务注册和发现系统广播与媒体服务器归属同一机房的机审系统的地址信息。
对于本实施例,所述服务注册和发现系统为提供审核监控过程中各服务进程的注册,以及向注册的服务进程提供广播服务上下线通知的系统。所述服务注册和发现系统以服务实例的方式部署服务注册和发现进程。所述服务注册和发现系统可用于实现分布式服务管理,通过广播与MS媒体服务器归属同一机房的机审系统的地址信息,可使MS服务器获知与其归属同一机房的机审系统的地址信息,并优先选择向本机房推送其管理的所有音频流数据,从而协同各个服务进程工作,实现音频流数据在正常情况下都在本机房流转,不会产生跨机房、跨运营商带宽流量,仅在本机房的机审系统停止工作的情况下才分发到同运营商的其他单线机房。其中,所述地址信息包括IP和端口。
步骤S520:媒体服务器根据所述地址信息,向归属同一机房的机审系统发送音频流数据。
对于本实施例,机审系统提供有机审服务的调用接口,MS媒体服务器可通过根据对应所述调用接口的地址信息调用机审系统的机审服务调用接口向该机审系统推送音频流数据。在本实施例中,MS媒体服务器在 接获知归属同一机房的机审系统的地址信息之后,根据所述地址信息,向归属同一机房的机审系统发送机审服务调用请求,并通过调用归属同一机房的机审系统的机审服务调用接口向本机房的机审系统推送其管理的所有音频流数据。
步骤S530:机审系统获取归属同一机房的所述音频流数据;按照预置推审策略,从所述音频流数据中采集待审音频数据;将所述待审音频数据输入预先训练的音频识别模型,得到对应所述音频识别模型的预测值;根据所述预测值,生成音频机审结果。
对于本实施例,所述机审系统提供纯语音内容的机器识别审核服务,包括但不限于音频流数据的接收、推审和机器识别。
对于本实施例,所述步骤S530中机审系统的具体功能实现与以上应用于机审系统的分布式语音监控方法的步骤S110至S140中的技术特征相同,所述步骤S530的具体功能实现请参见上述实施例中的说明,在此不再赘述。
在本实施例提供的分布式语音监控方法中,所述机审系统还可实现以上应用于机审系统的分布式语音监控方法的其他方法实施例,具体功能实现请参见上述方法实施例中的说明,在此亦不再赘述。
本申请实施例提供的分布式语音监控方法,通过服务注册和发现系统、按单线机房部署的MS媒体服务器和机审系统实现分布式去中心化的纯语音内容监控审核,该方法可实现高投入产出比、高覆盖面、低延迟、高可靠、易于扩展、高识别率和高效的语音监控审核,可满足多运营商混合组网部署网络环境下高活跃度、瞬时高并发语音社交应用的音频内容监控需求。
在一个实施例中,如图6所示,所述步骤S530中所述根据所述预测值,生成音频机审结果之后,还包括:
步骤S540:所述机审系统根据所述音频机审结果,确定复审所述待审音频数据;将所述待审音频数据的音频机审结果分发复审系统。
对于本实施例,采用机器审核结合复审的语音监控审核方式。
对于本实施例,所述步骤S540中机审系统的具体功能实现与以上应 用于机审系统的分布式语音监控方法的步骤S150至S170中的技术特征相同,所述步骤S540的具体功能实现请参见上述实施例中的说明,在此不再赘述。
步骤S550:所述复审系统接收所述待审音频数据的音频机审结果;根据所述音频机审结果对所述待审音频数据进行复审,得到所述待审音频数据的复审结果。
对于本实施例,所述复审系统具体为人审系统,所述人审系统为提供内容审核和管理的web平台系统。所述复审系统接收所述机审系统分发的所述待审音频数据的音频机审结果,并写入人工审核的运营数据库,并将所述待审音频数据的音频机审结果等相关信息录入待审工单,所述复审系统,即人审系统的人工审核人员获取所述待审工单之后,对所述待审工单对应的待审音频数据的音频机审结果进行人工确认等复审操作,从而得到所述待审音频数据的复审结果。其中,复审操作可分为一审、二审、终审等多个步骤工序。此外,所述复审系统还可对复审操作中终审的结果进行抽样检查,核实所述复审结果的正确性和合理性。所述复审系统还可对语音社交应用上报的违规举报进行人工确认等审核操作。
在本实施例中,通过采用机器审核结合复审的语音监控审核方式,可进一步提高语音内容监控审核的正确率。
在一个实施例中,所述根据所述音频机审结果对所述待审音频数据进行复审,得到所述待审音频数据的复审结果之后,还包括:
当所述复审结果为存在违规行为时,所述复审系统确定所述待审音频数据对应的用户;根据服务注册和发现系统广播的所述用户客户端应用的违规处罚接口地址信息,向所述用户客户端应用的违规处罚接口发送违规行为处罚调用请求;
所述客户端应用对所述用户进行违规处罚。
对于本实施例,客户端应用预置有用于提供违规处罚服务的违规处罚接口。当所述复审结果为确认所述待审音频数据存在违规行为时,所述复审系统确定所述待审音频数据对应的用户,以通知对应客户端的语音社交应用对该用户进行违规处罚。具体地,所述复审系统与所述服务注册和发 现系统连接,所述服务注册和发现系统广播对应客户端的语音社交应用的违规处罚接口地址信息,复审系统可根据所述违规处罚接口地址信息向对应客户端的语音社交应用的违规处罚接口发送违规处罚服务调用请求。
在其他实施例中,所述复审系统还可保存该存在违规行为的复审结果,并将该复审结果对应的音频审核数据等相关数据发送给音频识别模型,标注用于对应音频识别模型的学习和训练,持续提高音频识别模型识别审核的准确率。
对于本实施例,客户端的语音社交应用在接收到所述违规处罚服务调用请求之后,响应所述违规处罚服务调用请求,对存在违规行为的用户执行预置的违规处罚操作,所述违规处罚操作包括但不限于账号冻结、账号封禁、对应直播语音房间封禁。
在本实施例中,通过服务注册和发现系统、按单线机房部署的MS媒体服务器和机审系统实现分布式去中心化的纯语音内容监控审核,还结合复审系统实现机审结果复审,并根据复审结果请求客户端应用处罚存在违规行为的用户,实现从语音社交应用音频流数据推审到机器识别审核、机审结果复审,再到语音社交应用处罚效果端到端的纯语音内容审核和监控闭环流程,可支持高并发、低审核延迟,能够快速把违规信息和内容扼杀,避免恶性事件的发生、散播,可满足多运营商混合组网部署网络环境下高活跃度、瞬时高并发语音社交应用的音频内容监控需求。
此外,本申请实施例提供了一种分布式语音监控装置,如图7所示,所述装置包括:音频流数据获取模块71、待审音频数据采集模块72、音频识别模块73和机审结果生成模块74;其中,
所述音频流数据获取模块71,用于获取归属同一机房的音频流数据;
所述待审音频数据采集模块72,用于按照预置推审策略,从所述音频流数据中采集待审音频数据;
所述音频识别模块73,用于将所述待审音频数据输入预先训练的音频识别模型,得到对应所述音频识别模型的预测值;
所述机审结果生成模块74,用于根据所述预测值,生成音频机审结果。
在一个实施例中,所述音频流数据获取模块71,具体用于:
接收归属同一机房的媒体服务器发送的机审服务调用请求;
响应所述机审服务调用请求,获取所述媒体服务器上传的音频流数据。
在一个实施例中,所述按照预置推审策略,从所述音频流数据中采集待审音频数据之后,将所述待审音频数据输入预先训练的音频识别模型,得到对应所述音频识别模型的预测值之前,还包括:
保存所述待审音频数据,确定保存所述待审音频数据的统一资源定位符URL;
根据所述待审音频数据的关联信息和所述统一资源定位符URL,生成所述待审音频数据的待审消息;将所述待审音频数据的待审消息写入Kafka待审消息队列;
当从所述Kafka待审消息队列读取所述待审消息时,根据所述待审消息中的所述统一资源定位符URL下载所述待审音频数据。
在一个实施例中,所述按照预置推审策略,从所述音频流数据中采集待审音频数据之前,还包括:
以预置周期采集应用内的用户行为数据和用户标签数据,生成按用户分级推审的预置推审策略;和/或
以预置周期采集应用内的语音房间标签数据,生成按语音房间分级推审的预置推审策略。
在一个实施例中,所述待审音频数据采集模块72,具体用于:
确定所述音频流数据对应的用户和/或语音房间;
按照预置推审策略,确定对应所述用户和/或语音房间的待审音频数据的采集频率和采集时长;
以所述采集频率和采集时长从所述音频流数据中采集待审音频数据。
在一个实施例中,所述按照预置推审策略,从所述音频流数据中采集待审音频数据之前,还包括:
当达到预置容灾条件时,接收归属同一运营商跨机房的音频流数据。
在一个实施例中,所述根据所述预测值,生成音频机审结果之后,还包括:
根据所述音频机审结果,判断是否复审所述待审音频数据;
若是,根据所述音频机审结果,生成所述待审音频数据的机审结果消息;将所述待审音频数据的机审结果消息写入Kafka机审结果消息队列;
当从所述Kafka机审结果消息队列读取所述机审结果消息时,将所述待审音频数据的机审结果分发复审系统。
本申请提供的分布式语音监控装置可实现:通过分布式去中心化的机审方式,无需以高昂的建设成本构建庞大、复杂的中心机审系统,并使得音频流数据在正常情况下都在本机房流转,不会产生跨机房、跨运营商带宽流量,可实现较高的投入产出比,显著降低音频内容监控成本;通过各个机审系统相互协作,分别对其归属同一机房的待审音频数据进行机器识别审核,可打通与高活跃度语音社交应用庞大数量级的音频流数据的实时推审,支持低延迟的监控审核,且机器识别审核支持足够大的审核覆盖面,可实现较高的识别率和审核效率。该方法可实现高投入产出比、高覆盖面、低延迟、高识别率和高效的语音监控审核,可满足多运营商混合组网部署网络环境下高活跃度语音社交应用的音频内容监控需求。还可实现:通过在机审系统中引入Kafka消息中间件辅助机审系统的语音监控审核,可保证系统灵活、易水平扩缩容,且其削锋填谷的特性可保证系统的高可用和高可靠,消息的实例化存储可实现灵活的重试策略,有效满足多运营商混合组网部署网络环境下高活跃度、瞬时高并发语音社交应用的音频内容监控需求;通过基于分级推审的推审策略,按用户、语音房间实现不同的待审音频数据的采集频率和采集时长,可使审核监控范围更有针对性,对监控对象有分级策略从而达到合理的监控覆盖范围,还可达到较高的审核识别率、正确率,显著提高机审系统的运作效率。
本申请实施例提供的分布式语音监控装置可以实现上述提供应用于机审系统的方法实施例,具体功能实现请参见方法实施例中的说明,在此不再赘述。
此外,本申请实施例提供了一种分布式语音监控系统,如图8所示,所述分布式语音监控系统包括:服务注册和发现系统81、媒体服务器82和机审系统83;其中,
所述服务注册和发现系统81,用于广播与媒体服务器归属同一机房的机审系统的地址信息;
所述媒体服务器82,用于根据所述地址信息,向归属同一机房的机审系统发送音频流数据;
所述机审系统83,用于获取归属同一机房的所述音频流数据;按照预置推审策略,从所述音频流数据中采集待审音频数据;将所述待审音频数据输入预先训练的音频识别模型,得到对应所述音频识别模型的预测值;根据所述预测值,生成音频机审结果。
在一个实施例中,如图9所示,所述分布式语音监控系统还包括复审系统84;其中,
所述机审系统83,还用于根据所述音频机审结果,确定复审所述待审音频数据;将所述待审音频数据的音频机审结果分发复审系统;
所述复审系统84,用于接收所述待审音频数据的音频机审结果;根据所述音频机审结果对所述待审音频数据进行复审,得到所述待审音频数据的复审结果。
以下,参见图10,示出了一个具体实施例对所述分布式语音监控系统作进一步阐述:
所述分布式语音监控系统包括服务注册和发现系统、至少两个机审系统、至少两个语音社交应用-MS媒体服务器和人审系统。
(1)服务注册和发现系统。所述服务注册和发现系统为提供审核监控过程中各服务进程的注册,以及向注册的服务进程提供广播服务上下线通知的系统,所述服务注册和发现系统与语音社交应用的MS媒体服务器、机审系统、作为复审系统的人审系统均有连接。所述服务注册和发现系统以服务实例的方式部署服务注册和发现进程,如图10所示,所述服务注册和发现系统中部署有多个服务注册和发现服务实例。所述服务注册和发现系统可实现分布式服务管理,协同各个服务进程工作,实现音频流数据在正常情况下在本机房传送,避免产生跨机房的流量费用,只有在本机房的机审服务进程全部不工作的情况下才分发到同运营商的其他单线机房。
(2)机审系统。机审系统按单线机房进行部署,如图10所示,机房 1、机房2有各自的机审系统。所述机审系统提供纯语音内容的机器识别审核服务,包括但不限于音频流数据的接收、推审、存储、机器识别和机审结果分发,具体包括:推审策略子系统、音频机器识别子系统、机审结果分发子系统、存储子系统和作为所述Kafka待审消息队列的音频待审消息队列、作为所述Kafka机审结果消息队列的音频审核结果队列。
a、推审策略子系统。实现接收同机房推送音频流、推审策略管理、音频流文件压缩保存至存储子系统、推审消息入Kafka音频待审消息队列。
b、音频待审消息队列。保存待审音频数据的待审消息。待审消息的生产者是所述推审策略子系统的服务进程,消费者是所述音频机器识别子系统的服务进程。
c、音频机器识别子系统。实现从存储子系统获取音频流文件、机器识别、机审结果入音频审核结果队列并将对应的wav格式的音频流文件保存至存储子系统。
d、音频审核结果队列。保存待审音频数据的机审结果消息。机审结果消息的生产者是音频机器识别子系统的服务进程,消费者是机审结果分发子系统的服务进程。
e、机审结果分发子系统。实现将机审结果推送给人审系统。
f、存储子系统。实现原始音频流文件和转码后的高危WAV格式音频文件的存储。提供文件上传API,接收二进制的音频流数据,返回存储的URL。支持文件按照特定的存储时效策略自动清理。
(3)作为复审系统的人审系统。所述人审系统为提供内容审核和管理的web平台系统。可实现接收全部机房的机审系统的机审结果、人工审核机审结果、对语音社交应用上报的违规行为举报进行人工审核确认、对复审结果进行审核质量抽检、对违规行为发起处罚请求、管理人审系统审核人员的信息、组织结构、人员角色权限的配置。
(4)语音社交应用-MS媒体服务器。语音社交应用的MS媒体服务器按单线机房进行部署,如图10所示,机房1、机房2有各自的MS媒体服务器。语音社交应用-MS媒体服务器实现同机房音频流推送和行为处罚API的提供。
本申请提供的分布式语音监控系统可实现:通过服务注册和发现系统、按单线机房部署的MS媒体服务器和机审系统实现分布式去中心化的纯语音内容监控审核,还结合复审系统实现机审结果复审,并根据复审结果请求客户端应用处罚存在违规行为的用户,实现从语音社交应用音频流数据推审到机器识别审核、机审结果复审,再到语音社交应用处罚效果端到端的纯语音内容审核和监控闭环流程,可支持高并发、低审核延迟,能够快速把违规信息和内容扼杀,避免恶性事件的发生、散播,可满足多运营商混合组网部署网络环境下高活跃度、瞬时高并发语音社交应用的音频内容监控需求。
本申请实施例提供的分布式语音监控系统可以实现上述提供的方法实施例,具体功能实现请参见方法实施例中的说明,在此不再赘述。
此外,本申请实施例提供了一种计算机可读存储介质,计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现以上实施例所述的分布式语音监控方法。其中,所述计算机可读存储介质包括但不限于任何类型的盘(包括软盘、硬盘、光盘、CD-ROM、和磁光盘)、ROM(Read-Only Memory,只读存储器)、RAM(Random Access Memory,随即存储器)、EPROM(Erasable Programmable Read-Only Memory,可擦写可编程只读存储器)、EEPROM(Electrically Erasable Programmable Read-Only Memory,电可擦可编程只读存储器)、闪存、磁性卡片或光线卡片。也就是,存储设备包括由设备(例如,计算机、手机)以能够读的形式存储或传输信息的任何介质,可以是只读存储器,磁盘或光盘等。
本申请提供的计算机可读存储介质,可实现:通过分布式去中心化的机审方式,无需以高昂的建设成本构建庞大、复杂的中心机审系统,并使得音频流数据在正常情况下都在本机房流转,不会产生跨机房、跨运营商带宽流量,可实现较高的投入产出比,显著降低音频内容监控成本;通过各个机审系统相互协作,分别对其归属同一机房的待审音频数据进行机器识别审核,可打通与高活跃度语音社交应用庞大数量级的音频流数据的实时推审,支持低延迟的监控审核,且机器识别审核支持足够大的审核覆盖面,可实现较高的识别率和审核效率。该方法可实现高投入产出比、高覆 盖面、低延迟、高识别率和高效的语音监控审核,可满足多运营商混合组网部署网络环境下高活跃度语音社交应用的音频内容监控需求。还可实现:通过在机审系统中引入Kafka消息中间件辅助机审系统的语音监控审核,可保证系统灵活、易水平扩缩容,且其削锋填谷的特性可保证系统的高可用和高可靠,消息的实例化存储可实现灵活的重试策略,有效满足多运营商混合组网部署网络环境下高活跃度、瞬时高并发语音社交应用的音频内容监控需求;通过基于分级推审的推审策略,按用户、语音房间实现不同的待审音频数据的采集频率和采集时长,可使审核监控范围更有针对性,对监控对象有分级策略从而达到合理的监控覆盖范围,还可达到较高的审核识别率、正确率,显著提高机审系统的运作效率;通过服务注册和发现系统、按单线机房部署的MS媒体服务器和机审系统实现分布式去中心化的纯语音内容监控审核,还结合复审系统实现机审结果复审,并根据复审结果请求客户端应用处罚存在违规行为的用户,实现从语音社交应用音频流数据推审到机器识别审核、机审结果复审,再到语音社交应用处罚效果端到端的纯语音内容审核和监控闭环流程,可支持高并发、低审核延迟,能够快速把违规信息和内容扼杀,避免恶性事件的发生、散播,可满足多运营商混合组网部署网络环境下高活跃度、瞬时高并发语音社交应用的音频内容监控需求。
本申请实施例提供的计算机可读存储介质可以实现上述提供的方法实施例,具体功能实现请参见方法实施例中的说明,在此不再赘述。
此外,本申请实施例还提供了一种计算机设备,如图11所示。本实施例所述的计算机设备可以是服务器、个人计算机以及网络设备等设备。所述计算机设备包括处理器1002、存储器1003、输入单元1004以及显示单元1005等器件。本领域技术人员可以理解,图11示出的设备结构器件并不构成对所有设备的限定,可以包括比图示更多或更少的部件,或者组合某些部件。存储器1003可用于存储计算机程序1001以及各功能模块,处理器1002运行存储在存储器1003的计算机程序1001,从而执行设备的各种功能应用以及数据处理。存储器可以是内存储器或外存储器,或者包括内存储器和外存储器两者。内存储器可以包括只读存储器(ROM)、可 编程ROM(PROM)、电可编程ROM(EPROM)、电可擦写可编程ROM(EEPROM)、快闪存储器、或者随机存储器。外存储器可以包括硬盘、软盘、ZIP盘、U盘、磁带等。本申请所公开的存储器包括但不限于这些类型的存储器。本申请所公开的存储器只作为例子而非作为限定。
输入单元1004用于接收信号的输入,以及接收用户输入的关键字。输入单元1004可包括触控面板以及其它输入设备。触控面板可收集用户在其上或附近的触摸操作(比如用户使用手指、触笔等任何适合的物体或附件在触控面板上或在触控面板附近的操作),并根据预先设定的程序驱动相应的连接装置;其它输入设备可以包括但不限于物理键盘、功能键(比如播放控制按键、开关按键等)、轨迹球、鼠标、操作杆等中的一种或多种。显示单元1005可用于显示用户输入的信息或提供给用户的信息以及计算机设备的各种菜单。显示单元1005可采用液晶显示器、有机发光二极管等形式。处理器1002是计算机设备的控制中心,利用各种接口和线路连接整个电脑的各个部分,通过运行或执行存储在存储器1002内的软件程序和/或模块,以及调用存储在存储器内的数据,执行各种功能和处理数据。
作为一个实施例,所述计算机设备包括:一个或多个处理器1002,存储器1003,一个或多个计算机程序1001,其中所述一个或多个计算机程序1001被存储在存储器1003中并被配置为由所述一个或多个处理器1002执行,所述一个或多个计算机程序1001配置用于执行以上任一实施例所述的分布式语音监控方法。
本申请提供的计算机设备,可实现:通过分布式去中心化的机审方式,无需以高昂的建设成本构建庞大、复杂的中心机审系统,并使得音频流数据在正常情况下都在本机房流转,不会产生跨机房、跨运营商带宽流量,可实现较高的投入产出比,显著降低音频内容监控成本;通过各个机审系统相互协作,分别对其归属同一机房的待审音频数据进行机器识别审核,可打通与高活跃度语音社交应用庞大数量级的音频流数据的实时推审,支持低延迟的监控审核,且机器识别审核支持足够大的审核覆盖面,可实现较高的识别率和审核效率。该方法可实现高投入产出比、高覆盖面、低延 迟、高识别率和高效的语音监控审核,可满足多运营商混合组网部署网络环境下高活跃度语音社交应用的音频内容监控需求。还可实现:通过在机审系统中引入Kafka消息中间件辅助机审系统的语音监控审核,可保证系统灵活、易水平扩缩容,且其削锋填谷的特性可保证系统的高可用和高可靠,消息的实例化存储可实现灵活的重试策略,有效满足多运营商混合组网部署网络环境下高活跃度、瞬时高并发语音社交应用的音频内容监控需求;通过基于分级推审的推审策略,按用户、语音房间实现不同的待审音频数据的采集频率和采集时长,可使审核监控范围更有针对性,对监控对象有分级策略从而达到合理的监控覆盖范围,还可达到较高的审核识别率、正确率,显著提高机审系统的运作效率;通过服务注册和发现系统、按单线机房部署的MS媒体服务器和机审系统实现分布式去中心化的纯语音内容监控审核,还结合复审系统实现机审结果复审,并根据复审结果请求客户端应用处罚存在违规行为的用户,实现从语音社交应用音频流数据推审到机器识别审核、机审结果复审,再到语音社交应用处罚效果端到端的纯语音内容审核和监控闭环流程,可支持高并发、低审核延迟,能够快速把违规信息和内容扼杀,避免恶性事件的发生、散播,可满足多运营商混合组网部署网络环境下高活跃度、瞬时高并发语音社交应用的音频内容监控需求。
本申请实施例提供的计算机设备可以实现上述提供的方法实施例,具体功能实现请参见方法实施例中的说明,在此不再赘述。
此外,在本申请各个实施例中的各功能单元可以集成在一个处理模块中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。所述集成的模块如果以软件功能模块的形式实现并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。
以上所述仅是本申请的部分实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本申请原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本申请的保护范围。
Claims (15)
- 一种分布式语音监控方法,其特征在于,包括如下步骤:获取归属同一机房的音频流数据;按照预置推审策略,从所述音频流数据中采集待审音频数据;将所述待审音频数据输入预先训练的音频识别模型,得到对应所述音频识别模型的预测值;根据所述预测值,生成音频机审结果。
- 根据权利要求1所述的分布式语音监控方法,其特征在于,所述获取归属同一机房的音频流数据,包括:接收归属同一机房的媒体服务器发送的机审服务调用请求;响应所述机审服务调用请求,获取所述媒体服务器上传的音频流数据。
- 根据权利要求1所述的分布式语音监控方法,其特征在于,所述按照预置推审策略,从所述音频流数据中采集待审音频数据之后,将所述待审音频数据输入预先训练的音频识别模型,得到对应所述音频识别模型的预测值之前,还包括:保存所述待审音频数据,确定保存所述待审音频数据的统一资源定位符URL;根据所述待审音频数据的关联信息和所述统一资源定位符URL,生成所述待审音频数据的待审消息;将所述待审音频数据的待审消息写入Kafka待审消息队列;当从所述Kafka待审消息队列读取所述待审消息时,根据所述待审消息中的所述统一资源定位符URL下载所述待审音频数据。
- 根据权利要求1所述的分布式语音监控方法,其特征在于,所述按照预置推审策略,从所述音频流数据中采集待审音频数据之前,还包括:以预置周期采集应用内的用户行为数据和用户标签数据,生成按用户分级推审的预置推审策略;和/或以预置周期采集应用内的语音房间标签数据,生成按语音房间分级推审的预置推审策略。
- 根据权利要求4所述的分布式语音监控方法,其特征在于,所述 按照预置推审策略,从所述音频流数据中采集待审音频数据,包括:确定所述音频流数据对应的用户和/或语音房间;按照预置推审策略,确定对应所述用户和/或语音房间的待审音频数据的采集频率和采集时长;以所述采集频率和采集时长从所述音频流数据中采集待审音频数据。
- 根据权利要求1所述的分布式语音监控方法,其特征在于,所述按照预置推审策略,从所述音频流数据中采集待审音频数据之前,还包括:当达到预置容灾条件时,接收归属同一运营商跨机房的音频流数据。
- 根据权利要求1所述的分布式语音监控方法,其特征在于,所述根据所述预测值,生成音频机审结果之后,还包括:根据所述音频机审结果,判断是否复审所述待审音频数据;若是,根据所述音频机审结果,生成所述待审音频数据的机审结果消息;将所述待审音频数据的机审结果消息写入Kafka机审结果消息队列;当从所述Kafka机审结果消息队列读取所述机审结果消息时,将所述待审音频数据的机审结果分发复审系统。
- 一种分布式语音监控方法,其特征在于,包括如下步骤:服务注册和发现系统广播与媒体服务器归属同一机房的机审系统的地址信息;媒体服务器根据所述地址信息,向归属同一机房的机审系统发送音频流数据;机审系统获取归属同一机房的所述音频流数据;按照预置推审策略,从所述音频流数据中采集待审音频数据;将所述待审音频数据输入预先训练的音频识别模型,得到对应所述音频识别模型的预测值;根据所述预测值,生成音频机审结果。
- 根据权利要求8所述的分布式语音监控方法,其特征在于,所述根据所述预测值,生成音频机审结果之后,还包括:所述机审系统根据所述音频机审结果,确定复审所述待审音频数据;将所述待审音频数据的音频机审结果分发复审系统;所述复审系统接收所述待审音频数据的音频机审结果;根据所述音频 机审结果对所述待审音频数据进行复审,得到所述待审音频数据的复审结果。
- 根据权利要求9所述的分布式语音监控方法,其特征在于,所述根据所述音频机审结果对所述待审音频数据进行复审,得到所述待审音频数据的复审结果之后,还包括:当所述复审结果为存在违规行为时,所述复审系统确定所述待审音频数据对应的用户;根据服务注册和发现系统广播的所述用户客户端应用的违规处罚接口地址信息,向所述用户客户端应用的违规处罚接口发送违规行为处罚调用请求;所述客户端应用对所述用户进行违规处罚。
- 一种分布式语音监控装置,其特征在于,包括:音频流数据获取模块,用于获取归属同一机房的音频流数据;待审音频数据采集模块,用于按照预置推审策略,从所述音频流数据中采集待审音频数据;音频识别模块,用于将所述待审音频数据输入预先训练的音频识别模型,得到对应所述音频识别模型的预测值;机审结果生成模块,用于根据所述预测值,生成音频机审结果。
- 一种分布式语音监控系统,其特征在于,包括:服务注册和发现系统、媒体服务器和机审系统;其中,所述服务注册和发现系统,用于广播与媒体服务器归属同一机房的机审系统的地址信息;所述媒体服务器,用于根据所述地址信息,向归属同一机房的机审系统发送音频流数据;所述机审系统,用于获取归属同一机房的所述音频流数据;按照预置推审策略,从所述音频流数据中采集待审音频数据;将所述待审音频数据输入预先训练的音频识别模型,得到对应所述音频识别模型的预测值;根据所述预测值,生成音频机审结果。
- 根据权利要求12所述的分布式语音监控系统,其特征在于,还包括复审系统;其中,所述机审系统,还用于根据所述音频机审结果,确定复审所述待审音频数据;将所述待审音频数据的音频机审结果分发复审系统;所述复审系统,用于接收所述待审音频数据的音频机审结果;根据所述音频机审结果对所述待审音频数据进行复审,得到所述待审音频数据的复审结果。
- 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现权利要求1至10任一项所述的分布式语音监控方法。
- 一种计算机设备,其特征在于,其包括:一个或多个处理器;存储器;一个或多个计算机程序,其中所述一个或多个计算机程序被存储在所述存储器中并被配置为由所述一个或多个处理器执行,所述一个或多个计算机程序配置用于:执行根据权利要求1至10任一项所述的分布式语音监控方法。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811628102.2 | 2018-12-28 | ||
CN201811628102.2A CN111383659B (zh) | 2018-12-28 | 2018-12-28 | 分布式语音监控方法、装置、系统、存储介质和设备 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2020134646A1 true WO2020134646A1 (zh) | 2020-07-02 |
Family
ID=71128655
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2019/116774 WO2020134646A1 (zh) | 2018-12-28 | 2019-11-08 | 分布式语音监控方法、装置、系统、存储介质和设备 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN111383659B (zh) |
WO (1) | WO2020134646A1 (zh) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112765518A (zh) * | 2021-01-19 | 2021-05-07 | 广州趣丸网络科技有限公司 | 一种内容审核方法、装置及设备 |
CN115756875B (zh) * | 2023-01-06 | 2023-05-05 | 山东未来网络研究院(紫金山实验室工业互联网创新应用基地) | 面向流式数据的机器学习模型在线服务部署方法及系统 |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2010033955A1 (en) * | 2008-09-22 | 2010-03-25 | Personics Holdings Inc. | Personalized sound management and method |
CN101998138A (zh) * | 2009-08-25 | 2011-03-30 | 北京达鸣慧科技有限公司 | 电视频道监控系统及其实时监控方法 |
CN102014278A (zh) * | 2010-12-21 | 2011-04-13 | 四川大学 | 一种基于语音识别技术的智能视频监控方法 |
CN104065836A (zh) * | 2014-05-30 | 2014-09-24 | 小米科技有限责任公司 | 监控通话的方法和装置 |
CN106328134A (zh) * | 2016-08-18 | 2017-01-11 | 都伊林 | 监狱语音数据识别及监测预警系统 |
US20170345416A1 (en) * | 2011-12-06 | 2017-11-30 | Nuance Communications, Inc. | System and Method for Machine-Mediated Human-Human Conversation |
CN107465657A (zh) * | 2017-06-22 | 2017-12-12 | 武汉斗鱼网络科技有限公司 | 直播视频监控方法、存储介质、电子设备及系统 |
Family Cites Families (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030236663A1 (en) * | 2002-06-19 | 2003-12-25 | Koninklijke Philips Electronics N.V. | Mega speaker identification (ID) system and corresponding methods therefor |
US7272565B2 (en) * | 2002-12-17 | 2007-09-18 | Technology Patents Llc. | System and method for monitoring individuals |
US8301447B2 (en) * | 2008-10-10 | 2012-10-30 | Avaya Inc. | Associating source information with phonetic indices |
CN103916837B (zh) * | 2014-04-18 | 2018-05-04 | 广东欧珀移动通信有限公司 | 数据收发方法及智能终端 |
CN106331695B (zh) * | 2016-08-24 | 2018-08-07 | 合肥数酷信息技术有限公司 | 一种基于视频音频检测与数据分析系统 |
US9838538B1 (en) * | 2016-09-21 | 2017-12-05 | Noble Systems Corporation | Using real-time speech analytics to navigate a call that has reached a machine or service |
CN108717408B (zh) * | 2018-05-11 | 2023-08-22 | 杭州排列科技有限公司 | 一种敏感词实时监控方法、电子设备、存储介质及系统 |
CN108932303B (zh) * | 2018-06-12 | 2022-02-08 | 中国电子科技集团公司第二十八研究所 | 一种分布式可见光遥感影像动态目标检测与分析系统 |
CN109033231A (zh) * | 2018-07-03 | 2018-12-18 | 芜湖威灵数码科技有限公司 | 一种从多媒体文件中提取信息的方法 |
CN109005425A (zh) * | 2018-08-26 | 2018-12-14 | 俞绍富 | 网络视频监控系统 |
-
2018
- 2018-12-28 CN CN201811628102.2A patent/CN111383659B/zh active Active
-
2019
- 2019-11-08 WO PCT/CN2019/116774 patent/WO2020134646A1/zh active Application Filing
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2010033955A1 (en) * | 2008-09-22 | 2010-03-25 | Personics Holdings Inc. | Personalized sound management and method |
CN101998138A (zh) * | 2009-08-25 | 2011-03-30 | 北京达鸣慧科技有限公司 | 电视频道监控系统及其实时监控方法 |
CN102014278A (zh) * | 2010-12-21 | 2011-04-13 | 四川大学 | 一种基于语音识别技术的智能视频监控方法 |
US20170345416A1 (en) * | 2011-12-06 | 2017-11-30 | Nuance Communications, Inc. | System and Method for Machine-Mediated Human-Human Conversation |
CN104065836A (zh) * | 2014-05-30 | 2014-09-24 | 小米科技有限责任公司 | 监控通话的方法和装置 |
CN106328134A (zh) * | 2016-08-18 | 2017-01-11 | 都伊林 | 监狱语音数据识别及监测预警系统 |
CN107465657A (zh) * | 2017-06-22 | 2017-12-12 | 武汉斗鱼网络科技有限公司 | 直播视频监控方法、存储介质、电子设备及系统 |
Also Published As
Publication number | Publication date |
---|---|
CN111383659B (zh) | 2021-03-23 |
CN111383659A (zh) | 2020-07-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9992448B2 (en) | Recording web conferences | |
US8326643B1 (en) | Systems and methods for automated phone conversation analysis | |
US10629188B2 (en) | Automatic note taking within a virtual meeting | |
US10841268B2 (en) | Methods and apparatus to generate virtual war rooms via social media in enterprise network environments | |
CN108028763A (zh) | 云计算的电信平台 | |
US20170279964A1 (en) | System and Method For User Notification Regarding Detected Events | |
US11895263B2 (en) | Interpreting conference call interruptions | |
US10250539B2 (en) | Methods and apparatus to manage message delivery in enterprise network environments | |
US20120203551A1 (en) | Automated follow up for e-meetings | |
CN107977823A (zh) | 突发事件处理方法和装置 | |
US20180211223A1 (en) | Data Processing System with Machine Learning Engine to Provide Automated Collaboration Assistance Functions | |
US10297255B2 (en) | Data processing system with machine learning engine to provide automated collaboration assistance functions | |
US20200280501A1 (en) | Automation of customer support sorting process | |
WO2020134646A1 (zh) | 分布式语音监控方法、装置、系统、存储介质和设备 | |
CN109788306A (zh) | 信息处理方法、装置、服务器及存储介质 | |
US10346221B2 (en) | Determining life-cycle of task flow performance for telecommunication service order | |
US10972297B2 (en) | Data processing system with machine learning engine to provide automated collaboration assistance functions | |
US20230169272A1 (en) | Communication framework for automated content generation and adaptive delivery | |
CN109495378A (zh) | 检测异常帐号的方法、装置、服务器及存储介质 | |
CN102857798A (zh) | 一种基于机顶盒的系统软件运行状况的监控系统 | |
CN114205676A (zh) | 直播监测方法、装置、介质以及计算机设备 | |
CN113162777A (zh) | 通知消息的生成方法、装置、电子装置和存储介质 | |
CN103458134A (zh) | 计算机辅助电话访问系统及其操作方法 | |
US10652717B2 (en) | Systems and methods for providing an interactive community through device communication | |
CN113852835A (zh) | 直播音频处理方法、装置、电子设备以及存储介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19903654 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19903654 Country of ref document: EP Kind code of ref document: A1 |