WO2017054515A1 - 一种色情图像检测方法和系统 - Google Patents

一种色情图像检测方法和系统 Download PDF

Info

Publication number
WO2017054515A1
WO2017054515A1 PCT/CN2016/085882 CN2016085882W WO2017054515A1 WO 2017054515 A1 WO2017054515 A1 WO 2017054515A1 CN 2016085882 W CN2016085882 W CN 2016085882W WO 2017054515 A1 WO2017054515 A1 WO 2017054515A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
detected
module
erotic
detection
Prior art date
Application number
PCT/CN2016/085882
Other languages
English (en)
French (fr)
Inventor
贺镇海
杨斌
刘志军
陈正江
王国俊
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2017054515A1 publication Critical patent/WO2017054515A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/5838Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/08Feature extraction

Definitions

  • This article relates to, but is not limited to, the field of communications, and relates to a method and system for detecting erotic images.
  • RCS high-throughput service convergence communication
  • CDN Content Delivery Network
  • the content pipeline that provides unified acceleration services is an effective carrier for obtaining user Internet access data.
  • the content security issues associated with these two types of carriers also need to be effectively curbed.
  • Typical, such as the spread of erotic images, the spread of channels under RCS and CDN is wider, the throughput is huge, and it is necessary to adopt fast and effective technical means to implement detection and filtering.
  • Embodiments of the present invention provide a method and system for detecting erotic images, which solves the problem of pornographic security in high-throughput services.
  • An embodiment of the present invention provides a method for detecting an erotic image, including:
  • the module to be detected performs erotic image detection on the image to be detected to obtain a detection result.
  • the high throughput service interface comprises a converged communication interface and/or a content distribution network interface.
  • the method further includes: after obtaining the detection result, performing manual confirmation on the image to be detected when the detection result includes the image to be detected being a suspected erotic image.
  • the method further includes:
  • the plurality of to-be-detected images are packed into data packets according to a preset size
  • Distributing the to-be-detected image to the corresponding to-be-detected module includes: distributing the data packet to the to-be-detected module.
  • the distributing the to-be-detected image to the corresponding to-be-detected module includes: distributing, by the routing module, the to-be-detected image to the corresponding to-be-detected module.
  • the determining, by the plurality of detection modules, the at least one module to be detected as the module to be detected includes: when acquiring multiple to-be-detected images from the high-throughput service interface, determining multiple from multiple detection modules The detection module is a module to be detected;
  • the distributing the to-be-detected image to the corresponding to-be-detected module includes: distributing the plurality of to-be-detected images to the corresponding to-be-detected module by using an asynchronous manner.
  • the to-be-detected module performs erotic image detection on the image to be detected, and the detection result includes:
  • the detecting module uses the global feature retrieval algorithm to perform erotic image detection on the image to be detected to obtain a detection result.
  • the global feature comprises: a color histogram feature on the hexagonal pyramid model space, a region shape feature, and a gradient direction histogram feature of the local region.
  • the detecting module uses the global feature retrieval algorithm to perform erotic image detection on the image to be detected, and the detection result includes:
  • the color histogram features, the region shape features, and the gradient direction histogram features of the local regions are retrieved to obtain a plurality of similar sample images;
  • the detection result is obtained according to the mark corresponding to the plurality of similar sample images.
  • the obtaining the detection result according to the mark corresponding to the multiple similar sample images comprises:
  • the detection result is an erotic image
  • the detection result is a suspected erotic image, and the first threshold is greater than the second threshold;
  • the detection result is a non-erotic image.
  • the embodiment of the invention further provides an erotic image detection system, comprising an acquisition module, a determination module, a distribution module and a plurality of detection modules:
  • the obtaining module is configured to acquire at least one image to be detected from the high throughput service interface
  • the determining module is configured to determine at least one detection module from the plurality of detection modules as the module to be detected;
  • the distribution module is configured to distribute the image to be detected to the corresponding module to be detected;
  • the detecting module is configured to perform erotic image detection on the image to be detected to obtain a detection result.
  • the system further includes a manual agent module, and the manual agent module is configured to: after obtaining the detection result, performing the image to be detected when the detection result includes the image to be detected being a suspected erotic image Manual confirmation.
  • the distribution module is further configured to package the plurality of to-be-detected images into data packets according to a preset size before distributing the to-be-detected image to the corresponding to-be-detected module;
  • the distribution module distributes the to-be-detected image to the corresponding to-be-detected module by distributing the data packet to the to-be-detected module.
  • the distribution module is configured to distribute the to-be-detected image to the corresponding to-be-detected module by using the routing module.
  • the determining module is configured to acquire multiple to be detected from a high throughput service interface And determining, by the plurality of detection modules, the plurality of detection modules are the modules to be detected; the distribution module is further configured to separately distribute the plurality of to-be-detected images to the corresponding to-be-detected modules by using an asynchronous manner.
  • the detecting module is configured to perform a erotic image detection on the image to be detected using a global feature retrieval algorithm to obtain a detection result.
  • the global feature comprises: a color histogram feature on the hexagonal pyramid model space, a region shape feature, and a gradient direction histogram feature of the local region.
  • the detecting module implements, by using a global feature retrieval algorithm, performing erotic image detection on the image to be detected to obtain a detection result:
  • the detection result is obtained according to the mark corresponding to the plurality of similar sample images.
  • the detecting module obtains the detection result according to the marking corresponding to the plurality of similar sample images by:
  • the detection result is an erotic image
  • the detection result is a suspected erotic image, and the first threshold is greater than the second threshold;
  • the detection result is a non-erotic image.
  • the embodiment of the invention further provides a computer readable storage medium, the computer readable storage medium
  • the computer stores executable instructions that implement an erotic image detection method when the computer executable instructions are executed.
  • the erotic image detecting method and system acquires at least one image to be detected from a high-throughput service interface; at least one detecting module is determined as a module to be detected from a plurality of detecting modules; and the image to be detected is distributed And corresponding to the to-be-detected module; the to-be-detected module performs erotic image detection on the image to be detected to obtain a detection result.
  • it can perform pornographic detection on a large number of pictures in high-throughput services, avoiding the security risks of pornographic transmission through high-throughput services, improving network security of high-throughput services, and improving user experience.
  • Other aspects will be apparent upon reading and understanding the drawings and detailed description.
  • FIG. 1 is a flowchart of a method for detecting erotic images according to Embodiment 1 of the present invention
  • 2-1 is a schematic diagram of an erotic image detection system according to Embodiment 2 of the present invention.
  • 3-1 is a schematic diagram of an erotic image detection system according to Embodiment 3 of the present invention.
  • FIG. 4 is a schematic diagram of an erotic image detection system according to Embodiment 4 of the present invention.
  • 5-1 is a schematic diagram of an erotic image detection system according to Embodiment 5 of the present invention.
  • 6-1 is a schematic structural diagram of an erotic image detecting system according to Embodiment 6 of the present invention.
  • 6-2 is another schematic structural diagram of an erotic image detecting system according to Embodiment 6 of the present invention.
  • FIG. 6-3 is still another schematic structural diagram of an erotic image detecting system according to Embodiment 6 of the present invention.
  • Embodiment 1 is a diagrammatic representation of Embodiment 1:
  • the erotic image detecting method of this embodiment is as shown in FIG. 1, and the method includes:
  • Step S101 Acquire at least one image to be detected from the high-throughput service interface.
  • the high throughput service interface may include a converged communication interface RCS and/or a content distribution network interface CDN.
  • the image transmission format of RCS/CDN includes bmp, jpg, tiff, gif, png, etc., and the size ranges from several K to several M. Messages can be transmitted in tens of thousands per second. The images obtained by sorting need to be processed quickly and efficiently, and the identification and recognition capabilities should be quickly adapted as the traffic volume and recognition characteristics change.
  • Step S102 Determine at least one detection module from the plurality of detection modules as the module to be detected
  • the detection module acts as a module to be tested.
  • the module to be detected may be one or more detection modules, where the module to be detected refers to a detection module that is currently required to perform image detection to be detected, that is, a detection module that is in an active state.
  • Step S103 Distribute the image to be detected to the corresponding module to be detected
  • the corresponding image to be detected is distributed to the corresponding module to be detected.
  • Step S104 The module to be detected performs erotic image detection on the detected image to obtain a detection result.
  • the module to be detected performs a pornographic detection to obtain a detection result, and performs filtering and the like to improve network security.
  • the method further includes: after obtaining the detection result, manually confirming the image to be detected when the detection result includes the image to be detected being a suspected erotic image. That is, it is possible to confirm the suspected erotic image by hand.
  • the method further includes: before distributing the image to be detected to the corresponding module to be detected, packaging the plurality of to-be-detected images according to a preset size.
  • the distributing the image to be detected to the corresponding module to be detected comprises: distributing the data packet to the corresponding module to be detected. That is, the image to be detected can be packed according to a certain size, so that it can be distributed in a fixed size packet.
  • the message distribution module needs to go online or go offline as the traffic volume and type increase or decrease, and on the other hand, the detection module avoids modifying the connection configuration and restarting, and maintains the service.
  • Continuity to reduce the operation and maintenance cost, and distributing the image to be detected to the corresponding module to be detected includes: distributing the image to be detected to the corresponding module to be detected through the routing module. That is, the routing module performs caching and the like processing by means of the routing module.
  • the distribution may be performed in an asynchronous manner.
  • Determining the module to be detected from the plurality of detection modules includes: when acquiring a plurality of to-be-detected images from the high-throughput service interface, determining, by the plurality of detection modules, the plurality of detection modules as the module to be detected; distributing the image to be detected The corresponding to-be-detected module is separately distributed to the corresponding to-be-detected module by using an asynchronous method.
  • the to-be-detected module performs erotic image detection on the detected image
  • the detection result includes: performing a erotic image detection on the detected image by using a global feature retrieval algorithm to obtain a detection result.
  • a global feature retrieval algorithm to obtain a detection result.
  • the global features include: a color histogram feature on the hexagonal pyramid model space, that is, a CD feature, a region shape feature, that is, an SCD feature, and a gradient region histogram feature of the local region, that is, an HOG feature.
  • a color histogram feature on the hexagonal pyramid model space that is, a CD feature
  • a region shape feature that is, an SCD feature
  • a gradient region histogram feature of the local region that is, an HOG feature.
  • other global features can also be used, and it should be understood that as long as the image can be easily identified and retrieved.
  • the detecting module performs the erotic image detection on the detected image by using the global feature retrieval algorithm, and the detection result includes: collecting the image sample, and marking the image sample as a pornographic sample when the image sample is an erotic image, where the image is Marking the image sample as a non-pornographic sample when the sample is a non-porn image; extracting a color histogram feature, a region shape feature, and a gradient direction histogram feature of the local region on the hexagonal pyramid model space of the image sample;
  • the detection result is obtained according to the mark corresponding to the plurality of similar sample images.
  • the obtaining the detection result according to the mark corresponding to the plurality of similar sample images comprises: obtaining the proportion of the number of pornographic sample images according to the mark corresponding to the plurality of similar sample images; and if the ratio is greater than or equal to the first threshold, the detection result is An erotic image; if the ratio is less than the first threshold and greater than the second threshold, the detection result is a suspected erotic image, wherein the first threshold is greater than the second threshold; and the ratio is less than or equal to the second threshold, the detection result is a non-erotic image.
  • Embodiment 2 is a diagrammatic representation of Embodiment 1:
  • the corresponding erotic detection system in this embodiment is shown in FIG. 2-1.
  • the erotic detection system in this embodiment includes a message distribution module, a cluster resident module, a core detection module, a manual agent module, and a cluster management module, as shown in FIG. 2 .
  • the erotic detection method of this embodiment includes:
  • Step A1 The message distribution module requests the RCS service interface to obtain the detected image.
  • Step A2 The message distribution module distributes the obtained image to be detected to a core detection module of any server in the detection cluster;
  • Step A3 The core detection module uses the global feature retrieval algorithm to score the image, and determines that the image type is an erotic image, a non-erotic image, or a suspected erotic image;
  • Step A4 If the image type is a suspected erotic image, it is sent to the artificial agent module for confirmation, and the confirmation result is returned;
  • Step A5 The detection result of the core detection module is sent to the cluster management module for aggregation;
  • the process of the message distribution module in step A2 distributing the obtained image to be detected to the core detection module of any server in the detection cluster includes:
  • Step A22 The core detection module of the service node starts one or more processes for requesting image data from the message distribution module and performing detection;
  • Step A23 The message distribution module requests an image file from the RCS service interface, sets a fixed threshold, and packs one or more small file data and corresponding file information that may appear into a chunk-sized data packet, and forms a data packet into a queue;
  • step A24 when the process of the core detection module is idle, the chunk distribution packet is requested and obtained from the message distribution module in an asynchronous manner, and is unpacked in the memory to be restored to the original image data.
  • step A3 the process of the global feature retrieval algorithm in step A3 is:
  • Step A31 collecting a plurality of positive samples, and collecting a plurality of negative sample images (generally more than 100,000 sheets);
  • Step A32 Calculate three feature value CD features, SCD features, and HOG features for each sample image, and obtain a feature vector value of the sample image and a binary set of positive and negative identifiers;
  • the CD Cosmetic Textness Descriptor characterizes the shape of the area, and represents the skin area and the non-skin area in white and black, respectively, to form a binary image.
  • the binary image is equally divided into small blocks and the skin point probability distribution in each region is recorded. The order is the whole image, four equal parts per region, and 16 equal parts per skin skin probability. A total of 21 (1+4+16) Item probability value.
  • the HOG Histogram of Oriented Gradient
  • the HOG constructs features by calculating and grading the gradient direction histogram of the local region of the image, and maintains good invariance to the image set and optical deformation.
  • Step A33 vector values for the three features, and respectively establishing a kd-tree index file for the three features;
  • Step A34 Calculate a CD feature, an SCD feature, and an HOG feature for the image to be tested obtained from the RCS interface;
  • Step A35 using the kd-tree index to retrieve the first 1000 similar values for the three features of the image to be tested, and scoring from high to low, the score is 1 to 1000;
  • Step A36 taking the top 1000 sample images with the highest total score, and counting the number of positive samples according to the positive and negative signs of the sample, and multiplying the ratio of the 1000 samples by 100 is the score;
  • step A37 the scores above 65 points are erotic images, the scores below 35 points are non-erotic images, and the scores between 35 and 65 are suspected erotic images.
  • Embodiment 3 is a diagrammatic representation of Embodiment 3
  • the above embodiment can implement fast and effective determination of erotic pictures related to a specific sample, and can obtain a good discriminating effect when the sample size coverage is relatively good.
  • the discriminating effect will continue to decrease. Therefore, it is necessary to iteratively add new types of bad images to the sample library.
  • sample training module is added on the basis of the second embodiment.
  • the positive and negative images that have been confirmed by the artificial seat are obtained as samples, and on the other hand, the new positive and negative are manually introduced.
  • Image samples append the sample image recalculated feature values to the original feature set, and calculate to generate a new kd-tree index.
  • Step B1 The service node of the image detection applies for a subscription to the cluster management module in a message queue of a “publish-subscribe” mode;
  • Step B2 The manual agent module transmits the manual determination result to the sample training module
  • Step B3 The sample training module appends the manually determined positive and negative sample image extraction SCD, CD, and HOG features and identifiers to the original binary set, and calculates and generates a kd-tree index file;
  • Step B4 The sample training module transmits the index file to the cluster management module.
  • Step B5 The cluster management module publishes the index file to the cluster resident module of each service node that is subscribed in step B1;
  • Step B6 The cluster resident module replaces the new index file with the original index file, and notifies the core detection module to re-read the index and perform subsequent determination.
  • Embodiment 4 is a diagrammatic representation of Embodiment 4:
  • a message distribution module is used to interface with the RCS service interface to obtain an image to be detected.
  • a single message distribution module cannot withstand the data throughput from the RCS service interface and the changes in the service docking scenario.
  • the message distribution module in this embodiment is applicable to the CDN Cache device (hereinafter referred to as Cache) on the basis of the foregoing Embodiments 2 and 3, and requests the cached image to determine whether the image is pornographic.
  • the message distribution module node needs to be expanded, on the one hand, it can meet the connection with the new RCS interface or the Cache interface, and on the other hand, when the image is distributed to the service node.
  • Step C1 Determine a new RCS service interface or a Cache service interface, and add a new message distribution module node and the service interface to connect;
  • Step C2 changing the configuration information of the message distribution module in the cluster resident module of the service node, and the message distribution module does not need to configure the service node information;
  • Step C3 performing subsequent image discrimination and training processes as shown in the foregoing embodiment
  • the specific docking process includes:
  • Step C31 The message distribution module acquires the bill data of the current time interval from the Cache management device.
  • Step C32 The message distribution module parses the URL of the hit image part of the bill data (Uniform Resoure Locator, Uniform Resource Locator);
  • Step C33 The message distribution module initiates the request for the URL to the Cache to obtain an image file, and performs subsequent distribution and detection processing.
  • Embodiment 5 is a diagrammatic representation of Embodiment 5:
  • the message distribution module group can form a cloud service to provide external universal erotic image detection, as shown in FIG. 5-1.
  • the purpose is to make the service capability scaling of the message distribution module have no impact on the core detection configuration changes in the background, and the scaling of the detection service node does not affect the configuration of the message distribution module, and the routing message module is responsible for the reasonable scheduling and buffering of the message transmission on both sides. , reduce system detection delay.
  • the erotic image detecting method of this embodiment includes:
  • Step D1 The core detection module sends a request for asynchronously acquiring an image to the message routing module, setting an image file to acquire an event, and starting detection;
  • Step D2 The message routing module obtains an asynchronous acquisition request sent by each service node core detection module, and a service node information corresponding to the request, to form a request queue;
  • Step D3 The message distribution module obtains an image request queue member from the message routing module, and includes corresponding service node information.
  • Step D4 The message distribution module acquires image data from a service interface such as RCS and Cache, and packs it into a chunk file block, and encapsulates and sends the chunk file together with the service node information to the route detection module according to the obtained image request;
  • a service interface such as RCS and Cache
  • Step D5 The message routing module obtains the packet sent by the message distribution module, parses out the service node information, and sends the chunk packet with the service node information removed to the corresponding service node core detection mode. Piece;
  • Step D6 The core detection module detects the chunk packet receiving event, parses the image file in the chunk packet, and performs subsequent detection work
  • each module is coordinated with each other, and the message distribution module is configured to acquire an image to be discriminated and a compressed file from the RCS/CDN interface, and distribute the image to the image detection cluster.
  • the cluster management module is used to monitor the service status of the image detection cluster, control the service to go online and offline, service configuration is delivered, and statistics are reported.
  • the cluster resident module is used to deploy operations such as command execution service control of the cluster management module receiving the cluster management module.
  • the core detection module is configured to perform discrimination on the received image running detection algorithm, and output the type of the image (pornography, non-pornography, or suspected pornography).
  • the sample training module is used to collect existing pornographic images and non-erotic images to mark the inbound database for indexing.
  • the artificial seat module is used for manual confirmation of the suspected erotic images discriminated by the core detection module.
  • the message routing module is used to enable the message distribution module and the core detection module to be deployed in a scalable manner, in contact with the coupling of the two
  • the erotic image detection method of the embodiment may further include:
  • the method may also be marked in other manners, input to the sample training module, extract the image global retrieval feature index, and distribute the index data to the detection cluster module through the cluster management module.
  • the message distribution module obtains the image to be discriminated and other additional messages from the RCS/CDN interface in real time to form a message queue.
  • the message distribution module distributes the queue message to the core detection module according to the service node where the core detection module is located.
  • the core detection module obtains the image to be discriminated and discriminates in parallel, and outputs a score for the image.
  • the training is input to the sample training module, and the training result is redistributed to the service node to improve the accuracy of subsequent discrimination.
  • the erotic image detection method of the embodiment can provide a high accuracy of erotic image recognition; provide high processing performance for a large number of messages; and the system deployment has scalability as the volume of the service changes; quickly adapt to the processing of new erotic images. .
  • the erotic image detection system includes an acquisition module, a determination module, a distribution module, and a plurality of detection modules: wherein the acquisition module is configured to acquire at least one to-be-taken from a high-throughput service interface. Detecting an image; the determining module is configured to determine at least one detection module as a module to be detected from the plurality of detection modules; the distribution module is configured to distribute the image to be detected to the corresponding module to be detected; The detecting module is configured to perform erotic image detection on the image to be detected to obtain a detection result.
  • the functions of the obtaining module, the determining module, and the distributing module in this embodiment may be implemented by using the message distributing module in the foregoing embodiment; the detecting module in this embodiment may be implemented by using the core detecting module in the foregoing embodiment. The function.
  • the embodiment further provides an erotic image detection system, as shown in FIG. 6-2, further including a manual agent module, where the manual agent module is configured to include the detection result after obtaining the detection result.
  • the manual agent module is configured to include the detection result after obtaining the detection result.
  • the distribution module is further configured to package the plurality of to-be-detected images into data packets according to a preset size before distributing the to-be-detected image to the corresponding to-be-detected module;
  • the distribution module distributes the to-be-detected image to the corresponding to-be-detected module by distributing the data packet to the to-be-detected module.
  • the device further includes a routing module, where the distribution module is configured to distribute the to-be-detected image to the corresponding to-be-detected module by using the routing module.
  • the determining module is configured to: when acquiring a plurality of to-be-detected images from the high-throughput service interface, determine, from the plurality of detection modules, the plurality of detection modules as the module to be detected; the distribution module is further configured to The plurality of to-be-detected images are separately distributed to the corresponding to-be-detected modules by using an asynchronous manner.
  • the detecting module is configured to perform a erotic image detection on the image to be detected using a global feature retrieval algorithm to obtain a detection result.
  • the global feature comprises: a color histogram feature on the hexagonal pyramid model space, a region shape feature, and a gradient direction histogram feature of the local region.
  • the detecting module implements, by using a global feature retrieval algorithm, performing erotic image detection on the image to be detected to obtain a detection result:
  • the detection result is obtained according to the mark corresponding to the plurality of similar sample images.
  • the detecting module obtains the detection result according to the marking corresponding to the plurality of similar sample images by:
  • the detection result is an erotic image
  • the detection result is a suspected erotic image, and the first threshold is greater than the second threshold;
  • the detection result is a non-erotic image.
  • the module in this embodiment corresponds to one module in the above two embodiments, and one module corresponding module or multiple modules corresponds to one module, for example, the obtaining module and the distributing module in this embodiment.
  • the two modules correspond to the above-mentioned message distribution module, and the detection module in this embodiment corresponds to the above-mentioned core detection module, etc., and should be understood as long as the same work can be achieved. Yes, they can all correspond or combine.
  • Each module can be either a hardware device or a corresponding software program, or a combination of a software program and a hardware device.
  • the embodiment of the invention further provides a computer readable storage medium, wherein the computer readable storage medium stores computer executable instructions, and the computer executable instructions are implemented to implement an erotic image detection method.
  • each module/unit in the above embodiment may be implemented in the form of hardware, for example, by implementing an integrated circuit to implement its corresponding function, or may be implemented in the form of a software function module, for example, executing a program stored in the memory by a processor. / instruction to achieve its corresponding function.
  • This application is not limited to any specific combination of hardware and software.
  • the above technical solution can perform pornographic detection on a large number of pictures in a high-throughput service, avoiding security risks of pornographic transmission through high-throughput services, improving network security of high-throughput services, and improving user experience.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Library & Information Science (AREA)
  • Image Analysis (AREA)

Abstract

一种色情图像检测方法和系统,该方法包括:从高吞吐量业务接口获取至少一个待检测图像;从多个检测模块中确定出至少一个检测模块作为待检测模块;将待检测图像分发给对应的待检测模块;待检测模块对待检测图像进行色情图像检测得到检测结果。与相关技术相比,该技术方案能够对高吞吐量业务中的大量图片进行色情检测,避免了通过高吞吐量业务进行色情传播的安全隐患,提高了高吞吐量业务的网络安全,提升了用户体验度。

Description

一种色情图像检测和系统 技术领域
本文涉及但不限于通信领域,涉及一种色情图像检测方法和系统。
背景技术
和传统短/彩信业务相比,高吞吐量业务融合通信(以下简称RCS,Rich Communication Suite)的消息传输更具互联网业务特征;以及融合内容分发网络(以下简称CDN,Content Delivery Network)作为互联网业务提供统一加速服务的内容管道,是获取用户互联网访问数据的有效载体。这两类载体伴随的内容安全问题也需要有效遏制。典型如色情类图像的传播,在RCS和CDN下传播渠道更宽泛,吞吐量巨大,必需采取快速有效的技术手段实施检测和过滤。
发明内容
以下是对本文详细描述的主题的概述。本概述并非是为了限制权利要求的保护范围。
本发明实施例提供一种色情图像检测方法和系统,解决了高吞吐量业务中色情安全的问题。
本发明实施例提供一种色情图像检测方法,包括:
从高吞吐量业务接口获取至少一个待检测图像;
从多个检测模块中确定出至少一个检测模块作为待检测模块;
将所述待检测图像分发给对应的所述待检测模块;
所述待检测模块对所述待检测图像进行色情图像检测得到检测结果。
可选地,所述高吞吐量业务接口包括融合通信接口和/或内容分发网络接口。
可选地,所述方法还包括:在得到检测结果后,在所述检测出结果包括所述待检测图像为疑似色情图像时,对所述待检测图像进行人工确认。
可选地,所述方法还包括:
在将所述待检测图像分发给对应的所述待检测模块之前,根据预设大小将多个待检测图像打包成数据包;
所述将所述待检测图像分发给对应的所述待检测模块包括:将数据包分发对应的所述待检测模块。
可选地,所述将所述待检测图像分发给对应的所述待检测模块包括:通过路由模块将将所述待检测图像分发给对应的所述待检测模块。
可选地,所述从多个检测模块中确定出至少一个待检测模块作为待检测模块包括:当从高吞吐量业务接口获取多个待检测图像时,从多个检测模块中确定出多个检测模块为待检测模块;
所述将所述待检测图像分发给对应的所述待检测模块包括:通过异步方式将多个待检测图像分别分发给对应的所述待检测模块。
可选地,所述待检测模块对所述待检测图像进行色情图像检测得到检测结果包括:
所述检测模块使用全局特征检索算法对所述待检测图像进行色情图像检测得到检测结果。
可选地,所述全局特征包括:在六角锥体模型空间上的颜色直方图特征、区域形状特征和局部区域的梯度方向直方图特征。
可选地,所述检测模块使用全局特征检索算法对所述待检测图像进行色情图像检测得到检测结果包括:
采集图像样本,在所述图像样本为色情图像时将所述图像样本标记为色情样本,在所述图像样本为非色情图像时将所述图像样本标记为非色情样本;抽取所述图像样本的在六角锥体模型空间上的颜色直方图特征、区域形状特征和局部区域的梯度方向直方图特征;
根据所述图像样本的在六角锥体模型空间上的颜色直方图特征、区域形状特征和局部区域的梯度方向直方图特征建立色情图像索引;
根据所述色情图像索引和所述待检测图像的在六角锥体模型空间上的颜 色直方图特征、区域形状特征和局部区域的梯度方向直方图特征进行检索,得到多个相似的样本图像;
根据多个相似的样本图像对应的标记得到检测结果。
可选地,所述根据多个相似的样本图像对应的标记得到检测结果包括:
根据多个相似的样本图像对应的标记得到色情样本图像个数占比;
若所述占比大于或等于第一阈值,则检测结果为色情图像;
若所述占比小于第一阈值且大于第二阈值,则检测结果为疑似色情图像,所述第一阈值大于所述第二阈值;
若所述占比小于或等于第二阈值,则检测结果为非色情图像。
本发明实施例还提供一种色情图像检测系统,包括获取模块、确定模块、分发模块和多个检测模块:
所述获取模块设置为从高吞吐量业务接口获取至少一个待检测图像;
所述确定模块设置为从多个检测模块中确定出至少一个检测模块作为待检测模块;
所述分发模块设置为将所述待检测图像分发给对应的所述待检测模块;
所述检测模块设置为对所述待检测图像进行色情图像检测得到检测结果。
可选地,所述系统还包括人工坐席模块,所述人工坐席模块设置为在得到检测结果后,在所述检测结果包括所述待检测图像为疑似色情图像时,对所述待检测图像进行人工确认。
可选地,所述分发模块还设置为,在将所述待检测图像分发给对应的所述待检测模块之前,根据预设大小将多个待检测图像进行打包成数据包;
所述分发模块通过如下方式实现将所述待检测图像分发给对应的所述待检测模块:将数据包分发对应的所述待检测模块。
可选地,所述分发模块是设置为通过所述路由模块将将所述待检测图像分发给对应的所述待检测模块。
可选地,所述确定模块是设置为当从高吞吐量业务接口获取多个待检测 图像时,从多个检测模块中确定出多个检测模块为待检测模块;所述分发模块还设置为通过异步方式将多个待检测图像分别分发给对应的所述待检测模块。
可选地,所述检测模块是设置为使用全局特征检索算法对所述待检测图像进行色情图像检测得到检测结果。
可选地,所述全局特征包括:在六角锥体模型空间上的颜色直方图特征、区域形状特征和局部区域的梯度方向直方图特征。
可选地,所述检测模块通过如下方式实现使用全局特征检索算法对所述待检测图像进行色情图像检测得到检测结果:
采集图像样本,在所述图像样本为色情图像时将所述图像样本标记为色情样本,在所述图像样本为非色情图像时将所述图像样本标记为非色情样本;抽取所述图像样本的在六角锥体模型空间上的颜色直方图特征、区域形状特征和局部区域的梯度方向直方图特征;
根据所述图像样本的在六角锥体模型空间上的颜色直方图特征、区域形状特征和局部区域的梯度方向直方图特征建立色情图像索引;
根据所述色情图像索引和所述待检测图像的在六角锥体模型空间上的颜色直方图特征、区域形状特征和局部区域的梯度方向直方图特征进行检索,得到多个相似的样本图像;
根据多个相似的样本图像对应的标记得到检测结果。
可选地,所述检测模块通过如下方式实现根据多个相似的样本图像对应的标记得到检测结果:
根据多个相似的样本图像对应的标记得到色情样本图像个数占比;
若所述占比大于或等于第一阈值,则检测结果为色情图像;
若所述占比小于第一阈值且大于第二阈值,则检测结果为疑似色情图像,所述第一阈值大于所述第二阈值;
若所述占比小于或等于第二阈值,则检测结果为非色情图像。
本发明实施例还提供一种计算机可读存储介质,所述计算机可读存储介 质中存储有计算机可执行指令,所述计算机可执行指令被执行时实现色情图像检测方法。
本发明实施例的有益效果是:
本发明实施例提供的色情图像检测方法和系统,从高吞吐量业务接口获取至少一个待检测图像;从多个检测模块中确定出至少一个检测模块作为待检测模块;将所述待检测图像分发给对应的所述待检测模块;所述待检测模块对所述待检测图像进行色情图像检测得到检测结果。与相关技术相比,能够对高吞吐量业务中的大量图片进行色情检测,避免了通过高吞吐量业务进行色情传播的安全隐患,提高了高吞吐量业务的网络安全,提升了用户体验度。在阅读并理解了附图和详细描述后,可以明白其它方面。
附图说明
图1为本发明实施例一提供的色情图像检测方法流程图;
图2-1为本发明实施例二提供的色情图像检测系统的示意图;
图2-2为本发明实施例二提供的色情图像检测方法的流程图;
图3-1为本发明实施例三提供的色情图像检测系统的示意图;
图3-2为本发明实施例三提供的色情图像检测方法的流程图;
图4为本发明实施例四提供的色情图像检测系统的示意图;
图5-1为本发明实施例五提供的色情图像检测系统的示意图;
图5-2为本发明实施例五提供的色情图像检测方法的流程图;
图6-1为本发明实施例六提供的色情图像检测系统的结构示意图;
图6-2为本发明实施例六提供的色情图像检测系统的另一结构示意图;
图6-3为本发明实施例六提供的色情图像检测系统的又一结构示意图。
具体实施方式
下面将结合附图对本发明实施例中的技术方案进行清楚、完整地描述, 所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
实施例一:
本实施例的色情图像检测方法,如图1所示,该方法包括:
步骤S101:从高吞吐量业务接口获取至少一个待检测图像;
在该步骤中,高吞吐量业务接口可以包括融合通信接口RCS和/或内容分发网络接口CDN。其中,RCS/CDN的图片传输格式包括bmp、jpg、tiff、gif、png等,大小从数K到数M不等。消息每秒传输数量可达到上万条。分拣得到的图像需要进行快速有效的处理,而且随着业务量和识别特点的变化,识别方式和识别能力也要快速适应。
步骤S102:从多个检测模块中确定出至少一个检测模块作为待检测模块;
在该步骤中,由于高吞吐量业务的信息量比较大,会存在多个检测模块来进行色情图像检测,那么要对待检测图像指定对应的检测模块,即当前的待检测图像用哪个检测模块来进行色情检测,该检测模块就作为待检测模块。在本实施例中,待检测模块可以是一个或多个检测模块,这里的待检测模块就是指当前需要用来进行待检测图像检测的检测模块,即处于工作状态的检测模块。
步骤S103:将待检测图像分发给对应的待检测模块;
在该步骤中,在确定了用哪个检测模块即待检测模块进行色情检测后,将对应的待检测图像分发到对应的待检测模块。
步骤S104:待检测模块对待检测图像进行色情图像检测得到检测结果。
在该步骤中,待检测模块进行色情检测得到检测结果,进行滤除等处理,提高网络安全。
可选地,所述方法还包括:在得到检测结果后,在检测结果包括所述待检测图像为疑似色情图像时,对所述待检测图像进行人工确认。即对不能确定疑似色情图像,通过人工进行确认。
可选地,为了提高处理效率,避免实时发送占用太多网络资源,所述方法还包括:在将待检测图像分发给对应的待检测模块之前,根据预设大小将多个待检测图像打包成数据包;其中,将待检测图像分发给对应的待检测模块包括:将数据包分发对应的待检测模块。即可以将待检测图像根据一定的大小打包,这样可以以固定的大小的数据包进行分发。
可选地,为了减小分发过程中的一方面随着业务量和类型的增减需要将消息分发模块随之上线或下线,另一方面避免检测模块随之修改连接配置和重启,保持业务连续性以降低运维成本,将待检测图像分发给对应的待检测模块包括:通过路由模块将将待检测图像分发给对应的待检测模块。即借助路由模块对待检测图像进行缓存等处理。
可选地,当多个待检测图像需要分发到多个检测模块时,为了避免在分发过程中彼此冲突,可以采用异步方式进行分发。从多个检测模块中确定出待检测模块包括:当从高吞吐量业务接口获取多个待检测图像时,从多个检测模块中确定出多个检测模块为待检测模块;将待检测图像分发给对应的待检测模块包括:通过异步方式将多个待检测图像分别分发给对应的待检测模块。
可选地,在上述步骤S104中,待检测模块对待检测图像进行色情图像检测得到检测结果包括:使用全局特征检索算法对待检测图像进行色情图像检测得到检测结果。当然,其他可以进行色情图像检测的方式都可以实现。
在本实施例中,全局特征包括:在六角锥体模型空间上的颜色直方图特征即CD特征、区域形状特征即SCD特征和局部区域的梯度方向直方图特征即HOG特征。当然,也可以采用其他全局特征来实现,应该理解为,只要能够便于对图像进行识别以及进行检索都包含在内。
可选地,检测模块使用全局特征检索算法对待检测图像进行色情图像检测得到检测结果包括:采集图像样本,在所述图像样本为色情图像时将所述图像样本标记为色情样本,在所述图像样本为非色情图像时将所述图像样本标记为非色情样本;抽取所述图像样本的在六角锥体模型空间上的颜色直方图特征、区域形状特征和局部区域的梯度方向直方图特征;
根据所述图像样本的在六角锥体模型空间上的颜色直方图特征、区域形状特征和局部区域的梯度方向直方图特征建立色情图像索引;
根据所述色情图像索引和所述待检测图像的在六角锥体模型空间上的颜色直方图特征、区域形状特征和局部区域的梯度方向直方图特征进行检索,得到多个相似的样本图像;
根据多个相似的样本图像对应的标记得到检测结果。
值得注意的是,这里的多个相似样本,可以根据其相似程度的高低选取一定数量的,例如选取其中的1000个。其中,根据多个相似的样本图像对应的标记得到检测结果包括:根据多个相似的样本图像对应的标记得到色情样本图像个数占比;若占比大于或等于第一阈值,则检测结果为色情图像;若占比小于第一阈值且大于第二阈值,则检测结果为疑似色情图像,其中,第一阈值大于第二阈值;占比小于等于第二阈值,则检测结果为非色情图像。进一步,还可以根据占比得到对应的进行打分,以100分为例,根据其占比乘以100就可以得到对应的分数。设置相应的分数阈值来判断其是否为色情图像等。当然也可以直接根据占比进行判断。
实施例二:
本实施例中对应的色情检测系统如图2-1所示,本实施例中的色情检测系统包括消息分发模块、集群驻留模块、核心检测模块、人工坐席模块、集群管理模块,如图2-2所示,本实施例的色情检测方法包括:
步骤A1、消息分发模块向RCS业务接口请求获得检测的图像;
步骤A2、消息分发模块将得到的待检测的图像分发到检测集群中任一台服务器的核心检测模块;
步骤A3、核心检测模块使用全局特征检索算法为图像打分,判定图像类型为色情图像、非色情图像或者疑似色情图像;
步骤A4、若图像类型为疑似色情图像则被发送到人工坐席模块进行确认,返回确认结果;
步骤A5、核心检测模块的检测结果被发送到集群管理模块进行汇总;
可选地,为达到待检测图像文件大吞吐量的传送,步骤A2中消息分发模块将得到的待检测的图像分发到检测集群中任一台服务器的核心检测模块的流程包括:
步骤A21、消息分发模块和一个或多个实施图像检测的服务节点中的核心检测模块形成异步的请求响应关系;
步骤A22、服务节点的核心检测模块启动一个或多个进程用于向消息分发模块请求图像数据并实施检测;
步骤A23、消息分发模块向RCS业务接口请求图像文件,设定固定的阈值,将可能出现的一个或多个小文件数据和对应的文件信息打包成chunk大小的数据包,将数据包形成队列;
步骤A24、当核心检测模块的进程空闲时,通过异步方式向消息分发模块请求并获得chunk数据包,并在内存中拆包还原成原始图像数据。
可选地地,步骤A3中全局特征检索算法的流程为:
步骤A31、采集多张正样本,以及采集多张负样本图像(一般十万张以上);
步骤A32、对每张样本图像计算三种特征值CD特征、SCD特征、HOG特征,得到样本图片的特征向量值以及正负标识的二进制集合;
其中SCD(Scalable Color Descriptor)特征描述图片在HSV(Hue、Saturation、Value,色调饱和度明度)空间上的颜色直方图:将图像从RGB(Red、Green、Blue,红绿蓝)空间转换到HSV空间后量化为一个256色区(色度H等分为16份,饱和度S等分4份,亮度V等分4份,16*4*4=256),统计每个色区中像素点数目,归一化为每个色区像素点概率分布。
CD(Compactness Descriptor)特征描述区域形状,用白色、黑色分别表示皮肤区域和非皮肤区域,形成二值图像。二值图像被等分成小块并记录每个区域中的皮肤点概率分布,顺序是整幅图像、四等分每区域、16等分每区域的皮肤点概率,共21(1+4+16)项概率值。
HOG(Histogram of Oriented Gradient)特征通过计算和统计图像局部区域的梯度方向直方图来构成特征,对图像集合和光学形变能保持较好的不变性。
步骤A33、对三种特征的向量值,并对三个特征分别建立kd-tree索引文件;
步骤A34、针对从RCS接口获得的待测图像,计算CD特征、SCD特征、HOG特征;
步骤A35、用前述kd-tree索引为待测图像三个特征检索前1000个相似的值,并由高到低进行打分,分值1~1000;
步骤A36、取总得分最高的前1000张样本图像,根据样本的正负标识,统计其中正样本个数,在这1000张样本的占比乘以100即为得分;
步骤A37、上述得分在65分以上的为色情图像,得分在35分以下的为非色情图像,得分在35~65之间的为疑似色情图像。
实施例三:
上述实施例可以对特定样本相关的色情图片实施快速有效的判定,并且当样本数量覆盖比较好时能取得不错的判别效果。但在互联网环境下的业务运行过程中,会有新类型的不良图像,当这类图像越来越多时会使判别效果不断下降,因此需要将新类型的不良图像迭代加入样本库。
本实施例在实施例二的基础上增加系统“样本训练模块”,如图3-1所示,一方面获取人工坐席已判别确认的正负图像作为样本,另一方面手动传入新型正负图像样本,将样本图片重新计算特征值追加到原有特征集合,并计算生成新的kd-tree索引。
如图3-2所示,本实施例的色情图像检测方法包括:
步骤B1、图像检测的服务节点以“发布-订阅”模式的消息队列向集群管理模块申请订阅;
步骤B2、人工坐席模块将人工判定结果传送到样本训练模块;
步骤B3、样本训练模块将人工判定的正负样本图像提取SCD、CD和HOG特征和标识追加到原有的二进制集合,并计算生成kd-tree索引文件;
步骤B4、样本训练模块将索引文件传送到集群管理模块;
步骤B5、集群管理模块将索引文件发布到步骤B1中申请订阅的各服务节点的集群驻留模块;
步骤B6、集群驻留模块将新的索引文件替换原有索引文件,并通知核心检测模块重新读取索引再进行后续判别。
实施例四:
上述实施例中采用一个消息分发模块和RCS业务接口进行对接,获取待检测图像。随着业务发展,数据传输量的不断扩大,单个消息分发模块无法承受来自RCS业务接口的数据吞吐量,以及业务对接场景的变化。
如图4所示,本实施例中的消息分发模块在上述实施例二和三的基础上,适用于和CDN Cache设备(以下简称Cache)对接,请求缓存图像并判别图像是否为色情。此时需要对消息分发模块节点进行扩充,,一方面可以满足和新的RCS接口或Cache接口对接,另一方面向服务节点分发图像时进行分流。
本实施例的色情图像检测方法包括:
步骤C1、确定新的RCS业务接口或Cache业务接口,增加新的消息分发模块节点和该业务接口对接;
步骤C2、在服务节点的集群驻留模块中更改消息分发模块的配置信息,而消息分发模块无需配置服务节点信息;
步骤C3、如前述实施例所示进行后续的图像判别和训练流程;
而针对CDN Cache缓存的色情图像判别,特定的对接流程包括:
步骤C31、消息分发模块向Cache管理设备获取当前时间间隔的话单数据;
步骤C32、消息分发模块解析话单数据的命中图像部分的URL(Uniform  Resoure Locator,统一资源定位器);
步骤C33、消息分发模块向Cache发起上述URL的请求得到图像文件,并进行后续的分发、检测处理。
实施例五:
上述实施例中,随着图片识别业务量的进一步增长,或者业务场景的进一步丰富,对消息分发模块的线性扩展将有进一步要求。一方面随着业务量和类型的增减需要将消息分发模块随之上线或下线,另一方面避免图像检测服务节点随之修改连接配置和重启,保持业务连续性以降低运维成本。此时消息分发模块群组可形成云服务的组网方式提供对外的通用色情图像检测,如图5-1所示,在上述实施例二、实施例三和实施例四的基础上,增加系统“路由消息模块”。目的为使消息分发模块的服务能力伸缩对后台的核心检测配置变化不产生影响,而检测服务节点的伸缩同样不影响消息分发模块配置,同时路由消息模块负责对两侧消息传递的合理调度和缓存,降低系统检测延时。
如图5-2所示,本实施例的色情图像检测方法包括:
步骤D1、核心检测模块向消息路由模块发出异步获取图像的请求,设定图像文件获取事件并开始侦测;
步骤D2、消息路由模块得到各服务节点核心检测模块发送的异步获取请求,以及请求所对应的服务节点信息,形成请求队列;
步骤D3、消息分发模块从消息路由模块取得图像请求队列成员,包括对应的服务节点信息;
步骤D4、消息分发模块从RCS、Cache等业务接口获取图像数据并打包成chunk文件块,根据得到的图像请求,将chunk文件连同服务节点信息封装并发送到路由检测模块;
步骤D5、消息路由模块得到消息分发模块发送的封包,解析出服务节点信息,将去掉服务节点信息的chunk封包发送到对应的服务节点核心检测模 块;
步骤D6、核心检测模块侦测到chunk封包接收事件,解析出chunk封包中的图像文件,进行后续的检测工作;
其中,在本发明实施例中,每个模块之间相互协调完成,消息分发模块用于从RCS/CDN接口获取待判别的图像和压缩文件,并分发到图像检测集群。集群管理模块用于监控图像检测集群服务状态,控制服务上下线,业务配置下发,统计信息上报。集群驻留模块用于部署在集群服务器接收集群管理模块的指令实施服务控制等操作。核心检测模块用于对接收的图像运行检测算法实施判别,输出该图像的类型(色情、非色情,或疑似色情)。样本训练模块用于搜集现有色情图像和非色情图像分别标记入库建立索引。人工坐席模块用于对于核心检测模块判别得到的疑似色情图像进行人工确认。消息路由模块用于使消息分发模块和核心检测模块可伸缩性部署,接触两者的耦合。
可选地,本实施例的色情图像检测方法,还可以包括:
采集图像样本,并将图像样本标记为正样本(即色情图像)或负样本(即非色情图像),这里是进行采集图像样本并根据是否为色情图像标记为色情样本或非色情样本的一种示例,还可以采用其他方式进行标记,输入到样本训练模块,抽取图像全局检索特征建立索引,将索引数据通过集群管理模块分发配置到检测集群模块。
消息分发模块实时从RCS/CDN等接口获取待判别的图像以及其它附加消息,形成消息队列。
消息分发模块根据核心检测模块所在的服务节点将队列消息分发到核心检测模块。
核心检测模块获取待判别图像以并行方式判别,输出对图像的打分。
得分较高的作为色情图像,较低的作为正常图像,落在中间范围的作为疑似图像,将疑似图像发送到人工坐席模块进行人工确认。
完成人工确认后,输入到样本训练模块实施训练,并将训练结果重新分发到服务节点,以提升后续判别的准确率。
采用本实施例的色情图像检测方法,可以提供较高的色情图像识别准确率;对海量消息提供较高的处理性能;随着业务量变化系统部署具有可伸缩性;快速适应新型色情图片的处理。
实施例六:
如图6-1所示,本实施例提供的色情图像检测系统包括获取模块、确定模块、分发模块和多个检测模块:其中,所述获取模块设置为从高吞吐量业务接口获取至少一个待检测图像;所述确定模块设置为从多个检测模块中确定出至少一个检测模块作为待检测模块;所述分发模块设置为将所述待检测图像分发给对应的所述待检测模块;所述检测模块设置为对所述待检测图像进行色情图像检测得到检测结果。
需要说明的是,可以通过前述实施例中的消息分发模块实现本实施例中的获取模块、确定模块以及分发模块的功能;可以通过前述实施例中的核心检测模块实现本实施例中的检测模块的功能。
可选地,本实施例还提供一种色情图像检测系统,如图6-2所示,还包括人工坐席模块,所述人工坐席模块设置为在得到检测结果后,在所述检测结果包括所述待检测图像为疑似色情图像时,对所述待检测图像进行人工确认。
可选地,所述分发模块还设置为,在将所述待检测图像分发给对应的所述待检测模块之前,根据预设大小将多个待检测图像进行打包成数据包;
所述分发模块通过如下方式实现将所述待检测图像分发给对应的所述待检测模块:将数据包分发对应的所述待检测模块。
可选地,如图6-3所示,所述装置还包括路由模块,所述分发模块是设置为通过所述路由模块将将所述待检测图像分发给对应的所述待检测模块。
可选地,所述确定模块是设置为当从高吞吐量业务接口获取多个待检测图像时,从多个检测模块中确定出多个检测模块为待检测模块;所述分发模块还设置为通过异步方式将多个待检测图像分别分发给对应的所述待检测模块。
可选地,所述检测模块是设置为使用全局特征检索算法对所述待检测图像进行色情图像检测得到检测结果。
可选地,所述全局特征包括:在六角锥体模型空间上的颜色直方图特征、区域形状特征和局部区域的梯度方向直方图特征。
可选地,所述检测模块通过如下方式实现使用全局特征检索算法对所述待检测图像进行色情图像检测得到检测结果:
采集图像样本,在所述图像样本为色情图像时将所述图像样本标记为色情样本,在所述图像样本为非色情图像时将所述图像样本标记为非色情样本;抽取所述图像样本的在六角锥体模型空间上的颜色直方图特征、区域形状特征和局部区域的梯度方向直方图特征;
根据所述图像样本的在六角锥体模型空间上的颜色直方图特征、区域形状特征和局部区域的梯度方向直方图特征建立色情图像索引;
根据所述色情图像索引和所述待检测图像的在六角锥体模型空间上的颜色直方图特征、区域形状特征和局部区域的梯度方向直方图特征进行检索,得到多个相似的样本图像;
根据多个相似的样本图像对应的标记得到检测结果。
可选地,所述检测模块通过如下方式实现根据多个相似的样本图像对应的标记得到检测结果:
根据多个相似的样本图像对应的标记得到色情样本图像个数占比;
若所述占比大于或等于第一阈值,则检测结果为色情图像;
若所述占比小于第一阈值且大于第二阈值,则检测结果为疑似色情图像,所述第一阈值大于所述第二阈值;
若所述占比小于或等于第二阈值,则检测结果为非色情图像。
值得注意的是,本实施例中的模块与上述实施例二至六中一个模块对应一个模块,也可以一个模块对应模块或多个模块对应一个模块,例如本实施例中的获取模块和分发模块两个模块对应上述的消息分发模块,本实施例中的检测模块对应上述的核心检测模块等,应该理解为只要能够实现相同的功 能,其都可以进行对应或结合对应。每个模块既可以是硬件设备也可以对应的软件程序,也可以软件程序和硬件设备结合实现。
本发明实施例还提供一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机可执行指令,所述计算机可执行指令被执行时实现色情图像检测方法。
本领域普通技术人员可以理解上述方法中的全部或部分步骤可通过程序来指令相关硬件(例如处理器)完成,所述程序可以存储于计算机可读存储介质中,如只读存储器、磁盘或光盘等。可选地,上述实施例的全部或部分步骤也可以使用一个或多个集成电路来实现。相应地,上述实施例中的各模块/单元可以采用硬件的形式实现,例如通过集成电路来实现其相应功能,也可以采用软件功能模块的形式实现,例如通过处理器执行存储于存储器中的程序/指令来实现其相应功能。本申请不限制于任何特定形式的硬件和软件的结合。本领域的普通技术人员应当理解,可以对本申请的技术方案进行修改或者等同替换,而不脱离本申请技术方案的精神和范围,均应涵盖在本申请的权利要求范围当中。
工业实用性
上述技术方案能够对高吞吐量业务中的大量图片进行色情检测,避免了通过高吞吐量业务进行色情传播的安全隐患,提高了高吞吐量业务的网络安全,提升了用户体验度。

Claims (19)

  1. 一种色情图像检测方法,包括:
    从高吞吐量业务接口获取至少一个待检测图像;
    从多个检测模块中确定出至少一个检测模块作为待检测模块;
    将所述待检测图像分发给对应的所述待检测模块;
    所述待检测模块对所述待检测图像进行色情图像检测得到检测结果。
  2. 如权利要求1所述的色情图像检测方法,其中,所述高吞吐量业务接口包括融合通信接口和/或内容分发网络接口。
  3. 如权利要求1所述的色情图像检测方法,所述方法还包括:在得到检测结果后,在所述检测出结果包括所述待检测图像为疑似色情图像时,对所述待检测图像进行人工确认。
  4. 如权利要求1所述的色情图像检测方法,所述方法还包括:
    在将所述待检测图像分发给对应的所述待检测模块之前,根据预设大小将多个待检测图像打包成数据包;
    所述将所述待检测图像分发给对应的所述待检测模块包括:将数据包分发对应的所述待检测模块。
  5. 如权利要求1所述的色情图像检测方法,其中,所述将所述待检测图像分发给对应的所述待检测模块包括:通过路由模块将将所述待检测图像分发给对应的所述待检测模块。
  6. 如权利要求1-5任一项所述的色情图像检测方法,其中,所述从多个检测模块中确定出至少一个待检测模块作为待检测模块包括:当从高吞吐量业务接口获取多个待检测图像时,从多个检测模块中确定出多个检测模块为待检测模块;
    所述将所述待检测图像分发给对应的所述待检测模块包括:通过异步方式将多个待检测图像分别分发给对应的所述待检测模块。
  7. 如权利要求1-5任一项所述的色情图像检测方法,其中,所述待检测 模块对所述待检测图像进行色情图像检测得到检测结果包括:
    所述检测模块使用全局特征检索算法对所述待检测图像进行色情图像检测得到检测结果。
  8. 如权利要求7项所述的色情图像检测方法,其中,所述全局特征包括:在六角锥体模型空间上的颜色直方图特征、区域形状特征和局部区域的梯度方向直方图特征。
  9. 如权利要求8项所述的色情图像检测方法,其中,所述检测模块使用全局特征检索算法对所述待检测图像进行色情图像检测得到检测结果包括:
    采集图像样本,在所述图像样本为色情图像时将所述图像样本标记为色情样本,在所述图像样本为非色情图像时将所述图像样本标记为非色情样本;抽取所述图像样本的在六角锥体模型空间上的颜色直方图特征、区域形状特征和局部区域的梯度方向直方图特征;
    根据所述图像样本的在六角锥体模型空间上的颜色直方图特征、区域形状特征和局部区域的梯度方向直方图特征建立色情图像索引;
    根据所述色情图像索引和所述待检测图像的在六角锥体模型空间上的颜色直方图特征、区域形状特征和局部区域的梯度方向直方图特征进行检索,得到多个相似的样本图像;
    根据多个相似的样本图像对应的标记得到检测结果。
  10. 如权利要求9项所述的色情图像检测方法,其中,所述根据多个相似的样本图像对应的标记得到检测结果包括:
    根据多个相似的样本图像对应的标记得到色情样本图像个数占比;
    若所述占比大于或等于第一阈值,则检测结果为色情图像;
    若所述占比小于第一阈值且大于第二阈值,则检测结果为疑似色情图像,所述第一阈值大于所述第二阈值;
    若所述占比小于或等于第二阈值,则检测结果为非色情图像。
  11. 一种色情图像检测系统,包括获取模块、确定模块、分发模块和多个检测模块:
    所述获取模块设置为从高吞吐量业务接口获取至少一个待检测图像;
    所述确定模块设置为从多个检测模块中确定出至少一个检测模块作为待检测模块;
    所述分发模块设置为将所述待检测图像分发给对应的所述待检测模块;
    所述检测模块设置为对所述待检测图像进行色情图像检测得到检测结果。
  12. 如权利要求11所述的色情图像检测系统,所述系统还包括人工坐席模块,所述人工坐席模块设置为在得到检测结果后,在所述检测结果包括所述待检测图像为疑似色情图像时,对所述待检测图像进行人工确认。
  13. 如权利要求11所述的色情图像检测系统,其中,所述分发模块还设置为,在将所述待检测图像分发给对应的所述待检测模块之前,根据预设大小将多个待检测图像进行打包成数据包;
    所述分发模块通过如下方式实现将所述待检测图像分发给对应的所述待检测模块:将数据包分发对应的所述待检测模块。
  14. 如权利要求11所述的色情图像检测系统,所述方法还包括路由模块;
    所述分发模块是设置为通过所述路由模块将将所述待检测图像分发给对应的所述待检测模块。
  15. 如权利要求11-14任一项所述的色情图像检测系统,其中,所述确定模块是设置为当从高吞吐量业务接口获取多个待检测图像时,从多个检测模块中确定出多个检测模块为待检测模块;所述分发模块还设置为通过异步方式将多个待检测图像分别分发给对应的所述待检测模块。
  16. 如权利要求11-14任一项所述的色情图像检测系统,其中,所述检测模块是设置为使用全局特征检索算法对所述待检测图像进行色情图像检测得到检测结果。
  17. 如权利要求16项所述的色情图像检测系统,其中,所述全局特征包括:在六角锥体模型空间上的颜色直方图特征、区域形状特征和局部区域的梯度方向直方图特征。
  18. 如权利要求17项所述的色情图像检测系统,其中,所述检测模块通 过如下方式实现使用全局特征检索算法对所述待检测图像进行色情图像检测得到检测结果:
    采集图像样本,在所述图像样本为色情图像时将所述图像样本标记为色情样本,在所述图像样本为非色情图像时将所述图像样本标记为非色情样本;抽取所述图像样本的在六角锥体模型空间上的颜色直方图特征、区域形状特征和局部区域的梯度方向直方图特征;
    根据所述图像样本的在六角锥体模型空间上的颜色直方图特征、区域形状特征和局部区域的梯度方向直方图特征建立色情图像索引;
    根据所述色情图像索引和所述待检测图像的在六角锥体模型空间上的颜色直方图特征、区域形状特征和局部区域的梯度方向直方图特征进行检索,得到多个相似的样本图像;
    根据多个相似的样本图像对应的标记得到检测结果。
  19. 如权利要求18项所述的色情图像检测系统,其中,所述检测模块通过如下方式实现根据多个相似的样本图像对应的标记得到检测结果:
    根据多个相似的样本图像对应的标记得到色情样本图像个数占比;
    若所述占比大于或等于第一阈值,则检测结果为色情图像;
    若所述占比小于第一阈值且大于第二阈值,则检测结果为疑似色情图像,所述第一阈值大于所述第二阈值;
    若所述占比小于或等于第二阈值,则检测结果为非色情图像。
PCT/CN2016/085882 2015-09-30 2016-06-15 一种色情图像检测方法和系统 WO2017054515A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510641416.6 2015-09-30
CN201510641416.6A CN106557527A (zh) 2015-09-30 2015-09-30 一种色情图像检测和系统

Publications (1)

Publication Number Publication Date
WO2017054515A1 true WO2017054515A1 (zh) 2017-04-06

Family

ID=58417816

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/085882 WO2017054515A1 (zh) 2015-09-30 2016-06-15 一种色情图像检测方法和系统

Country Status (2)

Country Link
CN (1) CN106557527A (zh)
WO (1) WO2017054515A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109446461A (zh) * 2018-10-29 2019-03-08 成都思维世纪科技有限责任公司 一种cdn及cache缓存不良信息内容审计的方法
CN111259304A (zh) * 2020-02-17 2020-06-09 猎港信息技术(上海)有限公司 基于图像识别的论坛监控系统

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102547794A (zh) * 2012-01-12 2012-07-04 郑州金惠计算机系统工程有限公司 Wap手机传媒色情图像、视频及不良内容的识别监管平台
CN102567101A (zh) * 2012-01-12 2012-07-11 郑州金惠计算机系统工程有限公司 Wap手机传媒色情图像识别、监管的多进程管理系统
CN102842032A (zh) * 2012-07-18 2012-12-26 郑州金惠计算机系统工程有限公司 基于多模式组合策略的移动互联网色情图像识别方法
US20150221097A1 (en) * 2014-02-05 2015-08-06 Electronics And Telecommunications Research Institute Harmless frame filter, harmful image blocking apparatus having the same, and method for filtering harmless frames

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101303734B (zh) * 2008-06-25 2011-06-22 深圳市腾讯计算机系统有限公司 图片检测系统及方法
US8374914B2 (en) * 2008-08-06 2013-02-12 Obschestvo S Ogranichennoi Otvetstvennostiu “Kuznetch” Advertising using image comparison
CN102306287B (zh) * 2011-08-24 2017-10-10 百度在线网络技术(北京)有限公司 一种用于识别敏感图像的方法与设备
CN104134059B (zh) * 2014-07-25 2017-07-14 西安电子科技大学 保持颜色信息的混合形变模型下的不良图像检测方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102547794A (zh) * 2012-01-12 2012-07-04 郑州金惠计算机系统工程有限公司 Wap手机传媒色情图像、视频及不良内容的识别监管平台
CN102567101A (zh) * 2012-01-12 2012-07-11 郑州金惠计算机系统工程有限公司 Wap手机传媒色情图像识别、监管的多进程管理系统
CN102842032A (zh) * 2012-07-18 2012-12-26 郑州金惠计算机系统工程有限公司 基于多模式组合策略的移动互联网色情图像识别方法
US20150221097A1 (en) * 2014-02-05 2015-08-06 Electronics And Telecommunications Research Institute Harmless frame filter, harmful image blocking apparatus having the same, and method for filtering harmless frames

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
HUO, HONGTAO ET AL.: "Design of Internet Porn Image Monitoring System and Node Monitoring System", NETINFO SECURITY, 30 April 2010 (2010-04-30), pages 41 - 43, ISSN: 1671-1122 *

Also Published As

Publication number Publication date
CN106557527A (zh) 2017-04-05

Similar Documents

Publication Publication Date Title
US10102227B2 (en) Image-based faceted system and method
WO2018054342A1 (zh) 一种网络数据流分类的方法及系统
CN111131379B (zh) 一种分布式流量采集系统和边缘计算方法
US11301425B2 (en) Systems and computer implemented methods for semantic data compression
WO2022257436A1 (zh) 基于无线通信网络数据仓库构建方法、系统、设备及介质
CN106972985B (zh) 加速dpi设备数据处理与转发的方法和dpi设备
US10706062B2 (en) Method and system for exchanging data from a big data source to a big data target corresponding to components of the big data source
CN109669795A (zh) 崩溃信息处理方法及装置
CN109271363B (zh) 一种文件存储的方法及设备
US20230042747A1 (en) Message Processing Method and Device, Storage Medium, and Electronic Device
US11368482B2 (en) Threat detection system for mobile communication system, and global device and local device thereof
EP3744066B1 (en) Method and device for improving bandwidth utilization in a communication network
KR20140062955A (ko) 통신망의 종류를 구분하는 방법 및 이를 이용한 콘텐츠 제공 방법
CN113986811B (zh) 一种高性能内核态网络数据包加速方法
WO2017054515A1 (zh) 一种色情图像检测方法和系统
CN110505307B (zh) 一种网间交通流数据的交换方法及系统
CN109710502B (zh) 日志传输方法、装置及存储介质
US9003054B2 (en) Compressing null columns in rows of the tabular data stream protocol
US10009265B2 (en) Communication control apparatus, communication control method, communication system, and recording medium
CN1829231B (zh) 直接接收入站数据的方法和装置
CN103530297A (zh) 一种自动进行网站分析的方法及装置
CN111614726A (zh) 一种数据转发方法、集群系统及存储介质
CN104573518B (zh) 文件扫描方法、装置、服务器及系统
CN113794601B (zh) 网络流量处理方法、装置和计算机可读存储介质
CN116828044A (zh) 基于数据平面开发套件的消息队列遥感传输方法和系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16850140

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16850140

Country of ref document: EP

Kind code of ref document: A1