WO2009144330A1 - Procédé de détection de contenus contestables dans des images numériques fixes ou animées - Google Patents

Procédé de détection de contenus contestables dans des images numériques fixes ou animées Download PDF

Info

Publication number
WO2009144330A1
WO2009144330A1 PCT/EP2009/056765 EP2009056765W WO2009144330A1 WO 2009144330 A1 WO2009144330 A1 WO 2009144330A1 EP 2009056765 W EP2009056765 W EP 2009056765W WO 2009144330 A1 WO2009144330 A1 WO 2009144330A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
thumbnail
values
skin
thumbnails
Prior art date
Application number
PCT/EP2009/056765
Other languages
English (en)
Inventor
Gérard YAHIAOUI
Pierre Da Silva Dias
Original Assignee
Crystal Content
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Crystal Content filed Critical Crystal Content
Publication of WO2009144330A1 publication Critical patent/WO2009144330A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/23Recognition of whole body movements, e.g. for sport training
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/24Aligning, centring, orientation detection or correction of the image
    • G06V10/242Aligning, centring, orientation detection or correction of the image by image rotation, e.g. by 90 degrees
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands

Definitions

  • the present invention generally relates to computer-implemented methods for detecting undesirable contents, such as illegal contents or contents intended to a specific public, such as explicit sexual contents for adults.
  • the present invention aims at providing a new and effective detection method based on a combination of processes or agents.
  • the present invention provides a method for detecting objectionable content in a bi-dimensional pixel image, comprising: implementing a first process for detecting geometric features of the image, implementing a second process for detecting skin colors in the image, implementing a third process for detecting object shapes indicative of sexual contents in the image.
  • Preferred but non-limiting aspects of this method are as follows:
  • said bi-dimensional pixel image belongs to a video sequence, and the method further comprises: implementing a fourth process for detecting recurrent movement in a succession of images.
  • said first process includes a sub-process for detecting circular patterns, said sub-process comprising the following steps:
  • said first process includes a sub-process for detecting areas having axial symmetry in a bi-dimensional pixel image, said sub-process comprising the following steps:
  • each pixel of the image has values in a color space having N dimensions (such as RGB), and said second process comprises the following steps:
  • the second further comprises, before applying vector to the neural network, the step of eliminating non eligible areas of the image (too green, too blue, too dark, etc.).
  • said elimination step comprises converting the color pixel values (RGB) into HSL (Hue/Saturation/Lightness) pixel values and applying to said HSL values a gauge eliminating non-skin type hue values, as well as dark pixels and black & white pixels.
  • RGB color pixel values
  • HSL Hue/Saturation/Lightness
  • said neural network outputs degree of membership values among three classes, i.e. normal skin, dark skin, and non-skin.
  • the second process comprises a further step of applying said degree of membership values to a table of fuzzy decision rules for determining a score of skin color associated with an index score of reliability.
  • the second process comprises a step of generating an image of skin color scores for a plurality of eligible areas of the starting image, and a step of noise removal and binahzation to retain as skin areas those of adjacent pixels all have the same binary value.
  • said third process comprises the following steps : - for an image of a sequence, providing a vector (Dx, Dy) of apparent movement relative to a previous image of the sequence in different positions of the image,
  • the third process comprises a step of determining the amplitudes of the periodic signal values.
  • the removal step comprises computing a linear regression of S(t) among t, so as to remove the trend component thereof.
  • said fourth process comprises the following steps: - providing a plurality of decomposition thumbnails having a size of X x Y diagrammatically representing different object appearances,
  • ROI region-of-interest
  • said fourth process comprises post-processing the membership function, said post-processing including dynamics reshaping by repositioning the slope of the membership function.
  • the present invention also provides a computer system or computing environment including sets of instructions for performing the above method, as well as information media or carriers containing said sets of instructions.
  • Fig. 1 illustrates the scanning of a bidimensional pixel image with a region of interest
  • Fig. 2 illustrates a process for detecting circular patterns
  • Fig. 3 illustrates a process for detecting symmetric patterns
  • Fig. 4 is a series of curves illustrating the variability of bi-dimensional color vectors, for use designing a skin color detection process
  • Fig. 5A illustrates a membership function used in hue detection for this process
  • Fig. 5B illustrates a membership function used in saturation detection for said process
  • Fig. 6 is a synoptic representation of a pixel eligibility scheme for the skin color detection process
  • Fig. 7 is a synoptic representation of the skin color detection process
  • Figs. 8A, 8B and 8C illustrate fuzzy logic rules used in the skin color detection process
  • Fig. 9 illustrates a pixel region used for de-noising and binarizing pixel values for the skin color detection process
  • Figs. 10A and 10B illustrate two types of optical flows for detecting recurrent movement
  • Fig. 11 illustrates a trendless signal generation step used in a recurrent movement detection process of the invention
  • Fig. 12 illustrates an autocorrelation step used in said process
  • Fig. 13 illustrates a linear correlation coefficient generation step used in said process
  • Fig. 14 shows a set of model thumbnails used in a penetration detection process
  • Fig. 15 illustrates a neural network used for said process
  • Figs. 16A and 16B illustrate two fuzzy logic membership functions used in said neural network
  • Fig. 17 shows a set of model thumbnails used in a breast nipple detection process
  • Fig. 18 illustrates a neural network used for said process
  • Figs. 19A and 19B illustrate two fuzzy logic membership functions used in said neural network.
  • the present invention may implement conventional processes for edge detection and corner detection, such as the corner detectors of Harris and Moravec (cf. en wikipedia.org/wikl/Comer detection).
  • the system scans each image entirely with a Region of Interest ("ROI") that contains a thumbnail to be analyzed, having a size of e.g. 32 x 32 pixels. This ROI successively assumes all possible locations in the complete picture, as illustrated in Fig. 1.
  • ROI Region of Interest
  • the system computes a "circular area” score that is obtained by calculating a linear correlation coefficient (ranging between -1 and 1 ) between two images derived from the same thumbnail being analyzed. This coefficient is however assigned the 0 value when it is negative, so as to obtain a circular area score ranging between 0 and 1.
  • the two images are on the one hand, the thumbnail itself, and on the second hand the same thumbnail after a rotation of an angle of 90°.
  • each image is entirely scanned with a Region of Interest (ROI) that contains a thumbnail to be analyzed, having a size of e.g. 32 x 32 pixels), which successively assumes all possible locations in the complete picture, as illustrated in Fig. 1.
  • ROI Region of Interest
  • a "symmetric data" score is computed, which is obtained by calculating a linear correlation coefficient (ranging between -1 and 1 ) between two images derived from the same thumbnail being analyzed. This coefficient is however assigned the 0 value when it is negative, so as to obtain a circular area score ranging between 0 and 1.
  • the two images are on the one hand, the thumbnail itself, and on the second hand the same thumbnail after a rotation of an angle of 180°.
  • the factors of the variability of skin color are mainly:
  • the purpose of the process is to achieve automatic evaluation of the quality of lighting and white balance.
  • the second solution tested was to "measure" the distortion of the cloud of points, and to provide these measures to a color recognition system. Pursuant to this testing, this second solution was selected as preferred as it is as efficient and does not require recalculating a re-normalized image.
  • each (R, G, B) pixel is converted into a (H, S, L) pixel and a broad gauge is applied that selects the areas which may be of skin
  • the membership function for the hue (H) parameter is illustrated in Fig. 5A, and the membership function for the saturation (S) parameter is illustrated in Fig. 5B.
  • a similar membership function is used for the lightness variable (L).
  • the neural network is applied on a thumbnail of p x p pixels, e.g. 5 x 5 pixels, whose center successively assumes all possible locations of eligible pixels in the image, as illustrated in Fig. 7.
  • the neural network has been previously trained on a large number (e.g. 1000) of labeled images, i.e. area thumbnails that were identified by the human expert as being human skin.
  • the distinction between skin color and dark skin color is important to decipher as scenes of nudity that expose genitals commonly show contrasts between normal skin and dark-skinned elements (example: areoles are darker than the rest of the breast).
  • the neural network is capable of outputting 3 values each between -1 and 1 (because of the activation function being hyperbolic tangent).
  • FPS FPS
  • FPC FPC
  • NPF NPF
  • a table of fuzzy decision rules is then advantageously used to calculate a score of skin color associated with an index score of reliability.
  • the index score of reliability is determined by first calculating a value FP as the "FUZZY LOGIC OR" between the values FPS and FPC, and then calculating the absolute gap E between FP and NPF. Finally the "FUZZY LOGIC AND " between E and FP is calculated.
  • the score of skin color is then computed in the same manner as indicated above.
  • the result of this process is a score of skin color for each pixel of the image, where: - for ineligible areas, the score is zero,
  • the score is that which is calculated by the process described above.
  • the image of skin color scores may display irregularities, such as a high score alone surrounded by low scores, or a low score alone surrounded by high scores.
  • a mathematics automaton is preferably used and performs the following functions:
  • a Boolean value B is obtained for each pixel.
  • a set of regions (with their location and size) is obtained. All regions with a size smaller than a given threshold Tmin are eliminated.
  • the regions that remain are those considered to have a color of skin. If the proportion of these regions in the image is greater than a threshold, the image is considered as showing sex-related nudity.
  • the present invention advantageously includes an optical flow process as published by Pedro Cobos Arribas and Felix Monasterio Huelin Macia, which offers a real-time version of the Shun & Horn algorithm.
  • the optical flow can track points that move from one picture to another.
  • the optical flow provides a vector of apparent movement of every pixel from the previous image to the current image.
  • This displacement is a 2-component vector computed at each point of the image.
  • Figs. 10A and 10B show two examples of optical flow, respectively the optical flow of a panoramic view (the pixels move in majority in a given direction) and of a traveling sequence (the pixels move closer to the camera, without common direction).
  • Dx and Dy which respectively represent the apparent movement among x between image i and image i + 1 , and apparent movement among y between image i and image i
  • the process checks the periodic features of Sx and Sy. The process is the same for Sx and Sy, so that it will be described below with reference to a generic signal S.
  • the first step is to convert S into a stationary signal, i.e. to remove components due to slow or linear movements of the camera.
  • the next step is to compute the autocorrelation function (FAC) of the Sd signal, as illustrated in Fig. 12:
  • FAC( ⁇ ) correlation ( Sd(t), Sd(t- ⁇ ))
  • FAC is also a periodic signal. If S is a periodic signal with a noise component, then FAC is a signal that is rapidly becoming periodic when ⁇ grows.
  • the next step is to count the number of crossings of zero of the FAC signal.
  • the cosine signal CNpz with the same number of points and having the same number of crossings of zero is then generated, and the correlation between FAC( ⁇ ) and CNpz( ⁇ ) is computed (cf. Fig. 13).
  • the frequency of Sd is the inverse of the period of the cosine CNpz.
  • the frequency F (which allows to determine whether the period may or may not correspond to a type of movement, such as a human movement which cannot exceed certain speeds) is calculated. Very high frequencies correspond to periodical movements that cannot be generated by a human being).
  • the standard deviation of Sd is considered as the effective amplitude of motion in the analyzed direction.
  • the frequencies of these two signals are tested independently from each other.
  • the vector sum of amplitudes gives the direction of the periodic movement in the image.
  • Movements with amplitude and frequencies within prescribed ranges are considered by the process as illustrating sex-related recurrent movements.
  • Porture is defined as the shape of the torus of contact when a tube is embedded in a hole.
  • a representation space is needed that does not dilute the details that are of greatest interest to the penetration detection process (those that represent elements to recognize). Also needed is a representation space that is insensitive to noise. To ensure this insensitivity to noise, analyzed signals include redundancy (unlike a base of the vector space signals/images).
  • the process uses a set of images constructed for this ad-hoc decomposition into basic components.
  • the set of these components is then considered a signature of the analyzed image.
  • the component of the image among a given thumbnail of decomposition is by definition the coefficient of linear correlation between the analyzed image and the decomposition thumbnail.
  • a preferred set of decomposition thumbnails is a set of synthetic images that represent each feature of the torus of a possible penetration which the process attempts to recognize.
  • a set of 14 such thumbnails is illustrated in Fig. 14.
  • R subsequent thumbnails are deduced through R rotations by 360/R degrees, and are then used for the decomposition of the image on the 14 x R thumbnails.
  • the process then keeps the maximum value of coefficients among subsequent thumbnails as the coefficient for the original thumbnail. This generates 14 coefficients. - or, the process knows the specific direction of interest in the image
  • the process directly computes the coefficient of every original thumbnail rotated in this direction. This also generates 14 coefficients. These 14 coefficients form a "curve", which can be considered as a
  • decomposition thumbnails that serve as a basis for decomposition can be extended beyond the 14 images presented above. Furthermore, in certain cases, it may be necessary to add additional thumbnails. If 2 images exhibit the same 14-coefficient signature although they are known as being different, it is clear that the process needs additional decomposition thumbnails.
  • the process then discriminates the signature using a feed-forward neural network trained on thousands of labeled images (labeled by a human operator) using a back-propagation learning rule, as illustrated in Fig. 15.
  • the activation functions of neurons are hyperbolic tangents.
  • the two outputs are post-processed by a set of decision rules. These outputs vary between -1 and +1. They are processed through a fuzzy set membership function that is used for two purposes: - to reduce them to between 0 and 1 ; - to reshape the dynamic by positioning the slope of the membership function in a range between (average - standard deviation) and (average + standard deviation).
  • Penetration Score FUZZY LOGIC AND (Ap, D)
  • the problematic is generally the same as for tube-hole interface torus detection as described above.
  • the process uses a previously constructed set of images for ad-hoc decomposition into basic components.
  • the set of these components is then considered a signature of the analyzed image.
  • the component of the image among a given thumbnail of decomposition is by definition the coefficient of linear correlation between the analyzed image and the decomposition thumbnail.
  • the set of decomposition thumbnails is a set of synthetic images that represent each feature of breast nipple that the process is attempting to recognize.
  • An exemplary set of twelve such thumbnails is illustrated in Fig. 17.
  • the 12 coefficients obtained form a "curve", which is the "signature" of the ROI image to be analyzed.
  • the set of decomposition thumbnails that serve as a basis for decomposition can be extended beyond the 12 images shown in Fig. 17. In this regard, if two images exhibit the same 12-coefficient signature although they have been visually identified as being different from each other, additional decomposition thumbnails can be used.
  • the signature is then discriminated using a feed-forward neural network trained on thousands of labeled images (labeled by a human operator) using back-propagation learning rule, as illustrated in Fig.18.
  • the activation functions of neurons are hyperbolic tangents.
  • the 2 outputs are post-processed by a set of decision rules. These outputs vary between -1 and +1 . They are processed through a fuzzy set membership function that is used for two things:
  • the method of the invention receives the scores provided by the different processes, and uses any conventional decision schemes (weighted sums and thresholds, etc.) to identify whether the image or the sequence of images is objectionable (and should be filtered out or accessible through specific procedure) or not.
  • the present invention is not limited to the described embodiment, but the skilled person will be able to devise many variants and alternate embodiments.
  • ROI Region 5 Of Interest
  • process executions on each ROI can be implemented in different ways, while it is preferred that ROI iteration is mutualized between different processes involving such ROIs.
  • the present invention shall also include the different processes as described above and recited in the claims, but considered l O individually, and any combination of at least two of such processes.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Image Analysis (AREA)

Abstract

L'invention concerne un procédé de détection de contenus contestables dans une image de pixels bidimensionnels comprenant : la mise en œuvre d'un premier processus pour détecter des caractéristiques géométriques de l'image, la mise en œuvre d'un deuxième processus pour détecter des couleurs de peau dans l'image, et la mise en œuvre d'un troisième processus pour détecter des formes d'objets révélatrices de contenus sexuels dans l'image.
PCT/EP2009/056765 2008-05-30 2009-06-02 Procédé de détection de contenus contestables dans des images numériques fixes ou animées WO2009144330A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US13043008P 2008-05-30 2008-05-30
US61/130,430 2008-05-30

Publications (1)

Publication Number Publication Date
WO2009144330A1 true WO2009144330A1 (fr) 2009-12-03

Family

ID=40935572

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2009/056765 WO2009144330A1 (fr) 2008-05-30 2009-06-02 Procédé de détection de contenus contestables dans des images numériques fixes ou animées

Country Status (1)

Country Link
WO (1) WO2009144330A1 (fr)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
RU2510905C2 (ru) * 2011-12-28 2014-04-10 Федеральное Государственное Бюджетное Образовательное Учреждение Высшего Профессионального Образования "Саратовский Государственный Университет Имени Н.Г. Чернышевского" Способ оценки фото-, кино- и видеоматериалов, содержащих нежелательное изображение (варианты)
WO2014090262A1 (fr) * 2012-12-11 2014-06-19 Unify Gmbh & Co. Kg Procédé consistant à traiter des données vidéo, dispositif, produit de programme informatique et élément de données
US20150023552A1 (en) * 2013-07-18 2015-01-22 GumGum, Inc. Systems and methods for determining image safety
CN115331286A (zh) * 2022-07-29 2022-11-11 中国兵器工业信息中心 一种基于深度学习的内容安全检测系统

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
ASHWIN THANGALI ET AL: "Periodic Motion Detection and Estimation via Space-Time Sampling", 1 January 2005, 2005 SEVENTH IEEE WORKSHOPS ON APPLICATIONS OF COMPUTER VISION (WACV/MOTION'05) - 5-7 JAN. 2005 - BRECKENRIDGE, CO, USA, IEEE, LOS ALAMITOS, CALIF., USA, PAGE(S) 176 - 182, ISBN: 978-0-7695-2271-5, XP031059164 *
CUTLER R ET AL: "ROBUST REAL-TIME PERIODIC MOTION DETECTION, ANALYSIS, AND APPLICATIONS", IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, IEEE SERVICE CENTER, LOS ALAMITOS, CA, US, vol. 22, no. 8, 1 August 2000 (2000-08-01), pages 781 - 796, XP000976486, ISSN: 0162-8828 *
N. REA ET AL.: "Multimodal periodicity analysis for illicit content dectection in videos", PROC. 3-RD CONF. ON VISUAL MEDIA PRODUCTION, 2006, pages 106 - 114, XP002541426, Retrieved from the Internet <URL:http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4156017> [retrieved on 20090814] *
TADILO ENDESHAW ET AL: "Classification of indecent videos by low complexity repetitive motion detection", 15 October 2008, APPLIED IMAGERY PATTERN RECOGNITION WORKSHOP, 2008. AIPR '08. 37TH IEEE, IEEE, PISCATAWAY, NJ, USA, PAGE(S) 1 - 7, ISBN: 978-1-4244-3125-0, XP031451786 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
RU2510905C2 (ru) * 2011-12-28 2014-04-10 Федеральное Государственное Бюджетное Образовательное Учреждение Высшего Профессионального Образования "Саратовский Государственный Университет Имени Н.Г. Чернышевского" Способ оценки фото-, кино- и видеоматериалов, содержащих нежелательное изображение (варианты)
WO2014090262A1 (fr) * 2012-12-11 2014-06-19 Unify Gmbh & Co. Kg Procédé consistant à traiter des données vidéo, dispositif, produit de programme informatique et élément de données
US9501702B2 (en) 2012-12-11 2016-11-22 Unify Gmbh & Co. Kg Method of processing video data, device, computer program product, and data construct
US10368129B2 (en) 2012-12-11 2019-07-30 Unify Gmbh & Co. Kg Method of processing video data, device, computer program product, and data construct
US10516916B2 (en) 2012-12-11 2019-12-24 Unify Gmbh & Co. Kg Method of processing video data, device, computer program product, and data construct
US20150023552A1 (en) * 2013-07-18 2015-01-22 GumGum, Inc. Systems and methods for determining image safety
US9355406B2 (en) * 2013-07-18 2016-05-31 GumGum, Inc. Systems and methods for determining image safety
US9947087B2 (en) 2013-07-18 2018-04-17 GumGum, Inc. Systems and methods for determining image safety
CN115331286A (zh) * 2022-07-29 2022-11-11 中国兵器工业信息中心 一种基于深度学习的内容安全检测系统

Similar Documents

Publication Publication Date Title
US20230005238A1 (en) Pixel-level based micro-feature extraction
Khawaja et al. An improved retinal vessel segmentation framework using frangi filter coupled with the probabilistic patch based denoiser
CN102426649B (zh) 一种简单的高准确率的钢印数字自动识别方法
US7957560B2 (en) Unusual action detector and abnormal action detecting method
Frucci et al. WIRE: Watershed based iris recognition
JP4725298B2 (ja) 画像による外観検査方法
CN1523533A (zh) 人的检测方法和设备
CN103886589A (zh) 面向目标的自动化高精度边缘提取方法
CN105243667A (zh) 基于局部特征融合的目标再识别方法
CN105975906B (zh) 一种基于面积特征的pca静态手势识别方法
CN106874867A (zh) 一种融合肤色及轮廓筛选的人脸自适应检测与跟踪方法
WO2009144330A1 (fr) Procédé de détection de contenus contestables dans des images numériques fixes ou animées
Chaabane et al. Color image segmentation using automatic thresholding and the fuzzy C-means techniques
Jamil et al. Illumination-invariant ear authentication
KR20110019969A (ko) 얼굴 검출 장치
CN111507968B (zh) 一种图像融合质量检测方法及装置
JPH07311833A (ja) 人物の顔の検出装置
Cheng The distinctiveness of a curve in a parameterized neighborhood: extraction and applications
CN110245590B (zh) 一种基于皮肤图像检测的产品推荐方法及系统
CN109934190B (zh) 基于变形高斯核函数的自适应强光人脸图像纹理恢复方法
Aminian et al. Face detection using color segmentation and RHT
Wang et al. Research on an improved algorithm of face detection based on skin color features and cascaded Ada Boost
Schmugge et al. Task-based evaluation of skin detection for communication and perceptual interfaces
Ghimire et al. A lighting insensitive face detection method on color images
CN111507969B (zh) 一种图像融合质量检测方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09753985

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 09753985

Country of ref document: EP

Kind code of ref document: A1