WO2008037042A2

WO2008037042A2 - Progressive randomization process and equipment for multimedia analysis and reasoning

Info

Publication number: WO2008037042A2
Application number: PCT/BR2007/000156
Authority: WO
Inventors: Anderson De Rezende Rocha; Siome Klein Goldestein
Original assignee: Universidade Estadual De Campinas - Unicamp
Priority date: 2006-09-29
Filing date: 2007-06-15
Publication date: 2008-04-03
Also published as: BRPI0605994A; BRPI0605994B1; WO2008037042A3

Abstract

This present patent of invention discloses a process and equipment to extract high-level multimedia information (e.g. image, sound, video). More specifically, this present invention discloses a progressive process of disturbance of multimedia content designed to extract non-visible characteristics in the natural state of the content analyzed. The process and equipment are able to gather information extracted from various media classes to solve known problems such as multimedia content categorization, hidden message detection, content tampering detection, robust watermarking, recognition of the originating means of the media (e.g. type of camera used in the capture process) among other uses.

Description

SPECIFICATION

"PROGRESSIVE RANDOMIZATION

PROCESS AND EQUIPMENT FOR MULTIMEDIA ANALYSIS AND REASONING". Field of the Invention

This present invention discloses a process and equipment to extract high-level multimedia information (e.g. image, sound, video). More specifically, this present invention discloses a progressive process of disturbance of multimedia content designed to extract non-visible characteristics in the natural state of the content analyzed. Given a set of multimedia objects to be analyzed, the process performs a series of steps to extract intrinsic statistical characteristics of the objects analyzed.

The process and equipment of the present invention are able to aggregate information extracted from various media classes to solve known problems such as multimedia content categorization, hidden message detection, content tampering detection, robust watermarking, recognition of the originating media means (e.g. type of camera used in the capture process) among other uses. The process can be encompassed within the most varied devices, being designed to solve specific applications. For example, it can be encompassed within routers in order to analyze the media that pass through the router and, in a non-invasive way, to highlight suspect media that may house hidden content. Moreover, the process can also be encompassed within digital cameras in such as way as to categorize the photographs of the clients (image types).

Background of the Invention

The semantic knowledge of a certain medium allows us to develop intelligent techniques for processing such medium based on its content (Szummer & Picard, 1998). Digital cameras or computer applications can adjust color and brightness automatically, taking into consideration the properties of the scene analyzed. On the whole, multimedia reasoning is defined as the process of analyzing a set of multimedia data to extract high- level or semantic information therefrom. In a given arbitrary multimedia item, a person may wish to discover a certain type of information therein. Such information may be the automatic determination of a class of multimedia object analyzed, information on the presence or not of content hidden in the object, a test for authenticity, evaluation on potential tampering thereof, among others.

On the whole, multimedia processing techniques use local intrinsic properties of the media analyzed. For example, in the context of digital images, color histograms and an analysis of shape and texture are some of the techniques used. However, this kind of approach limits the process to solving a certain kind of problem only that is directly linked to the media used. The various kinds of approaches for extracting information and subsequent problem solving using multimedia are described in the state of the art.

Patent document US 7.075.571 describes a system for detecting falsification of an image. The system is able to determine whether a certain photograph originated from a digital camera or whether it was digitally modified. Other similar patents are US 6.516.078 and US 6.970.259, which detect copyright infringements in printed documents. These approaches generally require the original means for comparison and subsequent identification of content tampering.

Patent documents US 7.099.510 and US 6.990.239 describe systems and methods for detecting objects in digital images. The systems are unable to classify the analyzed image within a certain semantic class, only providing the user with information on the absence/presence of a certain object in the image analyzed.

Patent documents US 7.039.856 and US 7.039.239 describe methods for classifying digital documents using content information. The first approach groups similar documents, but it is unable to state which class they belong to. The second approach groups regions of images, but is unable to generalize which class of images the analyzed image belongs to.

Patent document US 7.027.065 describes a method for image texture describing. Patent US 6.985.628 describes image type classification using edge features. However, this kind of approach is extremely dependent upon context. Images having few edge features general create problems in this kind of classification approach. Patent document US 6.810.149 describes a method for cataloging images. However, the approach described represents more of a retrieval system based on content rather than a classification system per se.

Patent document US 6.480.627 describes image classification using evolved parameters that is able to learn the most important features in a digital media and subsequently use this information in a classification system.

Patent document US 6.246.793 presents a method and apparatus for transforming an image for classification or pattern recognition. The methods and apparatus use information on textures, edges, shapes and other information. Other similar patents are US 5.995.651, which discloses image content classification methods using texture patterns, US 5.787.201 teaches a high order fractal features extraction for classification of objects in images, and patent US 5.781.650 presents a method for classifying human faces.

Patent documents US 2005140791, US 6.862.038, US 2005231611 and JP 2003330941 disclose image categorization methods using color, shape and texture.

Patent documents EP 1107179, US 6.504.951 and DE 60008486 describe a system for detecting outdoor from indoor images using middle-level information such as the presence/absence of grass, sky. snow, trees and water textures in the image analyzed.

Patent document US 6.535.636 describes a method for detecting the general quality of a digital image.

On the whole, the methods presented are intricately linked to the problem to be solved. In the case of classifying indoor/outdoor images, for example, these two types of classes alone can be differentiated as in patent US 6.504.951. The state of the art of hidden message detection in digital content and content tampering detection has also been well researched, comprising a highly important field in the industry as a whole.

Patent US 6.831.991 describes a hidden message detection system in images using the statistical properties of color and grayscale images. However, the system is not considered to be of the "blind detection" kind since the system is not independent from the type of process use in message hiding. A dependent detection system results in that the developers create a system to detect a certain masking algorithm and not a general set of masking techniques. Therefore, the detection process may fail with digital content whose masking process (insertion of hidden message) is not known.

Patent document US 6.804.377 discloses a system of hidden message detection in digital images based on altered information in the color schemes.

Patent documents US 6.064.764, US 7.039.215 and US 6.735.325 disclose systems for detecting watermarks in digital images. The systems are also able to determine whether watermarks have been tampered with. They can also be used as techniques for creating more robust watermarks.

In general terms, the approaches are dependent upon the context analyzed. This means that a system that can detect watermark tampering is generally rather different from a system that can detect the presence of hidden messages in digital content.

Accordingly, it would be useful to develop a simple and unified system for hidden message detection as well as for content tampering detection and attempts to counterfeit multimedia in addition to multimedia content categorization. A system that only requires an initial training presenting examples of typical content for each class of problem being analyzed. A process that can adapt to resolve a different problem in a simple manner, normally implying only direct changes in the learning process and in the process by which the media is analyzed.

Brief Description of the Invention

The present invention discloses a process and equipment to extract high-level multimedia information (e.g. image, sound, video) for subsequent use in a classification system. Given a set of media to be analyzed, the present process consists of altering certain properties of these media so as to obtain high-level information that these media present when disturbed by a process of progressive randomization of their content. Different types of media behave differently to the process developed and these different behaviors can be used to solve certain problems associated to the media analyzed.

Mores specifically, given a set of multimedia objects to be analyzed, the process executes a set of steps to extract intrinsic statistical characteristics from the objects analyzed. Next, a descriptor set is created which represents the objects analyzed. Each group of objects has a specific descriptor set. In order to differentiate a group of objects from another group, it is suffice to compare the two sets of descriptors and analyze their differences. More specifically, consider the problem of hidden message detection in digital content. The distinction between media groups with and without hidden messages is performed by applying the process described herein in a sampling set of each one of the groups. Any new example can be analyzed directly simply by comparing its descriptor set to the sets of descriptors already processed by the procedure described herein.

Therefore, the present invention presents a simple and unified system for categorizing images in terms of their semantic class. The process merely requires an initial learning step to capture the statistical behavior of a new class. Based on this learning, any new media sample provided to the system will be identified and the invention will accurately highlight the class to which this object belongs, provided that the training has been effective for the classes of interest.

Brief Description of the Drawings Figure 1 presents the progressive randomization procedure applied to a region (a) of an image. The indicator (b) presents the least significant bits (LSBs) of the image region in (a). The progressive randomization applied to (b) results in the bits being disturbed and shown in (c-h). Analytical descriptors are then applied to each one of the sets generated (b-h) to construct a set of identifying characteristics of the region presented in (a).

Figure 2 shows the selection of some characteristic regions considered important for the progressive randomization procedure. More specifically, eight regions were selected, of which four were overlapping (1-4) and four were not

(5-8).

Figure 3 presents the result of a disturbance in four of the least significant bits of an image. The disturbance inserted is represented by bits 1110. The original bits are 1000 and, after disturbance, the resulting bits are 1110.

Detailed Description of the Invention

Minor disturbances in the most varied kinds of media are imperceptible to human beings (Wayner, 2002). Yet such disturbances are statistically detectable and useful for inference on the kind of media analyzed. The present invention presents PROGRESSIVE RANDOMIZATION as a useful process for multimedia reasoning and analysis. This is an unprecedented process that captures the change dynamics of statistical artifacts inserted during the process of disturbance in each semantic class of the media of interest. Different types of media behave differently under the process presented herein and hereafter referred to as Progressive Randomization (PR).

The present invention has a much wider field of applications, and can be used to categorize the semantic class of the media, content tampering detection, hidden message detection (steganalysis), recognition of the originating media means (e.g. type of camera used in the capture process) among other applications.

Generally, the Progressive Randomization process comprises up to seven steps, namely: (1) Acquisition of the media to be analyzed; (2) Multi-scale decomposition; (3)

Randomization; (4) Selection of regions; (5) Statistical descriptors; (6) Normalization; (7) Inference. The pre-requisite is the acquisition of the media to be analyzed and results in a set of characteristics that describe the media analyzed. Not all the problems require the performance of all the steps set forth.

Illustrative Examples:

Progressive Randomization - Procedure

Requirements: - Set of percentages representing the disturbance intensities

Step 1. Acquisition of the media to be analyzed. Comprises the formation of an M media sampling space with examples of media that are representative of the problem being solved. For each m media present in the M media set apply the steps described in 2-7.

Step 2. Multi-scale decomposition. Comprises the decomposition of m media provided in 7 scales. For each scale generated, repeat the procedures in steps 3-7.

Step 3. Randomization. Comprises the disturbance of the least significant statistical properties of the original m media

Step 4. Selection of regions. Comprises the selection of r regions of interest in the media analyzed. It can be understood as a sampling process for the media provided for analysis.

Step 5. Statistical descriptors. Comprises the calculation of d statistical descriptors on each region generated by the procedure described in step 3. The descriptors used are related to the specific application being developed.

Step 6. Invariance. Comprises the normalization of the descriptors in relation to the measurements present in the original media prior to the progressive randomization procedure. Step 7. Inference. Comprises the use of any machine learning technique to differentiate the set of descriptors collected from the set of descriptors already learned in the procedure on other media present in the sampling space of M media provided for training.

Step (1): comprises the construction of a sampling space for the media to be analyzed. The progressive randomization procedure is then applied to all the examples of the sampling space provided. If the problem is one of hidden message detection in images, then a set of images with and without hidden messages should be provided. If the problem is one of image categorization, then examples of each one of the classes required for classification should be provided.

Step (2): the image can be decomposed into multi-levels. Decomposition is optional and depends on the problem being analyzed. Carrying out just one scale is a particular case and is the most used. The type of decomposition used is related to the problem and the time available to obtain a satisfactory response from the system. Step (3): randomization comprises receiving the original m media belonging to an M media set and applying a progressive process of disturbance on m resulting in T(m, 0),

T(mΛ), ..., T(w, P_Λ)-

The transformation T(m, P₁) represents disturbances with different intensities in the least significant properties of the media analyzed. In the case of sound and digital images, such properties can be the least significant bits. The percentages used and the amount of transformations necessary are related to the problem being solved. The process of disturbance can be defined as a sequence of samplings made from a random X variable of some statistic distribution, such as the Gaussian and Bernoulli distributions. A given m media is defined as a disturbance of percentage p to transformation T(m, p), to the extent that p percent of bits of the m media will be altered (disturbed) according to the statistical distribution of the random X variable. This process is similar, but not equivalent to the process of adding noise to the media analyzed.

Step (4): Local properties of multimedia content do not undergo a global analysis (Wayner, 2002). Statistical descriptors in localized regions may be used to capture the change dynamics of the statistical artifacts inserted into the progressive randomization process.

Given a media m, sampling regions r can be used on m so as to represent the best way possible to medium m. The type and number of regions selected are linked to the problem solved. In a problem, it is possible to request the selection of regions that are rich in detail whereas in another problem, it is possible to request the selection of regions poor in detail. Step (5): A disturbance inserted by the progressive randomization alters a set of least significant properties in the media analyzed, inducing local statistical changes. Accordingly, step five consists of selecting a set of statistical descriptors that is capable of measuring these alterations and supplying a descriptor set of the media analyzed. Different types of media behave differently when submitted to Progressive Randomization. For example, if the media evaluated has a message in some way hidden in its properties, then its behavior under Progressive Randomization differs from that of a media without hidden content in its properties.

Step (6): In certain problems, a final step may be needed to cause the transformation of the invariance in the set of descriptors collected. Given that the procedure consists of progressively disturbing a set of properties of a certain type of media, then it may be necessary to provide only the behavior related to these descriptors in relation to the media without the randomization process. In this manner, the set of descriptors is normalized in relation to the original media provided for analysis.

Step (7): Consists of using a machine learning system to separate (differentiate) the sets characteristics created after carrying out the previous steps. The classifier used depends on the problem being solved. In general, there are various equivalent possibilities.

The present invention is able to resolve various problems related to digital and analogical multimedia. Experiments have proven that the process described herein is able to detect messages hidden in digital images with an accuracy rate of 96.3% surpassing techniques described in the state of the art such as those set forth in patents US 2004/6.831.991 and US 2004/6.804.377 and (Westfeld, 1999), (Fridrich. 2000), (Fridrich, 2001), (Farid, 2002) and (Provos, 2001) without using the multi-scale stage.

The process described herein is able to differentiate the semantic class of digital images with great accuracy. The process distinguishes digital photographs from computer-generated images (photo-realistic) with an accuracy rate of 94.3% surpassing techniques such as (Athisos, 1997), (Oliveira, 2002) and (Farid, 2005). The process distinguishes outdoor from indoor images with an accuracy rate of 93.9% surpassing techniques such as those described patents EP 1107179, JP 1195591, US 6504951 and DE 60008486 and in the works by (Picard, 1998), (Luo, 2001) and (Savakis, 2002).

Besides the positive results that spotlight the wide-scale applicability of the procedure described herein, this invention has a simplified and unified manner of solving certain kinds of problems that were previously only solved on a timely basis.

The procedure described can be adapted to different kinds of problems, and can be encompassed within different mechanisms such as routers, video boards, digital cameras, digital signal processors (DSPs), video recorders and other digital mechanisms. It is possible to develop specific equipment containing the procedure described herein and applied to a certain problem. For instance, equipment designed to analyze internet suspects that is able to create logs and records on individuals who are using messages hidden in digital media in their internet communications. Such acts may quality a crime, such as the disclosure and sale of child-pornography images hidden in apparently inoffensive media. The above description of the present invention was set forth for purposes of illustration and description. Furthermore, the description is not intended to limit the invention to the embodiment disclosed herein. Consequently, variations and modifications compatible with the teachings above, and the ability with or knowledge of the relevant technique, are within the scope of this present invention.

Accordingly, the embodiments described above are designed to illustrate more clearly the known ways of carrying out the invention and to enable persons skilled in the art to use the invention in these or other embodiments and with the various modifications needed in specific applications or uses of the present invention. It is intended that this present invention include all modifications and variations of the invention, within the scope described in the specification and in the claims appended hereto.

Claims

1. "PROGRESSIVE RANDOMIZATION PROCESS FOR MULTIMEDIA ANALYSIS AND REASONING", characterized by its ability to extract high-level multimedia information through a progressive process of disturbance of content and by comprising any subsets of the following steps: (1) Acquisition of the media to be analyzed; (2) Multi-scale decomposition; (3) Randomization; (4) Selection of regions; (5) Statistical descriptors; (6) Invariance; and/or (7) Inference;

2. "PROGRESSIVE RANDOMIZATION PROCESS FOR MULTIMEDIA ANALYSIS AND REASONING", according to claim 1, characterized wherein the step of the acquisition of the media to be analyzed comprises, more specifically, the formation of an M media sampling space with examples of media representative of the problem being resolved.

3. "PROGRESSIVE RANDOMIZATION PROCESS FOR MULTIMEDIA ANALYSIS AND REASONING", according to claim 1, characterized wherein the step of multi-scale decomposition preferably comprises decomposition of the m media provided in j scales, and the realization of just one scale is a particular case and the most used.

4. "PROGRESSIVE RANDOMIZATION PROCESS FOR MULTIMEDIA ANALYSIS AND REASONING", according to claim 1, characterized wherein the step of randomization preferably comprises disturbance of the least significant properties of the original m media.

5. "PROGRESSIVE RANDOMIZATION PROCESS FOR MULTIMEDIA ANALYSIS AND REASONING", according to claim 1, characterized wherein the step of selection of regions comprises the selection of r regions of interest in the media analyzed, which can be understood as a sampling process of the media provided for analysis.

6. "PROGRESSIVE RANDOMIZATION

PROCESS FOR MULTIMEDIA ANALYSIS AND REASONING", according to claim 1, characterized wherein the step of statistical descriptor analysis comprises the calculation of a certain set of statistical descriptors on each generated region being analyzed, wherein the descriptors to be used are related to the specific use being developed.

7. "PROGRESSIVE RANDOMIZATION PROCESS FOR MULTIMEDIA ANALYSIS AND REASONING", according to claim 1, characterized wherein the step of invariance comprises the normalization of descriptors in relation to the measurements present in the original media prior to the progressive randomization procedure.

8. "PROGRESSIVE RANDOMIZATION PROCESS AND EQUIPMENT FOR MULTIMEDIA ANALYSIS AND REASONING", according to claim 1, characterized wherein the step of inference comprises the use of a machine learning technique to differentiate the set of descriptors collected according to a criterion associated to the resolved problem.

9. "PROGRESSIVE RANDOMIZATION

PROCESS FOR MULTIMEDIA ANALYSIS AND REASONING", according to claim 1, characterized by its ability to be used in multimedia content categorization, hidden message detection, content tampering detection, robust watermarking, recognition of the originating media means (e.g. type of camera used in the capture process), identification of counterfeiting of artworks and any other uses in which progressive randomization can be used.

10. "PROGRESSIVE RANDOMIZATION PROCESS FOR MULTIMEDIA ANALYSIS AND REASONING", according to claims 1 and 9, characterized wherein the steps of application in classification of images is more specifically: (1) acquisition of a database of images that represent each class of image needed in the final classification; (2) multi-scale decomposition of the image; (3) randomization of the least significant bits representing a disturbance process of said bits; (4) selection of regions that are rich in detail in the image analyzed; (5) use of statistical descriptors that evaluate the sensitivity of the image to the randomization process; (6) invariance or normalization of the sensitivity measurements collected; (7) inference using a machine learning technique.

11. "PROGRESSIVE RANDOMIZATION PROCESS FOR MULTIMEDIA ANALYSIS AND REASONING", according to claims 1 and 9, characterized wherein the steps of application in hidden message detection in images is more specifically: (1) acquisition of a database of images that represent images without hidden messages; (2) multi- scale decomposition of the image; (3) randomization of the least significant bits representing a disturbance process of said bits and also simulating the presence of hidden messages in the images analyzed; (4) selection of regions that are rich in detail in the image analyzed; (5) use of statistical descriptors that evaluate the sensitivity of the image to the randomization process; (6) invariance or normalization of the sensitivity measurements collected; (7) inference using a machine learning technique.

12. "PROGRESSIVE RANDOMIZATION EQUIPMENT FOR MULTIMEDIA ANALYSIS AND REASONING", characterized by using the method of claims 1 to 11 wherein it can be used in isolation or encompassed within in any multimedia equipment.

13. "PROGRESSIVE RANDOMIZATION EQUIPMENT FOR MULTIMEDIA ANALYSIS AND REASONING", characterized wherein the multimedia equipment is more specifically a router, digital cameras, video recorders, video boards, micro controller units, FPGA boards, Digital Signal Processors (DSPs) and any other equipment that can handle multimedia content.