WO2010041785A1

WO2010041785A1 - Method and apparatus for recognizing object in image

Info

Publication number: WO2010041785A1
Application number: PCT/KR2008/006579
Authority: WO
Inventors: Dae Woon Lim; Ouk Hyung Kim
Original assignee: Kogooryeo Media Solution
Priority date: 2008-10-07
Filing date: 2008-11-07
Publication date: 2010-04-15

Abstract

Provided is a method of recognizing an object in an image, the method including: receiving area information about color areas having a color similar to that of a color area of an object on which a cursor is located, in an image provided to a client; comparing color distributions of each of pre-stored objects and a color distribution corresponding to the received area information to determine a similarity therebetween; sequentially extracting objects having a color distribution having higher similarity from among each of the pre-stored objects; and recognizing the object on which the cursor is located, by determining whether any one of the sequentially extracted objects matches the object on which the cursor is located. Accordingly, an object in a real time image can be quickly recognized by minimizing a size of a recognized area and reducing a range for recognizing an object according to a similarity between a color distribution of the object to be recognized and color distributions of pre-stored objects, instead of trying to recognize an object in an entire image.

Description

Description METHOD AND APPARATUS FOR RECOGNIZING OBJECT IN

IMAGE

Technical Field

[1] The present invention relates to recognizing an object in an image, and more particularly, to a technology for quickly recognizing an object according to a similarity between a color distribution of the object and color distributions of pre-stored objects, while minimizing a size of a recognition area of the object in an image provided in real time. Background Art

[2] Conventional systems for recognizing an object in an image generally use a recognition method using an outline. However, in the recognition method, a corresponding image may be distorted according to an external environment, such as lighting, and thus recognition of an object in the image may fail. This can be seen by viewing different edges detected in the same object when lighting changes. Accordingly, two images to be compared are considered to be the same image, and when a similarity between the two images is high by applying various recognition algorithms, the two images are determined to be the same. When the similarity between the two images is undesirable, it is determined that objects in the two images are different, and thus a searching mode ends or another image is searched for. Since a time for recognizing an object in a stored image is not limited, corresponding objects are compared very accurately.

[3] However, real time images are continuous static images having 30 frames or more per second. Accordingly, an object in the real time images is recognized in one static frame within an average of 33 ms. Thus, conventional methods cannot be applied to real time images, where speed is important. As time is wasted while searching for and comparing corresponding objects in images, conventional methods cannot be used in a real time search. Disclosure of Invention Technical Problem

[4] The present invention provides a method of recognizing an object in a real time image, wherein an object recognition time is minimized by minimizing a size of a recognized area, and color distributions of an original object and a corresponding object having an adjusted size are compared, instead of trying to recognize an object in an entire image. Technical Solution [5] According to an aspect of the present invention, there is provided a method of recognizing an object in an image, the method including: receiving area information about color areas having a color similar to that of a color area of an object on which a cursor is located, in an image provided to a client; comparing color distributions of each of pre-stored objects and a color distribution corresponding to the received area information to determine a similarity there between; sequentially extracting objects having a color distribution having higher similarity from among each of the pre-stored objects; and recognizing the object on which the cursor is located, by determining whether any one of the sequentially extracted objects matches the object on which the cursor is located.

[6] According to another aspect of the present invention, there is provided an apparatus for recognizing an object in an image, the apparatus including: an interface which receives area information about color areas having a color similar to that of a color area of an object on which a cursor is located, in an image provided to a client; a memory which stores information about color distributions of each object; a comparing and extracting unit which compares color distributions stored in the memory and a color distribution corresponding to the received area information to determine similarity, and sequentially extracts objects having a color distribution having a higher similarity; and an object recognizer which recognizes the object on which the cursor is located, by determining whether any one of the sequentially extracted objects matches the object on which the cursor is located. Advantageous Effects

[7] According to the present invention, an object in a real time image can be quickly recognized by minimizing a size of a recognized area and reducing a range for recognizing an object according to a similarity between a color distribution of the object to be recognized and color distributions of pre-stored objects, instead of trying to recognize an object in an entire image. Description of Drawings

[8] FIG. 1 is a flowchart of a method of recognizing an object in an image, according to an embodiment of the present invention;

[9] FIG. 2 is a diagram for describing a process of extracting area information in an image;

[10] FIG. 3 is a diagram for describing a similarity between color distributions of each object in an image illustrated in FIG. 2, according to an embodiment of the present invention;

[11] FIG. 4 is a diagram for describing a similarity between color distributions of different objects, according to an embodiment of the present invention; [12] FIG. 5 is a diagram for describing each operation of the present invention, performed between a server and a client; and

[13] FIG. 6 is a block diagram for describing an apparatus for recognizing an object in an image, according to an embodiment of the present invention. Mode for Invention

[14] Hereinafter, the present invention will be described more fully with reference to the accompanying drawings, in which exemplary embodiments of the invention are shown.

[15] FIG. 1 is a flowchart of a method of recognizing an object in an image, according to an embodiment of the present invention. The method of FIG. 1 is performed in a server for providing an image.

[16] First, area information about color areas having colors similar to those of a color area of an object on which a cursor is located, in an image provided to a client, is received in operation 100. The server may provide a stored image to the client, or provide an image to the client in real time. When the server provides an image to the client, a user of the client locates the cursor on an object to be recognized in the provided image. When the cursor is located on an object, area information about certain color areas having colors similar to colors on a coordinate of the cursor is extracted based on the coordinate. The extracted area information is transmitted from the client to the server. Accordingly, the server receives the area information.

[17] FIG. 2 is a diagram for describing a process of extracting area information in an image. An image illustrated in FIG. 2 corresponds to the image provided from the server to the client. When the user locates the cursor on a face of a person or on a part of a body, color areas having colors similar to those of a color area of the cursor are classified by lines on the image, based on a coordinate of the cursor. When the cursor is located on the face, coordinate values of certain color areas having a color similar to that of the face are transmitted to the server as area information.

[18] After operation 100, color distributions of each pre-stored object and a color distribution corresponding to the received area information are compared to determine similarity in operation 102. Upon receiving the area information, the server detects an image corresponding to the area information included in the image that was transmitted to the client, by referring to the area information received from the client. Then, a color distribution of the detected image is detected. Next, the color distribution of the detected image and the color distributions of each of the pre-stored objects are compared to determine similarity.

[19] Specifically, when the similarity is determined, at least one of red, green, blue, hue, and saturation is determined.

[20] FIG. 3 is a diagram for describing similarity between color distributions of each object in an image illustrated in FIG. 2, according to an embodiment of the present invention. FIG. 3 illustrates color distributions of red, green, blue, hue, and saturation, with respect to four objects (a), (b), (c), and (d). When the client locates the cursor on the object (a) and corresponding area information is transmitted to the server, the server extracts red, green, blue, hue, and saturation as information about color distributions corresponding to the object (a). Then, the color distributions of each of the pre- stored objects are searched for, so as to check whether an object having a high similarity with the object (a) exists. As illustrated in FIG. 3, the color distributions of the objects (a), (b), (c), and (d) are different, and thus when an object is searched for by referring to a similarity between color distributions, the time taken to recognize an object may be minimized.

[21] When color distributions are compared, the color distributions are compared after adjusting a size of an image corresponding to the received area information and sizes of images of each pre-stored object so that they are all the same. For example, when a size of an image of the object (a) is different from sizes of images of the pre-stored objects that are to be compared, the sizes are adjusted so that they are all the same, and color distributions of the images having the adjusted sizes are compared.

[22] FIG. 4 is a diagram for describing a similarity between color distributions of different objects, according to an embodiment of the present invention. FIG. 4 illustrates color distributions of red, green, blue, hue, and saturation, with respect to three objects (a), (b), and (c). As illustrated in FIG. 3, the color distributions of the objects (a) and (b) are clearly different from the color distribution of the object (c). Accordingly, when the object (a) or (b) is recognized, the object (c) is excluded since the color distribution of the object (c) is different from the color distribution of the object (a) or (b). Thus, the time taken to search for similar color distributions may be reduced.

[23] After operation 102, objects having color distributions having a higher similarity are sequentially extracted from the pre-stored objects in operation 104. As illustrated in FIG. 3 or 4, when an object to be recognized is searched for, color distributions, each based on red, green, blue, hue, and saturation, of the object to be compared and of the pre-stored objects are compared, and objects having a higher similarity are sequentially extracted.

[24] Then, the object on which the cursor is recognized in operation 106 by determining whether one of the sequentially extracted objects matches the object on which the cursor is located. In operation 106, it is determined whether the extracted objects match the object on which the cursor is located. When any one of objects pre-stored in the server and an object on a coordinate of the cursor are identical, the object on the coordinate of the cursor is recognized. An object is recognized via various algorithms for recognizing an object, along with comparing color distributions as described above. In the present invention, the object may be a static object or a moving object. Examples of a moving object include a face of a person, a car, or clothes. Recognizing an object is generally performed by searching for at least one object in a static image or a moving image from a conventional database. A method of recognizing an object may vary according to the complexity of a background in an image, a ratio of an object to an entire image, and lighting. An object is detected mainly by using a combination of at least one algorithm based on texture, depth, and shape of an image. A method of recognizing an object may be classified according to characteristics of the object, and examples thereof may include a method of using an object template when a background is simple, a method of searching for features of the object, and a method of using symmetry of the object. When an object template is used, a process of classifying the background and the object is simplified, and the object is accurately separated from the background. Accordingly, the object is recognized by changing the size of the entire object or a part of the object template via an algorithm called a rational object model (ROM). In the method for searching for features of the object, a pattern of the object is mainly used. A hierarchical knowledge-based pattern recognition system recognizes a location and a size of an object in low resolution and checks the object in a higher resolution, thereby determining only an object that is verified as an object to be recognized. Here, features of the object are searched for by comparing a vector obtained by using a Gaussian differential filter with a template vector, preparing a probability model, and then performing a graph contention. Also, the features of the object may be recognized by using a neural network that is used to recognize a face. It is determined whether an input image is an object or an object to be searched for, via a neural network that pre-learns about the input image by continuously sub-sampling the input image.

[25] A face recognition algorithm is used to recognize an object as a face. The face recognition algorithm is a computer supported application program that automatically identifies each person via a digital image. The face recognition algorithm is performed by comparing facial features with face information in a face database. The face recognition algorithm recognizes a face by standardizing a facial characteristics extracted from an obtained image, such as features, brightness, or geometry of a face, and comparing the obtained image and an image in a database. Examples of the face recognition algorithm include a geometric method, an Eigenfaces method, a Fisherfaces method, a method based on a support vector machine (SVM), a neural network method, and a Wavelet and Elastic method. Since the face recognition algorithm is well known in the related art, details thereof are omitted herein.

[26] After operation 106, a result of recognizing the object on which the cursor is located is transmitted to the client in operation 108. When one of the objects pre-stored in the server matches the object on which the cursor is located, the server transmits such a matching result to the client. When the client receives the matching result, it is displayed that the object on which the cursor is located is recognized on the image provided to the client. The object on which the cursor is located is displayed in such a way that it is classified from other objects. Meanwhile, when the result is transmitted, meta information corresponding to the recognized object may be transmitted together with the result. The meta information is periodically updated. Various pieces of meta information are stored according to each object; when an object is a person, information about the person is stored; and when an object is a product, various pieces of information about the product are stored. For example, when the recognized object is an actor, information about the actor is transmitted to the client with the matching result.

[27] FIG. 5 is a diagram for describing each operation of the present invention, performed between a server and a client. As illustrated in FIG. 5, when a server provides an image to a client, the client extracts a color area having colors similar to those of a color area of an object on which a cursor is located from among the provided image. Area information about the extracted color area is provided to the server. The server extracts an image corresponding to the area information from the image, and detects a color distribution of the corresponding extracted image. Then, the detected color distribution is compared with color distributions of each object pre-stored in the server to determine similarity, and objects having a high similarity are sequentially extracted. Then, one of the sequentially extracted objects that matches the object on which the cursor is located is determined as an object to be recognized. When it is determined that the object is recognized, the server transmits the matching result and corresponding meta information to the client. Upon receiving the result and the meta information, the client displays the matching result and the meta information on the image.

[28] 10 ms is taken to detect information about the color areas having colors similar to those of the color area of the object in a real image, and 35 ms is taken to compare color distributions. Accordingly, about 45 ms is needed to recognize an object in a real time image, and thus an object is recognized in an image of 30 FPS every second.

[29] An apparatus for recognizing an object in an image will now be described with reference to FIG. 6.

[30] FIG. 6 is a block diagram for describing an apparatus for recognizing an object in an image, according to an embodiment of the present invention. Referring to FIG. 6, the apparatus includes an interface 220, a memory 230, a comparing and extracting unit 240, an object recognizer 250, and a meta information provider 260.

[31] The apparatus is included in a server 200 for providing an image service. The server

200 includes an image provider 210 for providing an image to a client 300. [32] The image provider 210 transmits image content stored in the image provider 210 to the interface 220 according to a request of a user or a predetermined condition.

[33] The interface 220 provides an image to the client 300.

[34] The client 300 displays the image received from the server 200. When the image displayed on the client 300, a user of the client locates a cursor on an object to be recognized on the image. Then, the client 300 extracts area information about color areas having colors similar to those of a color area of a coordinate of the cursor. The client 300 transmits the extracted area information to the server 200.

[35] The interface 220 of the server 200 receives the area information, and transmits the received area information to the comparing and extracting unit 240.

[36] The comparing and extracting unit compares color distributions stored in the memory

230 and a color distribution corresponding to the received area information to determine similarity, and sequentially extracts objects having a color distribution having a higher similarity. Here, the memory 230 pre-stores information about color distributions of each object.

[37] The comparing and extracting unit 240 detects an image corresponding to the area information included in the image that was transmitted to the client 300, by referring to the received area information. Then, the comparing and extracting unit 240 detects a color distribution of the detected image. Next, the comparing and extracting unit 240 compares the color distributions of each image stored in the memory 230 and the color distribution of the detected image to determine similarity.

[38] The comparing and extracting unit 240 compares at least one of red, green, blue, hue, and saturation so as to compare the color distributions. As illustrated in FIG. 3, the comparing and extracting unit 240 detects red, green, blue, hue, and saturation as information about a color distribution corresponding to a certain object. Then, in order to check whether an object having a color distribution that is highly similar to the color distribution of the certain object exists, the comparing and extracting unit 240 searches for the color distributions of each object stored in the memory 230.

[39] The color distributions are compared after changing a size of the image corresponding to the received area information and sizes of images of each of the pre- stored objects so that they are all the same. When an image size of the object in an original image is different from image sizes of pre-stored objects to be compared, the comparing and extracting unit 240 adjusts the sizes so that they are all the same, and then compares the color distributions.

[40] The comparing and extracting unit 240 sequentially extracts the objects having a higher similarity with the object on which the cursor is located.

[41] The object recognizer 250 determines whether any one of the sequentially extracted objects matches the object on which the cursor is located, thereby recognizing the object on which the cursor is located. The object recognizer sequentially determines whether the extracted objects matches the object on which the cursor is located. When any one of the objects stored in the memory 230 matches the object on which the cursor is located, the object recognizer 250 outputs a result of recognizing the object on which the cursor is located. The object recognizer 250 recognizes an object by using various algorithms for recognizing an object, along with comparing the color distributions. The object recognizer 250 recognizes an object by combining at least one algorithm based on texture, depth, and shape of an image. The object recognizer 250 uses various algorithms for recognizing an object, according to complexity of background, ratio of an object in the entire image, and lighting. Examples of the algorithm for recognizing an object include a recognition algorithm using an object template, an algorithm for searching for features of an object, and an algorithm using symmetry of an object. Specifically, the object recognizer 250 uses a face recognition algorithm so as to recognize an object as a face. Examples of the face recognition algorithm include a geometric method, an Eigenfaces method, a Fisherfaces method, a method based on a support vector machine (SVM), a neural network method, and a Wavelet and Elastic method.

[42] The object recognizer 250 transmits the result to the meta information provider 260 and the interface 220.

[43] The meta information provider 260 extracts meta information corresponding to the recognized object, and transmits the meta information to the interface 220. The meta information provider 260 pre-stores meta information about each object. The meta information is periodically updated. The meta information provider 260 stores various pieces of meta information about each object. In other words, when an object is a person, information about the person is stored, and when an object is a product, various pieces of information about the product are stored. For example, when the recognized object is a certain actor, the meta information provider 260 extracts meta information about the actor, and transmits the meta information to the interface 220.

[44] The interface 220 transmits the result received from the object recognizer 250 and the meta information corresponding to the result received from the meta information provider 260 to the client 300.

[45] Upon receiving the result and the meta information, the client 300 displays the result and the meta information on the image.

[46] Meanwhile, the above method may be realized as a computer readable code/ instructions/program. In other words, an embodiment of the present invention provides a computer readable recording medium for executing a method of recognizing an object in an image, the method including: receiving area information about color areas having a color similar to that of a color area of an object on which a cursor is located, in an image provided to a client; comparing color distributions of each of pre-stored objects and a color distribution corresponding to the received area information to determine similarity; sequentially extracting objects having a color distribution having a higher similarity from among each of the pre-stored objects; and recognizing the object on which the cursor is located, by determining whether any one of the sequentially extracted objects matches the object on which the cursor is located.

[47] The embodiments of the present invention can be implemented in general-use digital computers that execute the codes/instructions/programs using a computer readable recording medium. Examples of the computer readable recording medium include read-only memory (ROM), random-access memory (RAM), CD-ROMs, magnetic tapes, floppy disks, optical data storage devices, and carrier waves (such as data transmission through the Internet). The computer readable recording medium can also be distributed over network coupled computer systems so that the computer readable code is stored and executed in a distributed fashion. Also, functional programs, codes, and code segments for accomplishing the present invention can be easily construed by programmers of ordinary skill in the art to which the present invention pertains.

[48] While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it will be understood by those of ordinary skill in the art that various changes in form and details may be made therein without departing from the spirit and scope of the present invention as defined by the following claims.

Claims

[1] L A method of recognizing an object in an image, the method comprising: receiving area information about color areas having a color similar to that of a color area of an object on which a cursor is located, in an image provided to a client; comparing color distributions of each of pre-stored objects and a color distribution corresponding to the received area information to determine similarity; sequentially extracting objects having a color distribution having a higher similarity from among each of the pre-stored objects; and recognizing the object on which the cursor is located, by determining whether any one of the sequentially extracted objects matches the object on which the cursor is located.

[2] 2. The method of claim 1, wherein the image is any one of an image that is provided in real time and a stored image.

[3] 3. The method of claim 1, wherein in the comparing, the color distributions are compared by adjusting a size of an image corresponding to the received area information and a size of an image of each of the pre-stored objects so that they are the same.

[4] 4. The method of claim 1, wherein in the comparing, at least one of red, green, blue hue, and saturation is compared as the color distributions.

[5] 5. The method of claim 1, wherein the method is used to recognize a moving object.

[6] 6. The method of claim 5, wherein the moving object comprises a face of a person and moving things.

[7] 7. The method of claim 1, further comprising transmitting a result of recognizing the object on which the cursor is located to the client.

[8] 8. The method of claim 7, wherein when the result is transmitted to the client, meta information corresponding to the recognized object is transmitted as well.

[9] 9. An apparatus for recognizing an object in an image, the apparatus comprising: an interface which receives area information about color areas having a color similar to that of a color area of an object on which a cursor is located, in an image provided to a client; a memory which stores information about color distributions of each object; a comparing and extracting unit which compares color distributions stored in the memory and a color distribution corresponding to the received area information to determine similarity, and sequentially extracts objects having a color distribution having a higher similarity; and an object recognizer which recognizes the object on which the cursor is located, by determining whether any one of the sequentially extracted objects matches the object on which the cursor is located.

[10] 10. The apparatus of claim 9, wherein the image is any one of an image provided in real time or a stored image.

[11] 11. The apparatus of claim 9, wherein the comparing and extracting unit compares the color distributions after adjusting a size of an image corresponding to the received area information and a size of an image of each stored object so that they are the same.

[12] 12. The apparatus of claim 9, wherein the comparing and extracting unit compares at least one of red, green, blue, hue, and saturation, as the color distribution.

[13] 13. The apparatus of claim 9, wherein the apparatus is used to recognize a moving object.

[14] 14. The apparatus of claim 13, wherein the moving object comprises a face of a person and moving things.

[15] 15. The apparatus of claim 9, wherein the interface transmits a result of recognizing the object on which the cursor is located to the client.

[16] 16. The apparatus of claim 15, further comprising a meta information provider which provides meta information corresponding to the recognized object, while transmitting the result to the client.

[17] 17. The apparatus of claim 9, wherein the apparatus is provided in a server for providing an image service.