KR101606760B1

KR101606760B1 - Apparatus and Method of Transforming Emotion of Image based on Object in Image

Info

Publication number: KR101606760B1
Application number: KR1020150104116A
Authority: KR
Inventors: 이인권; 김혜린
Original assignee: 연세대학교 산학협력단
Priority date: 2015-07-23
Filing date: 2015-07-23
Publication date: 2016-03-28

Abstract

The present invention relates to an apparatus and a method for converting emotions appearing in an image.
The object of the present invention is to provide an image processing apparatus and a method and a system for processing an image by using information about an object as well as color information included in a photograph based on the fact that the emotion felt by a person is most affected by an object included in the photograph, To convert the image to better express the desired emotion. To this end, the image emotion conversion apparatus according to the present invention receives an emotion conversion target image, extracts a target object to be emotion transformed from the target image, and includes information about a position or an area in the target image of the target object An object recognition unit for recognizing the target object and generating semantic information, which is information on the meaning of the target object, a target emotion information indicating a target emotion to be transformed with respect to the target image, A candidate selector for comparing at least one candidate object and the target object based on the semantic information and emotion information and selecting the candidate object according to the comparison result; Using the candidate object information including the information, And converting the image signal of the target image within the target image corresponding to the target image to generate an image converted so that the target image expresses emotion corresponding to the target emotion information.

Description

TECHNICAL FIELD The present invention relates to an object-based image emotion conversion apparatus and a method thereof, and more particularly,

The present invention relates to an apparatus and a method for converting emotions appearing in an image.

There have been various image conversion technologies that process image signals from various smart devices such as a smart phone and a tablet or a computer and give specific effects to the images. Especially, as the use of social network services has been expanded, applications have been developed in which each user can take a picture by using a smart device, download a picture from the web, give a desired image effect, and store or retransmit the picture. have.

As the image processing technology that gives a specific effect to the image, there are generally used methods such as adjusting the saturation and brightness of the image or using a filter for blurring or sharpening the image, and furthermore, There are also methods for converting images using preset image filters or using color palettes to reflect them. For example, if the user wishes to change the mood of a specific image more pleasantly or more unpleasantly, there are image emotion conversion methods of converting the image so that the color, lightness, etc. are changed by reflecting such emotion information.

However, in the existing methods of converting emotions, a method of converting only the overall color information of an image is used in order to express effects according to specific emotion information. In this regard, But there is a limit in that it can not.

Korean Patent Laid-Open Publication No. 2014-0037373 (2014. 03. 27)

The problem to be solved by the present invention is that the emotion felt by a person is most influenced by an object included in the photograph, so that not only the color information included in the photograph but also information about the object are used together So as to better represent the target emotion desired by the user, and an apparatus associated therewith.

According to one aspect of the present invention, there is provided an image-emotion conversion apparatus that receives an image to be emotionally transformed, extracts a target object to be emotion-transformed from the target image, An object recognition unit for generating target object information including information on the target object, information on the target object, information on the target object, and generating semantic information, which is information on the meaning of the target object, A candidate selector for receiving target emotion information, comparing at least one candidate object and the target object based on the semantic information and emotion information, and selecting the candidate object according to the comparison result; And the candidate object information of the candidate object And converting the image signal of the corresponding region in the target image to generate an image in which the target image is transformed to express emotion corresponding to the target emotion information.

Here, the object recognition unit may analyze a video signal of pixels included in the target image area corresponding to the extracted target object to extract a predetermined feature.

Wherein the image empowerment conversion device extracts emotion information of the target object by inputting the semantic information and the characteristic of the extracted target object into a previously learned emotion information classifier and extracting emotion information of the target object from the emotion information of the target object, And an emotion information extracting unit that determines emotion information of the target image.

Here, the object recognition unit performs image segmentation on the target image to divide it into a plurality of areas, selects a part of the divided areas, extracts the target area corresponding to the target object, The target object information including information about a position or an area in the target image of the target object.

Here, the object recognizer may determine which class of the predetermined object class the object object corresponds to, using the image signal of the target image or the extracted feature, and generate the semantic information according to the class of the determined object have.

Here, the object recognition unit may include an image segmentation unit that divides the target image into a plurality of regions by performing image segmentation, extracts the target image region corresponding to the target object by selecting a portion of the divided regions, A feature extracting unit that extracts a predetermined feature by analyzing an image signal of pixels included in the target image within the target image corresponding to the target object; And a semantic information generator for generating the semantic information according to the determined class of the determined object.

Here, the candidate object is stored in a candidate database, and the semantic information, which is information on the meaning of the candidate object for each candidate object, and the emotion information previously set for the candidate object are stored in the candidate database, Compares the semantic information and the emotion information of the candidate object stored in the candidate database with the semantic information and the target emotion information of the target object generated by the object recognition unit, The candidate objects corresponding to the candidate objects are selected.

Here, the candidate selector may calculate the similarity between the semantic information and the emotion information of the candidate object stored in the candidate database, the semantic information of the target object generated by the object recognition unit, and the target emotion information, And the candidate objects are selected based on a similarity.

Wherein the object recognition unit extracts the target image area corresponding to the target object and uses the extracted target area information as the position information of the target object to generate the target object information including the position information of the target object Wherein the candidate object information including positional information of the candidate object in the candidate object is stored in the candidate database for each candidate object, and the candidate selecting unit selects the semantic information of the candidate object stored in the candidate database, The candidate object information, the semantic information of the target object generated by the object recognition unit, the target emotion information, and the target object information, and selects the candidate object according to the comparison result. can do.

Wherein the object recognition unit extracts the target image area corresponding to the target object, uses the texture information of the extracted area as texture information of the target object, and extracts the target object information including the texture information of the target object The candidate object information including the texture information of the candidate object for each of the candidate objects is stored in the candidate database, and the candidate selecting unit selects the semantic information, the emotion information, and the semantic information of the candidate object stored in the candidate database, The candidate object information is compared with the semantic information, the target emotion information, and the target object information of the target object generated by the object recognition unit, and the candidate object is selected according to the comparison result have.

Wherein the candidate object information includes color information of the candidate object and the image transformation unit uses the color information included in the candidate object information of the candidate object selected by the candidate selection unit to generate color information corresponding to the target object And a color conversion unit for converting the image signal so that the color distribution of the image signal of the target image region matches the color information of the candidate object, thereby generating the converted image.

Wherein the candidate object information includes position information in the image of the candidate object and the image transform unit transforms the candidate object using the position information in the image included in the candidate object information of the candidate object selected by the candidate selecting unit, And a position conversion unit for performing an image transformation for moving the position of the target object in the target image so that the target object in the image corresponds to the in-image position information of the candidate object, thereby generating the transformed image .

Wherein the candidate object information includes texture information of the candidate object and the image transformation unit uses the texture information included in the candidate object information of the candidate object selected by the candidate selection unit, And a texture conversion unit for converting the image signal of the target image area corresponding to the target object so that the object corresponds to the texture information of the candidate object, and generating the converted image.

Wherein the image conversion unit searches the word database for an associated word corresponding to the semantic information of the target object, searches an image database for an image patch corresponding to the retrieved related word, And an object adding unit for generating the transformed image.

Wherein the object recognition unit extracts a predetermined feature including a color feature or a texture feature in an area in the target image corresponding to the target object, and the emotion information classifier used by the emotion information extractor includes: Wherein the classifier is configured to learn a plurality of learning data in which the semantic information, the characteristic, and the emotion information are set in advance, parameters of a classifying function for inputting the semantic information and the characteristic and outputting the emotion information, The information extracting unit may extract the emotion information of the target object by inputting the semantic information of the target object and the feature including the color feature or texture feature to the classification function of the emotion information classifier.

The emotion information classifier may be a classifier that learns the learning data using a linear regression model or a support vector-based regression model, and the parameter of the classification function is set.

According to another aspect of the present invention, there is provided an image emotion conversion method including receiving an emotion conversion target image, extracting a target object to be emotion-transformed from the target image, An object recognizing step of generating object object information including information on a position or an area of the target image and recognizing the target object and generating semantic information which is information on the meaning of the target object; A candidate selecting step of comparing the at least one candidate object and the target object based on the semantic information and the emotion information and selecting the candidate object according to a result of the comparison; The candidate object information of the selected candidate object, Converts the video signal of the area within the target image corresponding to the object, it may include an image converting step of generating the image the target image is converted to indicate the emotion corresponding to the target emotion information.

The object recognition step may further include extracting a predetermined feature by analyzing a video signal of pixels included in the target image area corresponding to the extracted target object.

Here, the image emotion conversion method may include inputting the semantic information and the characteristic of the extracted target object to a previously learned emotion information classifier, extracting emotion information of the target object, and based on the extracted emotion information of the target object And an emotion information extracting step of determining emotion information of the target image.

Here, the candidate object is stored in a candidate database, and the semantic information, which is information on the meaning of the candidate object for each candidate object, and the emotion information previously set for the candidate object are stored in the candidate database, Step compares the semantic information and the emotion information of the candidate object stored in the candidate database with the semantic information of the target object generated in the object recognition step and the target emotion information, And selects the candidate object corresponding to the object.

Wherein the candidate object information includes color information of the candidate object and the image transformation step uses the color information included in the candidate object information of the candidate object selected in the candidate selection step to correspond to the target object And a color conversion step of converting the image signal so that the color distribution of the image signal of the area within the target image matches the color information of the candidate object to generate the converted image.

The image conversion step may include searching an associated word corresponding to the semantic information of the target object in a word database, searching an image database for an image patch corresponding to the retrieved related word, And adding an object to the image to generate the transformed image.

According to another aspect of the present invention, an image emotion conversion program may be a computer program stored in a medium for executing the image emotion conversion method in combination with a computer.

According to the present invention, the object of the present invention is to provide an image-emotion conversion apparatus and method, in which the emotion felt by a person is most affected by an object included in the photograph, There is an effect that the image can be converted so as to better express the target emotion desired by the user by using the information about the object together.

FIG. 1 is a block diagram of a video image conversion apparatus according to an exemplary embodiment of the present invention. Referring to FIG.
2 is a block diagram of an image-emotion conversion apparatus further including an emotion information extracting unit.
3 is a detailed block diagram of the object recognition unit.
4 is a block diagram for explaining the operation of the candidate selector.
5 is a block diagram for explaining the operation of the candidate selector in the case of selecting candidate objects by further using object information in addition to emotion information and semantic information.
FIG. 6 is a reference diagram for explaining the operation of the video image conversion apparatus according to the present invention.
7 is a detailed block diagram of the image conversion unit.
8 is a block diagram for explaining the operation of the emotion information extracting unit.
9 is a flowchart of a video emotion conversion method according to the present invention.
10 is a flowchart of a video emotion conversion method further including an emotion information extracting step.

Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings. In the drawings, the same reference numerals are used to designate the same or similar components throughout the drawings. In the following description of the present invention, a detailed description of known functions and configurations incorporated herein will be omitted when it may make the subject matter of the present invention rather unclear. In addition, the preferred embodiments of the present invention will be described below, but it is needless to say that the technical idea of the present invention is not limited thereto and can be variously modified by those skilled in the art.

There have been various image conversion technologies that process image signals from various smart devices such as a smart phone and a tablet or a computer and give specific effects to the images. Especially, as the use of social network services has been expanded, applications have been developed in which each user can take a picture by using a smart device, download a picture from the web, give a desired image effect, and store or retransmit the picture. have. Traditionally, image processing techniques that give specific effects to images have generally been used to adjust the saturation and brightness of the image, or to use blurring or sharpening effects on the image.

On the other hand, people tend to communicate their feelings to others by shooting images and posting or delivering them. Thus, many existing studies have been conducted to recognize emotion information felt by people from images. And existing studies have focused primarily on the combination of features to use to extract emotions from images. However, since a person perceives an object included in the image and interprets the information about the object, the emotion that the image brings to the person is greatly influenced by the meaning of the object existing in the image. However, existing studies have focused only on finding a more efficient combination of features, and there is a limit in that attempts to extract emotion from the images using the semantic information of the objects included in the images are neglected.

On the other hand, image processing methods have been proposed for converting the emotion of an image so that the image expresses the emotion desired by the user. In other words, there are methods of converting images using a preset image filter or using a color palette so as to more intuitively reflect a certain emotion desired by a user in an image. For example, if the user wishes to change the mood of a specific image more pleasantly or more unpleasantly, there are image emotion conversion methods of converting the image so that the color, lightness, etc. are changed by reflecting such emotion information. However, in the existing methods of converting emotions, a method of converting only the overall color information of an image is used in order to express effects according to specific emotion information. In this regard, But there is a limit in that it can not.

In the present invention, attention is paid to the fact that the emotion felt by a person viewing a photograph is most affected by an object included in the photograph, so that not only the color information included in the photograph but also the information about the object are used together to determine the target A method and a device associated therewith for converting an image to better represent emotions.

The image emotion conversion method according to the present invention does not perform image signal processing for emotion conversion for the entire image as in the conventional image emotion conversion methods, And performs video signal processing. More specifically, in the image emotion conversion method according to the present invention, the semantic information, which is the information about the meaning of the object existing in the image, and the feature according to the image signal are extracted and based on the extracted emotion, Lt; / RTI > To this end, the present invention uses an emotion information classifier configured by learning learning data in which semantic information, characteristics, and emotion information of an object included in an image are preset.

In addition, the image emotional conversion method according to the present invention extracts an object from a target image, searches for a candidate object having semantic information of the extracted object and conversion target emotion information for the target image in the candidate database, and uses the selected candidate object And converts the extracted object. More specifically, the candidate information is previously stored in the candidate database by using the emotion information and the semantic information, and information according to the position of the object in the image or the characteristics of the image signal, if necessary, A candidate object having information similar to the above information of the object is selected from the candidate data base and the image signal of the extracted object is transformed using the selected candidate object.

According to the above configuration, the image emotion conversion method according to the present invention can extract the emotion information containing the image more accurately based on the semantic information of the object, and can also extract the image based on the object, And convert the emotion of the target image so that the target emotion desired by the user is better represented.

Hereinafter, a video image conversion apparatus and a method thereof according to the present invention will be described in detail.

First, the image emotion conversion apparatus according to the present invention will be described.

FIG. 1 is a block diagram of a video image conversion apparatus according to an exemplary embodiment of the present invention. Referring to FIG.

The image emotion conversion apparatus according to the present invention may include an object recognition unit 100, a candidate selection unit 200, and an image conversion unit 300.

The object recognition unit 100 receives the image to be emotionally transformed, extracts a target object to be emotion-transformed from the target image, and obtains target object information including information about a position or an area in the target image of the target object And recognizes the target object and generates semantic information, which is information on the meaning of the target object.

The candidate selecting unit 200 receives target emotion information indicating a target emotion to be converted for the target image, compares at least one candidate object and the target object based on the semantic information and emotion information, And selects the candidate object according to the candidate object.

The image transform unit 300 transforms the image signal of the target image in the target image corresponding to the target object using the candidate object information including the color information of the candidate object selected by the candidate sorting unit 200, And generates the converted image so that the target image represents the emotion corresponding to the target emotion information.

Here, the video image conversion apparatus according to the present invention may further include an emotion information extraction unit 400 if necessary.

2 is a block diagram of an image-emotion conversion apparatus further including an emotion-information extracting unit 400. As shown in FIG.

In this case, the object recognition unit 100 may analyze a video signal of pixels included in the target image area corresponding to the extracted target object to extract a predetermined feature.

The emotion information extracting unit 400 extracts emotion information of the target object by inputting the semantic information and the characteristic of the extracted target object into a previously learned emotion information classifier and extracts emotion information of the extracted target object The emotion information of the target image can be determined based on the emotion information.

Here, the video image conversion apparatus according to the present invention may be configured such that all of the components are implemented as one independent hardware, or a part or all of the components are selectively combined to form a part or a combination But may also be embodied as a computer program having a program module that performs all of its functions. Further, the video image conversion apparatus according to the present invention may be implemented as a software program and operated on a processor or a signal processing module, or may be implemented in hardware form to be included in various processors, chips, semiconductors, to be. Further, the video image conversion apparatus according to the present invention may be included in various embedded systems or devices such as a computer, a mobile phone, a tablet, a handheld device, a wearable device, or the like in the form of hardware or software modules.

Hereinafter, the operation of the object recognition unit 100 will be described in detail.

Here, the target object information may include information about an area in which the target object exists in the target image. Hereinafter, the target object may be extracted to a certain region within the target image through image segmentation, and the information about the extracted region may be included in the target object information. Also, the target object information may include position information in the target image of the target object determined according to the position in the target image of the extracted region as described above.

Here, the semantic information means information about the meaning of the target object. The semantic information may include a tag indicating the meaning of the target object, and may include information indicating a category to which the target object belongs. For example, the semantic information may include a tag such as 'pine', and may include information indicating a category such as 'tree'. The semantic information may include various information indicating the meaning of the other object.

First, the object recognition unit 100 divides the target image into a plurality of regions by performing image segmentation, extracts the target image region corresponding to the target object by selecting a portion of the divided regions, And generate the target object information including information on a position or an area in the target image of the target object according to the extracted area.

Image segmentation is a technique of dividing an image into a plurality of segments - a set of pixels - and is a technique widely used for detecting the position of a boundary of an object or an object. Here, image segmentation is a technique of projecting or classifying each pixel having a different image signal value included in an image into a predetermined category according to a predetermined criterion. For example, it is possible to generate one segment by classifying pixels having similar signal values adjacent to each other, pixels corresponding to the same object into one category, and generating a plurality of segments in the image in this manner have.

Preferably, the object recognition unit 100 can use the method proposed by Heesoo Myeong for the image segmentation in the " Tensor-based High-order Semantic Relation Transfer for Semantic Scene Segmentation ".

In addition, the object recognition unit 100 may divide the background image into a plurality of regions by using image segmentation methods of various methods that have been proposed in the prior art. For example, a clustering method can be used. For example, "Barghout, Lauren, and Jacob Sheynin." Journal of Vision 13.9 (2013) : 709-709. "The proposed method can be used. Compression-based methods can also be used. For example, " Hossein Mobahi, Shankar Rao, Allen Yang, Shankar Sastry and Yi Ma. Segmentation of Natural Images by Texture and Boundary Compression, International Journal of Computer Vision ), 95 (1), pg. 86-98, Oct. 2011. " In addition, the histogram based method can be used. For example, "Ohlander, Ron; Price, Keith; Reddy, D. Raj (1978)." Picture Segmentation Using a Recursive Region Splitting Method. Processing 8 (3): 313-333. Doi: 10.1016 / 0146-664X (78) 90060-6. " It is also possible to use the segmentation method proposed by Carsten Rother in "GrabCut: interactive foreground extraction using iterated graph cuts, SIGGRAPH 2004 ". In addition, various methods proposed by image segmentation techniques such as edge based method, region growing method, graph partitioning method, and multi-scale method are used. So that the image can be divided. Here, the method that can be used as the image segmentation technique is not limited to the above-described methods, and other segmentation techniques may be used.

Here, the object recognition unit 100 may perform a segmentation of an image and select a part of the divided regions to extract the target image region corresponding to the target object. Here, the object recognition unit 100 may extract at least one main object, which plays a major role in expressing the emotion of the target image, as the target object. If necessary, a representative object expressing the emotion of an image may be extracted as a target object. If two or more main objects exist, the object may be extracted as target objects.

Here, the object recognition unit 100 may receive the area selection input of the user from among the plurality of divided areas, and may extract the selected area as the target object. That is, the object recognition unit 100 can select one target object or two or more objects as target objects according to the received user area selection input separately. The object recognition unit 100 may receive the user input in a form of a line or a box, and may select an object mainly including pixels in which a line or a box type input is located as a target object.

Alternatively, the object recognition unit 100 may extract a portion of the plurality of regions based on the importance value as the object. That is, the object recognition unit 100 may calculate the saliency of the target image, and may select a target object among the divided regions in the target image based on the importance value of each pixel according to the importance .

In the field of image processing, the saliency is a numerical value of a portion of a human being felt important when viewing the image. In other words, there are parts of the human eye that feel more important depending on the color, brightness, and contour of the object in the image, and the saliency is a numerical value of the degree. For example, the saliency can be set according to the degree of difference in color or brightness, and the degree of strong feature of the contour. For example, "Yan, Q., Xu, L., Shi, J., & Jia, J. (2013, June). Hierarchical saliency detection.In Computer Vision and Pattern Recognition (CVPR) (IEEE Std 1155-1162, IEEE), "images are classified by layers (from a coarse level where images are simplified to a fine level where fine details remain), and a regional contrast is calculated for each layer, And calculates a final importance score by combining the scores calculated by the respective layers. In the present invention, the importance value is calculated using various existing methods as described above, and a target object among the divided regions can be selected based on the importance value.

Here, the object recognition unit 100 can select a region having a high degree of importance among the divided regions as a region of a target object. For example, the object recognition unit 100 may set a threshold value and select an area having an importance value greater than the threshold value as the target object. Also, it is possible to select pixels having high importance among the pixels included in the target image, and to select the region including the pixels as the target object.

In addition, the object recognition unit 100 can extract a region of a target object by using various methods for extracting an area corresponding to other existing objects from the image, and a method that can be used as a technique for extracting a region of the target object Is not limited to the above-described methods, and it is needless to say that area extraction techniques of other object objects can be used.

Alternatively, the object recognition unit 100 may image-segment an area corresponding to a target object whose class of object is recognized as described below, and extract an area within the target image corresponding to the target object. In this case, the object class of the target object is recognized on the basis of the feature, and then the target object in which the recognized object class exists is divided in the target image using the image segmentation techniques described above, It is possible to extract an area in the target image to be processed.

Here, the object recognition unit 100 determines which class of the predetermined object the class of the object corresponds to, using the image signal of the target image or the extracted feature, The semantic information can be generated.

For this, the object recognition unit 100 determines which class of the predetermined object the class of the object corresponds to, based on the feature extracted from the target image, and outputs the semantic information Lt; / RTI > Here, the object recognition unit 100 determines which class of the predetermined class of the object the object corresponds to, using the feature extracted in the target image in the corresponding region corresponding to the extracted target object, Thereby generating the semantic information.

Alternatively, the object recognition unit 100 may recognize and determine the class of the target object by using an Apperance-based method, a genetic algorithm-based method, or the like as needed, in addition to the feature-based method.

Here, the class of the object is a predetermined class for the object, and is a concept used in the field of object recognition. Here, the class may be a value or a label set according to the purpose of recognizing the object.

Here, the object recognition unit 100 can determine the class of the object using various existing pattern recognition and object recognition methods for detecting and recognizing objects in an image.

Here, the object recognition unit 100 may analyze the image and recognize or identify the object using the predetermined feature information of the image or using the appearance information. In order to recognize an object, the object recognition unit 100 extracts various types of existing features from the target image, and performs object recognition using the extracted features. For example, the object recognition unit 100 may extract and use various features such as an edge characteristic, a corner characteristic, a LoG (Laplacian of Gaussian), and a DoG (Difference of Gaussian). In addition, the object recognition unit 100 may include an object recognition technique using various existing feature description methods including Scale-invariant feature transform (SIFT), Speed Up Robust Features (SULF), and Histogram of Oriented Gradients Can be used. Alternatively, the object recognition unit 100 may detect and recognize a target object by comparing a template image and a certain area within the target image based on an appearance. The object recognition unit 100 may detect or recognize the target object using various known object recognition techniques in addition to the method exemplified with the concrete name above and obtain information about the position and the like of the recognized object. The operation of the object recognition unit 100 is not limited to the above-described techniques.

FIG. 3 is a detailed block diagram of the object recognizing unit 100. FIG.

The object recognition unit 100 may include an image segmentation unit 110, an object extraction unit 120, a feature extraction unit 130, and a semantic information generation unit 140. Here, the image segmentation unit 110, the object extraction unit 120, the feature extraction unit 130, and the semantic information generation unit 140 correspond to the object recognition unit 100, Can be performed in the same manner.

The image segmentation unit 110 may divide the target image into a plurality of regions by performing image segmentation.

The object extracting unit 120 may extract a region within the target image corresponding to the target object by selecting a part of the divided regions.

The feature extraction unit 130 may extract a predetermined feature by analyzing the image signal of the pixels included in the target image area corresponding to the target object.

Here, the video signal may mean a video signal value that each pixel included in the image has according to a predefined color space. For example, when an RGB color space is used, it may be a signal value of each of the R, G, and B channels. Or may be a luminance signal or a color difference signal in a pixel. Here, the image signal includes all of image signal values in various color spaces such as YCbCr, CMYK, YIQ, and the like.

The semantic information generation unit 140 determines which class of the predetermined object class the target object corresponds to using the feature extracted from the target image and generates the semantic information according to the determined class of the object . Here, using the feature extracted in the target image area corresponding to the target object, it is determined which class of the object belongs to the predetermined class of the object, and the semantic information is generated according to the determined class of the object .

Next, the operation of the candidate selecting unit 200 will be described in detail.

The candidate selecting unit 200 receives target emotion information indicating a target emotion to be converted for the target image, compares at least one candidate object and the target object based on the semantic information and emotion information, And selects the candidate object according to the candidate object. The candidate object information about the selected candidate object can be output, and the image transform unit 300 can perform image transformation using the candidate object information as described below.

Here, the emotion information may include a label or a numerical value indicating a specific emotion as information indicating a specific emotion. For example, emotion information can include labels that express qualitative emotions such as 'joy', 'sadness', 'loveliness', and may include labels indicating the nature of emotions such as 'affirmative' And may include numerical values representing the degree of certain emotions.

Here, the candidate object refers to an object used for transforming a target image in the image transform unit 300 by applying the candidate object to the target object. A plurality of candidate objects may exist, and information for determining correspondence with the target object for each object and information to be used for transforming the target image may be stored in association with each candidate object. In addition, the candidate object may exist in the form contained in the image and exist in the form of the object extracted from the image.

Here, the candidate selecting unit 200 may select a candidate object to be applied to a target object among a plurality of candidate objects stored in the candidate database 50. [

Here, the candidate database 50 may store video signal information of an image including a candidate object itself or a candidate object, or may store information related to a candidate object without storing the video signal. As described below, the semantic information, emotion information, candidate object information, and the like can be stored as information related to the candidate object.

At this time, the candidate database 50 preferably stores the semantic information, which is information on the meaning of the candidate object, and the emotion information previously set for the candidate object for each candidate object. Here, the candidate objects stored in the candidate database 50 may generate and store the semantic information and the emotion information from the candidate object using the object recognition unit 100 and the emotion information extraction unit 400 according to the present invention. Or may store each candidate object in which the above information is directly set by the user as needed.

4 is a block diagram for explaining the operation of the candidate sorting unit 200. As shown in FIG.

Here, the candidate selecting unit 200 selects the semantic information and the emotion information of the candidate object stored in the candidate database 50, the semantic information of the target object generated by the object recognizing unit 100, And the candidate object corresponding to the target object can be selected according to the comparison result.

Here, the candidate selecting unit 200 compares the semantic information, the emotion information, the semantic information of the target object, and the inputted target emotion information for each of the candidate objects with respect to the plurality of candidate objects, It is possible to select a candidate object to be applied to the image conversion of the target image.

Here, when the emotion information is set to the emotional label, if the emotion information of the candidate object and the target emotion information have the same emotional label or have the emotional label classified into the same category, the candidate object can be selected. That is, the plurality of emotion labels can be classified into the same category in advance according to the nature of the emotion, and can be judged to be similar if the emotion level belongs to the same category.

For example, when the target emotion information corresponds to the emotion label of 'joy', the candidate selector 200 compares the target emotion information with the set emotion information for a plurality of candidate objects, Can be selected as the emotion information. Or emotional labels of 'joy,' 'joy,' 'activeness,' and 'dynamic' when the categories 'joy', 'joy', 'active' The candidate objects to be included in the information may be selected.

In the same manner, the candidate selecting unit 200 can also select a candidate object having a tag of the same semantic information as the target object or belonging to the same category of the semantic information with respect to the semantic information. For example, if the target object has a tag of semantic information called 'pine', if the candidate object has the same semantic information tag 'pine' or belongs to the same category of semantic information 'tree' You can select objects.

Or the candidate selecting unit 200 may select the semantic information and the emotion information of the candidate object stored in the candidate database 50 and the semantic information of the target object generated in the object recognizing unit 100, The degree of similarity between the pieces of information can be calculated, and the candidate objects can be selected based on the calculated degree of similarity. If the emotion information or the semantic information is represented by a numerical value, the similarity in the emotion information and the semantic information can be calculated as an arithmetic value according to the difference between the numerical values. Or the emotion information or the semantic information is expressed in the form of a label or a tag, the similarity is calculated by using the coordinates where each label or tag in the predefined space is located for each label or tag, . Here, the emotion information and the semantic information may be expressed in the form of a vector, and the similarity may be calculated in the form of a distance between vectors.

Here, the candidate selecting unit 200 may select a candidate object to be applied in the image transforming unit 300 among a plurality of candidate objects based on the degree of similarity. That is, if the degree of similarity between the target object and the candidate object in the emotion information and the semantic information is greater than a preset reference, it is determined that the similar object is the candidate object.

Here, the candidate selecting unit 200 may further select candidate objects similar to the target object by using object information such as position information or texture information of the object.

FIG. 5 is a block diagram of the candidate selecting unit 200 in the case of selecting candidates using the object information in addition to the emotion information and the semantic information.

In this case, the object recognition unit 100 extracts the area of the target image corresponding to the target object, sets the extracted area as the location information of the target object, The target object information can be generated.

The candidate object information including the intra-image location information of the candidate object may be stored in the candidate database 50 for each of the candidate objects.

In this case, the candidate selecting unit 200 selects the candidate object stored in the candidate database 50 based on the semantic information, the emotion information, and the candidate object information of the candidate object stored in the candidate database 50, The semantic information, the target emotion information, and the target object information, and may select the candidate object according to the comparison result.

In addition, the object recognition unit 100 may extract the area in the target image corresponding to the target object, use the texture information of the extracted area as texture information of the target object, Object information can be generated.

The candidate object information including the texture information of the candidate object may be stored in the candidate database 50 for each candidate object.

Here, the method by which the candidate selecting unit 200 compares the target object information of the target object with the candidate object information of the candidate object is performed in the same manner as described above in the same manner as the comparison based on the emotion information and the semantic information And it can be determined whether the object information is similar based on the comparison result.

6 is a reference diagram for explaining the operations of the object recognition unit 100 and the candidate selector 200 described above.

As shown in FIG. 6A, the object recognition unit 100 extracts a target object from a target image and generates semantic information about the target object. For example, as shown in FIG. 6A, the T partial area can be extracted as a target object, the semantic information about the T partial area can be generated as 'Tree', and the M partial area can be further extracted as a target object You can create the semantic information for 'Mountain'.

Next, the candidate selecting unit 200 selects candidate objects C1, C2, C3, ... stored in the candidate database 50 as shown in FIG. 6B, based on the emotion information and the semantic information, Objects can be selected.

Next, the operation of the image transform unit 300 will be described in more detail.

The image transform unit 300 transforms the image signal of the target image in the target image corresponding to the target object using the candidate object information of the candidate object selected by the candidate sorter 200, And generates an image converted so as to express an emotion based on the target emotion information.

The conversion of the video signal of the target image corresponding to the target object may be performed by converting the video signal in the area corresponding to the target object, And < / RTI >

Here, the candidate object information may include information on the color, position, texture, or the like of the candidate object, and the image transform unit 300 may use any one or more of the above- The transformed image may be generated by transforming the image signal so that the image signal of the intra-image region conforms to the candidate object information.

7 is a detailed block diagram of the image converting unit 300. As shown in FIG.

The image converting unit 300 may include at least one of the color converting unit 310 and the position converting unit 320, the texture converting unit 330, and the object adding unit 340, The converted image can be generated.

Here, the candidate object information may include color information of the candidate object.

Here, the color conversion unit 310 may use the color information included in the candidate object information of the candidate object selected by the candidate selection unit 200 to determine the color of the image signal of the target image in the target image corresponding to the target object The transformed image may be generated by transforming the image signal so that the distribution is in accordance with the color information of the candidate object.

Preferably, the color information may be color distribution information of the candidate object, and the color conversion unit 310 may change the color distribution of the target object according to the color distribution of the selected candidate object, Lt; / RTI >

Here, the color distribution can be the histogram of each L, a, b in the three-dimensional color space of the Lab. In addition to Lab color space, it can include all color spaces such as RGB, HSV, HSI, CMYK, YCbCr.

Here, the color conversion unit 310 may generate the converted image by replacing the average value of the color distribution of the target object with the average value of the color distribution of the candidate object.

Preferably, the color conversion unit 310 may use the method proposed by Erik Reinhard in "Color transfer between images" for color conversion.

Here, the color conversion unit 310 may convert the color distribution of the target object according to the color distribution of the candidate object by using the color conversion method of various methods proposed previously. For example, color distribution-based color conversion techniques such as "ranc i ois pitie", Anil C. Kokaram, and Rozenn Dahyot "Automated color grading using color distribution transfer", Computer Vision and Image Understanding 107 (2007) Tania Pouli, Erik Reinhard, Progressive histogram reshaping for creative color transfer and tone reproduction, Proceeding NPAR, Proceedings of the 8th International Symposium on Non-Photorealistic Animation and Rendering: 81-90.

Referring to FIGS. 6B and 6C, the color conversion unit 310 converts the color of the target object T (a) in accordance with the color distribution of the candidate object C3 selected as shown in FIG. 6B, (T ') by converting the image signal of the area in the target image corresponding to the target object as shown in FIG. 5C, and generate the converted image.

The candidate object information may include position information in the image of the candidate object.

Here, the position of the candidate object in the image may be the position information of the candidate object in the candidate image including the candidate object.

Here, the position conversion unit 320 may use the intra-image position information included in the candidate object information of the candidate object selected by the candidate selection unit 200 so that the object in the target image is the image of the candidate object And may perform image conversion to move the position of the target object in the target image so as to comply with the position information, thereby generating the converted image.

For example, in (a) of FIG. 6, it is possible to perform image transformation to move the position of the target object T in accordance with the position information of the candidate object.

The candidate object information may include texture information of the candidate object. Here, the texture information means texture information of a video signal of a candidate object.

Here, the texture conversion unit 330 may use the texture information included in the candidate object information of the candidate object selected by the candidate selection unit 200 to determine whether the target object in the target image is the texture information of the candidate object The image signal of the target image within the target image corresponding to the target object may be transformed to generate the transformed image. That is, the image signal can be transformed so that the texture of the target object follows the texture of the candidate object.

Here, the object adding unit 340 searches the word database for an associated word corresponding to the semantic information of the target object, searches for an image patch corresponding to the retrieved related word in the image database, In addition to the target image, the converted image can be generated.

Here, the existing related word search method can be used to find a related word corresponding to the word according to the semantic information. For example, the same method proposed by ConceptNet may be used. Here, the image database may be a database storing image patches and a plurality of data in which words corresponding to the patches are defined.

For example, when the semantic information of the target object is 'Tree', the object adding unit 340 may search for a related word 'Bird' in the word database, as an associated word related to 'Tree'. In the image database, an image patch corresponding to a related word 'Bird' can be retrieved. The retrieved image patch may be added to the target image at a position spaced a predetermined distance from or within the area corresponding to the target object.

Referring to FIG. 6 (d), the object adding unit 340 may add an image patch (IP) on the target image to generate the converted image. By adding the image patch in this way, the emotion effect centered on the target object can be maximized.

Next, the operation of the emotion information extracting unit 400 will be described in more detail.

As described above, the emotion information extracting unit 400 according to the present invention draws attention to the fact that emotions that people see and feel are affected most by the objects included in the photograph, Extracts emotions contained in the image based on the extracted semantic information and feature based on the image signal.

FIG. 8 is a block diagram for explaining the operation of the emotion information extracting unit 400. FIG.

In this case, the object recognition unit 100 may analyze a video signal of pixels included in the target image area corresponding to the extracted target object to extract a predetermined feature. Here, the object recognition unit 100 may extract a predetermined feature including a color feature or a texture feature in the target image area corresponding to the target object.

The emotion information extracting unit 400 according to the present invention extracts emotion from the target image by using the extracted features together with the semantic information of the target object recognized by the object recognizing unit 100. [

To this end, the present invention uses an emotion information classifier set by learning learning data in which semantic information, features, and emotion information of an object included in an image are preset.

If the target object is one, the extracted emotion information of the target object may be emotion information of the target image. If there are a plurality of target objects, the set of emotion information of each target object may be emotion information of the target image, Some of the emotion information of the object may be the emotion information of the target image. For example, the emotion information of the target object selected as the main target object may be the emotion information of the target image, or the emotion information of the target image may be the emotion information of the target image.

Here, the emotion information classifier can extract the emotion information of the target object from the semantic information and the feature by using the classification function in which parameters are set in advance. Here, the emotion information extracting unit 400 may extract the emotion information of the target object by inputting the semantic information of the target object and the feature including the color feature or texture feature to the classification function of the emotion information classifier have.

For example, the emotion information extracting unit 400 may calculate and extract emotion information of the target object according to the result of the classification function as shown in Equation (1).

Where S is the semantic information, F is the feature, C () is the classification function, and E is the result of the classification function. The emotion information extracted here may be the resultant value of the classification function itself or may be an emotional label or numeric value classified according to whether it belongs to a predetermined numerical range based on the resultant value.

Here, the emotion information classifier used by the emotion information extractor 400 learns a plurality of learning data in which the semantic information, the characteristic, and the emotion information are set in advance for a plurality of learning objects, and the semantic information and the characteristic And the parameter of the classification function for outputting the emotion information is set to be a set classifier.

Preferably, the emotion information classifier sets correspondence between the semantic information and the feature information and the emotion information using a linear regression model or a support vector-based regression model, and learns a model using the learning data have. That is, the emotion information classifier may learn the learning data using a linear regression model or a support vector-based regression model, and the classifier may be set to a parameter of the classification function.

In addition, the parameters of the classification function of the emotion information classifier can be set through machine learning. In this case, various conventional methods can be used for the classification function and the corresponding machine learning method. For example, an ensemble learning method, a boost-based learning method, a nearest neighborhood search-based learning method, a support vector machine-based learning method, and an AdaBoost-based learning method can be used. In addition, the emotion information extracting unit 400 can set parameters of the classification function of the emotion information classifier by using various types of classification functions and the corresponding machine learning methods, without being limited to the above-described methods.

The image emotion conversion method according to another embodiment of the present invention may include an object recognition step S100, a candidate selection step S200, and an image conversion step S300. The image emotion conversion method according to the present invention can operate in the same manner as the image emotion conversion apparatus described in detail with reference to FIGS. 1 to 8 above. Therefore, overlapping portions will be omitted and briefly described below. In addition, the image emotion conversion method according to the present invention can be included in a computer or an embedded system in the form of hardware or software modules.

9 is a flowchart of a video emotion conversion method according to the present invention.

The object recognizing step S100 receives the image to be emotionally transformed, extracts a target object to be emotion transformed from the target image, and outputs target object information including information about a position or an area in the target image of the target object And recognizes the target object and generates semantic information, which is information on the meaning of the target object.

The candidate selection step S200 includes inputting target emotion information indicating a target emotion to be transformed with respect to the target image, comparing at least one candidate object and the target object based on the semantic information and emotion information, And selects the candidate object according to the candidate object.

The image transformation step S300 transforms the image signal of the target image in the target image corresponding to the target object using the candidate object information of the candidate object selected in the candidate selection step, Lt; RTI ID = 0.0 > emotion < / RTI >

Here, the image emotion conversion method according to the present invention may further include the emotion information extraction step S50 as needed.

10 is a flowchart of a video emotion conversion method further including a feeling information extracting step (S50).

In this case, the object recognition step S100 may further include analyzing the image signal of the pixels included in the target image area corresponding to the extracted target object to extract a predetermined feature.

The emotion information extraction step S400 extracts emotion information of the target object by inputting the semantic information and the feature of the extracted target object to a previously learned emotion information classifier and extracting emotion information of the extracted target object The emotion information of the target image is determined based on the emotion information.

Here, the candidate object is stored in the candidate database 50, and the semantic information, which is information on the meaning of the candidate object for each candidate object, and the emotion information previously set for the candidate object are stored in the candidate database 50 .

Here, the candidate selection step S200 compares the semantic information and the emotion information of the candidate object stored in the candidate database 50 with the semantic information and the target emotion information of the target object generated in the object recognition step And the candidate object corresponding to the target object can be selected according to the comparison result.

Wherein the candidate object information includes color information of the candidate object,

Here, in the image transforming step S300, the color distribution of the image signal of the target image within the target image corresponding to the target object is calculated using the color information included in the candidate object information of the candidate object selected in the candidate selection step And a color conversion step of converting the image signal according to the color information of the candidate object to generate the converted image.

In the image conversion step S300, an associated word corresponding to the semantic information of the target object is searched in a word database, an image patch corresponding to the searched related word is searched in an image database, And adding an object on the target image to generate the transformed image.

The image emotion conversion computer program according to another embodiment of the present invention may be a computer program stored in the medium for executing the image emotion conversion method described above in combination with the computer.

It is to be understood that the present invention is not limited to these embodiments, and all elements constituting the embodiment of the present invention described above are described as being combined or operated in one operation. That is, within the scope of the present invention, all of the components may be selectively coupled to one or more of them.

In addition, although all of the components may be implemented as one independent hardware, some or all of the components may be selectively combined to perform a part or all of the functions in one or a plurality of hardware. As shown in FIG. In addition, such a computer program may be stored in a computer readable medium such as a USB memory, a CD disk, a flash memory, etc., and read and executed by a computer to implement an embodiment of the present invention. As the recording medium of the computer program, a magnetic recording medium, an optical recording medium, a carrier wave medium, and the like can be included.

Furthermore, all terms including technical or scientific terms have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs, unless otherwise defined in the Detailed Description. Commonly used terms, such as predefined terms, should be interpreted to be consistent with the contextual meanings of the related art, and are not to be construed as ideal or overly formal, unless expressly defined to the contrary.

It will be apparent to those skilled in the art that various modifications, substitutions and substitutions are possible, without departing from the scope and spirit of the invention as disclosed in the accompanying claims. will be. Therefore, the embodiments disclosed in the present invention and the accompanying drawings are intended to illustrate and not to limit the technical spirit of the present invention, and the scope of the technical idea of the present invention is not limited by these embodiments and the accompanying drawings . The scope of protection of the present invention should be construed according to the following claims, and all technical ideas within the scope of equivalents should be construed as falling within the scope of the present invention.

100: Object recognition unit
200: candidate selection unit
300:
400: Emotion information extracting unit
50: Candidate database
110: Image Segmentation Unit
120: object extracting unit
130: Feature extraction unit
140: Semantic information generating unit
310: Color conversion unit
320:
330: texture conversion unit
340:
S100: Object recognition
S200: Candidate selection
S300: Image conversion

Claims

A video emotion conversion apparatus comprising:
An object to be emotionally transformed is extracted and a target object to be emotion transformed is extracted from the target image to generate target object information including information about a position or an area in the target image of the target object, An object recognition unit for recognizing and generating semantic information which is information on the meaning of the target object;
The method includes receiving target emotion information representing a target emotion to be transformed by a user with respect to the target image, receiving semantic information and emotion information for each of at least one candidate object preset and stored in the candidate database, A candidate selecting unit for calculating the similarity between the semantic information and the target emotion information and selecting candidate objects corresponding to the target object, the candidate objects having the similarity calculated among the at least one candidate object being greater than a preset reference; And
Wherein the candidate selecting unit converts the video signal of the target intra-region corresponding to the target object using the candidate object information of the candidate object selected by the candidate selecting unit to convert the target image into an emotion corresponding to the target emotion information An image converter for generating an image; Lt; / RTI >
Wherein the object recognizer extracts a predetermined feature including a color feature or a texture feature by analyzing an image signal of pixels included in the target image region corresponding to the extracted target object,
Wherein the object recognition unit preliminarily learns a plurality of learning data in which the semantic information, the predetermined characteristic, and the feeling information are preliminarily learned, thereby obtaining a parameter of a classification function for inputting the semantic information and the predetermined characteristic and outputting the feeling information The method of claim 1, further comprising the steps of: inputting the semantic information and the predetermined characteristic of the extracted target object to a predetermined emotion information classifier to extract emotion information of the target object; And an emotion information extracting unit for determining emotion information.

delete

The method according to claim 1,
Wherein the object recognition unit performs image segmentation on the target image to divide the target image into a plurality of regions, extracts the target image region corresponding to the target object by selecting a portion of the divided regions, And generates the target object information including information on a position or an area in the target image of the target object.

The method according to claim 1,
The object recognition unit determines which class of the predetermined object the class of the object corresponds to, using the image signal of the target image or the extracted characteristic, and generates the semantic information according to the determined class of the object And the image processing unit converts the image into a video signal.

The apparatus of claim 1,
An image segmentation unit dividing the target image into a plurality of regions by performing image segmentation;
An object extraction unit for selecting a part of the divided areas and extracting the target image area corresponding to the target object;
A feature extraction unit for analyzing the image signal of the pixels included in the target image region corresponding to the target object and extracting the predetermined feature; And
A semantic information generation unit for determining which class of the predetermined class of object the target object corresponds to using the predetermined feature extracted from the target image and generating the semantic information according to the determined class of the object; Further comprising: a video emotional conversion unit for generating a video emotional state;

delete

The method according to claim 1,
Wherein the object recognition unit extracts an area in the target image corresponding to the target object and uses the position information of the extracted area as the position information of the target object to obtain the target object information including the position information of the target object Generate,
The candidate object information including position information of the candidate object in the image for each candidate object is stored in the candidate database,
Wherein the candidate selector is configured to classify the semantic information, the emotion information, and the candidate object information of the candidate object stored in the candidate database, the semantic information of the target object, the target emotion information, Compares the information, calculates the degree of similarity, and selects candidates corresponding to the target object according to the degree of similarity.

The method according to claim 1,
Wherein the object recognition unit extracts an area in the target image corresponding to the target object and uses the texture information of the extracted area as texture information of the target object to generate the target object information including the texture information of the target object and,
The candidate object information including texture information of the candidate object for each candidate object is stored in the candidate database,
Wherein the candidate selector is configured to classify the semantic information, the emotion information, and the candidate object information of the candidate object stored in the candidate database, the semantic information of the target object, the target emotion information, Compares the information, calculates the degree of similarity, and selects candidates corresponding to the target object according to the degree of similarity.

The method according to claim 1,
Wherein the candidate object information includes color information of the candidate object,
Wherein the image transformation unit uses the color information included in the candidate object information of the candidate object selected by the candidate selection unit so that a color distribution of the image signal of the target image in the target image corresponding to the target object is obtained from the color information of the candidate object And a color conversion unit for converting the image signal according to the color information to generate the converted image.

The method according to claim 1,
Wherein the candidate object information includes position information in the image of the candidate object,
Wherein the image transforming unit transforms the target object in the target image to the in-image position information of the candidate object using the intra-image position information included in the candidate object information of the candidate object selected by the candidate selecting unit And a position conversion unit for performing an image conversion for moving the position of the target object in the target image and generating the converted image.

The method according to claim 1,
Wherein the candidate object information includes texture information of the candidate object,
Wherein the image transformation unit uses the texture information included in the candidate object information of the candidate object selected by the candidate selection unit so that the target object in the target image matches the texture information of the candidate object, And a texture conversion unit for converting the image signal of the corresponding area in the target image to generate the converted image.

The method according to claim 1,
Wherein the image conversion unit searches the word database for an associated word corresponding to the semantic information of the target object, searches an image database for an image patch corresponding to the retrieved related word, And an object adding unit for generating the converted image in addition to the image adding unit.

delete

The method according to claim 1,
Wherein the emotion information classifier learns the learning data using a linear regression model or a support vector-based regression model, and the parameter of the classification function is set by the classifier.

A video emotion conversion method comprising:
An object to be emotionally transformed is extracted and a target object to be emotion transformed is extracted from the target image to generate target object information including information about a position or an area in the target image of the target object, An object recognizing step of recognizing and generating semantic information which is information on the meaning of the target object;
The method includes receiving target emotion information representing a target emotion to be transformed by a user with respect to the target image, receiving semantic information and emotion information for each of at least one candidate object preset and stored in the candidate database, A candidate selection step of calculating a similarity between the semantic information and the target emotion information and selecting a candidate object corresponding to the target object whose calculated similarity among the at least one candidate object is larger than a preset reference; And
Wherein the target image is transformed into an image signal of an area in the target image corresponding to the target object using the candidate object information of the candidate object selected in the candidate selection step so that the target image is transformed to indicate the emotion according to the target emotion information And an image conversion step of generating an image,
The object recognition step
Extracting a predetermined feature including a color feature or a texture feature by analyzing a video signal of pixels included in the target intra-region corresponding to the extracted target object; And
Learning a plurality of learning data in which the semantic information, the predetermined feature, and the emotion information are set in advance, so that the parameter of the classification function for inputting the semantic information and the predetermined characteristic, Extracting emotion information of the target object by inputting the semantic information of the extracted target object and the predetermined feature including the color feature or the texture feature into the information classifier; And
Determining emotion information of the target image based on the extracted emotion information of the target object; Wherein the video emotional conversion method comprises:

delete

17. The method of claim 16,
Wherein the candidate object information includes color information of the candidate object,
Wherein the image transformation step uses the color information included in the candidate object information of the candidate object selected in the candidate selection step so that the color distribution of the image signal of the target image in the target image corresponding to the target object is the candidate object, And a color conversion step of converting the image signal to comply with the color information of the input image, and generating the converted image.

17. The method of claim 16,
Wherein the image conversion step searches the word database for an associated word corresponding to the semantic information of the target object, searches the image database for an image patch corresponding to the retrieved related word, And an object adding step of creating the transformed image in addition to the transformed image.

A computer program stored in a medium for executing a video emotional conversion method according to any one of claims 16, 19, 20, and 20 in combination with a computer.