WO2011148562A1 - 画像情報処理装置 - Google Patents
画像情報処理装置 Download PDFInfo
- Publication number
- WO2011148562A1 WO2011148562A1 PCT/JP2011/002235 JP2011002235W WO2011148562A1 WO 2011148562 A1 WO2011148562 A1 WO 2011148562A1 JP 2011002235 W JP2011002235 W JP 2011002235W WO 2011148562 A1 WO2011148562 A1 WO 2011148562A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- attention
- map
- tag
- processing apparatus
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/583—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/583—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/5838—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using colour
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/24—Aligning, centring, orientation detection or correction of the image
Definitions
- This relates to technology that supports the assignment of classification tags to images.
- DSC Digital Still Camera
- Non-Patent Document 1 a plurality of faces appearing in a plurality of images are detected, and then the detected plurality of faces are divided into groups based on similarity, and name tags can be added to this group all at once.
- the conventional technique can add a tag for identifying a person shown in an image, it is difficult to say that a tag that accurately represents the classification of the image itself, not the person, can be added.
- the present invention has been made under such a background, and an object of the present invention is to provide an image information processing apparatus that can attach an appropriate tag to an image by paying attention to the direction of an object such as a person.
- An image information processing apparatus includes an extraction unit that extracts an object from an image, a calculation unit that calculates a direction in which the extracted object is directed, and a tag according to the calculated direction. Providing means.
- an appropriate tag can be assigned to an image by paying attention to the direction of the object.
- the image information processing apparatus 10 includes an image storage unit 12, an object extraction unit 14, a calculation unit 16, an object information storage unit 18, an attention vector information storage unit 20, an assignment condition storage unit 22, and an assignment unit 24.
- the various storage units 12, 18, 20, 22, and 30 are configured by hardware such as HDD (Hard Disk Drive) and RAM (Random Access Memory). Note that a general PC (Personal Computer) can be used as the image information processing apparatus 10.
- the image storage unit 12 stores a large number of images.
- the image storage unit 12 stores “image A”, “image B”, and “image C” and a large number (for example, several thousand) of images.
- images are images handled by the user at home.
- an image (frame image) in a moving image captured by the user using the digital movie camera 1 or a DSC (Digital still camera) 2 is used by the user. This is an image taken with
- the object extraction unit 14 extracts the human body and human face objects included in the image, targeting the image stored in the image storage unit 12.
- the human body object is an object of the entire human body, and includes a human face (head), torso, and limbs.
- a method of extracting only the upper body as an object of a person's body can also be adopted.
- This extraction method is general.
- Patent Document 4 Japanese Patent Laid-Open No. 2008-250444
- the extracted face may be recognized and the type may be classified.
- Non-Patent Document 3 can be used for the recognition and extraction of the human body.
- the calculating unit 16 obtains the size of the occupation ratio indicating the ratio of the person's face and body in the image.
- calculation unit 16 calculates the rotation and orientation of the person's face and body based on the information about the person extracted by the object extraction unit 14.
- the calculation unit 16 stores the calculated result in the vector information storage unit 18.
- the object extraction unit 14 extracts a human face and body from the image X (S1).
- the calculation unit 16 obtains rotation and orientation for each of the extracted person's body and face (S2).
- the calculation unit 16 calculates the occupation ratio of the person's body by dividing the area (S B ) of the rectangular region surrounding the extracted person's body by the area (S A ) of the entire image X.
- the occupation ratio of the human face is calculated by dividing the area (S c ) of the rectangular region surrounding the human face by the area (S A ) of the entire image X (S 3).
- the calculation unit 16 calculates the attention vector based on the “rotation”, “direction”, “occupation ratio”, and the like of each object (S4).
- step S2 details of step S2 in which the calculation unit 16 calculates the orientation and rotation of the person's face will be described.
- the calculating unit 16 determines the rotation and direction of the face by comparing the face of the person extracted by the object extracting unit 14 with a table 17 as shown in FIG.
- Table 17 shows three categories of horizontal rotation of a: “ ⁇ 90 to ⁇ 25.5”, b: “ ⁇ 25.5 to 25.5”, and c: “25.5 to 90” (all ranges are in angle).
- 17a A: “-90 to -67.5", B: “-67.5 to -25.5", C: “-25.5 to 25.5", D: “25.5 to 67.5”, E: "67.5 to 90”
- 17b indicating five sections related to the degree of orientation.
- section C of “direction” 17b indicates that the face is facing forward.
- the calculation unit 16 can use a method such as
- the calculation unit 16 determines the classification between “rotation” and “orientation” based on a table according to the table 17.
- the determined classification is stored in the object information storage unit 18.
- the object information storage unit 18 stores object information including items of “type”, “direction”, “rotation”, and “occupation ratio” for each object included in the image for each image.
- Type indicates the type of the object, and has values such as “face” and “person (upper)” indicating the upper body of the person.
- Direction indicates a classification of the direction corresponding to the table 17 of FIG. 4 when the type is face. When the type is body, it indicates the body orientation.
- “Rotation” corresponds to the table 17 in FIG. 4 and indicates the division of rotation of the face (in the case of the body, the division of rotation of the body is indicated).
- “Occupation ratio” is the ratio of the object in the image as described in FIG.
- FIG. 6A is a diagram illustrating a state in which an object is extracted from the image A.
- the image A includes two persons (person A and person B) on the right side of the image, two persons (person C and person D) on the left side of the image, a tower, and clouds.
- the object extraction unit 14 extracts a total of eight objects including the faces of the persons O1, O2, O5, and O6 and the bodies of the persons O3, O4, O7, and O8. In the present embodiment, the object extraction unit 14 extracts only human objects included in the image and does not extract objects such as towers.
- the calculation unit 16 calculates “type”, “direction”, “rotation”, and “occupation ratio” for each of the extracted eight objects.
- the calculation unit 16 sets “C” in which “type” is “face” and “direction” indicates front, “b” in which “rotation” indicates no rotation, and “occupation ratio” is “ 3.7% "is calculated and stored in the object information storage unit 18.
- the calculation unit 16 recognizes that the face and the body belong to the same person if the face area is included in the person's body area. For example, in image A in FIG. 6A, O1 and O3 are recognized as person A, O2 and O4 as person B, O5 and O7 as person C, and O6 and O8 as person D.
- the calculation unit 16 sets an area for the recognized person.
- This area may be set for each person, but in this embodiment, persons close to each other are collectively set as one area.
- the calculation unit 16 determines the area occupied by the person A and the person B as “area 1”. "Is set. Similarly, the calculation unit 16 sets the area occupied by the person C and the person D as “area 2”. Regions 1 and 2 are shown in FIG.
- the calculation unit 16 acquires the object information of the object included in the area from the object information storage unit 18, and sets the “direction”, “rotation”, and “occupation ratio” of the object included in the acquired object information. Based on this, the attention vector is obtained.
- the orientation component of the attention vector is obtained based on the “direction” and “occupation ratio” of the object, and the rotation component of the attention vector is determined based on the “rotation” and “occupation ratio” of the object.
- the calculation unit 16 first starts the “direction” (“C”, “C” of each of the face objects O1, O2 from the area 1 including O1 to O4. ”) To get. Then, the calculation unit 16 obtains vectors V O1 and O2 having a “direction” C and a magnitude corresponding to the “occupation ratio” “3.7” of O1 and O2 (see Formula 1 described later for a specific formula for calculation). .
- the vectors V O1 and O2 may be obtained by separately calculating the two vectors V O1 and V O2 and then combining them.
- the vector size not only “occupancy ratio” but also matching accuracy, which is a value indicating the accuracy of face recognition, is used.
- the calculation unit 16 obtains V O3 and O4 related to O3 and O4 which are body objects from the region 1.
- FIG. 6C shows the orientation component / rotation component of the attention vectors 1 and 2 calculated by the calculation unit 16 in this way.
- the component in the direction of the attention vectors 1 and 2 on the left side of FIG. 6C represents the direction when it is assumed that the image A is viewed from directly above. For this reason, the downward direction in the figure in which V O1, O2 and V O3, O4 face is the front direction.
- the number of objects is k
- the occupation ratio of the object is R j [%]
- the direction of the object vector is D k [degrees]
- the number of segments of the attention vector is i
- the minimum angle of each segment is Mi i
- the maximum angle is Ma i
- the size F (i) of the attention vector is
- the storage contents of the attention vector information storage unit 20 are shown in FIG.
- the attention vector information storage unit 20 includes, for each image, items of “type” for the attention vector in the image, its “size”, and “area” used as an occupation ratio in vector calculation.
- the grant condition storage unit 22 stores conditions relating to tag assignment. Specifically, the following conditions (1) to (5) and the names of tags to be assigned corresponding to combinations that match each condition are stored. The following conditions (1) to (5) are merely examples, and the branching conditions and the like can be changed as appropriate.
- the size of this region is the size of the attention vector corresponding to that region, and those having a certain size or more (for example, 0.15 or more) are counted as effective regions.
- the size of the attention vector 1 corresponding to the region 1 is 0.23 ( ⁇ 0.15), so that the region 1 is an effective region.
- the region 2 is not regarded as an effective region.
- the arrangement of objects when there are two or more in (3) is regular or irregular. For example, if the variation in size of two or more objects is within a certain range, the arrangement is regular. In particular, if the number of objects is three or more, it is regular if the intervals at which the objects are arranged are close to equal intervals.
- the object in the effective area (the area in the image that is counted as the area in (3)) is a person or background.
- the occupation ratio of the effective area is 1/3 or more (about 33% or more), it is a person, and if it is less than 1/3, it is the background.
- the value obtained by adding the occupation ratios of the effective areas is 1/3 or more.
- the assigning unit 24 assigns a tag for each image by comparing the attention vector information stored in the attention vector information storage unit 20 with the storage content of the provision condition storage unit 20.
- the manner in which the assigning unit 24 assigns a tag is a general method, and an image and information indicating a tag attached to the image may be associated with each other and stored in the image tag storage unit 30.
- the present invention is not limited to this, and the tag may be directly written in Exif (Exchangeable Image File Format) data of each image.
- the input I / F unit 26 receives input from general input devices such as the keyboard 3 and the mouse 4.
- the output I / F unit 28 causes the display 5 to perform various displays.
- the processing subject of each step in FIGS. 8 and 9 is basically the granting unit 24.
- the assigning unit 24 specifies an image to be given (S11). This specification may be specified by the output I / F unit 28 displaying a menu or the like on the display 5 and the input I / F unit 26 receiving an input from the user. Alternatively, when a new image is added to the image storage unit 12, the new image may be automatically specified as a grant target.
- the assigning unit 24 acquires information on the identified image from the attention vector information storage unit 20 (S12). For example, if the image A is specified as an assignment target, information on the attention vectors 1 and 2 (see FIG. 7) of the image A is acquired.
- the assigning unit 24 determines whether the size of the attention vector is equal to or larger than a predetermined value (for example, 0.1 or larger) (S13).
- This step S13 is for determining the presence or absence of attention in the image.
- the assigning unit 24 counts the number of human objects (S20), and if it is 1 or more, a tag “landscape” is assigned. (S21).
- the assigning unit 24 determines whether the direction of the attention vector is front or non-front (S14).
- the assigning unit 24 counts the number of regions (effective regions) in which the size of the corresponding attention vector is equal to or greater than a certain value (S15). If it is 2 or more (S15: 2 or more), it is determined whether the arrangement is regular or irregular (S16).
- the three steps S17 to S19 are similar steps, and the assigning unit 24 determines whether the occupation ratio of the effective area is 1/3 or more. When there are two or more effective areas, the total occupation ratio obtained by adding the respective occupation ratios is determined.
- step S21 an image in which a person is greatly captured has an occupation ratio of 1/3 or more (S17: 1/3 or more, S18: 1/3 or more, S19: 1/3 or more).
- a system tag will be assigned (S21).
- an image with a small image of a person and a large background has an occupation ratio of less than 1/3 (S17: less than 1/3, S18: less than 1/3, S19: less than 1/3).
- the unit 24 will add a landmark system tag (S21). If it is determined in step S14 that it is not front, the process proceeds to the flow of FIG. Step S23 in FIG. 9 is the same as step S15 in FIG. 8, step S24 is the same as step S16, steps S25 to S27 are the same as step S17, and step S28 is the same as step S21.
- the assigning unit 24 acquires attention vector information (attention vectors 1 and 2) corresponding to the image A from the attention vector information storage unit 20 (S12).
- Step S13 is Yes.
- V O1, O2 and V O3, O4 facing the front are V O5, O6 , V O7, O8 facing left. Since it is sufficiently larger than that, the providing unit 24 determines that the orientation component is the front (S14: front).
- the size of the attention vector 1 corresponding to the region 1 is “0.23”, and the size of the attention vector 2 corresponding to the region 2 is “0.11”.
- the assigning unit 24 determines that the occupation ratio of the effective area is less than 1/3 (step S19: 1 / Less than 3).
- the assigning unit 24 assigns the tag “landmark 3” to the image A.
- Image B is an image in which two persons facing the camera are shown side by side.
- the vector of interest is facing forward (S14: front, S15: 1), and the number of areas of a certain size or more is one.
- the size of the effective area of the image B is 1/3 or more (S19: 1/3 or more).
- the assigning unit 24 assigns the “portrait 3” tag to the image B.
- Image C is an image showing a plurality of persons moving on a bicycle.
- the “direction” component in particular is in the diagonally lower left direction, so that the assigning unit 24 determines that the direction of the vector of interest is non-front (S14: non-front).
- the assigning unit 24 determines that the number of effective areas is two (S23: 2 or more), and determines that the arrangement is regular because the sizes of the two effective areas are approximately the same (S24: rule). )
- the assigning unit 24 assigns the tag “person surrounding 1”. To do.
- the image D is an image in which a person who calls a dog is shown.
- the assigning unit 24 determines that the direction of the target vector is non-front (S14: non-front).
- the assigning unit 24 determines that “target 3” The tag of is given.
- tags such as “Landmark 3” described above may be associated with an alias or an icon as shown in FIG. 12 so that the user can immediately understand the meaning of the tag.
- a tag to each image based on the attention vector in each image.
- Such a tag is useful as a clue to classify images, use images for retrieval, or allow the user to understand the contents of images from the tags.
- the second embodiment relates to a mechanism for calculating an attention level in an image by comprehensively considering an attention vector of an object in the image and extracting a region having a particularly high attention level.
- an area (attention area) where the object is estimated to be focused is set in the image.
- FIG. 13 is a functional block diagram of the image information processing apparatus 11 according to the second embodiment.
- the image information processing apparatus 11 includes an attention level map calculation unit 32 and an area setting unit 34.
- the attention level map creating unit 32 generates a corresponding attention level map for each object included in the image.
- This attention level map indicates the degree of attention on the image indicating the degree of attention in the situation where the image is captured. That is, a portion with a high degree of attention indicates that there is a high possibility that attention has been paid in the shooting state of the image. It can be said that there is a high possibility that the photographer of the image has paid attention.
- the attention level map creation unit 32 generates a total attention level map by adding all the generated attention level maps.
- the region setting unit 34 sets, as the attention region, a rectangular region having a degree of attention equal to or greater than a threshold in the total attention level map.
- the attention level map creation unit 32 acquires necessary information from the object information storage unit 18 and the attention vector information storage unit 20 (S31).
- the attention level map creating unit 32 sets one object in the image as a map creation target (S32).
- the attention level map creating unit 32 creates a attention level map based on the object information / attention vector information related to the target object (S33).
- Step S33 will be further described in the following steps (1) to (3).
- the center of gravity of the object is the starting point (the starting point is not limited to this as long as the object occupies the area). To determine if it is wide.
- the attention vector of the object O3 is in the downward direction of the image A.
- the area occupied by the object O3 as a reference, if the lower margin of this vector of interest is compared with the upper margin, the upper margin is wider. Therefore, a high degree of attention is set in the upward direction with respect to the area occupied by the object O3.
- the attention level map creation unit 32 repeats the processes of steps S32 and S33 until there is no object for which the attention level map has not been created (S34). In the case of the image A (see FIG. 6), there are eight objects O1 to O8, so the attention level map creating unit 32 repeats steps S32 and S33 eight times to create eight attention level maps. It will be.
- the attention level map creating unit 32 calculates the total attention level map by adding all the attention level maps (S35).
- FIG. 16A shows the attention level map corresponding to the persons A and B (O1 to O4) in the image A
- FIG. 16B shows the attention level corresponding to the persons C and D (O5 to O8) in the image A. Show the map. Since the objects O5 to O8 of the persons C and D are relatively small in size, the attention level map in FIG. 16B has a distribution of attention levels with a relatively low value compared to that in FIG. It has become.
- the area setting unit 34 sets an area that is equal to or greater than the threshold Th in the general attention level map (extracts as an attention area) (S36).
- the attention level map creating unit 32 adds the attention level maps of FIG. 16A and FIG. Create an attention map.
- Region A is a region where attention is present.
- the area setting unit 34 sets a rectangular area B that surrounds an area equal to or greater than the threshold Th among the areas A as an attention area.
- FIG. 18 shows a general attention map of the image D (see FIG. 11D) and a region C that is a region of interest.
- the level of attention is calculated according to the size and direction of the object and the distance from the direction. Note that when attention is given and the front is facing, it is difficult to estimate the direction of interest in the image from the face, so the degree of attention is calculated mainly using the direction of the human body.
- the attention map Fh (i) of the i-th attention vector is
- the non-object area is effective only in the maximum direction when viewed from the object area with respect to the entire image area.
- the orientation of the human body and the rotation direction are combined and converted into a direction in the two-dimensional image.
- the attention direction in the image can be estimated mainly from the face, so the attention degree is calculated mainly using the face direction.
- the number of face objects is Q, the object number is p, the face size is fh p , the vertical distance from the face direction is fd p , and the constants for normalizing the image size and weighting the region size are fw p.
- the attention map Ff (j) of the j-th attention vector is
- the attention degree map Fa (x) of the person X to be fused with the weight of the attention degree on the face as cw1 and the weight on the human body as cw2
- the detection information of a person, particularly the face and the human body is used as the object has been described, but as the detectable information, for example, a pet such as a dog or a cat or a general object recognition can be detected with high accuracy. If so, it can be used as object information.
- the weight for each object type it is conceivable to change the weight for each object type and to change the object type and weight value to be used for each image type.
- the attention area (and attention map) can be visualized and used as auxiliary information when the user selects the area.
- the attention area can be set as an extraction symmetry of the feature amount (for example, edge information, texture, luminance, color information, etc.), and a more appropriate tag can be assigned using the extracted feature amount. For example, if many green components are extracted from the attention area, a tag of “green” or a tag of a natural scenery system having a high affinity with green can be added. Further, for example, if a building “tower” is extracted from the region B which is the attention region of the image A, “tower” can be assigned as a tag.
- the feature amount for example, edge information, texture, luminance, color information, etc.
- Embodiment 3 tries to enable more appropriate tag assignment by considering not only the degree of attention of an object in a shooting situation but also information (for example, a saliency map) when browsing a shot image. Is.
- the image information processing apparatus 100 includes a saliency map creation unit 36, a depth-of-field map creation unit 38, a detection content determination unit 40, and an overall interest level map creation unit 42. Prepare.
- the other functional blocks are the same as those in FIG.
- the saliency map creation unit 36 creates a saliency map (Saliency Maps), which is a map representing the level of visual attention of a person, such as which part draws human attention and which part does not draw attention in an image. To do.
- a saliency map (Saliency Maps)
- This creation method is created by performing a predetermined calculation based on the luminance component (intensity), color component (colors), and orientation component (orientations) of the input image.
- luminance component intensity
- color component colors
- orientation component orientation component
- the depth-of-field map creating unit 38 generates a depth-of-field map indicating the depth of the depth of field, which part is deep in the image and which part is shallow in the image. create.
- the detection content determination unit 40 should detect the “type” (see FIG. 7) in the object information storage unit 18 or the attention vector information storage unit 20 or the value of the total attention level map created by the attention level map creation unit 32. Determine the contents.
- portrait-type images are tagged with a person at the center and are not detected.
- landmark-based images when there is a region of interest in front of a person or there is an object system in the background, the search is performed centering on the building system.
- the search In the image of the person peripheral system, the search is performed mainly on the object system that the person wears or holds around the person.
- the target system image it is determined whether or not an object exists inside the target area.
- a total attention level map obtained by multiplying the total attention level map described in the second embodiment and the saliency map (or depth of field) is created, and based on the total attention level map.
- a region (total region of interest) is set.
- the detection content determination unit 40 determines the detection content from the overall attention level map of the image (S41).
- the saliency map creation unit 36 creates a saliency map (S42).
- the general interest level map creation unit 42 creates a general interest level map by multiplying the general attention level map created by the attention level map creation unit 32 and the saliency map (S43).
- the area setting unit 34 sets (extracts) an area that is equal to or greater than the threshold in the comprehensive interest level map as the comprehensive interest area (S44).
- FIG. 21 is a saliency map of the image A created by the saliency map creation unit 36.
- FIG. 22 shows a general interest map created by the general interest map creating unit 42 by multiplying the saliency map of FIG. 21 with the general attention map of FIG.
- the saliency of the towers behind the persons A and B is relatively high, and also in the general attention degree map of FIG. Since it is high, in the comprehensive interest level map of FIG. 22, the area near the tower has a particularly high comprehensive interest level.
- the region setting unit 34 sets a rectangular region D including a region having a total interest level equal to or higher than the threshold Ta as a total region of interest.
- the third embodiment it is possible to set a more appropriate region by using the saliency map indicating a portion that is easy to be noticed when a human views an image.
- the region D (total region of interest) is a position that just surrounds the tower. Therefore, if the region D is made symmetrical with the extraction of various features, the tower can be detected, and tags related to the tower are added. Can be granted.
- a depth-of-field map may be used instead of the saliency map.
- the depth of field reflects the intention of photographing the image (how to adjust the degree of focus, etc.), so a more appropriate region setting can be expected.
- the total interest level map may be calculated by combining three types of maps such as total attention level map ⁇ saliency map ⁇ depth of field map.
- the type of image determined by the detection content determination unit 40 may be to change the type of visual characteristic information or photographer intention information used for each type, or weight each type.
- the saliency map is not limited to the saliency map of the type described above as long as it is a method that models human visual attention.
- FIG. 23 is a functional block diagram of the image information processing apparatus 102. The same functional blocks as those in FIG.
- the image information processing apparatus 102 includes a sorting unit 44.
- the sorting unit 44 sorts a large number of objects in the image into the above-mentioned important objects and trivial objects.
- the sorting methods include (1) Method 1: Only some people are selected as important objects from a plurality of people.
- Method 2 A part of a plurality of persons is grouped, and the grouped persons are selected as important objects.
- the image P is an image in which 10 persons a to j are captured.
- the solid line arrows shown in the figure are attention vectors corresponding to each person.
- the selection unit 44 selects only the highly reliable person among the persons a to j.
- This reliability is determined based on the matching accuracy when extracted as a person and the occupation ratio of the person's area.
- Method 2 is a method of grouping some persons out of a plurality of persons and selecting the grouped persons as important objects.
- the selection unit 44 determines whether or not there are a plurality of human regions in the image (S51).
- the calculation unit 16 calculates the attention vector of each person (S52).
- the selection unit 44 detects a polygon from the calculated orientations of the plurality of attention vectors, and groups the persons (area including the person) constituting the polygon (S53).
- the sorting unit 44 sorts the grouped persons as important objects (S54).
- step S53 an example of processing in step S53 will be described with reference to FIG.
- the image K in FIG. 26A is an image in which four persons P, Q, R, and S are shown from the left side.
- FIG. 26 (b) is a diagram showing four attention vectors when it is assumed that the image K is viewed from directly above.
- the selection unit 44 detects a triangle from the attention vectors of the persons P, R, and S based on the direction and size of each attention vector, and groups the three persons P, R, and S that form the triangle. . Then, the sorting unit 44 sorts the three persons P, R, and S as important objects, and sorts the person Q as a trivial object.
- grouping may be performed based on the similarity of the attention vector of the object.
- two people A and B who are both face-to-face vectors are grouped, and two people C and D are both grouped to the left. Also good.
- a plurality of line segments are extracted from an image, and a convergence region is set in a direction in which the extracted line segments converge.
- the set convergence area can be used for various purposes as in the attention area of the second embodiment.
- FIG. 28 is a functional block diagram of the image information processing apparatus 104. The same functional blocks as those in FIG.
- the edge extraction unit 46 extracts a place where the shading of the image changes abruptly as an edge.
- the extracted edge can have any two-dimensional shape such as a circle, a curve, and a line segment.
- the region setting unit 48 sets a convergence region on the direction side where a plurality of line segments extracted from the image converge.
- FIG. 29 is a flowchart showing the flow of region setting processing by the region setting unit 48.
- the region setting unit 48 acquires a line segment of the image from the edge extraction unit 46 (S61).
- the area setting unit 48 determines whether or not the acquired line segment has a certain convergence (S62).
- the convergence is to determine whether or not straight lines obtained by extending each line segment are gathered (converged) at a certain position.
- the region setting unit 48 sets a region where the convergence direction destination or convergence direction exists (extracts as a convergence region) (S63).
- FIG. 30 (a) in the image L in which the Arc de Triomphe is shown, as shown in FIG. 30 (b), the lane lines painted on the road and the line segments extracted from the bus are respectively extended. A possible axis (straight line) is considered.
- the multiple axes converge at a certain position (collected at a certain position, and many axes intersect at that position).
- the area setting unit 48 sets the area E so as to surround this position.
- the main line segment may be determined from the acquired line segments.
- the two-dimensional shape used for setting the region is not limited to a line segment.
- the region setting unit 48 may set the closed region in the ellipse as a region setting target.
- a plurality of elements constituting one image may be classified into different types by using the difference in convergence direction.
- the two-dimensional shape extraction method is not limited to the method using the edge, and other general methods can be used.
- the convergence region may be set using not only the edge component but also texture, luminance, color information, and the like.
- a more detailed index is set for each tag-added image.
- the set index can be used for analysis, evaluation and image retrieval of individual images.
- the object extraction unit 14 For an image of a person-centered tag (portrait, person periphery), the object extraction unit 14 performs a human recognition process (for example, a process of recognizing a face by extracting a face area from the image). Run to identify people in the image.
- a human recognition process for example, a process of recognizing a face by extracting a face area from the image.
- the person type (person index type) specified by the calculation unit 16 and the appearance frequency for each type are calculated, and the assigning unit 24 sets the calculation result as an index.
- the calculation unit 16 calculates a region of interest and its degree (including the size of the region and the intensity of attention).
- the object extraction unit 14 recognizes an object within the attention area, and the assigning unit 16 sets information indicating the presence / absence of the recognized object and its type as an index.
- the assigning unit 16 sets information indicating the type of scenery, the appearance frequency for each type, and the result of object recognition as an index.
- the seventh embodiment supports the generation of an album or a slide show for a group of images with a tag (see FIG. 33).
- the image information processing apparatus 106 includes a template storage unit 52 and a generation unit 54.
- Other functional blocks are the same as those of the image information processing apparatus 10 in FIG.
- the generation unit 54 generates an album and a slide show using the templates related to the album and the slide show stored in the template storage unit 52.
- the template storage unit 52 stores an album layout 52a and a table 52b.
- the layout 52a indicates the arrangement of five frames from frame a to frame e.
- the table 52b shows correspondence between frames and tags in the layout 52a.
- the generation unit 54 generates an album by inserting one image with a tag corresponding to each frame based on the layout 52a and the table 52b.
- An example of the generated album is shown in FIG.
- the selection method may be performed based on reception from the user, or the score of each image is calculated based on an index set for each image (see Embodiment 6), for example, the highest score.
- the image may be automatically selected.
- the generation uses a tag attached to the image, it is possible to create an album in which people, landscapes, landmarks, and the like are arranged in a balanced manner.
- templates for a plurality of types of albums are prepared in the template storage unit 52, and the generation unit 54 automatically selects a template corresponding to the tag type of the image to be inserted into the album frame. (Or recommend the user to make a selection).
- the generation unit 54 selects a landscape template from a plurality of types of templates stored in the template storage unit 52. It is possible.
- the generation unit 54 may set the frame itself and its surrounding decoration according to the tag of the image to be inserted. Information about these decorations may be included in the album template.
- the size, shape and decoration of the frame can be considered.
- a tag name As the decoration around the frame, a tag name, a symbol indicating the tag type, an icon indicating the tag, and the like can be considered.
- the generation unit 54 sets the shape of the frame c to be elliptical for portraits when creating an album,
- the frame of c may be set as a portrait frame, or a character string “portrait” that is the name of the tag may be displayed around the frame c.
- Generation of Slide Show The generation unit 54 generates a slide show using the person area and the attention area in the image.
- FIG. 36 An example is shown in FIG. In the slide show of FIG. 36, an action pattern of zooming the person area or the attention area of the image D or panning from the person area to the attention area is set.
- This action pattern is not limited to the example described with reference to FIG. 36, and various patterns used in general slide show creation applications and presentation applications can be set.
- the “action pattern” is sometimes called “animation” or “visual effect”.
- a plurality of types of slide show templates are prepared in the template storage unit 52, and the generation unit 54 automatically selects a template corresponding to the type of image tag used as the slide show (or selection). It may be recommended to the user).
- the generation unit 54 selects a slide show template including pan / slide and zoom from among a plurality of types of templates stored in the template storage unit 52.
- the item of “type” in the object information storage unit 18 and the attention vector information storage unit 20 can be used as follows.
- the attention vector of the human body may be emphasized because the body orientation often indirectly represents the target of attention.
- the vector value may be used with emphasis on the focused vector of the face.
- Basic attribute information of the image may be extracted from the image, and a tag may be attached using the extracted attribute information.
- Attribute information includes, for example, EXIF (Exchangeable Image File Format) information.
- Information such as the shooting date and time, GPS (Global Positioning System) information, shooting mode information, camera parameters at the time of various shootings, and the like defined by EXIF can be used.
- the granting condition of the granting unit 24 may be changed so that a natural landscape tag is more easily given.
- a basic low-order feature amount of an image such as an edge, a color, or a texture may be extracted.
- the “basic feature amount representing the change characteristic of the image” is luminance information, color information, direction information, edge information and texture information of the image
- “camera parameter information” is the focus area information, Depth of field information, date / time information, location information, shutter speed, sensitivity, white balance, flash information, and the like.
- a tag having high affinity at night for example, night view, party, fireworks, etc.
- Existing model data may include general objects such as dogs, cats, cars, and landscape scenes such as the sea and mountains.
- the adding unit 24 may add a tag using a model determined to be compatible in the determination process.
- the various regions in FIGS. 3, 6, 10, and 11 have been described as rectangular regions, but the shape of the region is not limited to a rectangle, but is a circle, an ellipse, or a polygon. be able to. Further, the area may be set in units of pixels of the image without particularly limiting the shape.
- the adding unit 24 assigns one tag to one image. However, a plurality of tags are added to one image. It may be given.
- the object to be extracted is a person, but is not limited thereto.
- it may be a pet (living body) such as a dog or a cat, or an object such as a flower, a building, or a car.
- a pet living body
- an object such as a flower, a building, or a car.
- any object that can be detected while maintaining a certain degree of accuracy can be extracted.
- HOG Heistogram-of-Oriented-Gradient
- SIFT Scale-Invariant Feature Transform
- Reference 1 Hiroyoshi Fujiyoshi “Gradient-based feature extraction-SIFT and HOG,”, Information Processing Society Research Report CVIM 160, pp. 211-224, 2007).
- the small vectors V O5 and O6 and V O7 and O8 are excluded from consideration when determining in step S14.
- two face vectors V O5, O6 and V O1, O2 ) may be combined to determine whether the combined vector is front-facing. In short, when there are a plurality of vector components in the image, it is sufficient to calculate the vector as the entire image by combining the various components.
- Each functional block in FIG. 1 or the like may be an LSI that is an integrated circuit. These may be individually made into one chip, or may be made into one chip so as to include a part or all of them. Although referred to as LSI here, it may be referred to as IC, system LSI, super LSI, or ultra LSI depending on the degree of integration. Further, the method of circuit integration is not limited to LSI's, and implementation using dedicated circuitry or general purpose processors is also possible. An FPGA (Field Programmable Gate Array) that can be programmed after manufacturing the LSI or a reconfigurable processor that can reconfigure the connection and setting of circuit cells inside the LSI may be used. Further, if integrated circuit technology comes out to replace LSI's as a result of the advancement of semiconductor technology or a derivative other technology, it is naturally also possible to carry out function block integration using this technology.
- LSI Field Programmable Gate Array
- control program comprising a program code for causing the processor of various information processing apparatuses and various circuits connected to the processor to execute the operations described in the embodiments on a recording medium, or various It can also be distributed and distributed via a communication channel.
- Such recording media include non-transitory recording media such as IC cards, hard disks, optical disks, flexible disks, and ROMs.
- the distributed and distributed control program is used by being stored in a memory or the like that can be read by the processor, and the processor executes the control program to perform various functions as shown in the embodiment. It will be realized.
- ⁇ Supplement 2> The present embodiment includes the following aspects.
- An image information processing apparatus includes an extraction unit that extracts an object from an image, a calculation unit that calculates a direction in which the extracted object is directed, and the image according to the calculated direction. Providing means for attaching a tag.
- the calculation unit calculates a size of a ratio of the extracted object in the image, and the adding unit adds a tag to the image based on the calculated direction or size. You may give it.
- the attaching unit attaches a tag to the image based on the calculated direction or the calculated size. For this reason, it contributes to giving a tag according to the magnitude of the calculated magnitude
- the extraction unit extracts a region including a person's face or person's body from the image as the object, and the calculation unit determines the orientation or rotation direction of the person's face or body in the extracted region.
- the direction may be calculated based on the size, and the size may be calculated based on a ratio of the face or body of the person in the extracted area in the image.
- the extraction unit extracts a plurality of objects from the image, and the calculation unit determines, for each of the extracted objects, a direction in which the object is focused and a size of a ratio occupied in the image. And calculating the vector of the entire image by combining the calculated plurality of vectors, and the assigning means calculates the image based on the calculated direction or magnitude of the vector of the entire image. You may add a tag to.
- the assigning unit assigns a first tag indicating that the portrait is a portrait if the direction of the vector of the entire image is front, and a second tag different from the first tag if not. You may give it.
- the first tag indicating the portrait or the second tag different from the first tag can be assigned according to the vector direction of the entire image.
- the assigning unit attaches a tag indicating that a person is focused when the vector size of the entire image is larger than a predetermined value, and pays attention to the background when the vector is smaller than the predetermined value. You may give the tag which shows that.
- the extraction unit extracts a plurality of objects from the image, the extraction unit extracts a human region including a face and a body from the image as the object, and the adding unit extracts the object by the extraction unit.
- the tag to be assigned may be different.
- Creation means for creating, on the image, a first map indicating the degree of attention of the object on the image based on the calculated direction and size, and the created first
- the map may include setting means for setting an area including a place where the degree is equal to or greater than a predetermined value.
- the creation means creates a second map indicating the level of human visual attention in the image, and after creation, the level of the degree of attention in the first map and the second map
- a comprehensive map may be created that indicates the degree of total visual attention, and the setting unit may set an area including a place where the degree of the created comprehensive map is equal to or greater than a predetermined value.
- the second map may be a saliency map based on the color, brightness, and direction of the image.
- the creation means creates a third map indicating the depth of field in the image, and after creation, the level of the degree of interest in the first map and the object in the third map are created. Creating a general map indicating the degree of total depth of field depth
- the setting means may set an area including a place where the degree in the created general map is a predetermined value or more.
- the extraction unit extracts a plurality of regions each including a person from the image as the object, and selects a portion of the extracted regions as a region to be used for tag assignment.
- a sorting unit may be provided, and the adding unit may add a tag based on a direction in which the person faces in the partial area or a ratio of the person in the image.
- the selecting unit groups two or more areas from among the plurality of areas based on the direction in which each person of the plurality of extracted areas is facing, and determines the areas constituting the group as the one area. It may be selected as a part area.
- the extraction unit may include a setting unit that extracts a plurality of line segments from the image and sets an area in a direction in which the extracted plurality of line segments converge with respect to the image. .
- the setting unit may define a plurality of axes obtained by extending the plurality of extracted line segments, and set the region so as to surround a position where the plurality of axes intersect.
- An extraction step of extracting an object from the image, a calculation step of calculating a direction in which the extracted object is facing, and an adding step of adding a tag to the image according to the calculated direction are included. It does not matter as a tagging method.
- An extraction step of extracting an object from the image, a calculation step of calculating a direction in which the extracted object is facing, and an adding step of adding a tag to the image according to the calculated direction are included. It may be a program that causes a computer to execute tagging processing.
- An extraction unit that extracts an object from an image, a calculation unit that calculates a direction in which the extracted object is facing, and an adding unit that adds a tag to the image according to the calculated direction.
- An integrated circuit may be used.
- the image information processing apparatus is useful because it can add a classification tag to an image.
- Image information processing apparatus 100, 102, 104, 106 Image information processing apparatus 12 Image storage unit 14 Object extraction unit 16 Calculation unit 18 Object information storage unit 20 Attention vector information storage unit 22 Assignment condition storage unit 24 Assignment unit 32 Attention degree map creation unit 34 Area setting unit 36 Saliency map creation unit 38 Depth of field map creation unit 40 Detection content determination unit 42 Total interest map creation unit 44 Selection unit 46 Edge extraction unit 48 Region setting unit
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Library & Information Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Oral & Maxillofacial Surgery (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
- Processing Or Creating Images (AREA)
- Image Processing (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
図1に示すように、画像情報処理装置10は、画像記憶部12、オブジェクト抽出部14、算出部16、オブジェクト情報記憶部18、注目ベクトル情報記憶部20、付与条件記憶部22、付与部24、入力I/F(インターフェイス)部26、出力I/F部28、画像タグ記憶部30を備える。なお、各種記憶部12,18,20,22、30はHDD(Hard Disk Drive)やRAM(Random Access Memory)などのハードウェアから構成される。なお、画像情報処理装置10としては、一般的なPC(Personal Computer)を用いることができる。
注目ベクトルの大きさが一定以上(例えば、0.15以上)のものを有効領域として数える。
これに対して、人が小さく写って背景が大きく写った画像は、占有割合は1/3未満となり(S17:1/3未満,S18:1/3未満,S19:1/3未満)、付与部24は、ランドマーク系統のタグを付与することとなる(S21)
なお、ステップS14において、非正面と判断すると、図9のフローに移行する。図9のステップS23は図8のステップS15と同様、ステップS24はステップS16と同様、ステップS25~S27は、ステップS17と同様、ステップS28はステップS21と同様であるので説明を省略する。
付与部24は、注目ベクトル情報記憶部20から画像Aに対応する注目ベクトル情報(注目ベクトル1,2)を取得する(S12)。
画像Bは、カメラを向いたふたりの人物が並んで写っている画像である。
画像Cは、自転車で移動する複数の人物が写っている画像である。
画像Dは、犬を呼ぶ人物が写っている画像である。
本実施の形態2は、画像内のオブジェクトの注目ベクトルを総合的に考慮し画像内における注目度の高低を算出し、特に注目度が高い領域を抽出する仕組みに関するものである。
実施の形態3は、撮影状況におけるオブジェクトの注目度合いだけではなく、撮影された画像を閲覧する際の情報(例えば顕著性マップ)を考慮することで、より適切なタグ付与を可能にしようとするものである。
実施の形態4は、画像内に多数のオブジェクトが存在する場合に、重要なオブジェクト(そのオブジェクトを含む領域)と、些末なオブジェクト(そのオブジェクトを含む領域)とに選別し、些末なオブジェクトはノイズとみなしてタグの付与の考慮から除外する。
(1)手法1:複数の人物から一部の人物のみを重要なオブジェクトとして選別する。
実施の形態5は、画像から複数の線分を抽出し、抽出された複数の線分が収束する方向に収束領域を設定する。設定した収束領域は、実施の形態2の注目領域と同様、様々な用途に利用することができる。
実施の形態6では、タグを付与した画像のそれぞれに、さらに詳細なインデクスを設定する。設定したインデクスは、個々の画像の分析、評価および画像検索に用いることができる。
実施の形態7は、タグが付与された画像群(図33参照)を対象としたアルバムやスライドショーの生成を支援する。
図35(a)に示すように、テンプレート記憶部52は、アルバムのレイアウト52aとテーブル52bを記憶している。
(2)スライドショーの生成
生成部54は、画像内の人物の領域および注目領域を利用してスライドショーを生成する。
<補足1>
以上、本実施の形態について説明したが、本発明は上記の内容に限定されず、本発明の目的とそれに関連又は付随する目的を達成するための各種形態においても実施可能であり、例えば、以下であっても構わない。
<補足2>
本実施の形態は、次の態様を含むものである。
<参考文献>
(1)参考文献1
藤吉弘亘著「Gradientベースの特徴抽出- SIFTとHOG -」, 情報処理学会 研究報告 CVIM 160, pp. 211-224, 2007
12 画像記憶部
14 オブジェクト抽出部
16 算出部
18 オブジェクト情報記憶部
20 注目ベクトル情報記憶部
22 付与条件記憶部
24 付与部
32 注目度マップ作成部
34 領域設定部
36 顕著性マップ作成部
38 被写界深度マップ作成部
40 検出内容判定部
42 総合関心度マップ作成部
44 選別部
46 エッジ抽出部
48 領域設定部
Claims (18)
- 画像からオブジェクトを抽出する抽出手段と、
抽出されたオブジェクトが向いている方向を算出する算出手段と、
前記画像に、算出された方向に応じてタグを付与する付与手段と、
を備えることを特徴とする画像情報処理装置。
- 前記算出手段は、前記抽出されたオブジェクトが前記画像内において占めている割合の大きさを算出し、
前記付与手段は、算出された方向または大きさを基に、前記画像にタグを付与する
ことを特徴とする請求項1記載の画像情報処理装置。
- 前記抽出手段は、前記画像から人物の顔または人物の体を含む領域を前記オブジェクトとして抽出し、
前記算出手段は、抽出された領域における人物の顔または体の、向きもしくは回転方向に基づいて前記方向を算出するとともに、前記抽出された領域における人物の顔または体が前記画像内において占めている割合に基づいて前記大きさを算出する
ことを特徴とする請求項2に記載の画像情報処理装置。
- 前記抽出手段は、前記画像から複数のオブジェクトを抽出し、
前記算出手段は、抽出されたオブジェクトそれぞれについて、当該オブジェクトが注目している方向と前記画像内において占めている割合の大きさとからなるベクトルを算出し、
算出した複数のベクトルを総合して、前記画像全体のベクトルを計算し、
前記付与手段は、計算された前記画像全体のベクトルの方向または大きさに基づいて、前記画像にタグを付与する
ことを特徴とする請求項2に記載の画像情報処理装置。
- 前記付与手段は、前記画像全体のベクトルの方向が、正面であればポートレートであることを示す第1タグを付与し、正面でなければ前記第1タグとは異なる第2タグを付与する
ことを特徴とする請求項4に記載の画像情報処理装置。
- 前記付与手段は、前記画像全体のベクトルの大きさが、所定値よりも大きければ人物を注目していることを示すタグを付与し、所定値以下であれば背景を注目していることを示すタグを付与する
ことを特徴とする請求項4に記載の画像情報処理装置。
- 前記抽出手段は、前記画像から複数のオブジェクトを抽出し、
前記抽出手段は、前記画像から顔と体を含む人物の領域を前記オブジェクトとして抽出し、
前記付与手段は、抽出手段により抽出されたオブジェクトの数が単数か複数かに応じて付与するタグを異ならせる
ことを特徴とする請求項4に記載の情報処理装置。
- 前記算出された算出された方向と大きさとに基づいて、
前記画像上に、前記オブジェクトが注目している度合いの高低を示す第1マップを作成する作成手段と、
作成された第1マップにおいて、前記度合いが所定値以上の場所を含む領域を設定する設定手段と、
を備えることを特徴とする請求項2に記載の画像情報処理装置。
- 前記作成手段は、前記画像における人の視覚注意の度合いの高低を示す第2マップを作成し、作成後、前記第1マップにおける前記注目している度合いの高低と前記第2マップにおける前記視覚注意の度合いの高低とを総合した度合いを示す総合マップを作成し、
前記設定手段は、作成された総合マップにおける度合いが所定値以上の場所を含む領域を設定する
ことを特徴とする請求項8に記載の画像情報処理装置。
- 前記第2マップは、前記画像の色、輝度および方向性を基にした顕著性マップである
ことを特徴とする請求項9に記載の画像情報処理装置。
- 前記作成手段は、前記画像における被写界深度の深浅を示す第3マップを作成し、作成後、前記第1マップにおける前記注目している度合いの高低と前記第3マップにおける前記被写界深度の深浅とを総合した度合いを示す総合マップを作成し、
前記設定手段は、作成された総合マップにおける度合いが所定値以上の場所を含む領域を設定する
ことを特徴とする請求項8に記載の画像情報処理装置。
- 前記抽出手段は、前記画像から、それぞれ人物を含む複数の領域を前記オブジェクトとして抽出し、
抽出された複数の領域の中から、一部の領域をタグの付与に用いる領域として選別する選別手段を備え、
前記付与手段は、前記一部の領域において人物が向いている方向または人物が画像内において占めている割合に基づいてタグを付与する
ことを特徴とする請求項2に記載の画像情報処理装置。
- 前記選別手段は、抽出された複数の領域それぞれの人物が向いている方向に基づいて、複数の領域の中から、2以上の領域をグループ化し、このグループを構成する領域を前記一部の領域として選別する
ことを特徴とする請求項12に記載の画像情報処理装置。
- 前記抽出手段は、前記画像から複数の線分を抽出し、
前記画像に対して、抽出された複数の線分が収束する方向上に領域を設定する設定手段を備える
ことを特徴とする請求項1に記載の画像情報処理装置。
- 前記設定手段は、抽出された複数の線分をそれぞれ延長した複数の軸を規定し、この複数の軸が交差する位置を囲むように前記領域を設定する
ことを特徴とする請求項14に記載の画像情報処理装置。
- 画像からオブジェクトを抽出する抽出ステップと、
抽出されたオブジェクトが向いている方向を算出する算出ステップと、
前記画像に、算出された方向に応じてタグを付与する付与ステップと、
を含むタグ付与方法。
- 画像からオブジェクトを抽出する抽出ステップと、
抽出されたオブジェクトが向いている方向を算出する算出ステップと、
前記画像に、算出された方向に応じてタグを付与する付与ステップと、
を含むタグ付与処理をコンピュータに実行させることを特徴とするプログラム。
- 画像からオブジェクトを抽出する抽出手段と、
抽出されたオブジェクトが向いている方向を算出する算出手段と、
前記画像に、算出された方向に応じてタグを付与する付与手段と、
を備えることを特徴とする集積回路。
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13/696,662 US8908976B2 (en) | 2010-05-26 | 2011-04-15 | Image information processing apparatus |
CN201180025428.6A CN102906790B (zh) | 2010-05-26 | 2011-04-15 | 图像信息处理装置 |
JP2012517103A JP5837484B2 (ja) | 2010-05-26 | 2011-04-15 | 画像情報処理装置 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2010120613 | 2010-05-26 | ||
JP2010-120613 | 2010-05-26 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2011148562A1 true WO2011148562A1 (ja) | 2011-12-01 |
Family
ID=45003563
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2011/002235 WO2011148562A1 (ja) | 2010-05-26 | 2011-04-15 | 画像情報処理装置 |
Country Status (4)
Country | Link |
---|---|
US (1) | US8908976B2 (ja) |
JP (1) | JP5837484B2 (ja) |
CN (1) | CN102906790B (ja) |
WO (1) | WO2011148562A1 (ja) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2013254302A (ja) * | 2012-06-06 | 2013-12-19 | Sony Corp | 画像処理装置、画像処理方法、及びプログラム |
US11410461B2 (en) | 2018-12-04 | 2022-08-09 | Nec Corporation | Information processing system, method for managing object to be authenticated, and program |
Families Citing this family (23)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP5389724B2 (ja) * | 2010-03-31 | 2014-01-15 | 富士フイルム株式会社 | 画像処理装置、画像処理方法およびプログラム |
US8908976B2 (en) * | 2010-05-26 | 2014-12-09 | Panasonic Intellectual Property Corporation Of America | Image information processing apparatus |
TWI459310B (zh) * | 2011-12-30 | 2014-11-01 | Altek Corp | 可簡化影像特徵值組之影像擷取裝置及其控制方法 |
CN104284055A (zh) * | 2013-07-01 | 2015-01-14 | 索尼公司 | 图像处理方法、装置以及电子设备 |
JP6271917B2 (ja) * | 2013-09-06 | 2018-01-31 | キヤノン株式会社 | 画像記録装置及び撮像装置 |
US20150154466A1 (en) * | 2013-11-29 | 2015-06-04 | Htc Corporation | Mobile device and image processing method thereof |
CN104899820B (zh) * | 2014-03-11 | 2018-11-20 | 腾讯科技(北京)有限公司 | 为图像添加标签的方法、系统和装置 |
US9773156B2 (en) * | 2014-04-29 | 2017-09-26 | Microsoft Technology Licensing, Llc | Grouping and ranking images based on facial recognition data |
CN105096299B (zh) * | 2014-05-08 | 2019-02-26 | 北京大学 | 多边形检测方法和多边形检测装置 |
KR102330322B1 (ko) * | 2014-09-16 | 2021-11-24 | 삼성전자주식회사 | 영상 특징 추출 방법 및 장치 |
CN105808542B (zh) * | 2014-12-29 | 2019-12-24 | 联想(北京)有限公司 | 信息处理方法以及信息处理装置 |
JP2016191845A (ja) * | 2015-03-31 | 2016-11-10 | ソニー株式会社 | 情報処理装置、情報処理方法及びプログラム |
CN105306678A (zh) * | 2015-09-14 | 2016-02-03 | 联想(北京)有限公司 | 一种信息处理方法及电子设备 |
CN108229519B (zh) * | 2017-02-17 | 2020-09-04 | 北京市商汤科技开发有限公司 | 图像分类的方法、装置及系统 |
CN107343189B (zh) * | 2017-07-10 | 2019-06-21 | Oppo广东移动通信有限公司 | 白平衡处理方法和装置 |
CN107392982A (zh) * | 2017-07-27 | 2017-11-24 | 深圳章鱼信息科技有限公司 | 在线设计方法、装置及系统 |
US10984536B2 (en) * | 2018-01-25 | 2021-04-20 | Emza Visual Sense Ltd | Motion detection in digital images and a communication method of the results thereof |
CN108399381B (zh) * | 2018-02-12 | 2020-10-30 | 北京市商汤科技开发有限公司 | 行人再识别方法、装置、电子设备和存储介质 |
US11373407B2 (en) * | 2019-10-25 | 2022-06-28 | International Business Machines Corporation | Attention generation |
US11450021B2 (en) * | 2019-12-30 | 2022-09-20 | Sensetime International Pte. Ltd. | Image processing method and apparatus, electronic device, and storage medium |
US11381730B2 (en) * | 2020-06-25 | 2022-07-05 | Qualcomm Incorporated | Feature-based image autofocus |
US11790665B2 (en) * | 2021-04-29 | 2023-10-17 | Hitachi Astemo, Ltd. | Data driven dynamically reconfigured disparity map |
CN114881995A (zh) * | 2022-05-25 | 2022-08-09 | 广州市奥威亚电子科技有限公司 | 人脸图像质量评估方法、装置、电子设备及存储介质 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2002245471A (ja) * | 2000-12-07 | 2002-08-30 | Eastman Kodak Co | 被写体内容に基づく修正を有する第2プリントを伴うダブルプリントの写真仕上げサービス |
JP2004297305A (ja) * | 2003-03-26 | 2004-10-21 | Sharp Corp | データベース構築装置、データベース構築プログラム、画像検索装置、画像検索プログラム、及び画像記録再生装置 |
WO2006082979A1 (ja) * | 2005-02-07 | 2006-08-10 | Matsushita Electric Industrial Co., Ltd. | 画像処理装置および画像処理方法 |
JP2006350552A (ja) * | 2005-06-14 | 2006-12-28 | Canon Inc | 画像データ検索装置 |
JP2007041987A (ja) * | 2005-08-05 | 2007-02-15 | Sony Corp | 画像処理装置および方法、並びにプログラム |
Family Cites Families (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2003087815A (ja) | 2001-09-06 | 2003-03-20 | Canon Inc | 画像処理装置、画像処理システム、画像処理方法、記憶媒体、及びプログラム |
US8593542B2 (en) * | 2005-12-27 | 2013-11-26 | DigitalOptics Corporation Europe Limited | Foreground/background separation using reference images |
US8369570B2 (en) * | 2005-09-28 | 2013-02-05 | Facedouble, Inc. | Method and system for tagging an image of an individual in a plurality of photos |
US8265349B2 (en) | 2006-02-07 | 2012-09-11 | Qualcomm Incorporated | Intra-mode region-of-interest video object segmentation |
US8027541B2 (en) * | 2007-03-15 | 2011-09-27 | Microsoft Corporation | Image organization based on image content |
JP4798042B2 (ja) | 2007-03-29 | 2011-10-19 | オムロン株式会社 | 顔検出装置、顔検出方法及び顔検出プログラム |
JP4983643B2 (ja) | 2008-02-22 | 2012-07-25 | 株式会社ニコン | 撮像装置及び補正プログラム |
JP2009290255A (ja) | 2008-05-27 | 2009-12-10 | Sony Corp | 撮像装置、および撮像装置制御方法、並びにコンピュータ・プログラム |
US8477207B2 (en) | 2008-06-06 | 2013-07-02 | Sony Corporation | Image capturing apparatus, image capturing method, and computer program |
JP5251547B2 (ja) | 2008-06-06 | 2013-07-31 | ソニー株式会社 | 画像撮影装置及び画像撮影方法、並びにコンピュータ・プログラム |
JP5093031B2 (ja) | 2008-09-29 | 2012-12-05 | カシオ計算機株式会社 | 撮像装置及びプログラム |
JP4849163B2 (ja) | 2009-09-29 | 2012-01-11 | ソニー株式会社 | 画像処理装置及び画像処理方法、並びにコンピュータ・プログラム |
JP4968346B2 (ja) | 2010-01-20 | 2012-07-04 | カシオ計算機株式会社 | 撮像装置、画像検出装置及びプログラム |
US8908976B2 (en) * | 2010-05-26 | 2014-12-09 | Panasonic Intellectual Property Corporation Of America | Image information processing apparatus |
-
2011
- 2011-04-15 US US13/696,662 patent/US8908976B2/en active Active
- 2011-04-15 CN CN201180025428.6A patent/CN102906790B/zh active Active
- 2011-04-15 WO PCT/JP2011/002235 patent/WO2011148562A1/ja active Application Filing
- 2011-04-15 JP JP2012517103A patent/JP5837484B2/ja active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2002245471A (ja) * | 2000-12-07 | 2002-08-30 | Eastman Kodak Co | 被写体内容に基づく修正を有する第2プリントを伴うダブルプリントの写真仕上げサービス |
JP2004297305A (ja) * | 2003-03-26 | 2004-10-21 | Sharp Corp | データベース構築装置、データベース構築プログラム、画像検索装置、画像検索プログラム、及び画像記録再生装置 |
WO2006082979A1 (ja) * | 2005-02-07 | 2006-08-10 | Matsushita Electric Industrial Co., Ltd. | 画像処理装置および画像処理方法 |
JP2006350552A (ja) * | 2005-06-14 | 2006-12-28 | Canon Inc | 画像データ検索装置 |
JP2007041987A (ja) * | 2005-08-05 | 2007-02-15 | Sony Corp | 画像処理装置および方法、並びにプログラム |
Non-Patent Citations (1)
Title |
---|
LAURENT ITTI ET AL.: "A saliency-based search mechanism for overt and covert shifts of visual attention", VISION RESEARCH, vol. 40, no. 10-12, June 2000 (2000-06-01), pages 1489 - 1506 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2013254302A (ja) * | 2012-06-06 | 2013-12-19 | Sony Corp | 画像処理装置、画像処理方法、及びプログラム |
CN103473737A (zh) * | 2012-06-06 | 2013-12-25 | 索尼公司 | 图像处理装置、图像处理方法和程序 |
US9633443B2 (en) | 2012-06-06 | 2017-04-25 | Sony Corporation | Image processing device, image processing method, and program for cutting out a cut-out image from an input image |
CN103473737B (zh) * | 2012-06-06 | 2017-08-15 | 索尼公司 | 图像处理装置、图像处理方法和程序 |
US11410461B2 (en) | 2018-12-04 | 2022-08-09 | Nec Corporation | Information processing system, method for managing object to be authenticated, and program |
US11915519B2 (en) | 2018-12-04 | 2024-02-27 | Nec Corporation | Information processing system, method for managing object to be authenticated, and program |
Also Published As
Publication number | Publication date |
---|---|
JPWO2011148562A1 (ja) | 2013-07-25 |
CN102906790A (zh) | 2013-01-30 |
JP5837484B2 (ja) | 2015-12-24 |
CN102906790B (zh) | 2015-10-07 |
US8908976B2 (en) | 2014-12-09 |
US20130058579A1 (en) | 2013-03-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP5837484B2 (ja) | 画像情報処理装置 | |
JP5934653B2 (ja) | 画像分類装置、画像分類方法、プログラム、記録媒体、集積回路、モデル作成装置 | |
JP5782404B2 (ja) | 画質評価 | |
Su et al. | Preference-aware view recommendation system for scenic photos based on bag-of-aesthetics-preserving features | |
US20130326417A1 (en) | Textual attribute-based image categorization and search | |
JP5525757B2 (ja) | 画像処理装置、電子機器、及びプログラム | |
US9626585B2 (en) | Composition modeling for photo retrieval through geometric image segmentation | |
CN106560809A (zh) | 用从另一图像提取的至少一个属性修改图像的至少一个属性 | |
JP5018614B2 (ja) | 画像処理方法、その方法を実行するプログラム、記憶媒体、撮像機器、画像処理システム | |
JP2011517818A (ja) | 撮影位置シーケンス情報を用いた画像分類 | |
JP2012530287A (ja) | 代表的な画像を選択するための方法及び装置 | |
CN107836109A (zh) | 电子设备自动聚焦于感兴趣区域的方法 | |
Farinella et al. | Scene classification in compressed and constrained domain | |
CN112215964A (zh) | 基于ar的场景导览方法和设备 | |
Qian et al. | POI summarization by aesthetics evaluation from crowd source social media | |
Kim et al. | Classification and indexing scheme of large-scale image repository for spatio-temporal landmark recognition | |
CN111309957A (zh) | 一种自动生成旅行相册mv的方法 | |
Dong et al. | Effective and efficient photo quality assessment | |
JP6586402B2 (ja) | 画像分類装置、画像分類方法及びプログラム | |
WO2022266878A1 (zh) | 景别确定方法、装置及计算机可读存储介质 | |
EP3152701A1 (en) | Method of and system for determining and selecting media representing event diversity | |
Wang et al. | Online photography assistance by exploring geo-referenced photos on MID/UMPC | |
Christodoulakis et al. | Contextual Geospatial Picture Understanding, Management and Visualization | |
Souza et al. | Generating an Album with the Best Media Using Computer Vision | |
Khan et al. | Various Image Classification Using Certain Exchangeable Image File Format (EXIF) Metadata of Images |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 201180025428.6 Country of ref document: CN |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 11786266 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2012517103 Country of ref document: JP Ref document number: 13696662 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 11786266 Country of ref document: EP Kind code of ref document: A1 |