WO2012147256A1 - 画像処理装置 - Google Patents
画像処理装置 Download PDFInfo
- Publication number
- WO2012147256A1 WO2012147256A1 PCT/JP2012/001392 JP2012001392W WO2012147256A1 WO 2012147256 A1 WO2012147256 A1 WO 2012147256A1 JP 2012001392 W JP2012001392 W JP 2012001392W WO 2012147256 A1 WO2012147256 A1 WO 2012147256A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- subject
- information
- shooting
- unit
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/70—Labelling scene content, e.g. deriving syntactic or semantic representations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/51—Indexing; Data structures therefor; Storage structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N5/00—Details of television systems
- H04N5/76—Television signal recording
- H04N5/765—Interface circuits between an apparatus for recording and another apparatus
- H04N5/77—Interface circuits between an apparatus for recording and another apparatus between a recording apparatus and a television camera
- H04N5/772—Interface circuits between an apparatus for recording and another apparatus between a recording apparatus and a television camera the recording apparatus and the television camera being placed in the same enclosure
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N5/00—Details of television systems
- H04N5/76—Television signal recording
- H04N5/91—Television signal processing therefor
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N9/00—Details of colour television systems
- H04N9/79—Processing of colour television signals in connection with recording
- H04N9/80—Transformation of the television signal for recording, e.g. modulation, frequency changing; Inverse transformation for playback
- H04N9/82—Transformation of the television signal for recording, e.g. modulation, frequency changing; Inverse transformation for playback the individual colour picture signal components being recorded simultaneously only
- H04N9/8205—Transformation of the television signal for recording, e.g. modulation, frequency changing; Inverse transformation for playback the individual colour picture signal components being recorded simultaneously only involving the multiplexing of an additional signal and the colour video signal
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N5/00—Details of television systems
- H04N5/76—Television signal recording
- H04N5/765—Interface circuits between an apparatus for recording and another apparatus
Definitions
- the present invention relates to an image processing apparatus that tags each image included in an image group held by a user.
- an image organization support technique that automatically classifies and tags subjects included in each image in an image group held by the user has attracted attention so that the user's desired image can be efficiently searched.
- an image recognition engine and a model dictionary are prepared for each of a plurality of themes in order to easily and quickly acquire information related to an image, and the designated model dictionary and recognition engine are set by the user specifying the theme.
- a method is known that can effectively extract relevant information of an object that the user wants to know by assigning an appropriate tag to the object in the target image and extracting relevant information of the object (see Patent Document 1). ).
- Patent Document 1 when tagging (corresponding to an object) for a plurality of images, a single image recognition engine used by a plurality of images to be processed by a user It is necessary to specify a model dictionary. In such a case, as the number of images to be processed increases, it becomes more difficult to specify the image recognition engine and the model dictionary used for the tagging process, which only increases the burden on the user.
- an object of the present invention is to provide an image processing device, a processing method, a computer program, and an integrated circuit that reduce the burden on the user when performing association with a subject.
- the present invention is an image processing apparatus, and for each of a plurality of events, corresponds to a shooting attribute indicating a shooting condition estimated to be satisfied when shooting an image related to the event.
- Attribute storage means for storing information
- subject information storage means for storing in advance a subject that can be included in an image captured for the event, and an image group consisting of the captured images
- an extraction unit that extracts shooting attributes common to a predetermined number of images, and an event associated with the extracted shooting attributes in the subject information storage unit
- a specifying unit that specifies a stored subject, and for each of the plurality of images included in the image group, the image is specified by the specifying unit.
- the image processing apparatus identifies and associates a subject that can be included in an image photographed at the event from the event associated with the photographing attribute in the image group. This eliminates the need for the user to specify the subject to be used for the association, thereby reducing the user's burden during the association process between the image and the subject.
- FIG. 1 is a block diagram showing a configuration of an image classification device (image processing device) 1.
- FIG. It is a figure which shows an example of the data structure of metadata information table T1. It is a figure which shows an example of the image feature-value extracted by the image feature-value calculation means 12.
- 3 is a block diagram showing a configuration of common attribute extraction means 13.
- FIG. It is a figure which shows an example of the imaging
- FIG. It is a figure which shows an example of the data structure of table T10 which stores the model information which is the feature-value for every object category.
- FIG. 1 is a diagram illustrating a configuration of an image classification system 4000.
- FIG. 1 is a diagram illustrating a configuration of an image classification system 4000.
- FIG. 2 is a block diagram showing a configuration of an image classification device 4100.
- FIG. 3 is a block diagram showing a configuration of a server device 4500.
- FIG. It is a flowchart which shows a model information transmission process. It is a figure which shows the structure of 4000 A of image classification systems. It is a block diagram which shows the structure of image classification apparatus 4100A. It is a block diagram which shows the structure of the terminal device 4600.
- FIG. It is a figure which shows an example of the data structure of table T100 containing diversity information.
- Embodiment 1 Embodiments of the present invention will be described below with reference to the drawings.
- the image classification apparatus for organizing an image group composed of many local images and moving image data of a user such as in the home
- the first embodiment is included in each image by using a common attribute for the image group.
- the present invention relates to a mechanism for automatically classifying subject objects (objects) to be automatically classified.
- FIG. 1 is a block diagram showing a configuration of an image classification device (image processing device) 1.
- the image classification device 1 includes a local data storage unit 11, an image feature amount calculation unit 12, a common attribute extraction unit 13, a classification unit 14, a classification dictionary creation unit 15, and a classification model information storage unit 16. And image attribute information storage means 17.
- the local data storage unit 11 is a recording medium for storing file data in a home or the like held by a certain limited user. For example, family photographic images and moving image data are stored.
- the local data storage means is a storage device such as a large-capacity media disk such as an HDD or a DVD or a semiconductor memory.
- Image metadata information includes, for example, shooting mode information and various shooting method information such as GPS (Global Positioning System) information or shooting method information, which is included in EXIF (Exchangeable Image File Format) information, shooting date and time information and shooting location information. These are camera parameters at the time of shooting.
- GPS Global Positioning System
- EXIF Exchangeable Image File Format
- the metadata information included in the metadata information table T1 is associated with each image data number that is an identifier for uniquely identifying an image.
- the metadata information includes a file name, shooting time information indicating the shooting time, longitude and latitude information obtained from GPS information as geographical position information at the time of shooting, and brightness at the time of shooting. Includes ISO (International Organization for Standardization) sensitivity information to perform adjustment, exposure information to adjust the brightness so that it can be viewed properly, and camera parameter information such as white balance information (WB) to adjust the color balance during shooting. It is.
- the image feature amount information that can be calculated by the image analysis calculated by the image feature amount calculating unit 12 may be used as image information.
- Image feature amount calculation means 12 calculates a high-order feature quantity specific to an object from basic low-order feature quantities of an image such as edges, colors, and textures as image features.
- the high-order feature amount is, for example, a local feature amount such as SIFT (Scale-Invariant Feature Transform) that represents a feature of a local region around a characteristic point, or an HOG (Histogram of oriented gradient) that represents a shape feature of an object.
- SIFT Scale-Invariant Feature Transform
- HOG Heistogram of oriented gradient
- a unique feature amount that can recognize a subject object (object) such as a face, a person, or an object in the image may be calculated.
- a face detection device that is put into practical use as described in a patent document (Japanese Patent Laid-Open No. 2008-250444), and human body detection and general object detection are Details are described in “Gradient-based feature extraction—SIFT and HOG-” (Information Processing Society of Japan Research Report CVIM 160, pp. 211-224, 2007).
- the image feature amount it is conceivable to calculate a high-dimensional feature amount capable of expressing the feature of the subject object from a low-dimensional feature amount such as color information or texture information that is basic feature amount information of the image.
- the image feature amount is associated with each image data number, and includes color 1, 2, edge 1, 2, local 1, 2, face, number of faces, and the like.
- Colors 1 and 2 are color information of an image, and are values calculated as in-image statistical values from RGB values. Note that the color information of the image may be a value calculated as hue information converted into an HSV or YUV color space, or a value calculated as statistic information such as a color histogram or a color moment.
- Edges 1 and 2 are texture information, and are values obtained by calculating features detected in line segments in an image as in-image statistical values at fixed angles.
- Locals 1 and 2 represent high-dimensional features, and represent features of a local region or a shape of an object around a characteristic point. Specifically, feature quantities such as SIFT, SURF, and HOG exist.
- the face indicates the presence or absence of a face in the image based on face information obtained from the face detection technique or the like, and the number of faces indicates the number of faces detected by the face detection technique.
- image recognition information related to a person from the size of the face, the color and shape of clothes, and human detection technology as the image features.
- image recognition techniques represented by car detection and pet detection such as dogs and cats.
- the common attribute extraction unit 13 extracts one or more common attributes from the image group including a plurality of images stored in the local data storage unit 11.
- the common attribute extracting unit 13 acquires common metadata information and tag information as common attributes by using metadata information and tag information directly assigned by the user when extracting common attributes from the image group. Further, the common attribute information may be extracted using personal information detectable by face detection or human body detection technology, photographing reason information provided by the user, or the like.
- the common attribute extraction unit 13 includes an image information extraction unit 131, a photographing unit extraction unit 132, and a common attribute determination unit 133.
- Image information extraction means 131 acquires metadata information and tag information as image information from each of the images included in the image group to be classified.
- Imaging unit extraction means 132 uses the image information extracted by the image information extraction unit 131 to divide the image group into a group including a series of a plurality of images that are considered to have been shot by the user as the same shooting event. To do.
- the divided groups are referred to as photographing units.
- the photographing unit extracting means 132 unitizes the photographing interval with a certain time width or less, or unitizes the image photographing place with a distance interval of a certain distance range or less.
- the photographing unit extracting means 132 may be unitized when the detected face similarity between photographed images and the similarity of the number of people and clothes as person information are approximated by a certain value or more. Further, the photographing unit extraction unit 132 may be unitized when information such as the photographing mode information of the camera at the time of photographing and camera parameters at the time of various photographings approximates a certain value or more between the photographed images.
- the photographing unit extraction unit 132 may use tag information to unitize the unit in which the photographing event name is directly given by the user and is intentionally grouped.
- the common attribute determination unit 133 uses the image information extracted by the image information extraction unit 131 for each of the images included in the shooting unit for each shooting unit determined by the shooting unit extraction unit 132. Extract common attributes.
- the types of common attributes include time information such as season, temperature, shooting frequency, and shooting time period, location information such as the degree of movement, indoors and outdoors, landmarks, personal information such as participants and family composition and age for the shooting event, There are shooting method information such as a shooting mode and shooting parameters of the shooting camera, and shooting reason information such as a shooting event given to the image group by the user.
- time information such as season, temperature, shooting frequency, and shooting time period
- location information such as the degree of movement, indoors and outdoors, landmarks
- personal information such as participants and family composition and age for the shooting event
- shooting method information such as a shooting mode and shooting parameters of the shooting camera
- shooting reason information such as a shooting event given to the image group by the user.
- the season can be specified from the shooting time.
- the temperature may be acquired from an external device based on the shooting time and the shooting location, or the image classification device 1 is equipped with a thermometer, and when the image is shot, the temperature at that time is measured, The measurement result may be included in the metadata information
- the common attribute may be a statistic calculated using at least one piece of information indicated as a type.
- the period of the shooting unit is specified from time information obtained from each of one or more images included in the shooting unit, and the season to which the specified period belongs is specified as a statistic.
- the shooting range in the shooting unit is specified, and it is determined whether the specified range is in the home or in the neighborhood, and the result Is a common attribute.
- family composition information indicating the family composition may be calculated from the person information as a statistic. For example, in the photographing unit, the number of images in which a child, father, mother, brother, etc. are photographed can be calculated from the person information. When it is specified from the calculation result that all the persons constituting the family are included in at least one of the images constituting the photographing unit, the family composition composed of each photographed person information as a common attribute Generate information. Further, based on the person information, not only family composition information but also information related to friends and information related to relatives may be generated.
- information related to a person obtained from person information may be estimated, and common attributes may be extracted based on the estimated subject person information. For example, it is conceivable to estimate the age as the subject person information and calculate the statistic based on the estimated age. In this case, a person is extracted from each of one or more images included in the photographing unit, the age of each extracted person is estimated, and the number of distributions for each age such as teenagers, teens, twenties, thirties, etc. calculate.
- the estimation target is not limited to age but may be anything estimated from person information such as a man and woman, an adult, and a child.
- FIG. 5 shows an example of the photographing unit extracted by the photographing unit extraction means 132 from the image group photographed in the short term (one day).
- the photographing unit extraction unit 132 extracts photographing units grouped at a constant photographing interval as units T1-T6. Further, the photographing unit extraction means 132 extracts units P1 to P3 collected as a photographing unit within a certain photographing range (position change of photographing place between images is within 100 m).
- the common attribute determination unit 133 extracts the common attribute, for example, first, the same item in the image information in each unit is determined for the smallest photographing unit unit (here, units T1 to T6). Furthermore, the same item information of the layered image information is extracted by determining the same item for the higher-level shooting units (here, units P1 to P3) by hierarchizing with common items, and using the same item information as a common attribute Output.
- FIG. 6 shows an example of the common attribute acquired from the example shown in FIG.
- the same item information exists in the layer 4 for the same item in each photographing unit of the units T1-T6, and the same item in the photographing unit corresponding to the unit P1-P3 whose location is the same also in the layer 3
- the same item information exists.
- the person information is acquired by using image feature amount information such as face detection, and the result of acquiring the same item with the characters present in the shooting unit is level 2, and all the shootings are in level 1.
- the same items common to the units are extracted.
- more detailed shooting units can be extracted from information such as shooting mode information at the time of shooting and camera parameters at the time of shooting to extract the same item information, or in units that are directly tagged or unitized by the user. It is good also as a structure which performs hierarchization. Furthermore, it is also conceivable to extract common attributes determined from long-term image information such as shooting unit units over several days such as travel, shooting style and family structure for each user event.
- Classification means 14 The classifying unit 14 uses a classifier to classify the feature amount calculated from the image to be classified and one or more model information indicated by the classification dictionary created by the classification dictionary creating unit 15. A determination process is performed to determine a model included in the image.
- Common classifiers include GMM (Gaussian mixture model) and SVM (Support Vector Machine).
- the model information is information obtained by modeling feature amount data of an image that can discriminate, for example, a face or a human body.
- the classifier outputs a discrimination as a model included in the image to be classified and a likelihood as a discrimination reliability.
- the likelihood generally means that the greater the value, the higher the reliability.
- the classifying unit 14 stores the model discrimination result output from the classifier and its likelihood in the image attribute information storage unit 17 in association with the image data number indicating the image to be classified.
- Classification model information storage means 16 The classification model information storage unit 16 is a storage medium that stores model information corresponding to an object for each object category (object). For example, it is conceivable that the result of weighting the image feature amount and the importance of each feature amount is used as model information as it is. As described above, there are GMM and SVM as a method for calculating the feature amount of an image as model information. Another method is ADABOOST. Since these methods are known techniques, description thereof is omitted here.
- FIG. 7 shows an example of a table T10 that stores model information that is a feature amount for each object category. The table T10 has an area for storing a plurality of sets of object category names and model information. For example, if the object category name is “cherry blossom”, it is associated with the feature amount of the cherry blossom image.
- the classification model information storage means 16 stores a basic event object table T20 shown in FIG.
- the basic event object table T20 includes basic event object information used by the classification dictionary creating means 15.
- the classification priority becomes higher as the object is easily shot in the event. Therefore, in the basic event object table T20, similar objects of the type are described together with subject objects that are easily photographed at the photographing event.
- the classification priority of each subject object is set to 1.0 by default.
- the classification model information storage unit 16 stores an attribute information table T30 including object priority attribute information used by the classification dictionary creation unit 15.
- An example of the attribute information table T30 is shown in FIG.
- object priority attribute information such as season, indoor / outdoor, participant, and place is associated with each object category name.
- the classification model information storage means 16 stores an event information table T40 including event-related objects associated with shooting events and priority attribute information thereof.
- an example of the event information table T40 is shown in the list of FIG.
- event-related objects, time information, location information, and person information are associated with each shooting event.
- An event-related object is an object that characterizes a corresponding event. For example, if it is “cherry blossom viewing”, examples of objects that characterize it are “cherry blossom”, “dumpling”, “stall”, “beer”, etc. .
- the time information indicates the shooting time and the shooting time length for the corresponding event.
- the location information indicates a location where the corresponding event is performed, and the person information indicates information on a person who participates in the corresponding event.
- Classification dictionary creation means 15 The classification dictionary creating unit 15 identifies one candidate event among a plurality of events based on the common attribute extracted by the common attribute extracting unit 13, and relates to the event related object included in the identified candidate event. A classification dictionary including the above object categories is created.
- the classification dictionary creation unit 15 uses, for each of one or more shooting units extracted by the common attribute extraction unit 13, a plurality of common attributes included in the shooting unit and object priority attribute information. Increase or decrease the priority. For example, as shown in FIG. 6, common attributes included in the unit T1 are “spring”, “neighborhood”, “morning”, “indoor”, “early morning”, and the like.
- the classification dictionary creation means 15 identifies matching object priority attribute information from these common attributes and the attribute information table T30 shown in FIG. 9, and the priority for the corresponding object category name according to the identified object priority attribute information. Update.
- FIG. 11 shows the updated basic event object table T21.
- the classification dictionary creating means 15 calculates a total value of priorities corresponding to related event objects that characterize each event, and the calculated total value is the highest among the events included in the event information table T40. Identify an event as a candidate event.
- the classification dictionary creating means 15 is a classification dictionary composed of one or more related event objects (object categories) having similarities and different priorities among the related event objects that characterize the identified candidate events. Create For example, when the “Ohanami” event has the highest priority, the classification dictionary creating means 15 limits the related objects to those with high priority among event-related objects such as cherry blossoms, dumplings, stalls, and beer. To create a classification dictionary.
- Image attribute information storage means 17 The image attribute information accumulating unit 17 is a storage medium for accumulating a classification model discrimination result, which is information obtained as a result of classification by the classification unit 14, a likelihood as a discrimination reliability, and the like.
- a classification result information table T50 is shown in FIG. 12 as an example of the result.
- the classification result information table T50 has an area for storing one or more sets of image data numbers, object categories, reliability, and likelihoods.
- the image data number is an identifier for uniquely identifying an image
- the object category indicates model information used for classification.
- the likelihood is a value indicating the likelihood that the object existing in the image indicated by the corresponding image data number matches the model information used for classification.
- “Reliability” indicates whether or not the classification result is reliable.
- the corresponding likelihood is a predetermined value or more (for example, 0.7 or more)
- the reliability value is set to 1
- the classification result is reliable. If the corresponding likelihood is smaller than a predetermined value (for example, 0.7), the reliability value is set to 0, and the classification result is not reliable.
- the image classification device 1 selects an image group when an image group to be classified is selected by the user or when all local data that can be automatically classified, a certain number of images, and the number of moving images are reached.
- the classification process for the subject object is started.
- common attributes are extracted from the image group to be classified, a classification dictionary is created based on the extracted common attributes, and subject objects in each image included in the image group are classified.
- the image feature quantity calculation means 12 acquires an image group consisting of a plurality of images to be classified from the local data storage means 11, and performs an image feature quantity calculation process for each image included in the acquired image group (step S1). ).
- the image information extraction unit 131 of the common attribute extraction unit 13 acquires metadata information and tag information as image information from each of the images included in the image group, and the photographing unit extraction unit 132 uses the extracted image information, A series of a plurality of images that are considered to be taken by the user as the same shooting event are divided as one shooting unit (step S2).
- the common attribute determination unit 133 extracts, for each divided shooting unit, a common attribute for the image group using the image information extracted by the image information extraction unit 131 corresponding to the image belonging to the shooting unit (Step S1). S3).
- the classification dictionary creating unit 15 uses the one or more common attributes extracted by the common attribute extraction unit 13 and the object category stored in the classification model information storage unit 16 to generate a classification dictionary used by the classification unit 14. Create (step S4).
- the classifying unit 14 determines whether each image has a feature that matches the model information corresponding to the object category included in the classification dictionary created by the classification dictionary creating unit 15 for the classification target image group, As the image attribute information, the determination result and the likelihood are stored in the image attribute information storage unit 17 in association with the image data number indicating the image to be classified (step S5).
- the image classification device 1 determines whether the classification process has been completed for all the photographing units (step S6). If it is determined that the process has been completed (“Yes” in step S6), the classification process is terminated. If it is determined that the process has not been completed (“No” in step S6), the process returns to step S3. .
- the classification dictionary creating means 15 acquires the basic event object table T20 and the attribute information table T30 in order to limit the classification model (step S11).
- the classification dictionary creating unit 15 updates the priorities of one or more object categories included in the basic event object table T20 indicated by the common attribute extracted by the common attribute extracting unit 13 (step S12).
- the classification dictionary creation means 15 determines whether or not the priority has been updated for all the common attributes (step S13). If it is determined that the priority has not been updated for all the common attributes (“No” in step S13), the process returns to step S12.
- the classification dictionary creating unit 15 determines whether the updated basic event object information, the event information table T40, From this, a candidate event in which a group of images to be classified has been taken is identified (step S14). Specifically, the classification dictionary creating means 15 uses the updated priority, and for each event included in the event information table T40, the total priority corresponding to the related event object (object category) that characterizes the event. A value is calculated, and an event having the highest calculated total value is identified as a candidate event. As a result, the classification dictionary creating means 15 can identify a candidate for a captured event of an image group to be classified.
- the classification dictionary creating unit 15 creates a classification dictionary including one or more related event objects (object categories) having a priority level equal to or higher than a predetermined threshold among the related event objects that characterize the identified candidate event (step S15). .
- the image classification apparatus 1 in Embodiment 1 classifies all general objects by using mainly the feature amount in the image as in the past. Instead, the classification process is performed by limiting the model information used for classification by the common attribute in the image group held by the user. Therefore, it is possible to classify images and moving images held by the user with high accuracy, automatically tagging and organizing images and moving images held by the user, and images desired by the user. Can be searched efficiently.
- Embodiment 2 in an image classification apparatus that organizes an image group composed of many local images and moving image data of a user such as in the home, a classification process is performed recursively using a common attribute and using a classification result.
- the present invention relates to a mechanism for automatically classifying subject objects included in each image with high accuracy. Note that in the second embodiment, the same reference numerals are given to configurations having the same functions as those in the first embodiment, and description thereof is omitted because the description can be applied.
- the subject object content to be classified recursively is updated and the classification process is performed, so that even an image including various subject objects is captured by the user.
- a method for classifying videos and videos with high accuracy will be described in detail.
- FIG. 15 is a block diagram showing a configuration of an image classification device 1000 according to the second embodiment.
- the image classification apparatus 1000 includes a local data storage unit 11, an image feature amount calculation unit 12, a common attribute extraction unit 13, a classification unit 1400, a classification dictionary creation unit 1500, and a classification model information storage unit 16. And image attribute information storage means 17.
- the local data storage unit 11 Since the local data storage unit 11, the image feature amount calculation unit 12, the common attribute extraction unit 13, the classification model information storage unit 16, and the image attribute information storage unit 17 are the same as those in the first embodiment, the description here will be given. Omitted.
- classification unit 1400 and the classification dictionary creation unit 1500 will be described.
- Classification means 1400 The classifying unit 1400 has the following functions in addition to the same function as the classifying unit 14 shown in the first embodiment.
- the classification unit 1400 determines whether or not the classification result is appropriate when the classification for each divided photographing unit is completed. Specifically, if the ratio of the total number of images determined to include the object category to be classified to the total number constituting the photographing unit is greater than a predetermined value, the classification is determined to be appropriate, If it is less than the predetermined value, it is determined that the classification is not appropriate.
- the classification unit 1400 determines that the classification is not appropriate, the classification unit 1400 outputs to the classification dictionary creation unit 1500 a creation instruction to recreate the dictionary.
- the classification unit 1400 stores the classification result in the image attribute information storage unit 17 when determining that the classification is appropriate.
- Classification dictionary creation means 1500 The classification dictionary creation means 1500 has the following functions in addition to the same functions as the classification dictionary creation means 15 shown in the first embodiment.
- the classification dictionary creation unit 1500 upon receiving a creation instruction from the classification unit 1400, recreates the classification dictionary.
- the classification dictionary creating unit 1500 uses the updated basic event object table and event information table T40 to identify candidate events identified last time from the remaining object categories excluding the object categories that have already been classified. A candidate event different from that (hereinafter referred to as a re-candidate event) is identified.
- the classification dictionary creating means 1500 is composed of one or more related event objects (object categories) having a priority equal to or higher than a predetermined threshold among the related event objects characterizing the re-candidate event specified again. Create a classification dictionary.
- step S21 When the classification process is started, a classification target image is acquired from the local data storage unit 11, and a classification process for extracting image attribute information of each image to be classified by the classification unit 1400 is performed (step S21).
- the process of step S21 is the same as steps S1 to S5 which are the processes shown in FIG.
- the classification unit 1400 determines whether the classification result is appropriate (step S22). Specifically, the classification unit 1400 has N images in the photographing unit to be classified, M images determined to have a related event object (object category), and a predetermined value T If the conditional expression “M / N> T” is satisfied, it is determined that the classification result is appropriate, and otherwise, the classification result is not appropriate.
- the classification unit 1400 notifies the classification dictionary creation unit 1500 of a creation instruction.
- the classification dictionary creation unit 1500 recreates the classification dictionary (step S23). Specifically, using the basic event object table and event information table T40 updated by the classification dictionary creation means 1500, the re-candidate event is identified from the remaining object categories excluding the object category that has already been classified, A classification dictionary is created that includes one or more related event objects (object categories) having similarities and different priorities among the related event objects that characterize the identified re-candidate event.
- the classification unit 1400 stores the classification result in the image attribute information storage unit 17 (step S24).
- FIG. 17 is a block diagram showing a configuration of an image classification device 1000A according to this modification.
- an image classification apparatus 1000A includes a local data storage unit 11, an image feature amount calculation unit 12, a common attribute extraction unit 13, a classification unit 1400A, a classification dictionary creation unit 1500A, and a classification model information storage unit 16.
- the local data storage unit 11 Since the local data storage unit 11, the image feature amount calculation unit 12, the common attribute extraction unit 13, the classification model information storage unit 16, and the image attribute information storage unit 17 are the same as those in the first embodiment, the description here will be given. Omitted.
- the classification unit 1400A the classification dictionary creation unit 1500A, and the axis object extraction unit 1800 will be described.
- Axis object extraction means 1800 The axis object extraction unit 1800 extracts an object category that becomes a highly reliable axis as a result of the classification by the classification unit 1400A.
- the axis object extraction means 1800 determines whether the number of classified objects is biased to one object category using the result of classification by the classification means 1400A.
- the axis object extraction unit 1800 identifies one object category that is biased as an axis object, and outputs a classification dictionary creation instruction based on the identified axis object to the classification dictionary creation unit 1500A.
- the classification unit 1400A has 20 images that are determined to have any object category among a plurality of object categories to be classified, and detects 18 images for one object category (for example, “sakura”). In such a case, it is determined that the object is biased toward one object category, and the one object category (“cherry blossom”) biased as the axis object is specified.
- Classification dictionary creation means 1500A The classification dictionary creating unit 1500A has the following functions in addition to the same functions as the classification dictionary creating unit 1500.
- the classification dictionary creation unit 1500A Upon receiving an instruction from the axis object extraction unit 1800, the classification dictionary creation unit 1500A recreates the classification dictionary based on the axis object specified by the axis object extraction unit 1800.
- the classification dictionary creating means 1500 extracts all events including the object category specified as the axis object, using the updated basic event object table and event information table T40. One or more object categories to be classified are extracted from each extracted event to create a classification dictionary.
- the classification dictionary creating means 1500 extracts “cherry blossom viewing”, “entrance ceremony”, “graduation ceremony”, etc., which are events including “sakura”, and is included in these events.
- a classification dictionary including an object category is generated.
- Classification means 1400A In addition to the same functions as the classification unit 1400, the classification unit 1400A performs classification using the classification dictionary created by the classification dictionary creation unit 1500 based on the axis object.
- the axis object extracting unit 1800 determines whether the number of classified items is biased to one object category using the result of classification by the classifying unit 1400A (step S31).
- the axis object extraction unit 1800 identifies one biased object category as an axis object (step S32).
- the classification dictionary creating unit 1500A extracts all the events including the object category identified as the axis object by the axis object extracting unit 1800, and extracts one or more object categories to be classified from each of the extracted events. Create (step S33).
- the classification unit 1400A performs classification using the classification dictionary created by the classification dictionary creation unit 1500 based on the axis object (step S34).
- step S31 If it is determined that there is no bias (“No” in step S31), the process ends.
- the image classification device 1000 and the image classification device 1000A do not perform classification processing on all general objects, but are extracted from an image group held by the user.
- the common attribute and the object category are recursively limited to perform classification processing, which makes it possible to classify images and moving images held by the user more accurately.
- Embodiment 3 Embodiment 3 according to the present invention will be described below with reference to the drawings.
- the third embodiment is common in an image classification apparatus that organizes a group of images consisting of many local images and moving image data of a user such as in the home, using region information obtained from the images for images to be classified.
- the present invention relates to a mechanism for automatically classifying subject objects included in each area in each image with high accuracy by acquiring information and performing classification.
- the area information is, for example, a face detection area where a human face is detected, a human body detection area where a human body is detected, a person peripheral area including a limb peripheral area, etc.
- the background detection area is an area other than these areas. Note that in the third embodiment, components having the same functions as those in the first embodiment or the second embodiment are denoted by the same reference numerals, and description thereof is omitted because the description can be applied.
- FIG. 19 is a block diagram illustrating a configuration of an image classification device 2000 according to the third embodiment.
- the image classification device 2000 includes a local data storage unit 11, an image feature amount calculation unit 12, a common attribute extraction unit 13, a classification unit 14, a classification dictionary creation unit 15, and a classification model information storage unit 16. And image attribute information storage means 17 and area information calculation means 2800.
- the local data storage unit 11, the image feature amount calculation unit 12, the common attribute extraction unit 13, the classification unit 14, the classification dictionary creation unit 15, the classification model information storage unit 16, and the image attribute information storage unit 17 are the same as those in the first embodiment. Since it is the same, description here is abbreviate
- Area information calculation means 2800 calculates specific area information included in the image for each image in the image group to be classified by the local data storage unit 11.
- the area information calculation unit 2800 calculates the face detection area, the human body detection area, and the other background detection area as the area information using the previously-described face detection and human body detection techniques.
- a face detection device that is put into practical use as described in a patent document (Japanese Patent Laid-Open No. 2008-250444), or by Hiroyoshi Fujiyoshi Details are described in "Gradient-based feature extraction -SIFT and HOG-" (Information Processing Society of Japan Research Report CVIM 160, 21pp.211-224, 2007).
- the area information calculation unit 2800 can estimate and calculate the human body area from the face area, but here, it is assumed that the human area is calculated by the human body detector in addition to the face detector.
- FIG. 20 shows an example of face area detection and human body area detection results.
- the face detection areas G101 and G102 are detected for two persons, and the human body detection areas G201 and G202 are detected.
- the area information calculation unit 2800 calculates a limb peripheral area as a human peripheral area as a constant peripheral area of the human area, and calculates other areas as a background area.
- the image feature quantity calculating means 12 calculates a feature quantity for each area information calculated for the image to be classified.
- the classification dictionary creating means 15 has the same function as that of the first embodiment, but differs in that a classification dictionary is created for each detected area.
- the area information calculation unit 2800 acquires an image group composed of a plurality of images to be classified from the local data storage unit 11, and at least one image for each image included in the acquired image group.
- Area information is calculated (step S41). For example, as shown in FIG. 20, in addition to the face detection areas G101 and G102 and the human body detection areas G201 and G202, as the area information, the human peripheral areas for the human body detection areas G201 and G202, and the background areas other than these areas 4 A region of type is calculated.
- the image feature amount calculating unit 12 calculates the in-image feature amount for each region (step S42).
- the image feature quantity calculation means 12 calculates information necessary for expressing each area information for each area. For example, it is a Gabor feature that is likely to appear on the face for the face region, a HOG feature for the human body region, a local feature such as SIFT for the human body peripheral region and the foreground region, and the background region. Is an overall feature amount such as a color histogram, a color moment, and an edge feature amount. Note that it is possible to use these feature amounts in combination or to retain the feature amounts used at the time of area detection and use the feature amounts.
- the common attribute extraction unit 13 extracts a common attribute for the classification target image group (step S43). Since this process is the same as that in step S3 in the first embodiment, a detailed description thereof will be omitted.
- the classification dictionary creating means 15 creates a classification dictionary for each area indicated by the area information using the extracted common attribute (step S44).
- the basic process is the same as step S4 in the first embodiment, but differs in that the type of subject object that is the target of the classification dictionary is limited and used depending on the region information.
- the face area is limited to items related to individual attributes such as individual person, race, age, family attributes, and presence / absence of glasses and hats.
- the human body region is dictionaryd with items limited to the type of clothes and the uniformity of clothes in the image.
- the area around the human body is dictionaryd with items limited to artificial object objects that tend to exist according to the season or indoor / outdoor types.
- the background area is dictionaryd with items limited to natural object objects that tend to exist depending on the season or indoor / outdoor type.
- step S45 classification processing is performed on the image group that is the classification target (step S45).
- the basic process is the same as that in step S5 in the first embodiment, but the classification unit 14 in the third embodiment applies the object category included in the classification dictionary for each classification target image group. The difference is that it is determined whether each image has a feature that matches the corresponding model information.
- the image classification device 2000 determines whether the classification process has been completed for all the photographing units (step S46). If it is determined that the process has been completed (“Yes” in step S46), the classification process is terminated. If it is determined that the process has not been completed (“No” in step S46), the process returns to step S3. .
- the area information is calculated before extracting the image feature amount, but the present invention is not limited to this.
- the region information may be extracted after extracting the image feature amount.
- FIG. 22 shows the configuration of the image classification device 2000A in this case.
- FIG. 22 is a block diagram showing a configuration of an image classification device 2000A according to this modification.
- an image classification device 2000A includes a local data storage unit 11, an image feature amount calculation unit 12, a common attribute extraction unit 13, a classification unit 2400A, a classification dictionary creation unit 15, and a classification model information storage unit 16. And image attribute information storage means 17 and area information calculation means 2800A.
- the local data storage unit 11, the image feature amount calculation unit 12, the common attribute extraction unit 13, the classification dictionary creation unit 15, the classification model information storage unit 16, and the image attribute information storage unit 17 are the same as those in the third embodiment. Therefore, explanation here is omitted.
- Area information calculation means 2800A The area information calculation unit 2800A calculates area information for each image in the image group to be classified by the local data storage unit 11 using each feature amount calculated by the image feature amount calculation unit 12.
- the area information calculation unit 2800A includes a person area extraction unit 2811, a season extraction unit 2812, and a location extraction unit 2813 as shown in FIG.
- the human area extraction unit 2811 identifies a face detection region, a human body detection region, and a human body peripheral region from each feature amount calculated by the image feature amount calculation unit 12.
- Season extraction unit 2812 specifies an area other than that specified by human area extraction unit 2811, that is, a background area.
- the season extraction unit 2812 extracts a region of an object (for example, cherry blossom, doll, etc.) indicating the season in the background region specified by using each feature amount calculated by the image feature amount calculation unit 12.
- the location extraction unit 2813 extracts an area of an object (for example, a building or a sofa that is an interior interior) indicating whether the shooting location is indoor or outdoor in the background area.
- an object for example, a building or a sofa that is an interior interior
- the classification means 2400A has a clothing / hat classification unit 2411, a seasonal classification unit 2412, a place classification unit 2413, and a general classification unit 2414.
- the clothing / hat classification unit 2411 includes items related to human attributes such as presence / absence of glasses and hats from the detected human area (face detection region, human body detection region), the type of clothing, and uniformity of clothing in the image.
- the classification process is limited to items related to.
- the seasonal classification unit 2412 performs classification processing only on items related to an artificial object representing a season or an object related to a natural object.
- the place classifying unit 2413 performs the classification process only on items related to an artificial object or a natural object indicating a place (indoor or outdoor).
- General classification unit 2414 performs classification processing only for individual persons, races, ages, and family attributes.
- Embodiment 3 a fixed region in an image is not obtained by using common attributes extracted from a group of images held by the user, instead of performing classification processing on all general objects. Since the classification process is performed by limiting the model information used for classification separately, it is possible to classify images and moving images held by the user more accurately, and automatically for the images and moving images held by the user. Tags can be automatically arranged and the user's desired image can be searched efficiently.
- Embodiment 4 Embodiment 4 according to the present invention will be described below with reference to the drawings.
- the fourth embodiment is common to the classification target when the user registers the target to be classified.
- the present invention relates to a mechanism for automatically classifying subject objects included in each image with high accuracy by using a common attribute registered in advance even if it is a newly registered classification target by registering attributes together. Note that in this embodiment, the same reference numerals are given to configurations having the same functions as those in Embodiments 1, 2, and 3, and descriptions thereof are omitted because they can be applied.
- FIG. 23 is a block diagram illustrating a configuration of an image classification device 3000 according to the fourth embodiment.
- an image classification device 3000 includes a local data storage unit 11, an image feature quantity calculation unit 12, a common attribute extraction unit 13, a classification unit 14, a classification dictionary creation unit 3815, and a classification model information storage unit 16. And image attribute information storage means 17, input means 3800, and registration means 3801.
- the local data storage unit 11, the image feature amount calculation unit 12, the common attribute extraction unit 13, the classification unit 14, the classification model information storage unit 16, and the image attribute information storage unit 17 are the same as in the first embodiment. Therefore, explanation here is omitted.
- Input means 3800 The input unit 3800 accepts an input of a user operation for registration processing performed on local data stored in the local data storage unit 11.
- Registration means 3801 The registration unit 3801 performs tagging processing and registration processing based on the input from the input unit 3800.
- the registration unit 3801 extracts a common attribute from the image group used for these processes or an image group related to the image group used for these processes, and sets the object category as the common attribute belonging to the registered object category name.
- the name is stored in the classification model information storage unit 16 in association with the name.
- Classification dictionary creating means 3815 has the following functions in addition to the functions shown in the first embodiment.
- the classification dictionary creation unit 3815 adds the object category to the classification dictionary.
- the registration unit 3801 performs the processing and extracts an image group from which common attributes are extracted (step S51). For example, an image group that is associated with a tag specified by the user, such as “My pet”, “Fireworks display”, “Chestnut picking”, “Christmas”, “Birthday party”, etc. Select an image group to be extracted from common attributes by selecting, selecting a related image group based on the selected image group, or selecting an image group that is continuous in a certain time series .
- the registration unit 3801 extracts the common attribute from the extracted image group using the same method as the common attribute extraction unit 13 in the first embodiment (step S52).
- the registration unit 3801 extracts the common attributes specific to the object category to be associated among the extracted common attributes (step S43).
- the common attribute is extracted from the content in which the common attribute can be extracted, for example, in the same format as shown in FIG.
- image metadata information as shown in FIG. 5 is extracted from each image included in the image group, and is extracted as a common attribute by abstracting them with specific items. For example, converting time information into time zone information such as seasons, converting location information into landmark (location area) information such as amusement parks, or converting character information frequently appearing in images into character information Can extract common information.
- the extracted common attribute is registered in the classification model information storage unit 16 as a common attribute belonging to the registered object category in association with the object category (step S44).
- Embodiment 4 Effects in Embodiment 4 As described above, re-classification for a new image group or for an already-stored image group by previously registering as a common attribute belonging to the object category to be registered. At the time of processing, an object category that is restricted based on the associated common attribute can be used as a classification target, so that it is possible to perform image classification processing that is more in line with the user's intention.
- Embodiment 5 Embodiment 5 according to the present invention will be described below with reference to the drawings.
- all the components are included in one device.
- it is assumed that some of the components are included in an external device connected via a network. Yes.
- configurations having the same functions as those in the first embodiment are denoted by the same reference numerals, and description thereof is omitted because the description can be applied.
- the image classification system 4000 includes an image classification device 4100 and a server device 4500.
- the image classification device 4100 and the server device 4500 include the Internet or the like. Are connected via a network 4001.
- Image classification device 4100 As shown in FIG. 26, the image classification device 4100 includes a local data storage unit 11, an image feature amount calculation unit 12, a common attribute extraction unit 13, a classification unit 14, an image attribute information storage unit 17, a classification dictionary.
- the creating unit 4115 includes an event related information storage unit 4116 and a transmission / reception unit 4110.
- the local data storage unit 11 Since the local data storage unit 11, the image feature quantity calculation unit 12, the common attribute extraction unit 13, the classification unit 14, and the image attribute information storage unit 17 are the same as those in the first embodiment, Description is omitted.
- Event related information storage means 4116 stores the basic event object table T20, the attribute information table T30, and the event information table T40 shown in the first embodiment.
- Classification dictionary creation means 4115 Similar to the classification dictionary creation unit 15 shown in the first exemplary embodiment, the classification dictionary creation unit 4115 has a priority equal to or higher than a predetermined threshold among related event objects that characterize the identified candidate event, and has different similar attributes. A classification dictionary composed of one or more related event objects (object categories) is created.
- the difference from the first embodiment is that the server apparatus 4500 is requested for model information corresponding to the created classification dictionary.
- the classification dictionary creating unit 4115 includes information (for example, a name and an identifier) for identifying all object categories included in the created classification dictionary, and generates request information for requesting model information.
- the generated request information is transmitted to the server apparatus 4500 via the transmission / reception means 4110.
- the classification dictionary creation unit 4115 receives each model information associated with each object category included in the generated classification dictionary from the server device 4500 via the transmission / reception unit 4110.
- the classification dictionary creation unit 4115 outputs each model information associated with each object category of the created classification dictionary to the classification unit 14.
- the classifying unit 14 classifies images from the model information associated with each object category of the classification dictionary created by the classification dictionary creating unit 4115 and the image feature amount calculated by the image feature amount calculating unit 12. be able to.
- (1-3) Transmission / reception means 4110 When the transmission / reception unit 4110 receives the request information from the classification dictionary creation unit 4115, the transmission / reception unit 4110 transmits the received request information to the server device 4500 via the network 4001.
- the transmission / reception unit 4110 receives model information associated with each object category of the classification dictionary created by the classification dictionary creation unit 4115 from the server device 4500 via the network 4001, the transmission / reception unit 4110 converts the received model information into the classification dictionary. Output to the creating means 4115.
- the server device 4500 includes a model information storage unit 4510, a control unit 4511, and a transmission / reception unit 4512.
- Model information storage means 4510 The model information accumulating unit 4510 accumulates a table T10 that stores model information that is a feature amount for each object category shown in the first embodiment.
- Control means 4511 receives request information from the image classification device 4100 via the transmission / reception unit 4512.
- the control unit 4511 obtains model information corresponding to each information for identifying the object category included in the classification dictionary created by the image classification device 4100 included in the received request information from the table T10 of the model information storage unit 4510. To do.
- control means 4511 associates the model information acquired for each object category included in the classification dictionary created by the image classification device 4100 and transmits it to the image classification device 4100 via the transmission / reception means 4512.
- the transmission / reception means 4512 receives model information associated with each object category of the classification dictionary created by the classification dictionary creation means 4115 from the control means 4511 and transmits it to the image classification device 4100 via the network 4001.
- the image classification device 4100 adds the following two steps between step S4 and step S5 shown in FIG.
- the classification dictionary creating unit 4115 generates request information and adds a step of transmitting the generated request information to the server device 4500 via the transmission / reception unit 4110 (hereinafter referred to as step S100).
- the classification dictionary creating unit 4115 receives each model information associated with each object category included in the classification dictionary generated by the classification dictionary creating unit 4115 from the server device 4500 (hereinafter, referred to as “following”). Step S101) is added.
- step S100 and S101 are executed, the image is classified by executing step S5.
- the control unit 4511 of the server device 4500 receives request information from the image classification device 4100 via the network 4001 (step S150).
- the control unit 4511 acquires model information corresponding to each piece of information for identifying the object category included in the received request information from the table T10 of the model information storage unit 4510 (step S151).
- the control unit 4511 associates the acquired model information with each object category included in the received request information, and transmits it to the image classification device 4100 via the network 4001 (step S152).
- the image classification system 4000 that stores model information in an external device (server device 4500) has been described.
- server device 4500 the configuration of the system is not limited to this.
- a system may be used that stores an image group to be classified by an external device.
- the image classification system 4000A includes an image classification device 4100A and a terminal device 4600.
- the image classification device 4100A and the terminal device 4600 include the Internet and the like. Are connected via a network 4001.
- Image classification device 4100A As shown in FIG. 26, the image classification device 4100A includes an image feature amount calculation unit 12, a common attribute extraction unit 13, a classification unit 14, a classification dictionary creation unit 15, a classification model information storage unit 16, and an image attribute.
- the information storage unit 17 and the reception unit 4150 are included.
- the image feature quantity calculating unit 12, the common attribute extracting unit 13, the classifying unit 14, the classification dictionary creating unit 15, the classification model information accumulating unit 16, and the image attribute information accumulating unit 17 are the same as those in the first embodiment. Since it is the same, description here is abbreviate
- the receiving unit 4150 receives an image group including one or more images to be classified and metadata information corresponding to each image from the terminal device 4600 via the network 4001, and receives the received image group and metadata information. And output to the image feature quantity calculating means 12 and the common attribute extracting means 13.
- the image feature quantity calculation unit 12 calculates the feature quantity of each image included in the image group received from the reception unit 4150.
- the common attribute extracting unit 13 extracts the common attribute using the image group and metadata information received from the receiving unit 4150.
- Terminal device 4600 As shown in FIG. 31, the terminal device 4600 includes data storage means 4610, control means 4611, and transmission means 4612.
- the data storage means 4610 is the same as the local data storage means 11 of the first embodiment, description thereof is omitted here.
- the control unit 4611 obtains an image group composed of one or more images stored in the local data storage unit 11 and metadata information corresponding to each image by a user operation, and performs image classification via the transmission unit 4612. Transmit to device 4100A.
- the transmission unit 4612 transmits the image group and metadata information received from the control unit 4611 to the image classification device 4100A via the network 4001.
- the image classifying device 4100A corresponds to the image group and each image including one or more images to be classified from the terminal device 4600 by the receiving unit 4150 before executing step S1.
- a step of receiving metadata information is added.
- the image classification device 4100A can classify the image group received from the external device (terminal device 4600).
- the terminal device 4600 may be any device that can store a group of images and can be connected to a network, such as a personal computer, a digital camera, or a digital video camera.
- a function of receiving an image group from an external device described in the present modification may be added to the components of the image classification device 1 shown in the first embodiment. In this case, it is possible to classify not only the image group stored in the image classification apparatus 1 but also the image group stored in an external apparatus.
- the image group acquired from the external device is not limited to the image captured by the user who wants to classify.
- a shooting unit may be generated from an image group taken by the user and an image group taken by an acquaintance, or an imaging unit may be generated only from an image group taken by an acquaintance.
- the event-related object corresponding to each event may be of any type as long as it is an object photographed within the event.
- the priority may be weighted in advance according to the degree of association. Or, by calculating the event priority and the object category priority individually, and summing them with arbitrary weighting to calculate the compound priority, the object category to be classified is determined based on the compound priority. May be.
- the predetermined value T in the conditional expression is a fixed value, but is not limited to this.
- the ease of shooting in an event may be defined for each object category as a preset, and a determination may be made as to whether it is appropriate by determining whether a certain subject object that is easily shot is classified. .
- an object category is determined to be present in all images, or a case where the object category is excessively classified such as a plurality of object categories is not appropriate.
- the previously extracted object category when the object category is extracted from the recandidate event, the previously extracted object category may be extracted. Therefore, the previously extracted object category may be an extraction target or may be excluded.
- a certain priority range or a certain number of object categories may be extracted in order from the highest priority among the remaining object categories excluding the object category used last time, and the classification dictionary may be used. .
- the classification dictionary may be limited to only those with higher priority from the subject objects subjected to the classification process.
- the candidate event is the same as that before the re-creation.
- the area information calculation unit 2800 includes a face detector and a human body detector, but is not limited thereto.
- a moving object detector may be provided.
- the area information calculation means 2800 can calculate the moving object area detected by moving object detection and the other background area as area information. Further, by providing a detector by another method, the region of interest or the region of interest and the other background region may be calculated as region information.
- the classification dictionary creation means creates a classification dictionary by identifying one candidate event among a plurality of events, but is not limited thereto.
- the classification dictionary creation means extracts a classification dictionary by extracting only the object category that matches the season, location, and event contents when the common attribute indicates the season, location, and event contents (for example, dolls and puppets). May be created.
- the common attribute and the event are associated with each other via the object category included in the event, but the present invention is not limited to this.
- the family composition information and subject person information shown in the above embodiment may include time transition information indicating aging as the degree of change of face and body over time. For example, when shooting for an event determined every year, it is possible to indicate the degree of change of the subject included in the images shot in each year, so that they can be regarded as the same person. Thereby, it is possible to specify an event to be held every year as a single candidate event instead of individual events.
- the classification unit performs classification based on the feature amount of the entire image to be classified, but is not limited thereto.
- Classification may be performed in consideration of the diversity of objects to be photographed.
- the classification model information storage means stores a table T100 shown in FIG. 32 instead of the table T10.
- the table T100 has an area for storing a plurality of sets of object category names, model information, and diversity information. For example, if the object category name is “cherry blossom”, it is associated with the feature amount of the cherry blossom image and its diversity information.
- Diversity information indicates the level of diversity of the corresponding object.
- diversity refers to a combination of an object to be imaged and a background about the object.
- the diversity information indicates “high”, it means that there are many combinations of the object to be imaged and the background of the object.
- the diversity information indicates “low”, the image is captured. This means that there are few combinations of the target object and the background of the object. For example, when the object is an “airplane”, the background is “sky” or “the ground (runway)”, and there is almost no other background. Therefore, the diversity information corresponding to “airplane” is “low”.
- the flower pot can be placed in various places such as “window”, “road”, “house (entrance)” and “garden”.
- the place is the background. Therefore, since the combination of “flowerpot” and “background” is diverse, diversity information is “high”.
- the classifying means acquires the diversity information of the object indicated by the model information using the table T100 for the model information used when classifying the images. If the acquired diversity information indicates “low”, the classification unit performs classification processing using the feature amount of the entire image. When the acquired diversity information indicates “high”, the classification unit specifies a ROI (Region of Interest) in the entire image, and performs classification processing using the model information for the specified area.
- the identification of the ROI is a well-known technique and will not be described here.
- a highly diverse object for example, a flower pot
- more accurate classification can be performed by performing classification processing on an area excluding the background (an area including only the flower pot).
- region of the object (object to be classified) with high diversity was specified by ROI, the specific method is not limited to this. Any method can be used as long as the region of the object to be classified can be specified.
- the processing related to image classification specifically, the processing in the image feature quantity calculation means 12, the common attribute extraction means 13, the classification dictionary creation means 4115, and the classification means 14,
- the present invention is not limited to this.
- the server device 4500 may perform processing by at least one of the image feature quantity calculation unit 12, the common attribute extraction unit 13, the classification dictionary creation unit 4115, and the classification unit 14.
- a CPU Central Processing Unit
- a program describing the procedure of the method may be stored in a recording medium and distributed.
- Each configuration according to each of the above embodiments may be realized as an LSI (Large Scale Integration) that is an integrated circuit. These configurations may be made into one chip, or may be made into one chip so as to include a part or all of them.
- LSI Large Scale Integration
- IC Integrated Circuit
- system LSI super LSI
- ultra LSI ultra LSI
- the method of circuit integration is not limited to LSI, and circuit integration may be performed with a dedicated circuit or a general-purpose processor.
- an FPGA Field Programmable Gate Array
- a reconfigurable processor ReConfigurable Processor
- the calculation of these functional blocks can be performed using, for example, a DSP (Digital Signal Processor) or a CPU (Central Processing Unit). Further, these processing steps can be processed by being recorded on a recording medium as a program and executed.
- the image processing apparatus associates, for each of a plurality of events, a shooting attribute indicating a shooting condition estimated to be satisfied when shooting an image related to the event.
- a subject information storage unit for storing in advance a subject that can be included in an image captured for the event, and a plurality of the image groups including the plurality of captured images.
- extraction means for extracting shooting attributes common to a predetermined number of images, and events associated with the extracted shooting attributes are stored in the subject information storage means. Identifying means for identifying a subject that has been identified, and for each of the plurality of images included in the image group, the subject identified by the identifying means If it is included, the image processing apparatus is characterized by comprising association means for associating with the subject.
- the image processing apparatus identifies and associates a subject that can be included in the image captured by the event from the event associated with the imaging attribute in the image group. This eliminates the need for the user to specify the subject to be used for the association, thereby reducing the user's burden during the association process between the image and the subject. Further, the image processing apparatus limits the subject used for the association to the one corresponding to the event associated with the shooting attribute, so that the classification can be performed with high accuracy.
- the extraction unit divides the image group into one or more image sets based on information related to photographing corresponding to each of the plurality of images included in the image group, and for each divided image set
- one or more shooting attributes may be extracted.
- the image processing apparatus divides the image group into one or more image sets, and extracts the shooting attributes for each divided image set. Therefore, the shooting attributes can be extracted with high accuracy.
- the information related to photographing indicates time information indicating time at the time of image photographing, location information indicating a place, person information of a person as a subject, photographing information indicating a photographing method, and an environment at the time of photographing.
- the environmental information may include at least one piece of information.
- the image processing apparatus can divide the image group into one or more image sets based on at least one of time information, place information, person information, shooting information, and environment information. .
- the extraction unit may calculate the similarity between pieces of information used for the division, and perform the division so as to include similar images using the calculated similarity.
- the image processing apparatus uses the similarity between pieces of information and divides the image into image sets, so that similar images can be made into one image set.
- the extracting means may use statistical information acquired using at least one piece of information included in the attribute as the photographing attribute.
- the image processing apparatus can extract statistics information as a shooting attribute.
- the extracting unit displays the family configuration information indicating the family as the statistics. It is also possible to acquire, as the statistic information, the person subject information indicating the gender or age distribution of the person obtained from each of the one or more person information in the image set as the quantity information.
- the image processing apparatus can use family configuration information or subject information as the statistic information.
- the family composition information or the subject information may include time transition information indicating aging as a degree of change of face and body over time.
- the image processing apparatus can include time transition information indicating aging as the degree of change of the face and body over time in the family configuration information or the subject information.
- time transition information indicating aging as the degree of change of the face and body over time in the family configuration information or the subject information.
- a plurality of subjects are associated with the event.
- the photographing attribute and the event are associated with each other.
- the specifying unit counts the priority for the subject corresponding to the shooting attribute, and the event having the highest total priority for the subject among the plurality of events. It may be specified as a candidate event, and a subject having a priority higher than a predetermined value may be specified among a plurality of objects related to the specified candidate event.
- the image processing apparatus selects the subject using the priority, the image classification can be performed using a subject having a high priority, that is, a more effective subject for classifying the image.
- a priority according to the shooting attribute is assigned to each shooting attribute, and the specifying unit is assigned to the shooting attribute for a subject corresponding to the shooting attribute for each of the shooting attributes. Priority may be counted.
- the image processing apparatus assigns priorities for each shooting attribute, for example, for a more important subject for image association, a higher priority is used for association by assigning a higher priority. Can be high.
- the image processing apparatus can perform the matching with the subject more accurately.
- the correlating means determines whether or not the classification is necessary again according to the result of the classification, and when the identifying means determines that the classification is necessary again, Another set that does not include the set of subjects used for the classification, a set that includes all of the subjects, or another set that includes a part of the set of subjects may be specified.
- the image processing apparatus performs association processing recursively according to the association result. That is, the image processing apparatus can perform the association with high accuracy by repeating the association processing.
- the association unit determines that the classification is necessary again, and the specifying unit May specify another event including the one subject, and may specify a subject having a priority equal to or higher than the predetermined value among a plurality of subjects corresponding to the specified other event.
- the image processing apparatus when the number of images classified by one subject is greater than or equal to a predetermined number, the image processing apparatus is biased, so that the classification is performed more accurately by performing association again. be able to.
- a value corresponding to the shooting difficulty level is assigned to each subject, and the specifying unit assigns, for each of the shooting attributes, the difficulty level assigned to the subject for the subject corresponding to the shooting attribute.
- a value corresponding to the above may be counted as a priority.
- the image processing apparatus can extract shooting attributes according to the difficulty level of shooting.
- the image processing apparatus further includes, for each of the plurality of images included in the image group, a plurality of regions according to the configuration in the image.
- An area dividing unit for dividing may be provided, and the extracting unit may extract one or more photographing attributes for each divided area.
- the image processing apparatus divides the image according to the configuration, and extracts the shooting attribute for each divided area, so that it is possible to extract a more accurate shooting attribute.
- the area dividing means may divide a person's area and other areas in the image.
- the image processing apparatus divides the image into a person area and another area, it is possible to more accurately extract shooting attributes relating to the person and other areas such as a shooting attribute relating to the background.
- the image processing apparatus further receives a subject extraction instruction for one image group belonging to the same event from the user, and receives the extraction instruction, the first image group A registration unit may be provided that extracts a subject in an event to which the one image group belongs, associates the extracted subject with an event to which the one image group belongs, and registers the subject in the subject information storage unit.
- the image processing apparatus can use the event registered by the user and the subject corresponding to the event for the association, the image processing apparatus can perform the association specialized for the user.
- the registration unit may extract a shooting attribute from the one image group, and associate the extracted shooting attribute with an event to which the one image group belongs.
- the image processing apparatus associates each of the one or more shooting attributes extracted from the one image group with the event to which the one image group belongs. Therefore, when the association is performed for other image groups, it can be specified as an event specialized by the user.
- the image processing apparatus further includes acquisition means associated with the subject specified by the specifying means and acquiring model information including the feature amount of the subject from an external device via a network.
- the association means includes, for each of the plurality of images, the subject specified by the specifying means in the image based on the feature quantity indicated by the model information acquired by the acquisition means from the feature quantity of the image. It may be determined whether or not it is included.
- the image classification apparatus acquires model information about the subject from an external device, it is not necessary to store model information about all the subjects in advance. Therefore, the image classification device can save storage capacity.
- the image processing device may further include an acquisition unit that acquires the image group from an external device via a network.
- the image classification device since the image classification device acquires an image group to be classified from an external device, it is not necessary to previously store the image group to be classified. Therefore, the image classification device can save storage capacity.
- the image processing apparatus includes a plurality of events. And an attribute storage means for storing a shooting attribute indicating a shooting condition estimated to be satisfied when shooting an image related to the event, and an image shot for the event for each of a plurality of events.
- Shooting attributes common to a predetermined number of images based on information related to shooting corresponding to each of the plurality of images with respect to a subject information storage unit that stores subjects that can be included and an image group including a plurality of shot images
- Extraction means for extracting a subject stored in the subject information storage means for an event associated with the extracted shooting attribute.
- Specifying means for acquiring the model information that is associated with the subject specified by the specifying means and includes the feature amount of the subject from the server device via the network, and the image group includes the information included in the image group
- the server device For each of a plurality of images, it is determined whether or not the subject specified by the specifying unit is included in the image from the feature amount indicated by the model information acquired by the acquiring unit from the feature amount of the image, and the image Is associated with the image and the subject, and the server device stores the subject for each subject to be stored in the subject information storage unit.
- Model information storage means for storing model information composed of feature quantities in association with the subject, and a subject specified by the image processing apparatus.
- the model information to respond via the network characterized in that it comprises transmission means for transmitting to said image processing apparatus.
- the image processing apparatus of the image processing system identifies and associates a subject that can be included in an image photographed at the event from the event associated with the photographing attribute in the image group. This eliminates the need for the user to specify the subject to be used for the association, thereby reducing the user's burden during the association process between the image and the subject. Further, the image processing apparatus limits the subject used for the association to the one corresponding to the event associated with the shooting attribute, so that the classification can be performed with high accuracy. Further, since the image classification device acquires model information about the subject from the server device, it is not necessary to store the model information about all the subjects in advance. Therefore, the image classification device can save storage capacity.
- the terminal device includes a plurality of captured images.
- Image storage means for storing an image group composed of images, and transmission means for transmitting the image group to the image processing apparatus via the network
- the image processing apparatus for each event Attribute storage means for storing shooting attributes indicating shooting conditions estimated to be satisfied when shooting an image related to an event in association with each other, and a plurality of events may be included in an image shot for the event
- Subject information storage means for storing a subject
- acquisition means for acquiring the image group from the terminal device via the network, and information acquired by the acquisition means
- an extraction unit that extracts shooting attributes common to a predetermined number of images, and an event associated with the extracted shooting attributes
- Specifying means for specifying a subject stored in the subject information storage means, and for each of the plurality of images included in the image
- the image processing apparatus of the image processing system identifies and associates a subject that can be included in an image photographed at the event from the event associated with the photographing attribute in the image group. This eliminates the need for the user to specify the subject to be used for the association, thereby reducing the user's burden during the association process between the image and the subject. Further, the image processing apparatus limits the subject used for the association to the one corresponding to the event associated with the shooting attribute, so that the classification can be performed with high accuracy. Further, since the image classification apparatus acquires the image group to be classified from the terminal device, it is not necessary to store the image group to be classified in advance. Therefore, the image classification device can save storage capacity.
- the image classification device is effective when tagging an image group consisting of many images with high accuracy. For example, when automatically organizing images or searching for a desired image, it is possible to perform an association process in accordance with a shooting event of a user's local data, thereby efficiently extracting an image group including an arbitrary target.
- the image processing apparatus can also be applied to uses such as DVD recorders, TVs, personal computers, and data servers for processing images.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Multimedia (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Signal Processing (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Computational Linguistics (AREA)
- Library & Information Science (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Television Signal Processing For Recording (AREA)
- Studio Devices (AREA)
- Processing Or Creating Images (AREA)
- Image Analysis (AREA)
Abstract
Description
以下、図面を参照してこの発明の実施形態について説明する。本実施の形態1は、家庭内等のユーザのローカルな多くの画像や動画データからなる画像群を整理する画像分類装置において、画像群について共通する属性を利用することにより、各画像内に含まれる被写体オブジェクト(物体)を精度良く自動分類する仕組みに関するものである。
図1は、画像分類装置(画像処理装置)1の構成を示すブロック図である。図1において、画像分類装置1は、ローカルデータ蓄積手段11と、画像特徴量算出手段12と、共通属性抽出手段13と、分類手段14と、分類辞書作成手段15と、分類モデル情報蓄積手段16と、画像属性情報蓄積手段17とから構成されている。
ローカルデータ蓄積手段11は、一定の限られたユーザが保有する家庭内等のファイルデータを蓄積する記録媒体である。例えば、家族内の写真画像や動画像データが記憶されている。ローカルデータ蓄積手段は、HDDやDVD等の大容量メディアディスクや半導体メモリ等のストレージデバイス等である。
画像特徴量算出手段12は、画像特徴としてエッジや色やテクスチャ等の画像の基本的な低次特徴量から物体に特異な高次特徴量を算出するものである。
共通属性抽出手段13は、ローカルデータ蓄積手段11に格納されている複数の画像からなる画像群について、1つ以上の共通の属性を抽出するものである。
画像情報抽出手段131は、分類対象となる画像群に含まれる画像それぞれから画像情報であるメタデータ情報やタグ情報を取得する。
撮影ユニット抽出手段132は、画像情報抽出手段131で抽出された画像情報を利用して、ユーザが同じ撮影イベントとして撮影したと考えられる一連の複数の画像を一ユニットとするグループに画像群を分割する。以下において、分割されたグループを撮影ユニットを呼ぶ。
共通属性判定手段133は、撮影ユニット抽出手段132で判定された撮影ユニット単位毎に、当該撮影ユニットに含まれる画像それぞれについて画像情報抽出手段131で抽出された画像情報を用いて、当該撮影ユニットについての共通属性を抽出する。
撮影ユニット抽出手段132で抽出された撮影ユニット毎に共通属性を抽出する具体的な処理内容について説明する。ここでは、短期的(1日)に撮影された画像群から撮影ユニット抽出手段132において抽出された撮影ユニットの一例を図5に示す。
分類手段14は、分類対象とである画像について、当該画像から算出された特徴量と、分類辞書作成手段15で作成された分類辞書で示される1つ以上のモデル情報とを分類器を用いた判定処理を行い、当該画像内に含まれるモデルを判定するものである。
分類モデル情報蓄積手段16は、物体カテゴリ(オブジェクト)毎に、当該オブジェクトに対応するモデル情報を蓄積する記憶媒体である。例えば、画像の特徴量と各特徴量の重要性などの重み付けを行った結果をそのままモデル情報とすること等が考えられる。モデル情報としての画像の特徴量の算出手法は、上述したように、GMMやSVMがある。また、他の手法としては、ADABOOSTがある。これら手法は既知の技術であるので、ここでの説明は省略する。ここで、物体カテゴリ毎の特徴量であるモデル情報を格納するテーブルT10の一例を図7に示す。テーブルT10は、物体カテゴリ名とモデル情報とからなる組を複数記憶するための領域を有している。例えば、物体カテゴリ名が「桜」であれば、桜の画像についての特徴量と対応付けられている。
分類辞書作成手段15は、共通属性抽出手段13で抽出された共通属性に基づいて、複数のイベントのうち1つの候補イベントを特定し、特定した候補イベントに含まれるイベント関連物体に関連する1つ以上の物体カテゴリからなる分類辞書を作成するものである。
画像属性情報蓄積手段17は、分類手段14で分類判定された結果の情報である分類モデルの判別結果及びその判別信頼度としての尤度等を蓄積する記憶媒体である。
ここでは、画像分類装置1の動作について、説明する。
画像分類装置1は、ユーザにより分類対象の画像群が選択される、又は自動的に分類可能な全ローカルデータや一定の画像数や動画数に達した場合に、画像内の被写体オブジェクトに対する分類処理を開始する。分類処理が開始されると、分類対象の画像群から共通属性を抽出し、抽出した共通属性に基づいて分類辞書を作成して、画像群に含まれる各画像内の被写体オブジェクトの分類を行う。
ここでは、図13に示すステップS4における分類辞書を作成する処理について、図14に示す流れ図を用いて説明する。
以上のように、実施の形態1における画像分類装置1は、従来のように一般的なオブジェクト全てを対象に画像内特徴量を中心に利用して分類するのではなく、ユーザが保有する画像群内の共通属性により分類に用いるモデル情報を限定して分類処理を行う。そのため、ユーザが保有する画像や動画像に対して精度よく分類を行うことが可能となり、ユーザが保有する画像や動画像に対して自動的にタグ付けし自動整理することやユーザが所望する画像を効率的に探す事ができる。
以下、図面を参照して本発明に係る実施の形態2について説明する。本実施の形態2は、家庭内等のユーザのローカルな多くの画像や動画データからなる画像群を整理する画像分類装置において、共通属性を利用すると共に分類結果を用いて再帰的に分類処理を行うことで、各画像内に含まれる被写体オブジェクトを精度良く自動分類する仕組みに関するものである。なお、本実施の形態2において、実施の形態1と同じ機能を有する構成には同じ符号を付し、その説明が適用できるため説明を省略する。
図15は、本実施の形態2に係る画像分類装置1000の構成を示すブロック図である。図15において、画像分類装置1000は、ローカルデータ蓄積手段11と、画像特徴量算出手段12と、共通属性抽出手段13と、分類手段1400と、分類辞書作成手段1500と、分類モデル情報蓄積手段16と、画像属性情報蓄積手段17とから構成されている。
分類手段1400は、上記実施の形態1で示す分類手段14と同様の機能に加えて、以下の機能を有する。
分類辞書作成手段1500は、上記実施の形態1で示す分類辞書作成手段15と同様の機能に加えて、以下の機能を有する。
ここでは、画像分類装置1000の動作について、図16に示す流れ図を用いて説明する。
ここでは、再帰的に分類辞書を作成する別の実施形態について説明する。
図17は、本変形例に係る画像分類装置1000Aの構成を示すブロック図である。図17において、画像分類装置1000Aは、ローカルデータ蓄積手段11と、画像特徴量算出手段12と、共通属性抽出手段13と、分類手段1400Aと、分類辞書作成手段1500Aと、分類モデル情報蓄積手段16と、画像属性情報蓄積手段17と、軸オブジェクト抽出手段1800とから構成されている。
軸オブジェクト抽出手段1800は、分類手段1400Aによって分類された結果、信頼度の高い軸となる物体カテゴリを抽出するものである。
例えば、分類手段1400Aは、分類対象となる複数の物体カテゴリのうち何れかの物体カテゴリが存在すると判断された画像が20枚あり、そのうち一の物体カテゴリ(例えば、「桜」)について18枚検出される様な場合には、一の物体カテゴリに偏っていると判断し、軸オブジェクトとして偏っている当該一の物体カテゴリ(「桜」)を特定する。
分類辞書作成手段1500Aは、上記分類辞書作成手段1500と同様の機能に加えて、以下の機能を有する。
分類手段1400Aは、上記分類手段1400と同様の機能に加えて、分類辞書作成手段1500が軸オブジェクトに基づいて作成した分類辞書を用いた分類を行う。
ここでは、画像分類装置1000Aの動作、特に軸オブジェクトを用いた分類について、図18に示す流れ図を用いて説明する。
以上のように、画像分類装置1000及び画像分類装置1000Aは、一般的なオブジェクト全てを対象に分類処理をするのではなく、ユーザが保有する画像群から抽出される共通属性及び再帰的に物体カテゴリを限定して分類処理を行うため、ユーザが保有する画像や動画像に対してより精度よく分類を行うことが可能となり、ユーザが保有する画像や動画像に対して自動的にタグ付けし自動整理することやユーザが所望する画像を効率的に探す事ができる。
以下、図面を参照して本発明に係る実施の形態3について説明する。本実施の形態3は、家庭内等のユーザのローカルな多くの画像や動画データからなる画像群を整理する画像分類装置において、分類対象となる画像について当該画像から得られる領域情報を用いて共通情報を取得し、分類を行うことで、各画像内の領域別に含まれる被写体オブジェクトを精度良く自動分類する仕組みに関するものである。ここで、領域情報とは、例えば、人物の顔が検出された顔検出領域や、人体が検出された人体検出領域や、検出された人体検出領域について手足周辺領域等を含めた人物周辺領域や、これら領域以外の領域である背景検出領域をいう。なお、本実施の形態3において、実施の形態1、又は実施の形態2と同じ機能を有する構成には同じ符号を付し、その説明が適用できるため説明を省略する。
図19は本実施の形態3における画像分類装置2000の構成を示すブロック図である。図19において、画像分類装置2000は、ローカルデータ蓄積手段11と、画像特徴量算出手段12と、共通属性抽出手段13と、分類手段14と、分類辞書作成手段15と、分類モデル情報蓄積手段16と、画像属性情報蓄積手段17と、領域情報算出手段2800とから構成されている。
領域情報算出手段2800は、ローカルデータ蓄積手段11の分類対象である画像群の各画像に対して画像内に含まれる特定の領域情報を算出する。
ここでは、画像分類装置2000の動作について、図21に示す流れ図を用いて説明する。
上記実施の形態3においては、画像特徴量を抽出する前に、領域情報を算出したが、これに限定されない。画像特徴量を抽出した後に、領域情報を抽出してもよい。
この場合における画像分類装置2000Aの構成を図22に示す。
領域情報算出手段2800Aは、ローカルデータ蓄積手段11の分類対象である画像群の各画像に対して、画像特徴量算出手段12で算出された各特徴量を用いて領域情報を算出する。
分類手段2400Aは、図22に示すように、服装/帽子分類部2411、季節もの分類部2412、場所もの分類部2413、一般もの分類部2414を有している。
画像分類装置2000Aの動作は、図21に示すステップS41とステップS42とを入れ替えることで実現できるので、ここでの説明は省略する。
以上のように、一般的なオブジェクト全てを対象に分類処理をするのではなく、ユーザが保有する画像群から抽出される共通属性を用いて画像内の一定領域別に分類に用いるモデル情報を限定して分類処理を行うため、ユーザが保有する画像や動画像に対してより精度よく分類することが可能となり、ユーザの保有する画像や動画像に対して自動的にタグ付けし自動整理することやユーザの所望画像を効率的に探す事ができる。
以下、図面を参照して本発明に係る実施の形態4について説明する。本実施の形態4は、家庭内等のユーザのローカルな多くの画像や動画データからなる画像群を整理する画像分類装置において、ユーザが分類したい対象を登録した際にその分類対象に関連する共通属性を併せて登録しておくことで、新しく登録された分類対象であっても予め登録した共通属性を利用して各画像内に含まれる被写体オブジェクトを精度良く自動分類する仕組みに関するものである。なお、本実施の形態において、実施の形態1、2、3と同じ機能を有する構成には同じ符号を付し、その説明が適用できるため説明を省略する。
図23は、本実施の形態4における画像分類装置3000の構成を示すブロック図である。図23において、画像分類装置3000は、ローカルデータ蓄積手段11と、画像特徴量算出手段12と、共通属性抽出手段13と、分類手段14と、分類辞書作成手段3815と、分類モデル情報蓄積手段16と、画像属性情報蓄積手段17と、入力手段3800と、登録手段3801とから構成されている。ローカルデータ蓄積手段11と、画像特徴量算出手段12と、共通属性抽出手段13と、分類手段14と、分類モデル情報蓄積手段16と、画像属性情報蓄積手段17については、実施の形態1と同様であるので、ここでの説明は省略する。
入力手段3800は、ローカルデータ蓄積手段11に蓄積されているローカルデータに対して行われる登録処理のためのユーザ操作の入力を受け付ける。
登録手段3801は、入力手段3800の入力に基づいて、タグ付け処理や登録処理を行う。
分類辞書作成手段3815は、実施の形態1で示す機能に加えて、以下の機能を有する。
ここでは、登録手段3801が行う動作について、図24に示す流れ図を用いて説明する。なお、画像群に対する分類処理については、実施の形態1で示す処理(図13、14参照)と同様であるので、ここでの説明は省略する。
以上により、登録する物体カテゴリに属する共通属性として予め対応付けて登録しておく事で、新しい画像群に対する分類処理時または既に蓄積されている画像群に対する再分類処理時に、対応付けされた共通属性に基づいて制限される物体カテゴリを分類対象として用いることができるため、よりユーザの意図に則した画像分類処理を行うことが可能となる。
以下、図面を参照して本発明に係る実施の形態5について説明する。実施の形態1では、1つの装置内にて全ての構成要素を含むものとしたが、本実施の形態5では、構成要素の一部をネットワークを介して接続された外部の装置が有するものとしている。なお、本実施の形態5において、実施の形態1と同じ機能を有する構成には同じ符号を付し、その説明が適用できるため説明を省略する。
画像分類システム4000は、図25に示すように、画像分類装置4100とサーバ装置4500とから構成されており、画像分類装置4100とサーバ装置4500とは、インターネット等のネットワーク4001を介して接続されている。
画像分類装置4100は、図26に示すように、ローカルデータ蓄積手段11と、画像特徴量算出手段12と、共通属性抽出手段13と、分類手段14と、画像属性情報蓄積手段17と、分類辞書作成手段4115と、イベント関連情報蓄積手段4116と送受信手段4110とから構成されている。
イベント関連情報蓄積手段4116は、実施の形態1で示す基本イベントオブジェクト表T20、属性情報テーブルT30及びイベント情報テーブルT40を蓄積している。
分類辞書作成手段4115は、実施の形態1で示す分類辞書作成手段15と同様に、特定した候補イベントを特徴付ける関連イベント物体のうち、優先度が所定の閾値以上であり、互いに異なる類似属性となる1つ以上の関連イベント物体(物体カテゴリ)からなる分類辞書を作成する。
送受信手段4110は、分類辞書作成手段4115から要求情報を受け取ると、ネットワーク4001を介してサーバ装置4500へ受け取った要求情報を送信する。
サーバ装置4500は、図27に示すように、モデル情報蓄積手段4510、制御手段4511及び送受信手段4512から構成されている。
モデル情報蓄積手段4510は、実施の形態1に示す物体カテゴリ毎の特徴量であるモデル情報を格納しているテーブルT10を蓄積している。
制御手段4511は、送受信手段4512を介して画像分類装置4100から要求情報を受け取る。
送受信手段4512は、ネットワーク4001を介して画像分類装置4100から要求情報を受信すると、受信した要求情報を制御手段4511へ出力する。
ここでは、画像分類システム4000の動作として、画像分類装置4100及びサーバ装置4500それぞれの動作について説明する。
画像分類装置4100の動作について、図13に示す流れ図を用いて、実施の形態1と本実施の形態5との差異についてのみ説明する。
サーバ装置4500の動作について、図28に示す流れ図を用いて説明する。
上記実施の形態5では、モデル情報を外部の装置(サーバ装置4500)に蓄積する画像分類システム4000について、説明したが、システムの構成はこれに限られない。
画像分類システム4000Aは、図29に示すように、画像分類装置4100Aと端末装置4600とから構成されており、画像分類装置4100Aと端末装置4600とは、インターネット等のネットワーク4001を介して接続されている。
画像分類装置4100Aは、図26に示すように、画像特徴量算出手段12と、共通属性抽出手段13と、分類手段14と、分類辞書作成手段15と、分類モデル情報蓄積手段16と、画像属性情報蓄積手段17と受信手段4150とから構成されている。
端末装置4600は、図31に示すように、データ蓄積手段4610と、制御手段4611と、送信手段4612とから構成されている。
ここでは、画像分類システム4000Aの動作、特に画像分類装置4100Aの動作について、実施の形態1と異なる点を説明する。
本変形例では、端末装置4600は、画像群を蓄積でき、ネットワーク接続できる装置であればよく、例えば、パーソナルコンピュータや、デジタルカメラ、デジタルビデオカメラ等である。
以上、各実施の形態に基づいて説明したが、本発明は上記の実施の形態に限られない。例えば、以下のような変形例が考えられる。
(1)本発明の一実施態様である、画像処理装置は、複数のイベント毎に、当該イベントに係る画像を撮影する際に満たされることが推定される撮影条件を示す撮影属性を対応付けて記憶する属性記憶手段と、複数のイベント毎に、当該イベントについて撮影された画像に含まれ得る被写体を予め記憶する被写体情報記憶手段と、撮影された複数の画像からなる画像群について、前記複数の画像それぞれに対応する撮影に係る情報に基づいて、所定数の画像に共通する撮影属性を抽出する抽出手段と、抽出された撮影属性に対応付けられたイベントについて、前記被写体情報記憶手段に記憶されている被写体を特定する特定手段と、前記画像群に含まれる前記複数の画像それぞれについて、当該画像が前記特定手段で特定された被写体を含む場合には当該被写体との対応付けを行う対応付手段とを備えることを特徴とする。
11 ローカルデータ蓄積手段
12 画像特徴量算出手段
13 共通属性抽出手段
14、1400、2400 分類手段
15、1500、3815、4115 分類辞書作成手段
16 分類モデル情報蓄積手段
17 画像属性情報蓄積手段
131 画像情報抽出手段
132 撮影ユニット抽出手段
133 共通属性判定手段
1800 軸オブジェクト抽出手段
2800 領域情報算出手段
3800 入力手段
3801 登録手段
3815 分類辞書作成手段
4000 画像分類システム
4110、4512 送受信手段
4116 イベント関連情報蓄積手段
4150 受信手段
4500 サーバ装置
4510 モデル情報蓄積手段
4511、4611 制御手段
4600 端末装置
4610 データ蓄積手段
4612 送信手段
Claims (24)
- 複数のイベント毎に、当該イベントに係る画像を撮影する際に満たされることが推定される撮影条件を示す撮影属性を対応付けて記憶する属性記憶手段と、
複数のイベント毎に、当該イベントについて撮影された画像に含まれ得る被写体を記憶する被写体情報記憶手段と、
撮影された複数の画像からなる画像群について、前記複数の画像それぞれに対応する撮影に係る情報に基づいて、所定数の画像に共通する撮影属性を抽出する抽出手段と、
抽出された撮影属性に対応付けられたイベントについて、前記被写体情報記憶手段に記憶されている被写体を特定する特定手段と、
前記画像群に含まれる前記複数の画像それぞれについて、当該画像が前記特定手段で特定された被写体を含む場合には当該被写体との対応付けを行う対応付手段とを備える
ことを特徴とする画像処理装置。 - 前記抽出手段は、
前記画像群を当該画像群に含まれる前記複数の画像それぞれに対応する撮影に係る情報に基づいて1以上の画像集合に分割し、分割した画像集合毎に、1以上の前記撮影属性を抽出する
ことを特徴とする請求項1に記載の画像処理装置。 - 前記撮影に係る情報は、画像撮影時の時間を示す時間情報、場所を示す場所情報、被写体である人物の人物情報、撮影方法を示す撮影情報及び撮影時の環境を示す環境情報のうち、少なくとも1つの情報を含むものである
ことを特徴とする請求項2に記載の画像処理装置。 - 前記抽出手段は、
前記分割に用いる各情報間の類似度を算出し、算出した類似度を用いて類似する各画像を含むよう前記分割を行う
ことを特徴とする請求項3に記載の画像処理装置。 - 前記抽出手段は、
前記属性に含まれる各情報のうち、少なくとも1の情報を用いて取得される統計量情報を前記撮影属性とする
ことを特徴とする請求項3に記載の画像処理装置。 - 前記抽出手段は、前記画像集合毎に、当該画像集合における1つ以上の人物情報から一の人物の家族が特定される場合には当該家族を示す家族構成情報を前記統計量情報として、又は当該画像集合における1つ以上の人物情報それぞれから得られる人物の性別若しくは年齢の分布を示す人物被写体情報を前記統計量情報として取得する
ことを特徴とする請求項5に記載の画像処理装置。 - 前記家族構成情報、又は前記被写体情報には、時間経過による顔や体の変化度合いとしての経年変化性を示す時間変遷情報が含まれる
ことを特徴とする請求項6に記載の画像処理装置。 - 前記イベントには、複数の被写体が対応付けられており、
前記撮影属性と、イベントについて撮影された画像に含まれ得る被写体とを対応付けることで、前記撮影属性とイベントとの対応付けがなされており、
前記特定手段は、
前記撮影属性それぞれについて、当該撮影属性に対応する被写体について優先度を計上し、前記複数のイベントのうち、被写体に対する優先度の合計が最も高いイベントを候補イベントとして特定し、特定した前記候補イベントに関連する複数のオブジェクトのうち、所定の値以上の優先度を有する被写体を特定する
ことを特徴とする請求項2に記載の画像処理装置。 - 撮影属性毎に当該撮影属性に応じた優先度が割り当てられており、
前記特定手段は、
前記撮影属性それぞれについて、当該撮影属性に対応する被写体について当該撮影属性に割り当てられた優先度を計上する
ことを特徴とする請求項8に記載の画像処理装置。 - 複数の被写体について類似する被写体の集合毎に、当該集合を識別する類似識別情報が対応付けられており、
前記特定手段は、
前記類似する被写体の集合毎に、所定の値以上の優先度のうち最も高い優先度を有する被写体を特定する
ことを特徴とする請求項8に記載の画像処理装置。 - 前記対応付手段は、
前記分類の結果により再度の分類が必要であるか否かを判断し、
前記特定手段は、
再度の分類が必要であると判断される場合に、当該分類に用いた前記被写体からなる集合を含まない他の集合、前記被写体を全て含む集合、又は前記被写体からなる集合の一部を含む他の集合を特定する
ことを特徴とする請求項8に記載の画像処理装置。 - 前記対応付手段は、
分類結果により、一の被写体に分類される画像の枚数が所定数以上である場合には、再度の分類が必要であると判断し、
前記特定手段は、前記一の被写体を含む他のイベントを特定し、特定した他のイベントについて対応する複数の被写体のうち、前記所定の値以上の優先度を有する被写体を特定する
ことを特徴とする請求項11に記載の画像処理装置。 - 被写体毎に撮影の難易度に応じた値が割り当てられており、
前記特定手段は、
前記撮影属性それぞれについて、当該撮影属性に対応する被写体について当該被写体に割り当てられた難易度に応じた値を優先度として計上する
ことを特徴とする請求項8に記載の画像処理装置。 - 前記画像処理装置は、さらに、
前記抽出手段による前記抽出に先立って、前記画像群に含まれる前記複数の画像それぞれについて、当該画像内の構成に応じて、複数の領域に分割する領域分割手段を備え、
前記抽出手段は、分割した領域毎に1以上の撮影属性を抽出する
ことを特徴とする請求項1に記載の画像処理装置。 - 前記領域分割手段は、
画像内における人物の領域とその他の領域とに分割する
ことを特徴とする請求項14に記載の画像処理装置。 - 前記画像処理装置は、さらに、
ユーザから、同一イベントに属する一の画像群について被写体の抽出指示を受け付ける受付手段と、
前記抽出指示が受け付けられると、前記一の画像群から前記一の画像群が属するイベントにおける被写体を抽出し、抽出した被写体を前記一の画像群が属するイベントと対応付け、前記被写体情報記憶手段に登録する登録手段とを備える
ことを特徴とする請求項1に記載の画像処理装置。 - 前記登録手段は、
前記一の画像群から撮影属性を抽出し、抽出した撮影属性を、前記一の画像群が属するイベントと対応付ける
ことを特徴とする請求項16に記載の画像処理装置。 - 前記画像処理装置は、さらに、
前記特定手段で特定された被写体に対応付けられ、当該被写体の特徴量からなるモデル情報を外部の装置からネットワークを介して取得する取得手段を備え、
前記対応付手段は、
前記複数の画像それぞれに対して、当該画像の特徴量とから前記取得手段で取得したモデル情報が示す特徴量とから当該画像に前記特定手段で特定された被写体が含まれるか否かを判定する
ことを特徴とする請求項1に記載の画像処理装置。 - 前記画像処理装置は、さらに、
前記画像群を、外部の装置からネットワークを介して取得する取得手段を備える
ことを特徴とする請求項1に記載の画像処理装置。 - 複数のイベント毎に、当該イベントに係る画像を撮影する際に満たされることが推定される撮影条件を示す撮影属性を対応付けて記憶する属性記憶手段と、複数のイベント毎に、当該イベントについて撮影された画像に含まれ得る被写体を予め記憶する被写体情報記憶手段と、抽出手段と、特定手段と対応付手段とを備える画像処理装置で用いられる処理方法であって、
前記抽出手段が、撮影された複数の画像からなる画像群について、前記複数の画像それぞれに対応する撮影に係る情報に基づいて、所定数の画像に共通する撮影属性を抽出する抽出ステップと、
前記特定手段が、抽出された撮影属性に対応付けられたイベントについて、前記被写体情報記憶手段に記憶されている被写体を特定する特定ステップと、
前記対応付手段が、前記画像群に含まれる前記複数の画像それぞれについて、当該画像が前記特定手段で特定された被写体を含む場合には当該被写体との対応付けを行う対応付ステップとを含む
ことを特徴とする処理方法。 - 複数のイベント毎に、当該イベントに係る画像を撮影する際に満たされることが推定される撮影条件を示す撮影属性を対応付けて記憶する属性記憶手段と、複数のイベント毎に、当該イベントについて撮影された画像に含まれ得る被写体を予め記憶する被写体情報記憶手段と、抽出手段と、特定手段と対応付手段とを備える画像処理装置で用いられるコンピュータプログラムであって、
前記抽出手段に、撮影された複数の画像からなる画像群について、前記複数の画像それぞれに対応する撮影に係る情報に基づいて、所定数の画像に共通する撮影属性を抽出する抽出ステップと、
前記特定手段に、抽出された撮影属性に対応付けられたイベントについて、前記被写体情報記憶手段に記憶されている被写体を特定する特定ステップと、
前記対応付手段に、前記画像群に含まれる前記複数の画像それぞれについて、当該画像が前記特定手段で特定された被写体を含む場合には当該被写体との対応付けを行う対応付ステップとを実行させる
ことを特徴とするコンピュータプログラム。 - 画像処理装置に用いられる集積回路であって、
複数のイベント毎に、当該イベントに係る画像を撮影する際に満たされることが推定される撮影条件を示す撮影属性を対応付けて記憶する属性記憶手段と、
複数のイベント毎に、当該イベントについて撮影された画像に含まれ得る被写体を予め記憶する被写体情報記憶手段と、
撮影された複数の画像からなる画像群について、前記複数の画像それぞれに対応する撮影に係る情報に基づいて、所定数の画像に共通する撮影属性を抽出する抽出手段と、
抽出された撮影属性に対応付けられたイベントについて、前記被写体情報記憶手段に記憶されている被写体を特定する特定手段と、
前記画像群に含まれる前記複数の画像それぞれについて、当該画像が前記特定手段で特定された被写体を含む場合には当該被写体との対応付けを行う対応付手段とを備える
ことを特徴とする集積回路。 - 画像処理装置と、当該画像処理装置とネットワークを介して接続されたサーバ装置とからなる画像処理システムであって、
前記画像処理装置は、
複数のイベント毎に、当該イベントに係る画像を撮影する際に満たされることが推定される撮影条件を示す撮影属性を対応付けて記憶する属性記憶手段と、
複数のイベント毎に、当該イベントについて撮影された画像に含まれ得る被写体を記憶する被写体情報記憶手段と、
撮影された複数の画像からなる画像群について、前記複数の画像それぞれに対応する撮影に係る情報に基づいて、所定数の画像に共通する撮影属性を抽出する抽出手段と、
抽出された撮影属性に対応付けられたイベントについて、前記被写体情報記憶手段に記憶されている被写体を特定する特定手段と、
前記特定手段で特定された被写体に対応付けられ、当該被写体の特徴量からなるモデル情報を前記サーバ装置から前記ネットワークを介して取得する取得手段と、
前記画像群に含まれる前記複数の画像それぞれについて、当該画像の特徴量とから前記取得手段で取得したモデル情報が示す特徴量とから当該画像に前記特定手段で特定された被写体が含まれるか否かを判定し、当該画像が当該被写体を含むと判定する場合には当該画像と当該被写体との対応付けを行う対応付手段とを備え、
前記サーバ装置は、
前記被写体情報記憶手段で記憶されるべき被写体それぞれについて、当該被写体の特徴量からなるモデル情報を当該被写体と対応付けて記憶しているモデル情報記憶手段と、
前記画像処理装置で特定された被写体に対応するモデル情報を前記ネットワークを介して前記画像処理装置へ送信する送信手段とを備える
ことを特徴とする画像処理システム。 - 画像処理装置と、当該画像処理装置とネットワークを介して接続された端末装置とからなる画像処理システムであって、
前記端末装置は、
撮影された複数の画像からなる画像群を記憶している画像記憶手段と、
前記画像群を前記ネットワークを介して前記画像処理装置へ送信する送信手段とを備え、
前記画像処理装置は、
複数のイベント毎に、当該イベントに係る画像を撮影する際に満たされることが推定される撮影条件を示す撮影属性を対応付けて記憶する属性記憶手段と、
複数のイベント毎に、当該イベントについて撮影された画像に含まれ得る被写体を記憶する被写体情報記憶手段と、
前記画像群を前記ネットワークを介して前記端末装置から取得する取得手段と、
前記取得手段で取得した前記画像群について、前記複数の画像それぞれに対応する撮影に係る情報に基づいて、所定数の画像に共通する撮影属性を抽出する抽出手段と、
抽出された撮影属性に対応付けられたイベントについて、前記被写体情報記憶手段に記憶されている被写体を特定する特定手段と、
前記画像群に含まれる前記複数の画像それぞれについて、当該画像が前記特定手段で特定された被写体を含む場合には当該被写体との対応付けを行う対応付手段とを備える
ことを特徴とする画像処理システム。
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13/805,153 US9008438B2 (en) | 2011-04-25 | 2012-02-29 | Image processing device that associates photographed images that contain a specified object with the specified object |
CN201280001795.7A CN102959551B (zh) | 2011-04-25 | 2012-02-29 | 图像处理装置 |
JP2013511884A JP5848336B2 (ja) | 2011-04-25 | 2012-02-29 | 画像処理装置 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2011097669 | 2011-04-25 | ||
JP2011-097669 | 2011-04-25 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2012147256A1 true WO2012147256A1 (ja) | 2012-11-01 |
Family
ID=47071792
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2012/001392 WO2012147256A1 (ja) | 2011-04-25 | 2012-02-29 | 画像処理装置 |
Country Status (4)
Country | Link |
---|---|
US (1) | US9008438B2 (ja) |
JP (1) | JP5848336B2 (ja) |
CN (1) | CN102959551B (ja) |
WO (1) | WO2012147256A1 (ja) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20140061186A (ko) * | 2012-11-13 | 2014-05-21 | 에스케이하이닉스 주식회사 | 반도체 시스템 |
JP2015210780A (ja) * | 2014-04-30 | 2015-11-24 | キヤノン株式会社 | 画像処理装置、画像処理方法及びプログラム |
CN108090503A (zh) * | 2017-11-28 | 2018-05-29 | 东软集团股份有限公司 | 多分类器的在线调整方法、装置、存储介质及电子设备 |
WO2018189962A1 (ja) * | 2017-04-12 | 2018-10-18 | 株式会社日立製作所 | 物体認識装置、物体認識システム、及び物体認識方法 |
JP2018200597A (ja) * | 2017-05-29 | 2018-12-20 | 株式会社Nttドコモ | 画像推定装置 |
JP2019159539A (ja) * | 2018-03-09 | 2019-09-19 | オムロン株式会社 | メタデータ評価装置、メタデータ評価方法、およびメタデータ評価プログラム |
US10678828B2 (en) | 2016-01-03 | 2020-06-09 | Gracenote, Inc. | Model-based media classification service using sensed media noise characteristics |
WO2022190618A1 (ja) * | 2021-03-09 | 2022-09-15 | 富士フイルム株式会社 | レコメンド情報提示装置、レコメンド情報提示装置の作動方法、レコメンド情報提示装置の作動プログラム |
Families Citing this family (25)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP6025522B2 (ja) * | 2012-11-27 | 2016-11-16 | キヤノン株式会社 | 画像処理装置、画像処理方法、画像処理システム及びプログラム |
JP5859471B2 (ja) * | 2013-03-19 | 2016-02-10 | 富士フイルム株式会社 | 電子アルバム作成装置および電子アルバムの製造方法 |
KR20150007723A (ko) * | 2013-07-12 | 2015-01-21 | 삼성전자주식회사 | 복수의 오브젝트들 사이에 공통된 오브젝트 속성에 대응하는 작업을 수행하는 전자 장치 제어 방법 |
KR20150100113A (ko) * | 2014-02-24 | 2015-09-02 | 삼성전자주식회사 | 영상 처리 장치 및 이의 영상 처리 방법 |
US10419657B2 (en) | 2014-10-26 | 2019-09-17 | Galileo Group, Inc. | Swarm approach to consolidating and enhancing smartphone target imagery by virtually linking smartphone camera collectors across space and time using machine-to machine networks |
US10049273B2 (en) * | 2015-02-24 | 2018-08-14 | Kabushiki Kaisha Toshiba | Image recognition apparatus, image recognition system, and image recognition method |
US10255703B2 (en) | 2015-12-18 | 2019-04-09 | Ebay Inc. | Original image generation system |
TWI579718B (zh) * | 2016-06-15 | 2017-04-21 | 陳兆煒 | 圖形資源管理系統及方法與內儲圖形資源管理程式之電腦程式產品 |
CN106485199A (zh) * | 2016-09-05 | 2017-03-08 | 华为技术有限公司 | 一种车身颜色识别的方法及装置 |
US11681942B2 (en) | 2016-10-27 | 2023-06-20 | Dropbox, Inc. | Providing intelligent file name suggestions |
US9852377B1 (en) * | 2016-11-10 | 2017-12-26 | Dropbox, Inc. | Providing intelligent storage location suggestions |
US10893182B2 (en) | 2017-01-10 | 2021-01-12 | Galileo Group, Inc. | Systems and methods for spectral imaging with compensation functions |
US10554909B2 (en) | 2017-01-10 | 2020-02-04 | Galileo Group, Inc. | Systems and methods for spectral imaging with a transmitter using a plurality of light sources |
WO2018163906A1 (ja) * | 2017-03-06 | 2018-09-13 | 株式会社ミックウェア | 情報処理装置、情報処理システム及び情報処理プログラム |
US10970552B2 (en) * | 2017-09-28 | 2021-04-06 | Gopro, Inc. | Scene classification for image processing |
EP3635513B1 (en) * | 2018-05-04 | 2021-07-07 | Google LLC | Selective detection of visual cues for automated assistants |
CN108848306B (zh) * | 2018-06-25 | 2021-03-02 | Oppo广东移动通信有限公司 | 图像处理方法和装置、电子设备、计算机可读存储介质 |
CN108881740B (zh) * | 2018-06-28 | 2021-03-02 | Oppo广东移动通信有限公司 | 图像方法和装置、电子设备、计算机可读存储介质 |
JP6812387B2 (ja) * | 2018-07-02 | 2021-01-13 | キヤノン株式会社 | 画像処理装置及び画像処理方法、プログラム、記憶媒体 |
KR20200017306A (ko) * | 2018-08-08 | 2020-02-18 | 삼성전자주식회사 | 카테고리에 기반하여 아이템에 관한 정보를 제공하는 전자 장치 |
US11055539B2 (en) * | 2018-09-27 | 2021-07-06 | Ncr Corporation | Image processing for distinguishing individuals in groups |
CN109151320B (zh) * | 2018-09-29 | 2022-04-22 | 联想(北京)有限公司 | 一种目标对象选取方法及装置 |
CN111382296B (zh) * | 2018-12-28 | 2023-05-12 | 深圳云天励飞技术有限公司 | 数据处理方法、装置、终端及存储介质 |
JP7129383B2 (ja) * | 2019-07-03 | 2022-09-01 | 富士フイルム株式会社 | 画像処理装置,画像処理方法,画像処理プログラムおよびそのプログラムを格納した記録媒体 |
CN112069344A (zh) * | 2020-09-03 | 2020-12-11 | Oppo广东移动通信有限公司 | 图片处理方法、装置、电子设备及存储介质 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2002010178A (ja) * | 2000-06-19 | 2002-01-11 | Sony Corp | 画像管理システム及び画像管理方法、並びに、記憶媒体 |
JP2003016448A (ja) * | 2001-03-28 | 2003-01-17 | Eastman Kodak Co | 前景/背景セグメント化を用いた画像のイベント・クラスタリング |
JP2006203574A (ja) * | 2005-01-20 | 2006-08-03 | Matsushita Electric Ind Co Ltd | 画像表示装置 |
JP2007213183A (ja) * | 2006-02-08 | 2007-08-23 | Seiko Epson Corp | デジタル画像データの分類装置、デジタル画像データの分類方法およびデジタル画像データの分類プログラム |
WO2011001587A1 (ja) * | 2009-07-01 | 2011-01-06 | 日本電気株式会社 | コンテンツ分類装置、コンテンツ分類方法及びコンテンツ分類プログラム |
Family Cites Families (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030210808A1 (en) * | 2002-05-10 | 2003-11-13 | Eastman Kodak Company | Method and apparatus for organizing and retrieving images containing human faces |
US20050104976A1 (en) * | 2003-11-17 | 2005-05-19 | Kevin Currans | System and method for applying inference information to digital camera metadata to identify digital picture content |
US7756866B2 (en) * | 2005-08-17 | 2010-07-13 | Oracle International Corporation | Method and apparatus for organizing digital images with embedded metadata |
JP2007058532A (ja) | 2005-08-24 | 2007-03-08 | Sony Corp | 情報処理システム、情報処理装置および方法、プログラム、並びに、記録媒体 |
US20080208791A1 (en) * | 2007-02-27 | 2008-08-28 | Madirakshi Das | Retrieving images based on an example image |
JP4798042B2 (ja) | 2007-03-29 | 2011-10-19 | オムロン株式会社 | 顔検出装置、顔検出方法及び顔検出プログラム |
US8611677B2 (en) * | 2008-11-19 | 2013-12-17 | Intellectual Ventures Fund 83 Llc | Method for event-based semantic classification |
JP5063632B2 (ja) | 2009-03-10 | 2012-10-31 | 株式会社豊田中央研究所 | 学習モデル生成装置、対象物検出システム、及びプログラム |
JP5471749B2 (ja) * | 2010-04-09 | 2014-04-16 | ソニー株式会社 | コンテンツ検索装置および方法、並びにプログラム |
US8655893B2 (en) * | 2010-07-16 | 2014-02-18 | Shutterfly, Inc. | Organizing images captured by multiple image capture devices |
-
2012
- 2012-02-29 CN CN201280001795.7A patent/CN102959551B/zh active Active
- 2012-02-29 WO PCT/JP2012/001392 patent/WO2012147256A1/ja active Application Filing
- 2012-02-29 US US13/805,153 patent/US9008438B2/en active Active
- 2012-02-29 JP JP2013511884A patent/JP5848336B2/ja not_active Expired - Fee Related
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2002010178A (ja) * | 2000-06-19 | 2002-01-11 | Sony Corp | 画像管理システム及び画像管理方法、並びに、記憶媒体 |
JP2003016448A (ja) * | 2001-03-28 | 2003-01-17 | Eastman Kodak Co | 前景/背景セグメント化を用いた画像のイベント・クラスタリング |
JP2006203574A (ja) * | 2005-01-20 | 2006-08-03 | Matsushita Electric Ind Co Ltd | 画像表示装置 |
JP2007213183A (ja) * | 2006-02-08 | 2007-08-23 | Seiko Epson Corp | デジタル画像データの分類装置、デジタル画像データの分類方法およびデジタル画像データの分類プログラム |
WO2011001587A1 (ja) * | 2009-07-01 | 2011-01-06 | 日本電気株式会社 | コンテンツ分類装置、コンテンツ分類方法及びコンテンツ分類プログラム |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR102013583B1 (ko) | 2012-11-13 | 2019-08-23 | 에스케이하이닉스 주식회사 | 반도체 시스템 |
KR20140061186A (ko) * | 2012-11-13 | 2014-05-21 | 에스케이하이닉스 주식회사 | 반도체 시스템 |
JP2015210780A (ja) * | 2014-04-30 | 2015-11-24 | キヤノン株式会社 | 画像処理装置、画像処理方法及びプログラム |
US10902043B2 (en) | 2016-01-03 | 2021-01-26 | Gracenote, Inc. | Responding to remote media classification queries using classifier models and context parameters |
US10678828B2 (en) | 2016-01-03 | 2020-06-09 | Gracenote, Inc. | Model-based media classification service using sensed media noise characteristics |
JP2018180879A (ja) * | 2017-04-12 | 2018-11-15 | 株式会社日立製作所 | 物体認識装置、物体認識システム、及び物体認識方法 |
WO2018189962A1 (ja) * | 2017-04-12 | 2018-10-18 | 株式会社日立製作所 | 物体認識装置、物体認識システム、及び物体認識方法 |
US10963736B2 (en) | 2017-04-12 | 2021-03-30 | Hitachi, Ltd. | Object recognition apparatus, object recognition system, and object recognition method |
JP2018200597A (ja) * | 2017-05-29 | 2018-12-20 | 株式会社Nttドコモ | 画像推定装置 |
CN108090503A (zh) * | 2017-11-28 | 2018-05-29 | 东软集团股份有限公司 | 多分类器的在线调整方法、装置、存储介质及电子设备 |
CN108090503B (zh) * | 2017-11-28 | 2021-05-07 | 东软集团股份有限公司 | 多分类器的在线调整方法、装置、存储介质及电子设备 |
JP2019159539A (ja) * | 2018-03-09 | 2019-09-19 | オムロン株式会社 | メタデータ評価装置、メタデータ評価方法、およびメタデータ評価プログラム |
JP7143599B2 (ja) | 2018-03-09 | 2022-09-29 | オムロン株式会社 | メタデータ評価装置、メタデータ評価方法、およびメタデータ評価プログラム |
WO2022190618A1 (ja) * | 2021-03-09 | 2022-09-15 | 富士フイルム株式会社 | レコメンド情報提示装置、レコメンド情報提示装置の作動方法、レコメンド情報提示装置の作動プログラム |
Also Published As
Publication number | Publication date |
---|---|
CN102959551A (zh) | 2013-03-06 |
JPWO2012147256A1 (ja) | 2014-07-28 |
JP5848336B2 (ja) | 2016-01-27 |
CN102959551B (zh) | 2017-02-08 |
US20130101223A1 (en) | 2013-04-25 |
US9008438B2 (en) | 2015-04-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP5848336B2 (ja) | 画像処理装置 | |
US8953895B2 (en) | Image classification apparatus, image classification method, program, recording medium, integrated circuit, and model creation apparatus | |
JP6023058B2 (ja) | 画像処理装置、画像処理方法、プログラム、集積回路 | |
KR101417548B1 (ko) | 사진 콜렉션에서 이벤트들을 생성하고 라벨링하는 방법 및 시스템 | |
US9552374B2 (en) | Imaging workflow using facial and non-facial features | |
JP5537557B2 (ja) | 事象毎に意味論的に分類する方法 | |
US8055081B2 (en) | Image classification using capture-location-sequence information | |
US20130111373A1 (en) | Presentation content generation device, presentation content generation method, presentation content generation program, and integrated circuit | |
US9141856B2 (en) | Clothing image analysis apparatus, method, and integrated circuit for image event evaluation | |
US20110184953A1 (en) | On-location recommendation for photo composition | |
US20140112530A1 (en) | Image recognition device, image recognition method, program, and integrated circuit | |
US20160179846A1 (en) | Method, system, and computer readable medium for grouping and providing collected image content | |
US20120148118A1 (en) | Method for classifying images and apparatus for the same | |
US20090091798A1 (en) | Apparel as event marker | |
US8320609B2 (en) | Device and method for attaching additional information | |
JP2014093058A (ja) | 画像管理装置、画像管理方法、プログラム及び集積回路 | |
JP2014092955A (ja) | 類似コンテンツ検索処理装置、類似コンテンツ検索処理方法、およびプログラム | |
US20140233811A1 (en) | Summarizing a photo album | |
US8270731B2 (en) | Image classification using range information | |
CN112069342A (zh) | 图像分类方法、装置、电子设备及存储介质 | |
US8533196B2 (en) | Information processing device, processing method, computer program, and integrated circuit | |
JP2012058926A (ja) | キーワード付与装置及びプログラム | |
WO2014186392A2 (en) | Summarizing a photo album | |
US11210829B2 (en) | Image processing device, image processing method, program, and recording medium | |
Cadik et al. | Camera elevation estimation from a single mountain landscape photograph |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 201280001795.7 Country of ref document: CN |
|
WWE | Wipo information: entry into national phase |
Ref document number: 13805153 Country of ref document: US |
|
ENP | Entry into the national phase |
Ref document number: 2013511884 Country of ref document: JP Kind code of ref document: A |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 12777076 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 12777076 Country of ref document: EP Kind code of ref document: A1 |