WO2023175931A1 - 画像分類装置、画像分類方法、及び、記録媒体 - Google Patents

画像分類装置、画像分類方法、及び、記録媒体 Download PDF

Info

Publication number
WO2023175931A1
WO2023175931A1 PCT/JP2022/012704 JP2022012704W WO2023175931A1 WO 2023175931 A1 WO2023175931 A1 WO 2023175931A1 JP 2022012704 W JP2022012704 W JP 2022012704W WO 2023175931 A1 WO2023175931 A1 WO 2023175931A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
target subject
image classification
images
captured
Prior art date
Application number
PCT/JP2022/012704
Other languages
English (en)
French (fr)
Japanese (ja)
Inventor
尊裕 中川
Original Assignee
日本電気株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 日本電気株式会社 filed Critical 日本電気株式会社
Priority to JP2024507433A priority Critical patent/JP7729466B2/ja
Priority to PCT/JP2022/012704 priority patent/WO2023175931A1/ja
Publication of WO2023175931A1 publication Critical patent/WO2023175931A1/ja

Links

Images

Classifications

    • AHUMAN NECESSITIES
    • A01AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
    • A01KANIMAL HUSBANDRY; AVICULTURE; APICULTURE; PISCICULTURE; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
    • A01K29/00Other apparatus for animal husbandry
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/215Motion-based segmentation

Definitions

  • the present disclosure relates to a technique for classifying captured images.
  • Patent Document 1 describes a device that identifies and classifies the motion of a subject from a plurality of image data obtained by capturing images of the subject.
  • Patent Document 1 Even according to Patent Document 1, it is difficult to classify photos and videos that pet owners prefer.
  • One objective of the present disclosure is to provide an image classification device that can classify a user's preferred image from a plurality of images.
  • an image classification device includes: an image acquisition means for acquiring an image in which the target subject is photographed; Image classification means for classifying, from the images, images in which a predetermined state of the target subject is captured, using a model in which a relationship between an image in which the target subject is captured and a predetermined state of the target subject is machine-learned; and, Output means for outputting the image and the classification result; Equipped with
  • an image classification method includes: Obtain an image that shows the target subject, Classifying images in which a predetermined state of the target subject is captured from the images using a model in which the relationship between an image in which the target subject is captured and a predetermined state of the target subject is machine learned; Outputting the image and the classification result.
  • the recording medium includes: Obtain an image that shows the target subject, Classifying images in which a predetermined state of the target subject is captured from the images using a model in which the relationship between an image in which the target subject is captured and a predetermined state of the target subject is machine learned; A program is recorded that causes a computer to execute a process of outputting the image and the classification result.
  • FIG. 1 shows the overall configuration of an image classification system according to a first embodiment.
  • FIG. 2 is a block diagram showing the configuration of a server and a user terminal.
  • FIG. 2 is a block diagram showing the functional configuration of a server.
  • FIG. 2 is a block diagram showing the functional configuration of the learning device.
  • 1 is a flowchart of an image classification system. It is a block diagram showing the functional composition of modification 1 of a 1st embodiment.
  • FIG. 2 is a block diagram showing the functional configuration of an information processing device according to a second embodiment. It is a flowchart of processing by the information processing device of a 2nd embodiment.
  • FIG. 1 shows the overall configuration of an image classification system to which an image classification device according to the present disclosure is applied.
  • the image classification system 1 includes a server 200 and a user terminal 300 used by an owner.
  • Server 200 is an example of an image classification device.
  • the server 200 and the owner's user terminal 300 can communicate wirelessly.
  • the server 200 acquires an image showing a predetermined state of the pet based on the video transmitted from the owner's user terminal 300. Specifically, when playing with the pet P, the owner always puts the user terminal 300 in recording mode and shoots a video. Then, the user terminal 300 transmits the shot video (hereinafter also referred to as “shot video”) to the server 200.
  • shot video the shot video
  • the server 200 extracts a still image frame by frame from the video shot by the user terminal 300, and uses AI (Artificial Intelligence) image analysis to classify whether the image depicts a predetermined state of the pet.
  • AI Artificial Intelligence
  • an image that shows a predetermined state of a pet includes, for example, an image that shows the pet's face, an image that shows the pet jumping, or an image that shows the pet playing.
  • This is an image of a pet that the pet owner feels is good, such as an image of a pet.
  • the server 200 attaches a classification result to the still image extracted from the video shot by the user terminal 300 (hereinafter also referred to as "extracted image”) as to whether it is a GOOD shot or not, associates it with the owner, and saves it in the database. do.
  • the owner accesses the server 200 from the user terminal 300 or a terminal other than the user terminal 300, and confirms only the GOOD shots in a slide show or the like. This allows the owner to obtain an image of the pet without missing a photo opportunity. Furthermore, by using smart glasses as the user terminal 300, the pet owner can obtain GOOD shots while interacting with the pet. Note that instead of smart glasses, other glasses-shaped wearable terminals such as AR (Augmented Reality) glasses, MR (Mixed Reality) glasses, and VR (Virtual Reality) glasses may be used.
  • AR Augmented Reality
  • MR Mated Reality
  • VR Virtual Reality
  • images classified as GOOD shots are not limited to still images, and may be moving images.
  • the server 200 extracts videos from videos captured by the user terminal 300 at predetermined time intervals. The server 200 then classifies the video as to whether or not it contains a GOOD shot, and saves the extracted video (also referred to as an "extracted image") with a classification result as to whether it is a GOOD shot or not. .
  • FIG. 2A is a block diagram showing the configuration of the server 200.
  • the server 200 mainly includes a communication unit 211, a processor 212, a memory 213, a recording medium 214, and a database (DB) 215.
  • DB database
  • the communication unit 211 sends and receives data to and from external devices. Specifically, the communication unit 211 transmits and receives information to and from the owner's user terminal 300.
  • the processor 212 is a computer such as a CPU (Central Processing Unit), and controls the entire server 200 by executing a program prepared in advance.
  • the processor 212 includes a GPU (Graphics Processing Unit), an FPGA (Field-Programmable Gate Array), a DSP (Demand-Side Platform), and an ASIC (Application ation Specific Integrated Circuit).
  • the memory 213 is composed of ROM (Read Only Memory), RAM (Random Access Memory), and the like.
  • the memory 213 is also used as a working memory while the processor 212 executes various processes. Furthermore, the memory 213 temporarily stores a series of moving images shot by the user terminal 300 under the control of the processor 212 . This moving image is stored in the memory 213 in association with, for example, the owner's identification information and time stamp information.
  • the recording medium 214 is a non-volatile, non-temporary recording medium such as a disk-shaped recording medium or a semiconductor memory, and is configured to be removably attached to the server 200.
  • the recording medium 214 records various programs executed by the processor 212.
  • a database (DB) 215 stores extracted images with classification results as to whether they are GOOD shots or not.
  • the DB 215 may include an external storage device such as a hard disk connected to or built into the server 200, or may include a removable storage medium such as a flash memory. Note that instead of providing the DB 215 in the server 200, the DB 215 may be provided in an external server or the like, and the extracted images with classification results as to whether they are GOOD shots or not may be stored in the server through communication.
  • the server 200 may include an input unit such as a keyboard and a mouse, and a display unit such as a liquid crystal display for an administrator or the like to give instructions and input.
  • an input unit such as a keyboard and a mouse
  • a display unit such as a liquid crystal display for an administrator or the like to give instructions and input.
  • FIG. 2(B) is a block diagram showing the internal configuration of the user terminal 300 used by the owner.
  • the user terminal 300 is, for example, a terminal device such as smart glasses or a smartphone.
  • the user terminal 300 includes a communication section 311, a processor 312, a memory 313, a display section 314, a camera 315, and a microphone 316.
  • the communication unit 311 transmits and receives data to and from an external device. Specifically, the communication unit 311 transmits and receives information to and from the server 200.
  • the processor 312 is a computer such as a CPU, and controls the entire user terminal 300 by executing a program prepared in advance.
  • the processor 312 may be a GPU, FPGA, DSP, ASIC, or the like.
  • the processor 312 transmits the moving image shot by the camera 315 to the server 200 by executing a program prepared in advance.
  • the memory 313 is composed of ROM, RAM, etc. Memory 313 stores various programs executed by processor 312. The memory 313 is also used as a working memory while the processor 312 executes various processes.
  • the moving image captured by the camera 315 is stored in the memory 313 and then transmitted to the server 200.
  • the display unit 314 is, for example, a liquid crystal display device, and displays a moving image taken by the camera 315, an extracted image of a GOOD shot stored in the server 200, and the like.
  • the camera 315 includes a camera that photographs the user's field of view (also referred to as an "out camera") and a camera that photographs the user's eyeballs (also referred to as an "eye camera”).
  • the out camera is mounted on the outside of the user terminal 300.
  • the outside camera photographs the user's field of view, including objects such as pets, and transmits it to the server 200.
  • the server 200 can acquire an image of a subject such as a pet.
  • the eye camera is mounted inside the user terminal 300 to photograph the user's eyeball.
  • the eye camera photographs the user's eye and sends it to processor 312 .
  • the processor 312 detects the movement of the user's line of sight based on the image of the user's eyeball taken by the eye camera. Thereby, the user terminal 300 can acquire information such as the user's line of sight direction.
  • the microphone 316 collects the user's voice and surrounding sounds and transmits them to the server 200.
  • the server 200 can infer that the user has uttered a predetermined word, or that the user has given an instruction or command to the pet, based on the user's voice or the pet's cry.
  • FIG. 3 is a block diagram showing the functional configuration of the server 200.
  • the server 200 includes an image acquisition section 411 and an image classification section 412.
  • a video shot by the user terminal 300 is input to the server 200.
  • the video captured by the user terminal 300 is input to the image acquisition unit 411.
  • the image acquisition unit 411 extracts a still image or a moving image from a moving image captured by the user terminal 300 as an extracted image.
  • the image acquisition unit 411 outputs the extracted image to the image classification unit 412.
  • the image classification unit 412 classifies whether the extracted image acquired from the image acquisition unit 411 is a GOOD shot or not using a previously prepared image recognition model or the like.
  • This image recognition model is a machine learning model trained in advance to classify whether an image is a GOOD shot or not, and is also referred to as an "image classification model" hereinafter.
  • the image classification unit 412 adds additional information indicating that the extracted image is a GOOD shot.
  • the image classification model classifies the extracted image as not a GOOD shot, that is, as a BAD shot
  • the image classification unit 412 adds additional information to the extracted image indicating that it is a BAD shot.
  • a BAD shot is an image other than a GOOD shot, such as an image that does not include a pet's face.
  • the image classification unit 412 outputs the extracted image with additional information added to the DB 215.
  • FIG. 4 is a block diagram showing a learning method for an image classification model, and includes learning data 511 and a learning device 512.
  • the learning data 511 is image data (hereinafter also referred to as "teacher data") that is labeled in advance as to whether it is a GOOD shot or not. Labeling of image data is performed based on whether a predetermined part of the pet is captured, whether the pet is performing a predetermined movement, etc.
  • the predetermined part of the pet refers to the pet's face or the like. For example, an image showing a pet's face is labeled as GOOD shot. On the other hand, images that do not include the pet, images that show the pet facing backwards, and images that only include the pet's torso or legs are labeled as BAD shots.
  • the predetermined motion of the pet refers to a motion of the pet that attracts attention. For example, a GOOD shot label is given to an image of a pet jumping or an image of a pet holding a tool in its mouth.
  • a pet owner may select images of a plurality of pets to determine whether they are GOOD shots or not, and use the images labeled with the results as training data. Thereby, it is possible to generate an image classification model that can classify images that more closely match the pet owner's preferences.
  • images of animals posted on SNS Social Network Service
  • SNS Social Network Service
  • images posted by pet owners or third parties on SNS are labeled as GOOD shots. This increases the amount of training data, making it possible to generate a more accurate image classification model.
  • the learning device 512 learns the pattern of GOOD shots based on the learning data 511, and outputs an image classification model as a learned model. As a result, an image classification model is generated that has learned the relationship between images of pets and the states of pets that correspond to GOOD shots.
  • the image classification unit 412 uses the image classification model to estimate whether the image is a GOOD shot. Specifically, the image classification model estimates whether or not an input image is a GOOD shot, and calculates a score (referred to as "GOOD shot score") indicating the probability that the image is a GOOD shot, and the image. A score (referred to as "BAD shot score”) indicating the probability that is a BAD shot is calculated. For example, the image classification model calculates each score such that the sum of the GOOD shot score and the BAD shot score is "1". Then, the image classification model compares the GOOD shot score and the BAD shot score with a predetermined threshold TH, and selects the one having a score greater than the threshold TH as the classification result.
  • a score referred to as "GOOD shot score”
  • BAD shot score A score indicating the probability that is a BAD shot is calculated. For example, the image classification model calculates each score such that the sum of the GOOD shot score and the BAD shot score is "1". Then, the image classification model compare
  • the image classification model calculates a GOOD shot score of "0.8" and a BAD shot score of "0.2" and compares them with a predetermined threshold TH.
  • the threshold value TH is set to "0.5"
  • the image classification model estimates that the image is a GOOD shot.
  • FIG. 5 is a flowchart of image classification processing performed in the server 200. This processing is realized by the processor 212 shown in FIG. 2 executing a program prepared in advance and operating as each element shown in FIG. 3.
  • the image acquisition unit 411 acquires a captured video from the user terminal 300. Then, the image acquisition unit 411 acquires an image (still image or video) from the captured video (step S11). Next, the image classification unit 412 classifies whether the image acquired by the image acquisition unit 411 is a GOOD shot or not (step S12). Specifically, the image classification unit 412 calculates a score indicating the probability that the image is a GOOD shot, and a score indicating the probability that the image is a BAD shot. The image classification unit 412 compares each calculated score with a threshold TH and classifies whether the image is a GOOD shot or a BAD shot.
  • the image classification unit 412 attaches a classification result to the image acquired by the image acquisition unit 411 and stores it in the database (DB) 215 (step S13). For example, the image classification unit 412 attaches a flag such as "1" to an image classified as a GOOD shot and "0" to an image classified as a BAD shot, and stores the flag in the DB 215. Then, the image classification process ends.
  • images of GOOD shots that match the user's taste are extracted from the huge number of images taken by the user and stored in the DB 215 of the server 200.
  • the user can access the server 200 and view the GOOD shot images stored in the DB 215. Further, the user can download images of GOOD shots from the server 200 and save them on a terminal device such as the user terminal 300.
  • the server 200 classifies images based on extracted images extracted from captured videos.
  • the server 200 may also determine whether a predetermined state occurrence condition is satisfied, and classify the image using the determination result.
  • the predetermined state occurrence condition is a condition under which it is estimated that a GOOD shot has been taken, and hereinafter also referred to as a "GOOD shot occurrence condition.”
  • the conditions for generating a GOOD shot are determined based on, for example, biometric information and behavior information of the photographer.
  • FIG. 6 shows the functional configuration of the server 200a of Modification 1.
  • a condition determination unit 413 is provided in the server 200a.
  • the condition determination unit 413 acquires biometric information of the photographer and a time stamp from the user terminal 300. Then, the condition determining unit 413 determines whether the biometric information of the photographer, etc. satisfies a predetermined condition using the trained model learned in advance, and outputs the determination result to the image classification unit 412. .
  • Biometric information of the photographer includes line of sight, voice, heart rate, etc.
  • the biometric information of the photographer is acquired by the user terminal 300.
  • the user terminal 300 may acquire biological information from a camera, microphone, sensor, etc. installed in the user terminal 300, or may wirelessly communicate with external devices using Bluetooth (registered trademark), Wi-Fi (registered trademark), etc. Biometric information may be acquired from an external device through communication.
  • the predetermined conditions include, for example, the photographer's gaze is directed toward the pet, the photographer's voice is louder than a predetermined threshold, and the photographer's response is "like". Examples include uttering a word, and the photographer's heart rate exceeding a predetermined threshold.
  • condition determination unit 413 determines that the biometric information of the photographer satisfies the predetermined condition not only at the time when the biometric information satisfies the predetermined condition, but also at the time before and after that time, and outputs the determination result to the image classification unit 412. You may.
  • condition determining unit 413 may determine whether the conditions for generating a GOOD shot are satisfied based on behavioral information of the photographer or the pet. For example, if the photographer gives a signal and the pet acts according to the signal, or if the photographer gives an instruction or command and the pet acts according to the instruction or command, the condition determination unit 413 determines the conditions for the occurrence of a GOOD shot. It is determined that the condition is satisfied, and the determination result is output to the image classification unit 412.
  • the behavior information of the photographer and the pet may be obtained from a microphone, a sensor, etc. installed in the user terminal 300, or may be obtained from a video captured by the user terminal 300.
  • the image classification unit 412 classifies whether the extracted image is a GOOD shot or not based on the extracted image input from the image acquisition unit 411 and the determination result input from the condition determination unit 413.
  • the image classification model used by the image classification unit 412 is a trained model that has been trained in advance to estimate whether or not it is a GOOD shot based on the extracted image and the determination result.
  • (Modification 2) Teacher data for relearning the image classification model may be created based on the GOOD shots classified according to the first embodiment described above. Specifically, the pet owner determines whether or not the GOOD shots classified by the server 200 are necessary. The server 200 assumes that the image that the pet owner has determined is necessary is a GOOD shot. On the other hand, the server 200 determines that the image that the pet owner has determined is unnecessary is a BAD shot, and changes the label. Then, the server 200 uses the image data of the GOOD shots and the image data of the BAD shots as learning data to re-learn the image classification model. This allows the server 200 to classify GOOD shots that more closely match the owner's preferences.
  • the user terminal 300 always puts the camera in recording mode and sends the captured video to the server 200. Instead, the user terminal 300 starts recording when the subject appears on the camera, ends recording when the subject no longer appears on the camera, and sends the captured video from the start of recording to the end of recording to the server 200. You may. Specifically, the user terminal 300 captures images captured by the camera at predetermined timings and transmits them to the server 200. The server 200 determines whether or not the pet is captured by the camera of the user terminal 300 based on an image recognition model created in advance. If the pet is captured by the camera of the user terminal 300, the server 200 puts the user terminal 300 into recording mode and starts recording. After that, if the pet is no longer visible on the camera of the user terminal 300, the server 200 ends the recording mode of the user terminal 300. This makes it possible to reduce the amount of captured video data transmitted from the user terminal 300 to the server 200.
  • the user terminal 300 may determine whether the pet is captured by the camera of the user terminal 300. In this case, the user terminal 300 uses an image recognition model created in advance or the like to determine whether or not the pet is captured by the camera of the user terminal 300. Then, the user terminal 300 may control the start and end of recording according to the determination result.
  • the server 200 classifies the GOOD shots based on videos shot with pets as subjects, but the subjects are not limited to pets, for example, children, etc. It may also be a different subject.
  • the information acquired by the user terminal 300 is basically transmitted to the server 200 as is, and the GOOD shots are classified based on the information received by the server 200.
  • the user terminal 300 may perform processing for classifying GOOD shots and transmit the processing results to the server 200.
  • the server 200 may not be used, and the processing for classifying GOOD shots and the storage of the processing results may be performed on the user terminal 300. Thereby, the communication load from the user terminal 300 to the server 200 and the processing load on the server 200 can be reduced.
  • user terminal 300 is an example of an image classification device.
  • FIG. 7 is a block diagram showing the functional configuration of the image classification device 50 of the second embodiment.
  • the image classification device 50 of the second embodiment includes an image acquisition means 51, an image classification means 52, and an output means 53.
  • FIG. 8 is a flowchart of processing by the image classification device 50.
  • the image acquisition means 51 acquires an image in which the target subject is photographed (step S51).
  • the image classification means 52 uses a model in which the relationship between an image in which the target subject is photographed and a predetermined state of the target subject is machine learned, and extracts an image that depicts a predetermined state of the target subject from the image. (Step S52).
  • the output means 53 outputs the image and the classification result (step S53).
  • the image classification device 50 of the second embodiment it is possible to easily classify images that the user likes.
  • An image acquisition means for acquiring an image in which the target subject is photographed;
  • Image classification means for classifying, from the images, images in which a predetermined state of the target subject is captured, using a model in which a relationship between an image in which the target subject is captured and a predetermined state of the target subject is machine-learned; and, Output means for outputting the image and the classification result;
  • An image classification device comprising:
  • Appendix 5 The image classification device according to appendix 4, wherein the condition determining means determines whether or not the occurrence condition is satisfied based on the line of sight direction of the photographer of the target subject.
  • Appendix 6 The image classification device according to appendix 4 or 5, wherein the condition determining means determines whether the occurrence condition is satisfied based on the heart rate of the photographer of the target subject.
  • Appendix 7 The image classification device according to any one of appendices 4 to 6, wherein the condition determining means determines whether the occurrence condition is satisfied based on the voice of the photographer of the target subject.
  • Appendix 8 The image classification device according to any one of appendices 4 to 7, wherein the condition determining means detects the voice of the photographer, and the occurrence condition is that the target subject acted in response to the voice of the photographer. .
  • the image acquisition means starts acquiring an image in which the target subject is captured when the target subject is captured by the camera of the terminal device, and starts capturing an image in which the target subject is captured when the target subject is not captured by the camera of the terminal device.
  • the image classification device according to any one of Supplementary Notes 1 to 8, which finishes acquiring an image in which a subject is captured.
  • Appendix 11 Obtain an image that shows the target subject, Classifying images in which a predetermined state of the target subject is captured from the images using a model in which the relationship between an image in which the target subject is captured and a predetermined state of the target subject is machine learned; An image classification method that outputs the image and the classification result.
  • Appendix 12 Obtain an image that shows the target subject, Classifying images in which a predetermined state of the target subject is captured from the images using a model in which the relationship between an image in which the target subject is captured and a predetermined state of the target subject is machine learned;
  • a recording medium that records a program that causes a computer to execute a process of outputting the image and the classification result.

Landscapes

  • Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Environmental Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Animal Husbandry (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Multimedia (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
PCT/JP2022/012704 2022-03-18 2022-03-18 画像分類装置、画像分類方法、及び、記録媒体 WO2023175931A1 (ja)

Priority Applications (2)

Application Number Priority Date Filing Date Title
JP2024507433A JP7729466B2 (ja) 2022-03-18 2022-03-18 画像分類装置、画像分類方法、及び、プログラム
PCT/JP2022/012704 WO2023175931A1 (ja) 2022-03-18 2022-03-18 画像分類装置、画像分類方法、及び、記録媒体

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2022/012704 WO2023175931A1 (ja) 2022-03-18 2022-03-18 画像分類装置、画像分類方法、及び、記録媒体

Publications (1)

Publication Number Publication Date
WO2023175931A1 true WO2023175931A1 (ja) 2023-09-21

Family

ID=88022966

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2022/012704 WO2023175931A1 (ja) 2022-03-18 2022-03-18 画像分類装置、画像分類方法、及び、記録媒体

Country Status (2)

Country Link
JP (1) JP7729466B2 (enrdf_load_stackoverflow)
WO (1) WO2023175931A1 (enrdf_load_stackoverflow)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH11205773A (ja) * 1998-01-20 1999-07-30 Fujitsu General Ltd 監視映像記録方式
JP2019047234A (ja) * 2017-08-31 2019-03-22 ソニーセミコンダクタソリューションズ株式会社 情報処理装置、情報処理方法、およびプログラム
JP2020170916A (ja) * 2019-04-02 2020-10-15 パナソニックIpマネジメント株式会社 情報処理装置、及び、情報処理方法
JP2021064825A (ja) * 2019-10-10 2021-04-22 Kddi株式会社 撮像装置、学習装置、制御方法、学習方法、及びコンピュータプログラム
WO2022050093A1 (ja) * 2020-09-01 2022-03-10 パナソニックIpマネジメント株式会社 ペット状況推定システム、ペットカメラ、サーバ、ペット状況推定方法、及びプログラム

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009059257A (ja) 2007-09-03 2009-03-19 Sony Corp 情報処理装置、および情報処理方法、並びにコンピュータ・プログラム
JP2009259122A (ja) 2008-04-18 2009-11-05 Canon Inc 画像処理装置、画像処理方法および画像処理プログラム
US20200322518A1 (en) 2016-06-10 2020-10-08 Sony Corporation Information processing apparatus, information processing method, and program

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH11205773A (ja) * 1998-01-20 1999-07-30 Fujitsu General Ltd 監視映像記録方式
JP2019047234A (ja) * 2017-08-31 2019-03-22 ソニーセミコンダクタソリューションズ株式会社 情報処理装置、情報処理方法、およびプログラム
JP2020170916A (ja) * 2019-04-02 2020-10-15 パナソニックIpマネジメント株式会社 情報処理装置、及び、情報処理方法
JP2021064825A (ja) * 2019-10-10 2021-04-22 Kddi株式会社 撮像装置、学習装置、制御方法、学習方法、及びコンピュータプログラム
WO2022050093A1 (ja) * 2020-09-01 2022-03-10 パナソニックIpマネジメント株式会社 ペット状況推定システム、ペットカメラ、サーバ、ペット状況推定方法、及びプログラム

Also Published As

Publication number Publication date
JPWO2023175931A1 (enrdf_load_stackoverflow) 2023-09-21
JP7729466B2 (ja) 2025-08-26

Similar Documents

Publication Publication Date Title
US11089985B2 (en) Systems and methods for using mobile and wearable video capture and feedback plat-forms for therapy of mental disorders
CN111291841B (zh) 图像识别模型训练方法、装置、计算机设备和存储介质
JP6431231B1 (ja) 撮像システム、学習装置、および撮像装置
US9875445B2 (en) Dynamic hybrid models for multimodal analysis
US20200175262A1 (en) Robot navigation for personal assistance
US9408562B2 (en) Pet medical checkup device, pet medical checkup method, and non-transitory computer readable recording medium storing program
CN114514585A (zh) 疾病预测系统、保险费计算系统以及疾病预测方法
US20140316881A1 (en) Estimation of affective valence and arousal with automatic facial expression measurement
EP2813064A1 (en) Method and apparatus for unattended image capture
CN110688874A (zh) 人脸表情识别方法及其装置、可读存储介质和电子设备
KR102198337B1 (ko) 전자 장치, 전자 장치의 제어 방법 및 컴퓨터 판독 매체.
US12307817B2 (en) Method and system for automatically capturing and processing an image of a user
CN106874922B (zh) 一种确定业务参数的方法及装置
WO2024038114A1 (en) Determining failure cases in trained neural networks using generative neural networks
KR102396794B1 (ko) 전자 장치 및 이의 제어 방법
KR102247481B1 (ko) 나이 변환된 얼굴을 갖는 직업영상 생성 장치 및 방법
CN118430024B (zh) 一种基于深度学习的胶囊胃镜胃部部位识别方法及系统
KR20230164384A (ko) 컴퓨팅 장치에서 객체인식 모델 학습방법
WO2024253769A1 (en) Ai highlight detection using cascaded filtering of captured content
CN112884158B (zh) 一种机器学习程序的训练方法、装置及设备
WO2024253768A2 (en) Ai highlight detection trained on shared video
WO2023175931A1 (ja) 画像分類装置、画像分類方法、及び、記録媒体
US20240160275A1 (en) Method of implementing content reacting to user responsiveness in metaverse environment
Yang et al. Automated recognition and classification of cat pain through deep learning
JP2023014766A (ja) 体重推定装置、体向き判定装置、体部位画像生成装置、機械学習装置、推論装置、体重推定方法、及び、体重推定プログラム

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22932201

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2024507433

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22932201

Country of ref document: EP

Kind code of ref document: A1