US20210279452A1 - Action-estimating device - Google Patents

Action-estimating device Download PDF

Info

Publication number
US20210279452A1
US20210279452A1 US17/324,190 US202117324190A US2021279452A1 US 20210279452 A1 US20210279452 A1 US 20210279452A1 US 202117324190 A US202117324190 A US 202117324190A US 2021279452 A1 US2021279452 A1 US 2021279452A1
Authority
US
United States
Prior art keywords
estimating
action
choices
probability
time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US17/324,190
Inventor
Daisuke Kimura
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Asilla Inc
Original Assignee
Asilla Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Asilla Inc filed Critical Asilla Inc
Priority to US17/324,190 priority Critical patent/US20210279452A1/en
Assigned to ASILLA, INC. reassignment ASILLA, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: KIMURA, DAISUKE
Publication of US20210279452A1 publication Critical patent/US20210279452A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/23Recognition of whole body movements, e.g. for sport training
    • G06K9/00342
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H40/00ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices
    • G16H40/60ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices for the operation of medical equipment or devices
    • G16H40/67ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices for the operation of medical equipment or devices for remote operation
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/103Detecting, measuring or recording devices for testing the shape, pattern, colour, size or movement of the body or parts thereof, for diagnostic purposes
    • A61B5/107Measuring physical dimensions, e.g. size of the entire body or parts thereof
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/103Detecting, measuring or recording devices for testing the shape, pattern, colour, size or movement of the body or parts thereof, for diagnostic purposes
    • A61B5/11Measuring movement of the entire body or parts thereof, e.g. head or hand tremor, mobility of a limb
    • A61B5/1126Measuring movement of the entire body or parts thereof, e.g. head or hand tremor, mobility of a limb using a particular sensing technique
    • A61B5/1128Measuring movement of the entire body or parts thereof, e.g. head or hand tremor, mobility of a limb using a particular sensing technique using image analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • G06T7/75Determining position or orientation of objects or cameras using feature-based methods involving models
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H80/00ICT specially adapted for facilitating communication between medical practitioners or patients, e.g. for collaborative diagnosis, therapy or health monitoring
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20076Probabilistic image processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H40/00ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices
    • G16H40/20ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices for the management or administration of healthcare resources or facilities, e.g. managing hospital staff or surgery rooms
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems

Definitions

  • the present invention relates to an action-estimating device for estimating an action of a subject appearing in a plurality of time-series images.
  • Patent Document 1 a device which detects a posture of a human appearing in time-series data based on the articulation of the human appearing in time-series data, and recognizes an action of the human based on the change of the posture is known (for example, Patent Document 1).
  • Patent Document 1 Japanese Patent Application publication No. 2017-228100.
  • a highly probable choice is selected from among a plurality of choices based on the detected posture. Therefore, precise selection of the choice will lead to an action-estimating with high accuracy.
  • an object of the invention to provide an action-estimating device for estimating an action of a subject appearing in a plurality of time-series images with high accuracy.
  • the present invention provides an action-estimating device including: an estimating-side obtaining unit configured to obtain a plurality of time-series images in which a subject appears; an estimating-side detecting unit configured to detect a plurality of articulations appearing in each time-series image; an estimating-side measuring unit configured to measure coordinates of the detected plurality of articulations in each time-series image; an estimating unit configured to estimate an action of the subject based on displacement of the coordinates of the measured plurality of articulations in the plurality of time-series images; and a storing unit configured to store a plurality of choices of the action to be estimated.
  • the estimating-side detecting unit further detects a background appeared in each time-series image.
  • the estimating unit calculates a probability of each of the plurality of choices based on the displacement of the coordinates of the measured plurality of articulations in the plurality of time-series images, and corrects the calculated probability of each of the plurality of choices based on the detected background.
  • the estimating unit excludes one or more choices from among the plurality of choices based on the detected background in order to estimate the action of the subject.
  • the choices whose actions have a relation greater than a prescribed value between each other are associated with each other in the storing unit.
  • the action estimating unit increases the probability of the associated choice which was not excluded or whose probability was not decreased, in order to estimate the action of the subject.
  • Another aspect of the present invention provides an action-estimating program installed on a computer storing a plurality of choices of action to be estimated.
  • the program including: a step for obtaining a plurality of time-series images in which a subject appears; a step for detecting a plurality of articulations appearing in each time-series image; a step for measuring coordinates of the detected plurality of articulations in each time-series image; a step for estimating an action of the subject based on displacement of the coordinates of the measured plurality of articulations in the plurality of time-series images; and a step for detecting a background appeared in each time-series image.
  • a probability of each of the plurality of choices is calculated based on the displacement of the coordinates of the measured plurality of articulations in the plurality of time-series images, and the calculated probability of each of the plurality of choices is corrected based on the detected background.
  • one or more choices are excluded from among the plurality of choices based on the detected background.
  • the choices whose actions have a relation greater than a prescribed value between each other are associated with each other in the computer.
  • the probability of the associated choice which was not excluded or whose probability was not decreased is increased.
  • an action-estimating device including: an estimating-side obtaining unit configured to obtain a plurality of time-series images in which a subject appears; an estimating-side detecting unit configured to detect a plurality of articulations appearing in each time-series image; an estimating-side measuring unit configured to measure coordinates of the detected plurality of articulations in each time-series image; an estimating unit configured to estimate an action of the subject based on displacement of the coordinates of the measured plurality of articulations in the plurality of time-series images; a setting unit configured to set a purpose or application for the estimating of the action of the subject; and a storing unit configured to store a plurality of choices of the action to be estimated.
  • the estimating unit calculates a probability of each of the plurality of choices based on the displacement of the coordinates the measured plurality of articulations in the plurality of time-series images, and corrects the calculated probability of each of the plurality of choices based on the set purpose or application.
  • the estimating unit excludes one or more choices from among the plurality of choices based on the set purpose or application in order to estimate the action of the subject.
  • the choices whose actions have a relation greater than a prescribed value between each other are associated with each other in the storing unit.
  • the action estimating unit increases the probability of the associated choice which was not excluded or whose probability was not decreased, in order to estimate the action of the subject.
  • Another aspect of the present invention provides an action-estimating program installed on a computer storing a plurality of choices of action to be estimated and in which a purpose or application for estimating of action of a subject is set including: a step for obtaining a plurality of time-series images in which the subject appears; a step for detecting a plurality of articulations appearing in each time-series image; a step for measuring coordinates of the detected plurality of articulations in each time-series image; and a step for estimating an action of the subject based on displacement of the coordinates of the measured plurality of articulations in the plurality of time-series images.
  • a probability of each of the plurality of choices is calculated based on the displacement of the coordinates of the measured plurality of articulations in the plurality of time-series images, and the calculated probability of each of the plurality of choices is corrected based on the set purpose or application.
  • one or more choices are excluded from among the plurality of choices based on the set purpose or application.
  • the choices whose actions have a relation greater than a prescribed value between each other are associated with each other in the computer.
  • the probability of the associated choice which was not excluded or whose probability was not decreased is increased.
  • the action-estimating device of the present invention it becomes possible to estimate an action of a subject appearing in a plurality of time-series images with high accuracy.
  • FIG. 1 is an explanatory view of a usage state of the action-estimating device according to an embodiment of the present invention.
  • FIG. 2 is a block diagram of a learning device and the action-estimating device according to the embodiment of the present invention.
  • FIG. 3 is an explanatory view of an articulation group according to the embodiment of the present invention.
  • FIG. 4 is an explanatory view of a correction of action choices based on a background according to the embodiment of the present invention.
  • FIG. 5 is a flowchart of an action-estimating in the action-estimating device according to the embodiment of the present invention.
  • FIG. 6 is an explanatory view of a usage state of the action-estimating device according to a modification of the present invention.
  • An action-estimating device 1 according to a preferred embodiment of the present invention will be described below, while referring to FIGS. 1 to 5 .
  • the action-estimating device 1 is used to estimate an action of a subject Z appearing in a plurality of time-series images Y (e.g., each frame constituting a video or the like) photographed by a photographing means X (in this embodiment, for easy understanding, the subject Z is displayed only on the skeleton).
  • information learned by a learning device 2 (see FIG. 2 ) and stored in a storing unit 3 is referred.
  • the learning device 2 includes a learning-side identifier 21 , a learning-side obtaining unit 22 , a learning-side detecting unit 23 , a correct-answer obtaining unit 24 , a learning-side measuring unit 25 , and a learning unit 26 .
  • the learning-side identifier 21 is used to identify a plurality of articulations A (in the present embodiment, neck, right elbow, left elbow, waist, right knee, and left knee) of a subject Z.
  • the learning-side identifier 21 stores articulation-identifying information as references for identifying each articulation A, such as shape, direction, and size. Further, the learning-side identifier 21 also stores supplemental identifying information as references on various “basic posture” (“walking”, “stand-up” etc.) of a subject Z, “motion range of each articulation A”, and “distance between articulations A.”
  • the learning-side identifier 21 also stores background identifying information (presence, color, and angle of an object; presence of a person, or the like) as the reference for identifying the background (i.e. “hospital room”, “office”, “outside” and the like).
  • the learning-side obtaining unit 22 obtains a plurality of time-series images Y as video images where actions appeared are known.
  • the plurality of time-series images Y is inputted by the user of the action-estimating device 1 .
  • the learning-side detecting unit 23 detects a plurality of articulations A appearing in each time-series image Y. Specifically, the learning-side detecting unit 23 detects the parts corresponding to the articulation-identifying information stored in the learning-side identifier 21 , using an inference model modeled by CNN (Convolution Neural Network). Each of the detected articulations A (A 1 to A 6 in FIG. 1 ) is selectably displayed on a display unit (not shown).
  • CNN Convolution Neural Network
  • the learning-side detecting unit 23 also detects backgrounds appeared in each time-series image Y. Particularly, the learning-side detecting unit 23 detects, in each time-series image Y, the part corresponding to the background identifying information stored in the learning-side identifier 21 .
  • the correct-answer obtaining unit 24 obtains a correct action (hereinafter referred to as correct-action) of the subject Z appearing in the plurality of time-series images Y, on each articulation A detected by the learning-side detecting unit 23 .
  • the correct-action is inputted by the user of the action-estimating device 1 .
  • the user selects each articulation A on the display unit (not shown) and inputs the correct-action “fall-down.”
  • the correct-answer obtaining unit 24 also obtains a correct-background appeared in the plurality of time-series images Y. For example, if the correct-background is a “hospital room”, the user will input a “hospital room” tag. Note that the choices of the correct-action and the correct-background are stored in the storing unit 3 .
  • the learning-side measuring unit 25 measures coordinates and depths of the plurality of articulations A detected by the learning-side detecting unit 23 . This measurement is performed on each time-series image Y.
  • the coordinate and the depth of the articulation A 1 in the time-series image Y at the time t 1 can be expressed such as XA 1 ( t 1 ), YA 1 ( t 1 ), ZA 1 ( t 1 ).
  • the depth is not necessarily expressed using the coordinate and may be expressed as a relative depth in the plurality of time-series images Y.
  • the depth may be measured by the known method.
  • a depth of each articulation A which has been inputted in advance in the correct-answer obtaining unit 24 , may also be used.
  • the learning unit 26 learns like “When the articulation has this size and angle, the articulation is in XX meters depth.”
  • the learning unit 26 learns the displacement in the plurality of time-series images Y of the coordinate and the depth of the whole of the plurality of articulations A belonging to each subject Z. Specifically, the learning unit 26 specifies the plurality of articulations A belonging to each subject Z selected by the correct-answer obtaining unit 24 as an articulation group B (see FIG. 3 ), and then, learns the displacement in the plurality of time-series images Y of the coordinate and the depth of the whole of the articulation group B.
  • the displacement of the coordinate and the depth of the whole of the articulation group B the displacement of the coordinate and the depth of the center point of coordinates of all the detected articulations A; or the displacement of the coordinate and the depth of the center of gravity closely related to the body movement. Both of these may also be used to increase the precision.
  • the displacement of the coordinate and the depth of each articulation A may be taken into account to increase the precision.
  • the coordinate and the depth of the center of gravity can be calculated based on the coordinate and the depth of each articulation A and the weight of each articulation A (including muscle, fat, etc.). In this case, information on the weight of each articulation A will be stored in the learning-side identifier 21 or the like in advance.
  • the learning unit 26 learns the said displacement in the plurality of time-series images Y of the coordinate and the depth of the whole of the articulation group B, in connection with the correct-action inputted in the correct-answer obtaining unit 24 . For example, when the correct-action is “fall forward”, the displacement of the coordinate of the whole of the articulation group B is learned as “move downward by first distance”, and the displacement of the depth of the whole of the articulation group B is learned as “move forward by second distance.”
  • the learning unit 26 learns the background detected by the learning-side detecting unit 23 (the background identifying information) and the correct-background obtained by the correct-answer obtaining unit 24 in association with each other. In this way, it becomes possible to estimate; “The background with such background identifying information is expected to be a hospital room”; “In the case of such background identifying information, the probability for the background to be a hospital room is 80%” and so on.
  • the learning unit 26 determines the relation between the correct-action and the correct-background, which are obtained by the correct-answer obtaining unit 24 .
  • the relation between the correct-action and the correct-background which are obtained by the correct-answer obtaining unit 24 .
  • “walking” is most common action
  • “fall-down” action occurs occasionally, “run” action rarely occurs, and “pitching” action never happens.
  • the background is “hospital room”, for example, it is determined like; “walk: high”, “fall-down: middle” “run: low” and “pitching: non”.
  • the relations determined as such are stored in the storing unit 3 .
  • the learning unit 26 learns a large amount of images having various viewpoints, other than the plurality of time-series images Y described above. For example, in the case of “hospital room”, it is considered that a large amount of images like “hospital rooms photographed at different angles”, “hospital rooms having various colored interiors” and “hospital rooms with presence/absence of nurses and patients” are collected to be learned by the learning unit 26 .
  • the storing unit 3 stores various choices of action and background, which the user can select, in the correct-answer obtaining unit 24 , other than the above described learning result by the learning unit 26 .
  • the action-estimating device 1 includes an estimating-side identifier 11 , an estimating-side obtaining unit 12 , an estimating-side detecting unit 13 , an estimating-side measuring unit 14 , and an estimating unit 15 .
  • the estimating-side identifier 11 is used to identify a plurality of articulations A (in the present embodiment, neck, right elbow, left elbow, waist, right knee, and left knee) of a subject Z.
  • the estimating-side identifier 11 stores articulation-identifying information as references for identifying each articulation A, such as shape, direction, and size. Further, the estimating-side identifier 11 also stores supplemental identifying information as references on various “basic posture” (“walking”, “stand-up” etc.) of a subject Z, “motion range of each articulation A”, and “distance between articulations A.” In the present embodiment, the same information as in the learning-side identifier 21 is stored.
  • the estimating-side identifier 11 also stores background identifying information (presence, color, and angle of an object; presence of a person, or the like) as the reference for identifying the background (i.e. “hospital room”, “office”, “outside” and the like). In the present embodiment, the same information as in the learning-side identifier 21 is stored.
  • the estimating-side obtaining unit 12 is connected to the photographing means X and obtains video images (i.e., a plurality of time-series images Y) taken by the photographing means X.
  • video images i.e., a plurality of time-series images Y
  • a plurality of time-series images Y is obtained in real-time. However, it may be obtained later depending on the intended purpose of the action-estimating device 1 .
  • the estimating-side detecting unit 13 detects a plurality of articulations A appearing in each time-series image Y. Specifically, the estimating-side detecting unit 13 detects the parts corresponding to the articulation-identifying information stored in the estimating-side identifier 11 , using an inference model modeled by CNN (Convolution Neural Network). When the estimating-side detecting unit 13 detects an articulation A, it can be considered that a subject Z appears in the time-series image Y.
  • CNN Convolution Neural Network
  • the estimating-side detecting unit 13 also detects the background appeared in each of the time-series images Y. In detail, the estimating-side detecting unit 13 detects the parts corresponding to the background identifying information stored in the estimating-side identifier 11 among each of the time-series images Y. Then, the estimating-side detecting unit 13 determines the background while referring to the learning result by the learning unit 26 stored in the storing unit 3 . For example, there presented a “bed” and an “infusion” in FIG. 1 , so that it is determined as “the background is a hospital room.”
  • the estimating-side measuring unit 14 measures coordinates and depths of the plurality of articulations A detected by the estimating-side detecting unit 13 . This measurement is performed on each time-series image Y.
  • the coordinate and the depth of an articulation A 1 at the time t 1 in the time-series images Y can be expressed such as XA 1 ( t 1 ), YA 1 ( t 1 ), ZA 1 ( t 1 ).
  • the depth is not necessarily expressed using the coordinate and may be expressed as a relative depth in the plurality of time-series images Y.
  • the depth may be measured by the known method. However, it is possible to specify the depth referring to the learning unit 26 when the learning unit 26 has already learned about the depth.
  • the estimating unit 15 estimates the action of the subject Z, based on the displacement in the plurality of time-series images Y of the coordinate and the depth of the whole of the articulation group B. Specifically, the estimating unit 15 selects one or more actions with high probability from among various action choices (“fall-down”, “walk”, “running” and “throwing”, etc.), while referring to the learning result by the learning unit 26 stored in the storing unit 3 .
  • the coordinate and the depth of the whole of the articulation group B of each subject Z is inputted in a time-series inference model, in which LSTM (Long Short Term Memory) is used, and the action identifying label such as “walking” and “standing” is outputted.
  • LSTM Long Short Term Memory
  • the estimating unit 15 also considers the background appeared in the time-series images Y in order to estimate the action of the subject Z. In detail, while referring to the relation between the correct-action and the correct-background stored in the storing unit 3 , the estimating unit 15 corrects the probability of choices of action, according to the background detected (determined) by the estimating-side detecting unit 13 .
  • the action may be excluded from the action choices. For example, under the condition “the action of probability less than 30% is excluded”, those actions “run” and “pitching” are excluded, as shown in FIG. 4 ( c ).
  • actions such as “fall-down” and “pitching”, having a relation greater than a prescribed value between each other may be associated with each other. Then, when any one of the associated actions is excluded or its probability is decreased, the probability of the action of the other may be increased. In the example of FIG. 4 , as shown in (d), since “pitching” is excluded, the probability of “fall-down” is increased.
  • the action-estimating device 1 since the action-estimating device 1 according to the embodiment considers the background appeared in the time-series images Y in order to estimate the action of subject Z, it becomes possible to perform action estimation with higher accuracy.
  • the coordinates and the depths of the plurality of articulations A detected in S 2 are measured by the estimating-side measuring unit 14 (S 3 ). This measurement is performed for each time-series image Y.
  • the action of subject Z is estimated by the estimating unit 15 based on the displacement in the plurality of time-series images Y of the coordinates and the depths of the plurality of articulations A measured in S 3 (S 4 ).
  • the action-estimating device 1 having such a configuration, for example, can be used in the below purpose; in a nursing home, the action-estimating device 1 will always photograph inside the room where care-receivers (subject Z) are there. Then, if the case for those care-receivers to fall or the like are estimated based on the photographed images, the action-estimating device 1 will give an alert on that fact to a caregiver.
  • the action-estimating device 1 considers the backgrounds appeared in the time-series images Y in order to estimate the action of subject Z.
  • the action-estimating device 1 in order to estimate the action of the subject Z, calculates the probability of each of the plurality of choices based on the measured displacement in the plurality of time-series images Y of the coordinates of the plurality of articulations A, and corrects the probability of the plurality of calculated choices based on the detected background.
  • the action-estimating device 1 in order to estimate the action of subject Z, excludes one or more choices from among the plurality of choices, based on the detected background.
  • the choices whose actions have a relation greater than a prescribed value between each other are associated with each other.
  • the probability of the associated choice which was not excluded or whose probability was not decreased is increased in order to estimate the action of the subject Z.
  • a setting unit 16 is provided to the action-estimating device 1 , and the user will set the purpose or the application (crime prevention, medical nursing care etc.).
  • the relation between the correct-action and the purpose or the application are stored in advance in the storing unit 3 .
  • a set value may also be stored in advance in the storing unit 3 .
  • the storing unit 3 is arranged separately from the action-estimating device 1 and the learning unit 2 .
  • the storing unit 3 may be mounted in the action-estimating device 1 side or in the learning unit 2 side.
  • the displacement in the plurality of time-series images Y of the coordinate and the depth of the articulation group B is considered in order to estimate the action of the subject Z.
  • the displacement of each articulation A in the plurality of time-series images Y may be used.
  • the present invention is also applied to a program that conducts the process of the action-estimating device 1 , or to a record media accommodating the content of the program.
  • the program should be installed on the computer or the like.
  • the record media storing the program may be reusable and not one-time use only.
  • reusable record media for example, CD-ROM may be employed, but the record media is not limited to this.
  • a plurality of choices of action to be estimated may be stored in the computer later.
  • the purpose or application of the estimation of the targeted action may also be set in the computer later.

Abstract

[Problem]To provide an action-estimating device which is capable of estimating an action of a subject appearing in a plurality of time-series images with high precision.[Solution]Provided is an action-estimating device 1 comprising: an estimating-side obtaining unit 12 for obtaining a plurality of time-series images Y in which a subject Z appears; an estimating-side detecting unit 13 for detecting a plurality of articulations A appearing in each time-series image Y; an estimating-side measuring unit 14 for measuring coordinates of the detected plurality of articulations A in each time-series image Y; an estimating unit 15 for estimating an action of the subject Z based on displacement of the coordinates of the measured plurality of articulations A in the plurality of time-series images Y; and a storing unit 3 for storing a plurality of choices of the action to be estimated. The estimating-side detecting unit 13 further detects a background appeared in each time-series image Y. In order to estimate the action of the subject Z, the estimating unit 15 calculates a probability of each of the plurality of choices based on the displacement of the coordinates of the measured plurality of articulations A in the plurality of time-series images Y, and corrects the calculated probability of each of the plurality of choices based on the detected background.

Description

    TECHNICAL FIELD OF THE INVENTION
  • The present invention relates to an action-estimating device for estimating an action of a subject appearing in a plurality of time-series images.
  • BACKGROUND OF THE INVENTION
  • Conventionally, a device which detects a posture of a human appearing in time-series data based on the articulation of the human appearing in time-series data, and recognizes an action of the human based on the change of the posture is known (for example, Patent Document 1).
  • PRIOR ART
  • Patent Document 1: Japanese Patent Application publication No. 2017-228100.
  • SUMMARY OF INVENTION Problem to be Solved by the Invention
  • Generally, in an action-estimating, a highly probable choice is selected from among a plurality of choices based on the detected posture. Therefore, precise selection of the choice will lead to an action-estimating with high accuracy.
  • In view of the foregoing, it is an object of the invention to provide an action-estimating device for estimating an action of a subject appearing in a plurality of time-series images with high accuracy.
  • Means for Solving the Problem
  • The present invention provides an action-estimating device including: an estimating-side obtaining unit configured to obtain a plurality of time-series images in which a subject appears; an estimating-side detecting unit configured to detect a plurality of articulations appearing in each time-series image; an estimating-side measuring unit configured to measure coordinates of the detected plurality of articulations in each time-series image; an estimating unit configured to estimate an action of the subject based on displacement of the coordinates of the measured plurality of articulations in the plurality of time-series images; and a storing unit configured to store a plurality of choices of the action to be estimated. The estimating-side detecting unit further detects a background appeared in each time-series image. In order to estimate the action of the subject, the estimating unit calculates a probability of each of the plurality of choices based on the displacement of the coordinates of the measured plurality of articulations in the plurality of time-series images, and corrects the calculated probability of each of the plurality of choices based on the detected background.
  • With this configuration, it becomes possible to focus on the action having high probability to occur by considering the background. Therefore, an action estimation with high accuracy can be realized. Further, it becomes possible to decrease the probability of action that is unlikely to occur, while increase the probability of action that is likely to occur. Therefore, an action-estimating with higher accuracy is realized.
  • It is preferable that the estimating unit excludes one or more choices from among the plurality of choices based on the detected background in order to estimate the action of the subject.
  • With this configuration, since the number of actions, which is ultimately presented to the user, decreases, it makes easier for the user to recognize the estimated action. In addition, since one or more choices are excluded before calculating the probability of choices, only the probability of choices, which was not excluded, can be calculated efficiently, and then the load on the CPU can be reduced.
  • It is preferable that the choices whose actions have a relation greater than a prescribed value between each other are associated with each other in the storing unit. When any one of the plurality of choices associated with each other was excluded or the probability of any one of the plurality of choices associated with each other was decreased, the action estimating unit increases the probability of the associated choice which was not excluded or whose probability was not decreased, in order to estimate the action of the subject.
  • With this configuration, in the case of “pitching” and “fall-down” for example, which have actions look alike part of the way, the probability of the action which was not excluded is increased. Therefore, it makes possible to perform an action estimating with higher accuracy.
  • Another aspect of the present invention provides an action-estimating program installed on a computer storing a plurality of choices of action to be estimated. The program including: a step for obtaining a plurality of time-series images in which a subject appears; a step for detecting a plurality of articulations appearing in each time-series image; a step for measuring coordinates of the detected plurality of articulations in each time-series image; a step for estimating an action of the subject based on displacement of the coordinates of the measured plurality of articulations in the plurality of time-series images; and a step for detecting a background appeared in each time-series image. In the estimating step, a probability of each of the plurality of choices is calculated based on the displacement of the coordinates of the measured plurality of articulations in the plurality of time-series images, and the calculated probability of each of the plurality of choices is corrected based on the detected background.
  • It is preferable that, in the estimating step, one or more choices are excluded from among the plurality of choices based on the detected background.
  • It is preferable that the choices whose actions have a relation greater than a prescribed value between each other are associated with each other in the computer. In the estimating step, when any one of the plurality of choices associated with each other was excluded or the probability of any one of the plurality of choices associated with each other was decreased, the probability of the associated choice which was not excluded or whose probability was not decreased is increased.
  • Another aspect of the present invention provides an action-estimating device including: an estimating-side obtaining unit configured to obtain a plurality of time-series images in which a subject appears; an estimating-side detecting unit configured to detect a plurality of articulations appearing in each time-series image; an estimating-side measuring unit configured to measure coordinates of the detected plurality of articulations in each time-series image; an estimating unit configured to estimate an action of the subject based on displacement of the coordinates of the measured plurality of articulations in the plurality of time-series images; a setting unit configured to set a purpose or application for the estimating of the action of the subject; and a storing unit configured to store a plurality of choices of the action to be estimated. In order to estimate the action of the subject, the estimating unit calculates a probability of each of the plurality of choices based on the displacement of the coordinates the measured plurality of articulations in the plurality of time-series images, and corrects the calculated probability of each of the plurality of choices based on the set purpose or application.
  • With this configuration, it becomes possible to focus on the action having high probability to occur by considering the application or purpose. Therefore, an action estimation with high accuracy can be realized. Further, it becomes possible to decrease the probability of action that is unlikely to occur, while increase the probability of action that is likely to occur. Therefore, an action-estimating with higher accuracy is realized.
  • It is preferable that the estimating unit excludes one or more choices from among the plurality of choices based on the set purpose or application in order to estimate the action of the subject.
  • It is preferable that the choices whose actions have a relation greater than a prescribed value between each other are associated with each other in the storing unit. When any one of the plurality of choices associated with each other was excluded or the probability of any one of the plurality of choices associated with each other was decreased, the action estimating unit increases the probability of the associated choice which was not excluded or whose probability was not decreased, in order to estimate the action of the subject.
  • Another aspect of the present invention provides an action-estimating program installed on a computer storing a plurality of choices of action to be estimated and in which a purpose or application for estimating of action of a subject is set including: a step for obtaining a plurality of time-series images in which the subject appears; a step for detecting a plurality of articulations appearing in each time-series image; a step for measuring coordinates of the detected plurality of articulations in each time-series image; and a step for estimating an action of the subject based on displacement of the coordinates of the measured plurality of articulations in the plurality of time-series images. In the estimating step, a probability of each of the plurality of choices is calculated based on the displacement of the coordinates of the measured plurality of articulations in the plurality of time-series images, and the calculated probability of each of the plurality of choices is corrected based on the set purpose or application.
  • It is preferable that, in the estimating step, one or more choices are excluded from among the plurality of choices based on the set purpose or application.
  • It is preferable that the choices whose actions have a relation greater than a prescribed value between each other are associated with each other in the computer. In the estimating step, when any one of the plurality of choices associated with each other was excluded or the probability of any one of the plurality of choices associated with each other was decreased, the probability of the associated choice which was not excluded or whose probability was not decreased is increased.
  • Effects of the Invention
  • According to the action-estimating device of the present invention, it becomes possible to estimate an action of a subject appearing in a plurality of time-series images with high accuracy.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 is an explanatory view of a usage state of the action-estimating device according to an embodiment of the present invention.
  • FIG. 2 is a block diagram of a learning device and the action-estimating device according to the embodiment of the present invention.
  • FIG. 3 is an explanatory view of an articulation group according to the embodiment of the present invention.
  • FIG. 4 is an explanatory view of a correction of action choices based on a background according to the embodiment of the present invention.
  • FIG. 5 is a flowchart of an action-estimating in the action-estimating device according to the embodiment of the present invention.
  • FIG. 6 is an explanatory view of a usage state of the action-estimating device according to a modification of the present invention.
  • PREFERRED EMBODIMENTS
  • An action-estimating device 1 according to a preferred embodiment of the present invention will be described below, while referring to FIGS. 1 to 5.
  • As shown in FIG. 1, the action-estimating device 1 is used to estimate an action of a subject Z appearing in a plurality of time-series images Y (e.g., each frame constituting a video or the like) photographed by a photographing means X (in this embodiment, for easy understanding, the subject Z is displayed only on the skeleton). In the action-estimating, information learned by a learning device 2 (see FIG. 2) and stored in a storing unit 3 is referred.
  • First, the configuration of the learning device 2 is described.
  • As shown in FIG. 2, the learning device 2 includes a learning-side identifier 21, a learning-side obtaining unit 22, a learning-side detecting unit 23, a correct-answer obtaining unit 24, a learning-side measuring unit 25, and a learning unit 26.
  • The learning-side identifier 21 is used to identify a plurality of articulations A (in the present embodiment, neck, right elbow, left elbow, waist, right knee, and left knee) of a subject Z. The learning-side identifier 21 stores articulation-identifying information as references for identifying each articulation A, such as shape, direction, and size. Further, the learning-side identifier 21 also stores supplemental identifying information as references on various “basic posture” (“walking”, “stand-up” etc.) of a subject Z, “motion range of each articulation A”, and “distance between articulations A.”
  • In addition, the learning-side identifier 21 also stores background identifying information (presence, color, and angle of an object; presence of a person, or the like) as the reference for identifying the background (i.e. “hospital room”, “office”, “outside” and the like).
  • The learning-side obtaining unit 22 obtains a plurality of time-series images Y as video images where actions appeared are known. The plurality of time-series images Y is inputted by the user of the action-estimating device 1.
  • The learning-side detecting unit 23 detects a plurality of articulations A appearing in each time-series image Y. Specifically, the learning-side detecting unit 23 detects the parts corresponding to the articulation-identifying information stored in the learning-side identifier 21, using an inference model modeled by CNN (Convolution Neural Network). Each of the detected articulations A (A1 to A6 in FIG. 1) is selectably displayed on a display unit (not shown).
  • The learning-side detecting unit 23 also detects backgrounds appeared in each time-series image Y. Particularly, the learning-side detecting unit 23 detects, in each time-series image Y, the part corresponding to the background identifying information stored in the learning-side identifier 21.
  • The correct-answer obtaining unit 24 obtains a correct action (hereinafter referred to as correct-action) of the subject Z appearing in the plurality of time-series images Y, on each articulation A detected by the learning-side detecting unit 23. The correct-action is inputted by the user of the action-estimating device 1. In particular, as shown in FIG. 1, when the subject Z is falling-down in the plurality of time-series images Y, the user selects each articulation A on the display unit (not shown) and inputs the correct-action “fall-down.”
  • In addition, in the present embodiment, the correct-answer obtaining unit 24 also obtains a correct-background appeared in the plurality of time-series images Y. For example, if the correct-background is a “hospital room”, the user will input a “hospital room” tag. Note that the choices of the correct-action and the correct-background are stored in the storing unit 3.
  • The learning-side measuring unit 25 measures coordinates and depths of the plurality of articulations A detected by the learning-side detecting unit 23. This measurement is performed on each time-series image Y.
  • For example, the coordinate and the depth of the articulation A1 in the time-series image Y at the time t1 can be expressed such as XA1(t 1), YA1(t 1), ZA1(t 1). The depth is not necessarily expressed using the coordinate and may be expressed as a relative depth in the plurality of time-series images Y. The depth may be measured by the known method. However, a depth of each articulation A, which has been inputted in advance in the correct-answer obtaining unit 24, may also be used. In this case, for example, the learning unit 26 (described later) learns like “When the articulation has this size and angle, the articulation is in XX meters depth.”
  • The learning unit 26 learns the displacement in the plurality of time-series images Y of the coordinate and the depth of the whole of the plurality of articulations A belonging to each subject Z. Specifically, the learning unit 26 specifies the plurality of articulations A belonging to each subject Z selected by the correct-answer obtaining unit 24 as an articulation group B (see FIG. 3), and then, learns the displacement in the plurality of time-series images Y of the coordinate and the depth of the whole of the articulation group B.
  • It is considered to use, as the displacement of the coordinate and the depth of the whole of the articulation group B, the displacement of the coordinate and the depth of the center point of coordinates of all the detected articulations A; or the displacement of the coordinate and the depth of the center of gravity closely related to the body movement. Both of these may also be used to increase the precision. The displacement of the coordinate and the depth of each articulation A may be taken into account to increase the precision. Note that the coordinate and the depth of the center of gravity can be calculated based on the coordinate and the depth of each articulation A and the weight of each articulation A (including muscle, fat, etc.). In this case, information on the weight of each articulation A will be stored in the learning-side identifier 21 or the like in advance.
  • The learning unit 26 learns the said displacement in the plurality of time-series images Y of the coordinate and the depth of the whole of the articulation group B, in connection with the correct-action inputted in the correct-answer obtaining unit 24. For example, when the correct-action is “fall forward”, the displacement of the coordinate of the whole of the articulation group B is learned as “move downward by first distance”, and the displacement of the depth of the whole of the articulation group B is learned as “move forward by second distance.”
  • Further, the learning unit 26 learns the background detected by the learning-side detecting unit 23 (the background identifying information) and the correct-background obtained by the correct-answer obtaining unit 24 in association with each other. In this way, it becomes possible to estimate; “The background with such background identifying information is expected to be a hospital room”; “In the case of such background identifying information, the probability for the background to be a hospital room is 80%” and so on.
  • In addition, in the present embodiment, the learning unit 26 determines the relation between the correct-action and the correct-background, which are obtained by the correct-answer obtaining unit 24. For example, in the case of the background “hospital room”, “walking” is most common action, “fall-down” action occurs occasionally, “run” action rarely occurs, and “pitching” action never happens. According to these relations, when the background is “hospital room”, for example, it is determined like; “walk: high”, “fall-down: middle” “run: low” and “pitching: non”. The relations determined as such are stored in the storing unit 3.
  • It is preferable that the learning unit 26 learns a large amount of images having various viewpoints, other than the plurality of time-series images Y described above. For example, in the case of “hospital room”, it is considered that a large amount of images like “hospital rooms photographed at different angles”, “hospital rooms having various colored interiors” and “hospital rooms with presence/absence of nurses and patients” are collected to be learned by the learning unit 26.
  • The storing unit 3 stores various choices of action and background, which the user can select, in the correct-answer obtaining unit 24, other than the above described learning result by the learning unit 26.
  • Next, the configuration of the action-estimating device 1 will be described as below.
  • As shown in FIG. 2, the action-estimating device 1 includes an estimating-side identifier 11, an estimating-side obtaining unit 12, an estimating-side detecting unit 13, an estimating-side measuring unit 14, and an estimating unit 15.
  • The estimating-side identifier 11 is used to identify a plurality of articulations A (in the present embodiment, neck, right elbow, left elbow, waist, right knee, and left knee) of a subject Z. The estimating-side identifier 11 stores articulation-identifying information as references for identifying each articulation A, such as shape, direction, and size. Further, the estimating-side identifier 11 also stores supplemental identifying information as references on various “basic posture” (“walking”, “stand-up” etc.) of a subject Z, “motion range of each articulation A”, and “distance between articulations A.” In the present embodiment, the same information as in the learning-side identifier 21 is stored.
  • In addition, the estimating-side identifier 11 also stores background identifying information (presence, color, and angle of an object; presence of a person, or the like) as the reference for identifying the background (i.e. “hospital room”, “office”, “outside” and the like). In the present embodiment, the same information as in the learning-side identifier 21 is stored.
  • The estimating-side obtaining unit 12 is connected to the photographing means X and obtains video images (i.e., a plurality of time-series images Y) taken by the photographing means X. In the present embodiment, a plurality of time-series images Y is obtained in real-time. However, it may be obtained later depending on the intended purpose of the action-estimating device 1.
  • The estimating-side detecting unit 13 detects a plurality of articulations A appearing in each time-series image Y. Specifically, the estimating-side detecting unit 13 detects the parts corresponding to the articulation-identifying information stored in the estimating-side identifier 11, using an inference model modeled by CNN (Convolution Neural Network). When the estimating-side detecting unit 13 detects an articulation A, it can be considered that a subject Z appears in the time-series image Y.
  • The estimating-side detecting unit 13 also detects the background appeared in each of the time-series images Y. In detail, the estimating-side detecting unit 13 detects the parts corresponding to the background identifying information stored in the estimating-side identifier 11 among each of the time-series images Y. Then, the estimating-side detecting unit 13 determines the background while referring to the learning result by the learning unit 26 stored in the storing unit 3. For example, there presented a “bed” and an “infusion” in FIG. 1, so that it is determined as “the background is a hospital room.”
  • The estimating-side measuring unit 14 measures coordinates and depths of the plurality of articulations A detected by the estimating-side detecting unit 13. This measurement is performed on each time-series image Y.
  • For example, the coordinate and the depth of an articulation A1 at the time t1 in the time-series images Y can be expressed such as XA1 (t 1), YA1 (t 1), ZA1 (t 1). The depth is not necessarily expressed using the coordinate and may be expressed as a relative depth in the plurality of time-series images Y. The depth may be measured by the known method. However, it is possible to specify the depth referring to the learning unit 26 when the learning unit 26 has already learned about the depth.
  • The estimating unit 15 estimates the action of the subject Z, based on the displacement in the plurality of time-series images Y of the coordinate and the depth of the whole of the articulation group B. Specifically, the estimating unit 15 selects one or more actions with high probability from among various action choices (“fall-down”, “walk”, “running” and “throwing”, etc.), while referring to the learning result by the learning unit 26 stored in the storing unit 3. Thus, in the action-estimating device 1, the coordinate and the depth of the whole of the articulation group B of each subject Z is inputted in a time-series inference model, in which LSTM (Long Short Term Memory) is used, and the action identifying label such as “walking” and “standing” is outputted.
  • Here, in this embodiment, the estimating unit 15 also considers the background appeared in the time-series images Y in order to estimate the action of the subject Z. In detail, while referring to the relation between the correct-action and the correct-background stored in the storing unit 3, the estimating unit 15 corrects the probability of choices of action, according to the background detected (determined) by the estimating-side detecting unit 13.
  • For example, the case where those probabilities become “walk: 65%”, “fall-down: 75%” “run: 45%”, and “pitching: 65%” is considered, as shown in FIG. 4 (a) if the action of subject Z is estimated without taking into account the background, although the background is “hospital room” actually.
  • In this case, since “pitching” action is similar to “fall-down” part of the way, the above probability of “pitching” action is estimated highly. However, “pitching” is an action of exceedingly rare to occur in a “hospital room.”
  • Then, in the present embodiment, when those relations are determined as “walk: high”, “fall-down: middle” “run: low”, and “pitching: non” for “background: hospital room”, the probability of the action, which is unlikely to occur in a “hospital room”, is corrected downward like “run: from 45% to 30%”, and “pitching: from 65% to 15%”, as shown in FIG. 4 (b). Conversely, the probability of the action, which is likely to occur in a “hospital room”, may be corrected upward like “walk: from 65% to 80%”, “fall-down: from 75% to 85%”.
  • Further, when the probability of the action, which is unlikely to occur in a “hospital room”, is corrected downward and the resulting probability becomes lower than a prescribed value, the action may be excluded from the action choices. For example, under the condition “the action of probability less than 30% is excluded”, those actions “run” and “pitching” are excluded, as shown in FIG. 4 (c).
  • Also, actions, such as “fall-down” and “pitching”, having a relation greater than a prescribed value between each other may be associated with each other. Then, when any one of the associated actions is excluded or its probability is decreased, the probability of the action of the other may be increased. In the example of FIG. 4, as shown in (d), since “pitching” is excluded, the probability of “fall-down” is increased.
  • Therefore, since the action-estimating device 1 according to the embodiment considers the background appeared in the time-series images Y in order to estimate the action of subject Z, it becomes possible to perform action estimation with higher accuracy.
  • Next, while referring to the flowcharts in FIG. 5, “estimating of action of subject Z” according to the action-estimating device 1 is explained.
  • First, when a plurality of time-series images Y is obtained by the estimating-side obtaining unit 12 (S1), a plurality of articulations A and backgrounds appearing in each of the time-series images Y are detected by the estimating-side detecting unit 13 (S2).
  • Next, the coordinates and the depths of the plurality of articulations A detected in S2 are measured by the estimating-side measuring unit 14 (S3). This measurement is performed for each time-series image Y.
  • Next, the action of subject Z is estimated by the estimating unit 15 based on the displacement in the plurality of time-series images Y of the coordinates and the depths of the plurality of articulations A measured in S3 (S4).
  • Finally, the probability of the estimated action is corrected based on the detected background (S5).
  • The action-estimating device 1 having such a configuration, for example, can be used in the below purpose; in a nursing home, the action-estimating device 1 will always photograph inside the room where care-receivers (subject Z) are there. Then, if the case for those care-receivers to fall or the like are estimated based on the photographed images, the action-estimating device 1 will give an alert on that fact to a caregiver.
  • As described above, the action-estimating device 1 according to the embodiment considers the backgrounds appeared in the time-series images Y in order to estimate the action of subject Z.
  • With this configuration, it becomes possible to focus on the action having high probability to occur by considering the background. Therefore, an action estimation with high accuracy can be realized.
  • Further the action-estimating device 1 according to the present embodiment, in order to estimate the action of the subject Z, calculates the probability of each of the plurality of choices based on the measured displacement in the plurality of time-series images Y of the coordinates of the plurality of articulations A, and corrects the probability of the plurality of calculated choices based on the detected background.
  • With this configuration, it becomes possible to decrease the probability of action that is unlikely to occur, while increase the probability of action that is likely to occur. Therefore, an action-estimating with higher accuracy is realized.
  • Further, the action-estimating device 1 according to the embodiment, in order to estimate the action of subject Z, excludes one or more choices from among the plurality of choices, based on the detected background.
  • With this configuration, since the number of actions, which is ultimately presented to the user, decreases, it makes easier for the user to recognize the estimated action. In addition, since one or more choices are excluded before calculating the probability of choices, only the probability of choices, which was not excluded, can be calculated efficiently, and then the load on the CPU can be reduced.
  • Further, in the action-estimating device 1 according to the present embodiment, the choices whose actions have a relation greater than a prescribed value between each other are associated with each other. When any one of the plurality of choices associated with each other was excluded or the probability of any one of the plurality of choices associated with each other was decreased, the probability of the associated choice which was not excluded or whose probability was not decreased is increased in order to estimate the action of the subject Z.
  • With this configuration, in the case of “pitching” and “fall-down” for example, which have actions look alike part of the way, the probability of the action which was not excluded is increased. Therefore, it makes possible to perform an action estimating with higher accuracy.
  • While the action-estimating device of the invention has been described in detail with reference to the preferred embodiment thereof, it would be apparent to those skilled in the art that many modifications and variations may be made therein without departing from the spirit of the invention, the scope of which is defined by the attached claims.
  • For example, in the above-described embodiment, though the background is considered in order to estimate the action of the subject Z, application or purpose of the action-estimation may also be take into account.
  • In the case where the purpose is to recognize employees' gesture in an office, for example, since those actions “fall-down,” “walking,” “running” and “pitching” are inappropriate, the probabilities of those choices are decreased or excluded. On the contrary, those probability of actions like “move of the arms” or “move of the face” can be considered to be increased. In this case, as is shown in FIG. 6, a setting unit 16 is provided to the action-estimating device 1, and the user will set the purpose or the application (crime prevention, medical nursing care etc.). The relation between the correct-action and the purpose or the application are stored in advance in the storing unit 3. When the estimating unit 15 estimates the action of the subject Z, while referring to the said relation, it is possible for the estimating unit 15 to correct the probability of the choices of action, based on the purpose or the application, which is set in the setting unit 16.
  • Further, in the above embodiment, although the relation between the correct-action and the correct-background, which is learned by the learning unit 26, is stored in the storing unit 3, a set value may also be stored in advance in the storing unit 3.
  • In the above embodiment, the storing unit 3 is arranged separately from the action-estimating device 1 and the learning unit 2. However, the storing unit 3 may be mounted in the action-estimating device 1 side or in the learning unit 2 side.
  • In the above embodiment, the displacement in the plurality of time-series images Y of the coordinate and the depth of the articulation group B is considered in order to estimate the action of the subject Z. However, the displacement of each articulation A in the plurality of time-series images Y may be used.
  • Further, in the above embodiment, the case where the subject Z is a human is explained. However, it is also possible to use the device in order to estimate an animal's action or robot's action. In addition, in the above embodiment, neck, right elbow, left elbow, waist, right knee, and left knee are employed as a plurality of articulations A. However, it is needless to say that the other articulations or more articulations A may also be employed.
  • The present invention is also applied to a program that conducts the process of the action-estimating device 1, or to a record media accommodating the content of the program. In the case of record media, the program should be installed on the computer or the like. The record media storing the program may be reusable and not one-time use only. As reusable record media, for example, CD-ROM may be employed, but the record media is not limited to this. In addition, it is obvious that a plurality of choices of action to be estimated may be stored in the computer later. Similarly, the purpose or application of the estimation of the targeted action may also be set in the computer later.
  • DESCRIPTION OF THE REFERENCE NUMBER
    • 1 Action-estimating device
    • 2 Learning device
    • 3 Storing unit
    • 11 Estimating-side identifier
    • 12 Estimating-side obtaining unit
    • 13 Estimating-side detecting unit
    • 14 Estimating-side measuring unit
    • 15 Estimating unit
    • 16 Setting unit
    • 21 Learning-side identifier
    • 22 Learning-side obtaining unit
    • 23 Learning-side detecting unit
    • 24 Correct-answer obtaining unit
    • 25 Learning-side measuring unit
    • 26 Learning unit

Claims (16)

1. A non-transitory computer-readable medium storing an action-estimating program for a computer storing a plurality of choices of action to be estimated, the action-estimating computer program comprising:
a step for obtaining a plurality of time-series images in which a subject appears;
a step for detecting a plurality of articulations appearing in each time-series image;
a step for measuring coordinates of the detected plurality of articulations in each time-series image;
a step for estimating an action of the subject based on displacement of the coordinates of the measured plurality of articulations in the plurality of time-series images; and
a step for detecting a background appeared in each time-series image,
wherein, in the estimating step, a probability of each of the plurality of choices is calculated based on the displacement of the coordinates of the measured plurality of articulations in the plurality of time-series images, and the calculated probability of each of the plurality of choices is corrected based on the detected background.
2. The action-estimating program according to claim 1, wherein, in the estimating step, one or more choices are excluded from among the plurality of choices based on the detected background.
3. The action-estimating program according to claim 1, wherein the choices whose actions have a relation greater than a prescribed value between each other are associated with each other in the computer,
wherein, in the estimating step, when any one of the plurality of choices associated with each other was excluded or the probability of any one of the plurality of choices associated with each other was decreased, the probability of the associated choice which was not excluded or whose probability was not decreased is increased.
4. The action-estimating program according to claim 2, wherein the choices whose actions have a relation greater than a prescribed value between each other are associated with each other in the computer,
wherein, in the estimating step, when any one of the plurality of choices associated with each other was excluded or the probability of any one of the plurality of choices associated with each other was decreased, the probability of the associated choice which was not excluded or whose probability was not decreased is increased.
5. An action-estimating device comprising:
an estimating-side obtaining unit configured to obtain a plurality of time-series images in which a subject appears;
an estimating-side detecting unit configured to detect a plurality of articulations appearing in each time-series image;
an estimating-side measuring unit configured to measure coordinates of the detected plurality of articulations in each time-series image;
an estimating unit configured to estimate an action of the subject based on displacement of the coordinates of the measured plurality of articulations in the plurality of time-series images;
a setting unit configured to set a purpose or application for the estimating of the action of the subject; and
a storing unit configured to store a plurality of choices of the action to be estimated,
wherein, in order to estimate the action of the subject, the estimating unit calculates a probability of each of the plurality of choices based on the displacement of the coordinates of the measured plurality of articulations in the plurality of time-series images, and corrects the calculated probability of each of the plurality of choices based on the set purpose or application.
6. The action-estimating device according to claim 5, wherein the estimating unit excludes one or more choices from among the plurality of choices based on the set purpose or application in order to estimate the action of the subject.
7. The action-estimating device according to claim 5, wherein the choices whose actions have a relation greater than a prescribed value between each other are associated with each other in the storing unit, and
wherein, when any one of the plurality of choices associated with each other was excluded or the probability of any one of the plurality of choices associated with each other was decreased, the action estimating unit increases the probability of the associated choice which was not excluded or whose probability was not decreased, in order to estimate the action of the subject.
8. The action-estimating device according to claim 6, wherein the choices whose actions have a relation greater than a prescribed value between each other are associated with each other in the storing unit, and
wherein, when any one of the plurality of choices associated with each other was excluded or the probability of any one of the plurality of choices associated with each other was decreased, the action estimating unit increases the probability of the associated choice which was not excluded or whose probability was not decreased, in order to estimate the action of the subject.
9. A non-transitory computer-readable medium storing an action-estimating program for a computer storing a plurality of choices of action to be estimated and in which a purpose or application for estimating of action of a subject is set, the action-estimating computer program comprising:
a step for obtaining a plurality of time-series images in which the subject appears;
a step for detecting a plurality of articulations appearing in each time-series image;
a step for measuring coordinates of the detected plurality of articulations in each time-series image; and
a step for estimating an action of the subject based on displacement of the coordinates of the measured plurality of articulations in the plurality of time-series images,
wherein, in the estimating step, a probability of each of the plurality of choices is calculated based on the displacement of the coordinates of the measured plurality of articulations in the plurality of time-series images, and the calculated probability of each of the plurality of choices is corrected based on the set purpose or application.
10. The non-transitory computer-readable medium according to claim 9, wherein, in the estimating step, one or more choices are excluded from among the plurality of choices based on the set purpose or application.
11. The non-transitory computer-readable medium according to claim 9, wherein the choices whose actions have a relation greater than a prescribed value between each other are associated with each other in the computer,
wherein, in the estimating step, when any one of the plurality of choices associated with each other was excluded or the probability of any one of the plurality of choices associated with each other was decreased, the probability of the associated choice which was not excluded or whose probability was not decreased is increased.
12. The non-transitory computer-readable medium according to claim 10, wherein the choices whose actions have a relation greater than a prescribed value between each other are associated with each other in the computer,
wherein, in the estimating step, when any one of the plurality of choices associated with each other was excluded or the probability of any one of the plurality of choices associated with each other was decreased, the probability of the associated choice which was not excluded or whose probability was not decreased is increased.
13. An action-estimating method executed on a computer storing a plurality of choices of action to be estimated and in which a purpose or application for estimating of action of a subject is set, the method comprising:
a step for obtaining a plurality of time-series images in which the subject appears;
a step for detecting a plurality of articulations appearing in each time-series image;
a step for measuring coordinates of the detected plurality of articulations in each time-series image; and
a step for estimating an action of the subject based on displacement of the coordinates of the measured plurality of articulations in the plurality of time-series images,
wherein, in the estimating step, a probability of each of the plurality of choices is calculated based on the displacement of the coordinates of the measured plurality of articulations in the plurality of time-series images, and the calculated probability of each of the plurality of choices is corrected based on the set purpose or application.
14. The action-estimating program according to claim 13, wherein, in the estimating step, one or more choices are excluded from among the plurality of choices based on the set purpose or application.
15. The action-estimating program according to claim 13, wherein the choices whose actions have a relation greater than a prescribed value between each other are associated with each other in the computer,
wherein, in the estimating step, when any one of the plurality of choices associated with each other was excluded or the probability of any one of the plurality of choices associated with each other was decreased, the probability of the associated choice which was not excluded or whose probability was not decreased is increased.
16. The action-estimating program according to claim 14, wherein the choices whose actions have a relation greater than a prescribed value between each other are associated with each other in the computer,
wherein, in the estimating step, when any one of the plurality of choices associated with each other was excluded or the probability of any one of the plurality of choices associated with each other was decreased, the probability of the associated choice which was not excluded or whose probability was not decreased is increased.
US17/324,190 2018-05-27 2021-05-19 Action-estimating device Abandoned US20210279452A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/324,190 US20210279452A1 (en) 2018-05-27 2021-05-19 Action-estimating device

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
JP2018101097A JP6525181B1 (en) 2018-05-27 2018-05-27 Behavior estimation device
JP2018-101097 2018-05-27
PCT/JP2019/015403 WO2019230199A1 (en) 2018-05-27 2019-04-09 Action estimation device
US202017057720A 2020-11-23 2020-11-23
US17/324,190 US20210279452A1 (en) 2018-05-27 2021-05-19 Action-estimating device

Related Parent Applications (2)

Application Number Title Priority Date Filing Date
US17/057,720 Continuation US11048924B1 (en) 2018-05-27 2019-04-09 Action-estimating device
PCT/JP2019/015403 Continuation WO2019230199A1 (en) 2018-05-27 2019-04-09 Action estimation device

Publications (1)

Publication Number Publication Date
US20210279452A1 true US20210279452A1 (en) 2021-09-09

Family

ID=66730618

Family Applications (2)

Application Number Title Priority Date Filing Date
US17/057,720 Active US11048924B1 (en) 2018-05-27 2019-04-09 Action-estimating device
US17/324,190 Abandoned US20210279452A1 (en) 2018-05-27 2021-05-19 Action-estimating device

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US17/057,720 Active US11048924B1 (en) 2018-05-27 2019-04-09 Action-estimating device

Country Status (3)

Country Link
US (2) US11048924B1 (en)
JP (1) JP6525181B1 (en)
WO (1) WO2019230199A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6966038B2 (en) * 2020-03-13 2021-11-10 エヌ・ティ・ティ・ビズリンク株式会社 Animal behavior estimation device, animal behavior estimation method and program
JP7012111B2 (en) * 2020-03-13 2022-01-27 エヌ・ティ・ティ・ビズリンク株式会社 Animal behavior estimation system, animal behavior estimation support device, animal behavior estimation method and program

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100303303A1 (en) * 2009-05-29 2010-12-02 Yuping Shen Methods for recognizing pose and action of articulated objects with collection of planes in motion
JP2017228100A (en) * 2016-06-23 2017-12-28 コニカミノルタ株式会社 Behavior recognition device and behavior recognition program
JP2018057596A (en) * 2016-10-05 2018-04-12 コニカミノルタ株式会社 Joint position estimation device and joint position estimation program
US20180121939A1 (en) * 2016-10-27 2018-05-03 Conduent Business Services, Llc Method and system for predicting behavioral characteristics of customers in physical stores
US20180286206A1 (en) * 2015-10-06 2018-10-04 Konica Minolta, Inc. Action Detection System, Action Detection Device, Action Detection Method, and Action Detection Program
US20190147601A1 (en) * 2017-11-10 2019-05-16 Fujitsu Limited Information processing apparatus, background image update method, and non-transitory computer-readable storage medium
US10592812B2 (en) * 2014-01-20 2020-03-17 Sony Corporation Information processing apparatus and information processing method

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7200266B2 (en) * 2002-08-27 2007-04-03 Princeton University Method and apparatus for automated video activity analysis
US20070132597A1 (en) * 2005-12-09 2007-06-14 Valence Broadband, Inc. Methods and systems for monitoring patient support exiting and initiating response
US20140157209A1 (en) * 2012-12-03 2014-06-05 Google Inc. System and method for detecting gestures
JP2016099982A (en) 2014-11-26 2016-05-30 日本電信電話株式会社 Behavior recognition device, behaviour learning device, method, and program
JP6166297B2 (en) 2015-03-12 2017-07-19 セコム株式会社 Posture estimation device
JP2017102808A (en) 2015-12-04 2017-06-08 ソニー株式会社 Image processing device and method
WO2019198696A1 (en) 2018-04-11 2019-10-17 株式会社アジラ Action estimation device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100303303A1 (en) * 2009-05-29 2010-12-02 Yuping Shen Methods for recognizing pose and action of articulated objects with collection of planes in motion
US10592812B2 (en) * 2014-01-20 2020-03-17 Sony Corporation Information processing apparatus and information processing method
US20180286206A1 (en) * 2015-10-06 2018-10-04 Konica Minolta, Inc. Action Detection System, Action Detection Device, Action Detection Method, and Action Detection Program
JP2017228100A (en) * 2016-06-23 2017-12-28 コニカミノルタ株式会社 Behavior recognition device and behavior recognition program
JP2018057596A (en) * 2016-10-05 2018-04-12 コニカミノルタ株式会社 Joint position estimation device and joint position estimation program
US20180121939A1 (en) * 2016-10-27 2018-05-03 Conduent Business Services, Llc Method and system for predicting behavioral characteristics of customers in physical stores
US20190147601A1 (en) * 2017-11-10 2019-05-16 Fujitsu Limited Information processing apparatus, background image update method, and non-transitory computer-readable storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Ahmad et al, ("A Depth Video Sensor-Based Life-Logging Human Activity Recognition System for Elderly Care in Smart Indoor Environments", Sensors 2014, 14, 11735-11759; doi:10.3390/s140711735) (Year: 2014) *

Also Published As

Publication number Publication date
JP6525181B1 (en) 2019-06-05
JP2019204464A (en) 2019-11-28
US11048924B1 (en) 2021-06-29
WO2019230199A1 (en) 2019-12-05
US20210201006A1 (en) 2021-07-01

Similar Documents

Publication Publication Date Title
EP3373804B1 (en) Device, system and method for sensor position guidance
US20210279452A1 (en) Action-estimating device
US9974466B2 (en) Method and apparatus for detecting change in health status
US11315275B2 (en) Edge handling methods for associated depth sensing camera devices, systems, and methods
US11298050B2 (en) Posture estimation device, behavior estimation device, storage medium storing posture estimation program, and posture estimation method
US20180005510A1 (en) Situation identification method, situation identification device, and storage medium
US11482046B2 (en) Action-estimating device
Harada et al. Body parts positions and posture estimation system based on pressure distribution image
KR20170054673A (en) Sleeping position verification method using kinect sensors
JP7347577B2 (en) Image processing system, image processing program, and image processing method
US20110288377A1 (en) Biological information measurement apparatus and method thereof
US11598680B2 (en) System for estimating thermal comfort
JP6525179B1 (en) Behavior estimation device
KR20140013662A (en) Device and method for calibration
JP6525180B1 (en) Target number identification device
Chen et al. Gait monitoring using an ankle-worn stereo camera system
KR102411882B1 (en) Untact physical fitness measurement system using images
CN113271848A (en) Body health state image analysis device, method and system
Steele et al. A system to facilitate early and progressive ambulation using fiducial markers
US11164481B2 (en) Method and electronic apparatus for displaying reference locations for locating ECG pads and recording medium using the method
WO2019086363A1 (en) Distance measurement devices, systems and methods
EP3463074A1 (en) Impedance shift detection
WO2022249746A1 (en) Physical-ability estimation system, physical-ability estimation method, and program
Chen Gait Parameter Monitoring Using A Wearable Stereo Camera System
WO2020008995A1 (en) Image recognition program, image recognition device, learning program, and learning device

Legal Events

Date Code Title Description
AS Assignment

Owner name: ASILLA, INC., JAPAN

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:KIMURA, DAISUKE;REEL/FRAME:056283/0616

Effective date: 20201119

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION