WO2023188217A1 - 情報処理プログラム、情報処理方法、および情報処理装置 - Google Patents
情報処理プログラム、情報処理方法、および情報処理装置 Download PDFInfo
- Publication number
- WO2023188217A1 WO2023188217A1 PCT/JP2022/016364 JP2022016364W WO2023188217A1 WO 2023188217 A1 WO2023188217 A1 WO 2023188217A1 JP 2022016364 W JP2022016364 W JP 2022016364W WO 2023188217 A1 WO2023188217 A1 WO 2023188217A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- time
- information processing
- subject
- series data
- information
- Prior art date
Links
- 230000010365 information processing Effects 0.000 title claims abstract description 219
- 238000003672 processing method Methods 0.000 title claims description 10
- 238000009826 distribution Methods 0.000 claims abstract description 22
- 230000002159 abnormal effect Effects 0.000 claims description 38
- 238000012545 processing Methods 0.000 claims description 36
- 238000000034 method Methods 0.000 claims description 32
- 230000008569 process Effects 0.000 claims description 23
- 230000008859 change Effects 0.000 claims description 22
- 210000001503 joint Anatomy 0.000 description 37
- 238000004458 analytical method Methods 0.000 description 29
- 238000010586 diagram Methods 0.000 description 28
- 230000005856 abnormality Effects 0.000 description 20
- 238000003860 storage Methods 0.000 description 18
- 230000002123 temporal effect Effects 0.000 description 15
- 230000006870 function Effects 0.000 description 14
- 210000000988 bone and bone Anatomy 0.000 description 9
- 238000012937 correction Methods 0.000 description 7
- 210000004247 hand Anatomy 0.000 description 7
- 238000013528 artificial neural network Methods 0.000 description 6
- 210000004394 hip joint Anatomy 0.000 description 6
- 230000004048 modification Effects 0.000 description 6
- 238000012986 modification Methods 0.000 description 6
- 230000006399 behavior Effects 0.000 description 5
- 230000000052 comparative effect Effects 0.000 description 5
- 230000004044 response Effects 0.000 description 5
- 230000001133 acceleration Effects 0.000 description 4
- 230000000386 athletic effect Effects 0.000 description 4
- 238000013135 deep learning Methods 0.000 description 4
- 210000002310 elbow joint Anatomy 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 210000000629 knee joint Anatomy 0.000 description 4
- 210000002414 leg Anatomy 0.000 description 4
- 210000000323 shoulder joint Anatomy 0.000 description 4
- 210000000115 thoracic cavity Anatomy 0.000 description 4
- 230000009191 jumping Effects 0.000 description 3
- 210000003127 knee Anatomy 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 238000007639 printing Methods 0.000 description 2
- 239000004065 semiconductor Substances 0.000 description 2
- 238000009987 spinning Methods 0.000 description 2
- 210000000707 wrist Anatomy 0.000 description 2
- 238000003745 diagnosis Methods 0.000 description 1
- 239000006185 dispersion Substances 0.000 description 1
- 238000005401 electroluminescence Methods 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 230000003862 health status Effects 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
- G06V40/23—Recognition of whole body movements, e.g. for sport training
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/29—Graphical models, e.g. Bayesian networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
- G06T7/251—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments involving models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/277—Analysis of motion involving stochastic approaches, e.g. using Kalman filters
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
- G06T7/73—Determining position or orientation of objects or cameras using feature-based methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/34—Smoothing or thinning of the pattern; Morphological operations; Skeletonisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20072—Graph-based image processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20076—Probabilistic image processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30196—Human being; Person
Definitions
- the present invention relates to an information processing program, an information processing method, and an information processing device.
- the result of the first process the likelihood of the result of the second process is calculated based on the likelihood of the result of the first process, the likelihood of the result of the second process, and the likelihood of the result of the third process.
- some of the results of the third process are output as the object skeleton recognition results.
- there is a technique for recognizing a heat map image in which the likelihood of a plurality of joint positions of a subject is projected from a plurality of directions from a distance image of the subject.
- there is a behavior detection technique that uses a recurrent neural network.
- the conventional technology it may be difficult to accurately specify the three-dimensional coordinates of each joint of a person.
- the three-dimensional coordinates of a joint in a person's right hand may be mistaken as the three-dimensional coordinates of a joint in a person's left hand.
- the three-dimensional coordinates of a part of an object other than a person in a multi-view image may be mistaken as the three-dimensional coordinates of a human joint.
- the present invention aims to enable accurate identification of the location of a subject's body part.
- time-series data of skeletal information including the position of each of a plurality of body parts of the subject is acquired, and based on the feature amount of the skeletal information in the acquired time-series data, the acquired
- the type of motion of the subject corresponding to the skeletal information at the first time point in the time series data is specified, and the type of movement of the subject person corresponding to the skeletal information at the first time point in the acquired time series data is determined.
- FIG. 1 is an explanatory diagram showing an example of an information processing method according to an embodiment.
- FIG. 2 is an explanatory diagram showing an example of the information processing system 200.
- FIG. 3 is a block diagram showing an example of the hardware configuration of the information processing device 100.
- FIG. 4 is a block diagram showing an example of the hardware configuration of the image capturing device 201.
- FIG. 5 is a block diagram showing an example of the functional configuration of the information processing device 100.
- FIG. 6 is an explanatory diagram showing the flow of operations of the information processing device 100.
- FIG. 7 is an explanatory diagram (part 1) showing a specific example of identifying an abnormal joint.
- FIG. 8 is an explanatory diagram (part 2) showing a specific example of identifying an abnormal joint.
- FIG. 1 is an explanatory diagram showing an example of an information processing method according to an embodiment.
- FIG. 2 is an explanatory diagram showing an example of the information processing system 200.
- FIG. 3 is a block diagram showing an example of the hardware configuration
- FIG. 9 is an explanatory diagram showing a specific example of generating a Factor Graph.
- FIG. 10 is an explanatory diagram showing a specific example of a Factor Graph template 911 corresponding to "jump".
- FIG. 11 is an explanatory diagram showing a specific example of a Factor Graph template 911 corresponding to "lying down”.
- FIG. 12 is an explanatory diagram showing a specific example of adding time series constraints.
- FIG. 13 is an explanatory diagram showing a specific example of modifying the 3D skeleton inference result 602.
- FIG. 14 is an explanatory diagram (part 1) showing a specific example of the flow of data processing in the operation example.
- FIG. 15 is an explanatory diagram (part 2) showing a specific example of the flow of data processing in the operation example.
- FIG. 16 is a flowchart showing an example of the overall processing procedure.
- FIG. 1 is an explanatory diagram showing an example of an information processing method according to an embodiment.
- the information processing device 100 is a computer that enables accurate identification of the location of a subject's body parts.
- the target person is, for example, a person.
- Sites include, for example, the neck, head, right and left shoulders, right and left elbows, right and left hands, right and left knees, right and left feet, and the like.
- the site is, for example, a joint.
- the position is, for example, a three-dimensional coordinate.
- the two-dimensional coordinates of each joint of the person are determined based on the detected area, and the identified two-dimensional coordinates are determined by taking into account the angle. It is conceivable to specify the three-dimensional coordinates of each joint of a person based on the following. Specifically, a model learned through deep learning is used to identify the three-dimensional coordinates of each joint of a person. For an example of this technique, reference can be made specifically to Reference 1 and Reference 2 below.
- V2v-posenet Voxel-to-voxel prediction network for accurate 3D hand and human pose estimation "From a single depth map.” Procedures of the IEEE conference on computer vision and pattern Recognition. 2018.
- the information processing device 100 acquires time-series data of the skeleton information 101.
- the skeletal information 101 includes, for example, the position of each of a plurality of body parts of the subject. Sites include, for example, the neck, head, right and left shoulders, right and left elbows, right and left hands, right and left knees, right and left feet, and the like.
- the site is, for example, a joint.
- the parts are specifically joint 1, joint 2, joint 3, etc.
- the position is, for example, a three-dimensional coordinate.
- the time series data includes, for example, skeletal information 101 for each time point.
- the time-series data specifically includes skeletal information 101 at time T, skeletal information 101 at time T-1, and the like.
- the information processing device 100 determines the type of motion of the subject corresponding to the skeletal information 101 at the first point in time in the acquired time series data, based on the feature amount of the skeletal information 101 in the acquired time series data. Identify.
- the types of motion include, for example, horizontal rotation such as walking, running, jumping, sitting, lying down, turning or spinning, or vertical rotation such as somersault or horizontal bar motion.
- the feature amount may be, for example, the position of each part of the subject indicated by the skeletal information 101.
- the feature amount may be, for example, a deviation in the position of each part of the subject indicated by the skeletal information 101 at different times.
- the feature amount may be, for example, the distance between the positions of different parts of the subject indicated by the skeletal information 101.
- the information processing device 100 has, for example, a first model for specifying the type of movement of the subject.
- the first model has a function that allows the type of motion of the subject to be determined, for example, according to the input of the feature amount of the skeletal information 101.
- the information processing device 100 uses the first model to identify the type of motion of the subject that corresponds to the skeletal information 101 at the first point in time in the acquired time series data.
- the information processing apparatus 100 specifically specifies "lying down" as the type of motion of the subject corresponding to the skeletal information 101 at the first point in time in the acquired time series data.
- the information processing device 100 determines, among the skeletal information 101 at the first point in time in the acquired time-series data, a temporal change in the position of one of the plurality of parts, corresponding to the identified movement type.
- a second model of probability distribution that is constrained according to the movement tendency of any part is determined.
- the movement tendency is, for example, a tendency of uniform movement, uniform velocity movement, or uniform acceleration movement.
- the information processing device 100 specifically distributes a probability distribution that constrains the temporal change in the position of the joint 1 according to the tendency of equipositional movement corresponding to lying down, in the skeletal information 101 at time T.
- a second model is determined.
- the information processing device 100 includes a node 111 indicating the position of each part at each time point, a first edge 112 connecting the nodes 111 to each other, and a second edge 113 connecting the nodes 111 to each other.
- a graph 110 is generated.
- the first edge 112 connects nodes 111 indicating the positions of different biologically connected parts at each time point.
- the second edge 113 connects nodes 111 indicating the positions of any parts at different times.
- the information processing device 100 When generating the graph 110, the information processing device 100 associates the determined second model with the second edge 113. In the example of FIG. 1, the information processing device 100 specifically adds the determined edge to the second edge 113 connecting the nodes 111 indicating the positions of the joint 1 of the subject at time T-1 and time T.
- a graph 110 is generated by associating the two models.
- the information processing device 100 corrects the skeleton information 101 at the first point in time in the time series data based on the generated graph 110. For example, the information processing device 100 corrects the position of the subject's joint 1 included in the skeletal information 101 at time T in the time series data. Thereby, the information processing device 100 can accurately identify the position of each joint of the subject. The information processing device 100 can accurately identify temporal changes in the positions of each joint of the subject.
- the information processing apparatus 100 uses the first model to identify the type of movement of the target person, but the present invention is not limited to this.
- the information processing device 100 may identify the type of movement of the subject without using the first model.
- a plurality of computers may cooperate to realize the functions of the information processing device 100.
- a computer that specifies the type of movement of the subject, a computer that generates the graph 110, and a computer that corrects the skeletal information 101 at the first point in time series data based on the graph 110 work together. There may be cases.
- FIG. 2 is an explanatory diagram showing an example of the information processing system 200.
- an information processing system 200 includes an information processing device 100, one or more image capturing devices 201, and one or more client devices 202.
- the information processing device 100 and the image capturing device 201 are connected via a wired or wireless network 210.
- the network 210 is, for example, a LAN (Local Area Network), a WAN (Wide Area Network), the Internet, or the like. Further, in the information processing system 200, the information processing device 100 and the client device 202 are connected via a wired or wireless network 210.
- the information processing device 100 acquires a plurality of images of the subject from different angles at different times from one or more image capturing devices 201.
- the information processing device 100 identifies the distribution of the existence probability of each part of the subject in a three-dimensional space based on the plurality of acquired images at each point in time, and identifies the three-dimensional coordinates of each part of the subject. do.
- the information processing device 100 identifies the type of movement of the target person at each time point based on the three-dimensional coordinates of each of the identified body parts of the target person.
- the information processing device 100 specifies, at each point in time, one of the regions of the target person that corresponds to the type, based on the identified type of motion of the target person.
- the information processing device 100 determines a probability distribution model that constrains the temporal change in the position of any of the specified parts based on the specified type of movement of the subject for each time point.
- the information processing device 100 generates a graph including nodes indicating the three-dimensional coordinates of each part of the subject at each specified time point.
- the information processing device 100 When generating a graph, the information processing device 100 generates the graph so that the graph includes a first edge connecting nodes indicating three-dimensional coordinates of different parts of the subject who are biologically connected at each time point. generate.
- the information processing device 100 When generating a graph, the information processing device 100 generates, for each point in time, a second edge that connects nodes indicating three-dimensional coordinates between the point in time and another point in time other than the point in time, regarding any of the specified parts. Generate a graph so that it is included in the graph.
- the other point in time is, for example, a point immediately before the certain point in time.
- the information processing device 100 associates the determined model with the second edge included in the graph.
- the information processing device 100 refers to the graph and corrects the three-dimensional coordinates of each part of the identified subject.
- the information processing device 100 outputs the corrected three-dimensional coordinates of each part of the subject.
- the output format includes, for example, displaying on a display, printing out to a printer, transmitting to another computer, or storing in a storage area.
- the information processing device 100 transmits the corrected three-dimensional coordinates of each part of the subject to the client device 202.
- the information processing device 100 is, for example, a server, a PC (Personal Computer), or the like.
- the image capturing device 201 is a computer that captures an image of a subject.
- the image capturing device 201 includes a camera having a plurality of image sensors, and captures an image of the subject using the camera.
- the image capturing device 201 generates an image of the subject and transmits it to the information processing device 100.
- the image capturing device 201 is, for example, a smartphone.
- the image capturing device 201 may be, for example, a fixed point camera.
- the image capturing device 201 may be, for example, a drone.
- the client device 202 receives the three-dimensional coordinates of each part of the subject from the information processing device 100.
- the client device 202 outputs the received three-dimensional coordinates of each body part of the subject so that the user can refer to them.
- the client device 202 displays, for example, the received three-dimensional coordinates of each part of the subject on a display.
- the client device 202 is, for example, a PC, a tablet terminal, or a smartphone.
- the information processing device 100 is a device different from the image capturing device 201, but the invention is not limited to this.
- the information processing device 100 may have a function as the image capturing device 201 and may also operate as the image capturing device 201.
- the information processing device 100 may have a function as the client device 202 and may also operate as the client device 202.
- FIG. 3 is a block diagram showing an example of the hardware configuration of the information processing device 100.
- the information processing apparatus 100 includes a CPU (Central Processing Unit) 301, a memory 302, a network I/F (Interface) 303, a recording medium I/F 304, and a recording medium 305.
- Information processing device 100 further includes a display 306 and an input device 307. Further, each component is connected to each other by a bus 300.
- the CPU 301 controls the entire information processing device 100.
- the memory 302 includes, for example, a ROM (Read Only Memory), a RAM (Random Access Memory), a flash ROM, and the like. Specifically, for example, a flash ROM or ROM stores various programs, and a RAM is used as a work area for the CPU 301.
- the program stored in the memory 302 is loaded into the CPU 301 and causes the CPU 301 to execute the coded processing.
- the network I/F 303 is connected to a network 210 through a communication line, and is connected to other computers via the network 210.
- the network I/F 303 serves as an internal interface with the network 210, and controls data input/output from other computers.
- the network I/F 303 is, for example, a modem or a LAN adapter.
- the recording medium I/F 304 controls reading/writing of data to/from the recording medium 305 under the control of the CPU 301.
- the recording medium I/F 304 is, for example, a disk drive, an SSD (Solid State Drive), a USB (Universal Serial Bus) port, or the like.
- the recording medium 305 is a nonvolatile memory that stores data written under the control of the recording medium I/F 304.
- the recording medium 305 is, for example, a disk, a semiconductor memory, a USB memory, or the like.
- the recording medium 305 may be removable from the information processing apparatus 100.
- the display 306 displays data such as a cursor, icon, toolbox, document, image, or functional information.
- the display 306 is, for example, a CRT (Cathode Ray Tube), a liquid crystal display, or an organic EL (Electroluminescence) display.
- the input device 307 has keys for inputting characters, numbers, various instructions, etc., and inputs data.
- the input device 307 is, for example, a keyboard or a mouse.
- the input device 307 may be, for example, a touch panel type input pad or a numeric keypad.
- the information processing device 100 may include, for example, a camera. Further, the information processing device 100 may include, for example, a printer, a scanner, a microphone, a speaker, or the like in addition to the components described above. Further, the information processing apparatus 100 may include a plurality of recording medium I/Fs 304 and recording media 305. Furthermore, the information processing apparatus 100 does not need to have the display 306 or the input device 307. Further, the information processing apparatus 100 does not need to have the recording medium I/F 304 and the recording medium 305.
- FIG. 4 is a block diagram showing an example of the hardware configuration of the image capturing device 201.
- the image capturing device 201 includes a CPU 401, a memory 402, a network I/F 403, a recording medium I/F 404, a recording medium 405, and a camera 406. Further, each component is connected to each other by a bus 400.
- Memory 402 includes, for example, ROM, RAM, flash ROM, and the like. Specifically, for example, a flash ROM or ROM stores various programs, and a RAM is used as a work area for the CPU 401. The program stored in the memory 402 is loaded into the CPU 401 and causes the CPU 401 to execute the coded processing.
- the network I/F 403 is connected to a network 210 through a communication line, and is connected to other computers via the network 210.
- the network I/F 403 serves as an internal interface with the network 210, and controls data input/output from other computers.
- the network I/F 403 is, for example, a modem or a LAN adapter.
- the recording medium I/F 404 controls data read/write to the recording medium 405 under the control of the CPU 401.
- the recording medium I/F 404 is, for example, a disk drive, an SSD, a USB port, or the like.
- the recording medium 405 is a nonvolatile memory that stores data written under the control of the recording medium I/F 404.
- the recording medium 405 is, for example, a disk, a semiconductor memory, a USB memory, or the like.
- the recording medium 405 may be removable from the image capturing device 201.
- the camera 406 has a plurality of image sensors, and generates an image of a target object captured by the plurality of image sensors. Camera 406 is, for example, a competition camera. Camera 406 is, for example, a surveillance camera.
- the image capturing device 201 may include, for example, a keyboard, a mouse, a display, a printer, a scanner, a microphone, a speaker, and the like. Further, the image capturing device 201 may include a plurality of recording medium I/Fs 404 and recording media 405. Further, the image capturing device 201 does not need to have the recording medium I/F 404 or the recording medium 405.
- Example of hardware configuration of client device 202 The example hardware configuration of the client device 202 is specifically the same as the example hardware configuration of the information processing device 100 shown in FIG. 3, so the description thereof will be omitted.
- FIG. 5 is a block diagram showing an example of the functional configuration of the information processing device 100.
- the information processing device 100 includes a storage unit 500, an acquisition unit 501, an analysis unit 502, a learning unit 503, a specification unit 504, a determination unit 505, a generation unit 506, a correction unit 507, and an output unit 508. including.
- the storage unit 500 is realized, for example, by a storage area such as the memory 302 or the recording medium 305 shown in FIG. 3. Although a case will be described below in which the storage unit 500 is included in the information processing device 100, the present invention is not limited to this. For example, there may be a case where the storage unit 500 is included in a device different from the information processing device 100, and the storage contents of the storage unit 500 can be referenced from the information processing device 100.
- the acquisition unit 501 to output unit 508 function as an example of a control unit. Specifically, the acquisition unit 501 to the output unit 508 execute the program by causing the CPU 301 to execute a program stored in a storage area such as the memory 302 or the recording medium 305 shown in FIG. This function is realized by The processing results of each functional unit are stored in a storage area such as the memory 302 or the recording medium 305 shown in FIG. 3, for example.
- the storage unit 500 stores various information that is referenced or updated in the processing of each functional unit.
- the storage unit 500 stores, for example, a plurality of images taken of a specific person from different angles at each of a plurality of consecutive time points. The angle indicates the imaging position.
- the image is acquired by the acquisition unit 501, for example.
- the storage unit 500 stores, for example, time-series data of skeletal information.
- the time series data includes skeletal information at each of a plurality of consecutive time points.
- the skeletal information includes the position of each of a plurality of body parts of a specific person.
- the site is, for example, a joint. Sites include, for example, the neck, head, right and left shoulders, right and left elbows, right and left hands, right and left knees, right and left feet, and the like.
- the position is, for example, a three-dimensional coordinate.
- the time series data is acquired by the acquisition unit 501.
- the time series data may be generated by the analysis unit 502, for example.
- the acquisition unit 501 acquires various information used in the processing of each functional unit.
- the acquisition unit 501 stores the acquired various information in the storage unit 500 or outputs it to each functional unit. Further, the acquisition unit 501 may output various information stored in the storage unit 500 to each functional unit.
- the acquisition unit 501 acquires various information based on, for example, a user's operation input.
- the acquisition unit 501 may receive various information from a device different from the information processing device 100, for example.
- the acquisition unit 501 acquires, for example, time-series data of the subject's skeletal information.
- the skeletal information of the subject includes, for example, the position of each of a plurality of body parts of the subject.
- the acquisition unit 501 acquires time-series data of the skeletal information of the subject by accepting input of time-series data of the skeletal information of the subject based on the user's operation input.
- the acquisition unit 501 may acquire time-series data of the subject's skeletal information by receiving it from another computer.
- the acquisition unit 501 may, for example, acquire time-series data of skeletal information of past subjects.
- the subject may be, for example, the same as the subject.
- the subject's skeletal information includes, for example, the position of each of a plurality of body parts of the subject.
- the acquisition unit 501 acquires time-series data of the skeletal information of the subject by accepting input of time-series data of the skeletal information of the subject based on the user's operation input.
- the acquisition unit 501 may acquire time-series data of the subject's skeletal information by receiving it from another computer.
- the acquisition unit 501 may, for example, acquire the type of motion of the subject corresponding to each piece of skeletal information in the time series data of past skeletal information of the subject.
- the types of motion include, for example, horizontal rotation such as walking, running, jumping, sitting, lying down, turning or spinning, or vertical rotation such as somersault or horizontal bar motion.
- the acquisition unit 501 receives input of the type of movement of the subject corresponding to each piece of skeletal information in the time-series data of past skeletal information of the subject based on the user's operation input. Get the behavior type.
- the acquisition unit 501 may acquire the type of motion of the subject corresponding to each piece of skeletal information in the time-series data of past skeletal information of the subject by receiving from another computer.
- the acquisition unit 501 acquires, for example, a plurality of images of the subject taken from different angles at each of a plurality of consecutive time points.
- the acquisition unit 501 acquires a plurality of images when the analysis unit 502 generates time-series data of the subject's skeletal information without acquiring it. Thereby, the acquisition unit 501 can enable the analysis unit 502 to generate time-series data of the subject's skeletal information.
- the acquisition unit 501 may acquire a plurality of images of the subject taken from different angles at each of a plurality of consecutive time points.
- the acquisition unit 501 acquires a plurality of images. Thereby, the acquisition unit 501 can enable the analysis unit 502 to generate time-series data of the subject's skeletal information.
- the acquisition unit 501 may receive a start trigger to start processing of any functional unit.
- the start trigger is, for example, a predetermined operation input by the user.
- the start trigger may be, for example, receiving predetermined information from another computer.
- the start trigger may be, for example, that any functional unit outputs predetermined information.
- the acquisition unit 501 may accept the acquisition of a plurality of images as a start trigger for starting the processing of the analysis unit 502.
- the acquisition unit 501 may receive, for example, the acquisition of time-series data of the subject's skeletal information as a start trigger for starting the processing of the learning unit 503.
- the acquisition unit 501 may receive the acquisition of time-series data of the subject's skeletal information as a start trigger for starting the processing of the identification unit 504, the determination unit 505, the generation unit 506, and the modification unit 507. .
- the analysis unit 502 generates time-series data of skeletal information of a predetermined person.
- the analysis unit 502 generates, for example, time-series data of the subject's skeletal information. Specifically, the analysis unit 502 estimates the position of each part of the subject at each time point based on a plurality of images taken of the subject from different angles at each of a plurality of time points, and performs estimation. Generates skeletal information of the subject including the location. Specifically, the analysis unit 502 generates time-series data of the skeletal information of the subject based on the generated skeletal information of the subject. Thereby, the analysis unit 502 can provisionally specify the position of each part of the subject at each point in time, and can obtain correction targets.
- the analysis unit 502 may generate time-series data of the subject's skeletal information. Specifically, the analysis unit 502 generates skeletal information of the subject at each time point based on a plurality of images taken of the subject from different angles at each of a plurality of time points, and generates skeletal information of the subject at each time point. Generate series data. The analysis unit 502 may add noise to the generated time series data of the subject's skeletal information. The analysis unit 502 sets the subject's skeletal information as teacher information for generating a learning model. Thereby, the analysis unit 502 can obtain teacher information for generating a learning model.
- the learning unit 503 learns the first learning model based on teacher information including the position of each of a plurality of body parts of the subject.
- the first learning model is capable of identifying which part of a given person is in an abnormal state in terms of position, among multiple parts of the given person, according to features related to skeletal information in time-series data of the given person's skeletal information. It has the function of The first learning model has, for example, a function of making it possible to determine whether or not each body part of a predetermined person is in an abnormal state in terms of position.
- the first learning model has a function of calculating an index value indicating the probability that each part of a predetermined person is in an abnormal state with respect to position. More specifically, the first learning model calculates, for each part of a given person, the probability that the part is in an abnormal state in terms of position, in response to input of feature amounts related to skeletal information. Outputs the indicated index value.
- the first learning model is a neural network. Thereby, the learning unit 503 can specify one of the plurality of parts of the subject whose position is abnormal.
- the learning unit 503 learns the second learning model based on teacher information including the position of each of the plurality of body parts of the subject.
- the second learning model calculates the behavior of a predetermined person corresponding to each piece of skeletal information in the time series data of the skeletal information of the predetermined person, according to the feature amount related to the skeletal information in the time series data of the skeletal information of the predetermined person. It has a function that allows the type to be identified.
- the second learning model can be a type of motion of a predetermined person corresponding to any skeletal information in the time-series data, in response to input of feature amounts related to skeletal information in the time-series data. Outputs an index value indicating the probability of each candidate.
- the second learning model is a neural network. Thereby, the learning unit 503 can specify the type of movement of the subject.
- the identifying unit 504 identifies the type of motion of the subject corresponding to the skeletal information at the first point in time in the acquired time series data, based on the feature amount of the skeletal information in the acquired time series data. For example, the identifying unit 504 uses the learned second learning model to determine the target person corresponding to the skeletal information at the first time point based on the feature amount related to the skeletal information in the acquired time series data of the skeletal information of the target person. Identify the type of behavior.
- the identifying unit 504 determines the behavior of the target person corresponding to the skeletal information at the first time point by inputting the feature amount related to the skeletal information in the time series data of the skeletal information of the target person into the second learning model.
- An index value indicating the probability of each candidate that can be the type of is calculated.
- the identifying unit 504 identifies the type of motion of the subject corresponding to the skeletal information at the first time point based on the calculated index value. More specifically, the identifying unit 504 identifies the candidate with the largest calculated index value as the type of movement of the subject. Thereby, the identifying unit 504 can obtain guidelines for correcting the position of each of the plurality of body parts of the subject.
- the specifying unit 504 can make it possible to determine which part of the subject's position is preferable to be corrected.
- the identification unit 504 identifies an abnormal region that is in an abnormal state with respect to position, among a plurality of regions of the subject.
- the identification unit 504 identifies abnormalities regarding the position of the skeletal information at the first point in time in the acquired time series data of the skeletal information of the target person, based on the feature amount related to the skeletal information in the acquired time series data of the skeletal information of the target person. Identify the abnormal area that is the condition.
- the identification unit 504 uses the learned first learning model to determine whether the skeletal information at the first point in time is abnormal in position based on the feature amount regarding the skeletal information in the acquired time series data of the skeletal information of the subject. Identify the abnormal area that is in a bad condition.
- the specifying unit 504 inputs into the first learning model the feature amounts related to the skeletal information in the time series data of the skeletal information of the subject, thereby determining the skeletal information of each of the subjects at the first time point.
- An index value indicating the probability that the part is in an abnormal state is calculated.
- the identification unit 504 identifies an abnormal site that is in an abnormal state with respect to position in the skeletal information at the first time point based on the calculated index value. More specifically, the identification unit 504 identifies, among a plurality of parts of the subject, a part whose calculated index value is equal to or greater than a threshold value as an abnormal part that is in an abnormal state with respect to position. Thereby, the identifying unit 504 can obtain guidelines for correcting the position of each of the plurality of body parts of the subject. The specifying unit 504 can make it possible to determine which part of the subject's position is preferable to be corrected.
- the determining unit 505 determines, among the skeleton information at the first point in time in the acquired time-series data, a probability distribution that constrains the temporal change in the position of one of the plurality of parts that corresponds to the specified motion type. Decide on the model.
- a distribution model is, for example, a model for constraining temporal changes in the position of any part corresponding to a specified type of movement, depending on the movement tendency of any part corresponding to the specified type of movement. be.
- the movement tendency is, for example, uniform movement, uniform velocity movement, or uniformly accelerated movement.
- the determining unit 505 can obtain guidelines for correcting the position of the region identified by the identifying unit 504.
- the generation unit 506 generates a graph including nodes indicating the positions of the respective parts at each time point, a first edge, and a second edge.
- the first edge connects nodes indicating the positions of different biologically connected parts at each time point.
- the second edge connects nodes indicating the positions of any parts corresponding to the specified type of motion at different times.
- the generation unit 506 associates the determined distribution model with the second edge. Thereby, the generation unit 506 can modify the skeletal information at the first point in time in the time series data of the skeletal information of the subject.
- the generation unit 506 generates the graph so that the graph further includes a third edge that connects nodes indicating positions of parts other than the one corresponding to the specified type of motion among the plurality of parts. May be generated. For example, if there is one first edge connected to each node indicating the position of another part at a different point in time, the generation unit 506 includes the third edge connecting the nodes in the graph. Generate a graph. Thereby, the generation unit 506 can accurately correct the skeletal information at the first point in time in the time series data of the skeletal information of the subject. For example, the generation unit 506 can make it possible to accurately correct the positions of other parts.
- the generation unit 506 generates a third edge that connects nodes indicating positions of other parts identified as abnormal parts in addition to one of the parts corresponding to the specified type of motion among the plurality of parts. Furthermore, a graph may be generated to be included in the graph. For example, if there is one first edge connected to each node indicating the position of another part at a different point in time, the generation unit 506 includes the third edge connecting the nodes in the graph. Generate a graph. Thereby, the generation unit 506 can accurately correct the skeletal information at the first point in time in the time series data of the skeletal information of the subject. For example, the generation unit 506 can make it possible to accurately correct the position of another region determined to be an abnormal region.
- the modification unit 507 modifies the skeletal information at the first point in time in the time series data of the skeletal information of the subject based on the generated graph.
- the modification unit 507 modifies the skeletal information at the first point in time in the time series data of the skeletal information of the subject, for example, by optimizing the generated graph. Thereby, the modification unit 507 can accurately specify the position of each part of the target person, taking into consideration the type of motion of the target person.
- the modification unit 507 can accurately specify the position of each part of the subject by considering the probability that each part of the subject is in an abnormal state.
- the output unit 508 outputs the processing result of at least one of the functional units.
- the output format is, for example, displaying on a display, printing out to a printer, transmitting to an external device via network I/F 303, or storing in a storage area such as memory 302 or recording medium 305. Thereby, the output unit 508 can notify the user of the processing results of at least one of the functional units, thereby improving the usability of the information processing apparatus 100.
- the output unit 508 outputs, for example, the skeleton information at the first time point corrected by the correction unit 507. Specifically, the output unit 508 transmits the skeleton information at the first time point modified by the modification unit 507 to the client device 202. Specifically, the output unit 508 displays the skeleton information at the first time point corrected by the correction unit 507 on the display. Thereby, the output unit 508 can make available the position of each part of the subject.
- FIG. 6 is an explanatory diagram showing the flow of operations of the information processing device 100.
- the information processing apparatus 100 acquires a plurality of multi-view images 600 obtained by capturing the subject at different times and from different angles.
- the information processing apparatus 100 detects an area in which a target person appears in each multi-view image 600 by performing a person detection process on each of the plurality of multi-view images 600.
- the information processing device 100 performs 2D (Dimension) pose estimation processing on each multi-view image 600 at each time point.
- the information processing device 100 performs 2D pose estimation processing on each multi-view image 600 at each time point, thereby generating a 2-D pose estimation process that indicates the distribution of the existence probability of each joint of the subject in each multi-view image 600.
- a heat map 601 is generated.
- the 2D heat map 601 includes, for example, joint likelihoods indicating the probability of existence of any joint of the subject at each point in the 2D space corresponding to the multi-view image 600.
- the information processing device 100 determines the joint of the subject in the multi-view image 600 based on the 2D heat map 601 that indicates the distribution of the existence probability of each joint of the subject in the multi-view image 600 at each point in time. Identify the 2D coordinates of.
- the variance of the joint likelihood in the 2D heat map 601, which indicates the probability of the existence of a joint of the subject, can be treated as an index value indicating the accuracy of the specified 2D coordinates.
- the information processing device 100 acquires placement information indicating the angle of each multi-view image 600 at each time point.
- the information processing device 100 performs a 3D pose estimation process based on the placement information and the 2D coordinates of each joint of the subject in each multi-view image 600 at each point in time, thereby estimating the subject in the 3D space. Identify the 3D coordinates of each joint of the person.
- the information processing device 100 generates a 3D skeleton inference result 602 including 3D coordinates of each joint of the identified subject at each time point, and generates time series data of the 3D skeleton inference result 602.
- the information processing device 100 corrects the 3D skeleton inference result 602 by performing correction processing on the time series data of the 3D skeleton inference result 602.
- the information processing apparatus 100 outputs the time series data of the corrected 3D skeleton inference result 603 in a usable manner. For example, the information processing apparatus 100 outputs time-series data of the corrected 3D skeleton inference result 603 so that the user can refer to it.
- the user performs a predetermined analysis process based on the time series data of the corrected 3D skeleton inference result 603.
- the analysis process is, for example, scoring participants in an athletic competition.
- the user performs an analysis process to score the participants based on the time series data of the corrected 3D skeleton inference results 603.
- the target person is a medical institution visitor undergoing rehabilitation, or a medical institution visitor receiving a diagnosis of athletic ability such as walking ability.
- the analysis process is, for example, determining the effectiveness of rehabilitation, or diagnosing athletic ability or health status.
- the user determines the effectiveness of rehabilitation of the medical institution visitor, or diagnoses the motor ability or health condition of the medical institution visitor. do.
- the information processing device 100 may perform the above-described analysis process based on the time-series data of the corrected 3D skeleton inference result 603.
- the information processing device 100 outputs the results of the analysis process so that the user can refer to them.
- the information processing apparatus 100 may output the time-series data of the corrected 3D skeleton inference result 603 to the analysis unit 502 that performs the above-described analysis process.
- the analysis unit 502 is included in a computer other than the information processing apparatus 100, for example. Thereby, the information processing apparatus 100 can perform analysis processing with high accuracy.
- FIGS. 7 to 15 a specific example of the correction process will be described using FIGS. 7 to 15. Specifically, first, using FIG. 7 and FIG. 8, a specific example in which the information processing device 100 identifies an abnormal joint that is determined to be in an abnormal state with respect to 3D coordinates among a plurality of joints of a subject. I will explain about it.
- FIGS. 7 and 8 are explanatory diagrams showing specific examples of identifying abnormal joints.
- the information processing apparatus 100 acquires time-series data of a plurality of original data 700.
- Original data 700 shows the subject's skeletal information.
- Original data 700 shows 3D coordinates of each of a plurality of joints of the subject.
- the 3D coordinates of the joints are, for example, ⁇ in the figure.
- the information processing device 100 generates processed data 701 by adding noise to the original data 700.
- the information processing device 100 changes the processed data 701 by changing the 3D coordinates of at least one of the plurality of joints of the subject indicated by the original data 700 to 3D coordinates that are determined to be in an abnormal state.
- the abnormal condition corresponds to, for example, a condition in which the 3D coordinates of a joint are incorrectly estimated.
- the abnormal state is jitter, inversion, swap, or miss.
- the information processing apparatus 100 can acquire time-series data of the processed data 701.
- the information processing device 100 learns the abnormality determination DNN 710 using the time series data of the processed data 701. For example, the abnormality determination DNN 710 detects an abnormality in each joint of the subject in at least one of the 3D skeleton inference results 602 in accordance with the input of the feature amount of the 3D skeleton inference result 602 in the time series data of the 3D skeleton inference result 602. It has a function to output probability.
- the abnormality probability indicates the probability that the 3D coordinates of the joints of the subject are in a positionally abnormal state.
- the abnormality determination DNN 710 outputs the abnormality probability of each joint of the subject in the entire time series data in response to input of the feature amount of the 3D skeleton inference result 602 in the time series data of the 3D skeleton inference result 602. It may have a function. Next, the explanation will move on to FIG.
- the information processing device 100 inputs the feature amount of the 3D skeleton inference result 602 in the time series data of the 3D skeleton inference result 602 to the abnormality determination DNN 710.
- the information processing device 100 obtains the abnormality probability of each joint of the subject in each 3D skeleton inference result 602 output by the abnormality determination DNN 710 in response to the input.
- the information processing device 100 identifies abnormal joints based on the acquired abnormality probability of each joint of the subject. For example, the information processing device 100 specifies, as an abnormal joint, any joint for which the obtained abnormality probability is equal to or greater than a threshold value among the plurality of joints of the subject.
- the information processing device 100 uses the abnormality determination DNN 710 to identify an abnormal joint, but the present invention is not limited to this.
- the information processing device 100 may identify abnormal joints based on rules. Specifically, in the 3D skeleton inference result 602, the information processing device 100 stores a rule for calculating the abnormality probability of each joint according to the magnitude of the difference between the feature amount and the threshold value regarding each joint. You can stay there. Specifically, the information processing device 100 calculates the abnormality probability of each joint with reference to the stored rules, and identifies any joint for which the calculated abnormality probability is greater than or equal to a threshold value as an abnormal joint. It is possible to do so. Next, a specific example in which the information processing apparatus 100 generates a Factor Graph will be described using FIG. 9 .
- FIG. 9 is an explanatory diagram showing a specific example of generating a Factor Graph.
- the information processing device 100 includes a state estimation DNN 900.
- the state estimation DNN 900 has a function of outputting the type of movement of the subject in at least one of the 3D skeleton inference results 602 according to the input of the feature amount of the 3D skeleton inference result 602 in the time series data of the 3D skeleton inference result 602. have
- the state estimation DNN 900 has a function of outputting the type of movement of the subject in the entire time-series data in response to the input of the feature amount of the 3D skeleton inference result 602 in the time-series data of the 3D skeleton inference result 602. You may do so.
- the information processing device 100 includes a Factor Graph definition DB (DataBase) 910.
- the Factor Graph definition DB 910 stores a Factor Graph template 911 for each type of exercise of the subject.
- the template 911 includes, for example, a node indicating the position of each joint of the subject, a first edge connecting nodes indicating the positions of different biologically connected joints, and a node indicating the position of the same joint at different times. and a second edge connecting the nodes shown.
- the first edge may be associated with a constraint on the distance between joints.
- the inter-articular distance is, for example, the length of a bone.
- the Factor Graph definition DB 910 stores a template 911 corresponding to the exercise type "jump”, a template 911 corresponding to the exercise type "lying down”, and the like.
- the second edge connects, for example, nodes indicating the position of any joint corresponding to the type of movement of the subject, for each type of movement of the subject. In other words, the second edge connects nodes indicating different joint positions depending on the type of movement of the subject, for example.
- the second edge is associated with a distribution model.
- a second edge connecting nodes indicating the position of one of the joints constrains the temporal change in the position of one of the joints according to the movement tendency of the one of the joints corresponding to the type of movement.
- a distribution model representing a probability distribution is associated. For example, if the type of motion is "jump”, the tendency corresponds to uniform linear motion. For example, if the type of exercise is "lying down,” the tendency corresponds to equipositional exercise.
- the information processing device 100 uses the state estimation DNN 900 to identify the type of movement of the subject in each 3D skeleton inference result 602.
- the information processing device 100 refers to the Factor Graph definition DB 910 and selects the template 911 corresponding to the type of movement of the subject in each 3D skeleton inference result 602 as the Factor Graph to be used.
- the Factor Graph template 911 will be described using FIGS. 10 and 11.
- FIG. 10 is an explanatory diagram showing a specific example of a Factor Graph template 911 corresponding to "jump".
- the template 911 includes, for example, nodes indicating the positions of each joint of the subject.
- the template 911 includes the subject's head, upper cervical vertebrae, lower cervical vertebrae, thoracic vertebrae, lumbar vertebrae, left and right hip joints, left and right knee joints, left and right leg joints, left and right feet, left and right shoulder joints, left and right elbow joints, and left and right hands.
- nodes indicating the positions of the subject's lower cervical vertebrae at different times are connected by a second edge 1001. Furthermore, nodes indicating the positions of the subject's thoracic vertebrae at different times are connected by a second edge 1001. Furthermore, nodes indicating the positions of the subject's lumbar vertebrae at different times are connected by a second edge 1001.
- nodes indicating the positions of the subject's left hip joint at different times are connected by a second edge 1001. Furthermore, nodes indicating the positions of the subject's right hip joint at different times are connected by a second edge 1001.
- Each second edge is associated with a Pairwise Term distribution model that indicates a time series constraint corresponding to uniform linear motion.
- Pairwise Term is, for example, g t (x j,t-1 , x j,t ) ⁇ N (
- x j,t-1 is the estimated position of the joint at time t-1.
- x j,t is the estimated position of the joint at time t.
- v j ⁇ is the average velocity of the joint.
- ⁇ t is a unit time width.
- ⁇ vj ⁇ is the velocity dispersion of the joint.
- the type of exercise is "jumping" it is considered that the position of the joints in the trunk portion tends to change over time in a regular manner.
- template 911 when the type of movement is "jump", the position of the joints in the trunk, where the time change in position is considered to be easy to predict, is assumed to be a uniform linear movement. Time changes can be constrained. Next, the description will move on to FIG. 11.
- FIG. 11 is an explanatory diagram showing a specific example of a Factor Graph template 911 corresponding to "lay down".
- the template 911 includes, for example, nodes indicating the positions of each joint of the subject.
- the template 911 includes the subject's head, upper cervical vertebrae, lower cervical vertebrae, thoracic vertebrae, lumbar vertebrae, left and right hip joints, left and right knee joints, left and right leg joints, left and right feet, left and right shoulder joints, left and right elbow joints, and left and right hands.
- nodes indicating the positions of the subject's head at different times are connected by a second edge 1101. Furthermore, nodes indicating the positions of the subject's upper cervical vertebrae at different times are connected by a second edge 1101. Furthermore, nodes indicating the positions of the subject's lower cervical vertebrae at different times are connected by a second edge 1101. Furthermore, nodes indicating the positions of the subject's thoracic vertebrae at different times are connected by a second edge 1101. Furthermore, nodes indicating the positions of the subject's lumbar vertebrae at different times are connected by a second edge 1101.
- nodes indicating the positions of the subject's left hip joint at different times are connected by a second edge 1101. Furthermore, nodes indicating the positions of the subject's right hip joint at different times are connected by a second edge 1101. Furthermore, nodes indicating the positions of the subject's left knee joint at different times are connected by a second edge 1101. Further, nodes indicating the positions of the subject's right knee joint at different times are connected by a second edge 1101. Furthermore, nodes indicating the positions of the subject's left leg joints at different times are connected by a second edge 1101. Furthermore, nodes indicating the positions of the subject's right leg joints at different times are connected by a second edge 1101. Furthermore, nodes indicating the positions of the subject's left foot at different times are connected by a second edge 1101. Furthermore, nodes indicating the positions of the subject's right foot at different times are connected by a second edge 1101.
- nodes indicating the positions of the subject's left shoulder joint at different times are connected by a second edge 1101. Furthermore, nodes indicating the positions of the subject's right shoulder joint at different times are connected by a second edge 1101. Further, nodes indicating the positions of the subject's left elbow joint at different times are connected by a second edge 1101. Furthermore, nodes indicating the positions of the subject's right elbow joint at different times are connected by a second edge 1101. Furthermore, nodes indicating the positions of the subject's right elbow joint at different times are connected by a second edge 1101. Further, nodes indicating the positions of the subject's left wrist at different times are connected by a second edge 1101. Further, nodes indicating the positions of the subject's right wrist at different times are connected by a second edge 1101.
- nodes indicating the positions of the subject's left hand at different times are connected by a second edge 1101.
- nodes indicating the positions of the subject's right hand at different times are connected by a second edge 1101.
- illustration of a part of the second edge 1101 is omitted for convenience of drawing.
- Each second edge is associated with a Pairwise Term distribution model that indicates a time series constraint corresponding to equipositional motion.
- the Pairwise Term is, for example, g t (x j,t-1 , x j,t ) to N (
- ⁇ xj ⁇ is the joint position variance.
- template 911 assumes equipositional movement for the joints of the whole body whose positional changes over time are considered easy to predict when the type of exercise is "lying down” and calculates the temporal change in position. Can be restricted.
- FIG. 12 is an explanatory diagram showing a specific example of adding time series constraints.
- the information processing device 100 determines whether a leaf node in the selected Factor Graph to which the second edge is not connected and one first edge is connected is a node indicating the position of the identified abnormal joint. do. If the leaf node indicates the position of the identified abnormal joint, the information processing apparatus 100 connects the leaf nodes at different times with the third edge 1201. Thereby, the information processing device 100 can accurately correct the position of the abnormal joint.
- the information processing apparatus 100 modifies the 3D skeleton inference result 602 using the selected Factor Graph 1300 will be described with reference to FIG.
- FIG. 13 is an explanatory diagram showing a specific example of correcting the 3D skeleton inference result 602.
- the information processing apparatus 100 uses the selected Factor Graph 1300 to modify the 3D skeleton inference result 602.
- the Factor Graph 1300 includes a node group 1310 corresponding to time t-1, a node group 1320 corresponding to time t, and so on.
- the node group 1310 includes nodes 1311 to 1313 and the like.
- the node group 1320 includes nodes 1321 to 1323 and the like.
- node 1311 and node 1312 are connected by first edge 1331.
- node 1312 and node 1313 are connected by first edge 1332.
- node 1321 and node 1322 are connected by first edge 1341.
- node 1322 and node 1323 are connected by first edge 1342.
- the first edge 1342 that connects the node 1322 and the node 1323 may be associated with a Pairwise Term that indicates a bone length constraint.
- the node 1312 and the node 1322 are connected by the second edge 1351.
- the second edge 1351 is associated with, for example, a pairwise term indicating the above-mentioned time series constraint corresponding to the type of movement of the subject.
- node 1311 and node 1321 are connected by third edge 1361.
- the third edge 1361 may be associated with, for example, Pairwise Term indicating the above-described time series constraint.
- the information processing device 100 may associate an Unary Term with a node indicating the position of at least one joint of the Factor Graph 1300.
- the Unary Term is, for example, f(x j ) to N(x j
- x j ⁇ is a weighted sum of joint likelihoods of a 3D heat map that integrates joint likelihoods of a plurality of 2D heat maps.
- ⁇ 3D j ⁇ is the variance of the joint likelihood of a 3D heat map that integrates the joint likelihoods of a plurality of 2D heat maps.
- the information processing device 100 supports a node indicating the position of at least one joint of the Factor Graph 1300 with an Unary Term indicating a constraint on the abnormal joint that acts to constrain the position of the joint according to the abnormality probability of the joint. You can also attach it.
- the information processing apparatus 100 may associate, for example, an Unary Term including the abnormality probability of the joint 1 with the node 1321 indicating the position of the joint 1 among the node group 1320.
- the Unary Term is, for example, f(x j ) to N(x j
- p(x j ) is the abnormality probability.
- the information processing device 100 corrects the position of each joint at each time point based on the Unary Term and Pairwise Term in the Factor Graph 1300.
- the information processing device 100 corrects the position of each joint at each time point, for example, by optimizing the Factor Graph 1300.
- the information processing device 100 can accurately correct the 3D skeleton inference result 602.
- the information processing device 100 can accurately identify the position of each joint at each point in time. For example, the information processing device 100 can determine the position of each joint of the subject at each point in time with relatively high accuracy even when the subject performs relatively high-speed or relatively complex movements such as gymnastics. can be identified.
- a comparative example 1 can be considered in which the 3D coordinates of the subject's joints are corrected using a Factor Graph that does not include Pairwise terms indicating time series constraints.
- it is difficult to accurately correct the 3D coordinates of each joint of the subject because it is not possible to constrain changes in the position of the joints over time. It may be difficult to identify accurately.
- the information processing device 100 can use a Factor Graph 1300 that includes a Pairwise Term indicating a time series constraint. Therefore, the information processing apparatus 100 can appropriately correct the 3D coordinates of each joint of the subject. For example, the information processing device 100 can change the time change from the 3D coordinates of the subject's joints at a certain point in time to the 3D coordinates of the subject's joints at the next point in time so that it is difficult for a person to intuitively feel that it is wrong.
- the 3D coordinates of the subject's joints at each point in time can be appropriately modified so that
- Comparative Example 2 in which the 3D coordinates of the subject's joints are corrected using a Factor Graph that includes a Pairwise Term that indicates a predetermined time series constraint.
- a Factor Graph that includes a Pairwise Term that indicates a predetermined time series constraint.
- it is difficult to accurately correct the three-dimensional coordinates of each joint of the subject and it may be difficult to accurately identify temporal changes in the three-dimensional coordinates of each joint of the subject.
- Comparative Example 2 it is not possible to dynamically change the Pairwise Term, which indicates time series constraints, according to the condition of the subject such as the type of movement, so the three-dimensional coordinates of each joint of the subject are corrected with precision. difficult to do.
- the information processing device 100 can set the Factor Graph 1300 by properly using a plurality of Factor Graph templates 911 including Pairwise terms indicating different time series constraints, depending on the type of the target person's motion. .
- the information processing device 100 can, for example, use Pairwise terms that indicate time-series constraints corresponding to uniform motion, uniform linear motion, uniform acceleration motion, etc., depending on the type of motion of the subject. can.
- the information processing device 100 can connect the second edge corresponding to Pairwise Term indicating the time series constraint to the node indicating the 3D coordinates of different joints depending on the type of motion of the subject.
- the information processing device 100 can appropriately correct the 3D coordinates of each joint of the subject.
- the information processing device 100 can change the time change from the 3D coordinates of the subject's joints at a certain point in time to the 3D coordinates of the subject's joints at the next point in time so that it is difficult for a person to intuitively feel that it is wrong.
- the 3D coordinates of the subject's joints at each point in time can be appropriately modified so that Next, a specific example of the flow of data processing in the operation example will be described using FIGS. 14 and 15.
- FIGS. 14 and 15 are explanatory diagrams showing a specific example of the flow of data processing in the operation example.
- the information processing apparatus 100 acquires a plurality of camera images 1401 at each time point.
- the information processing device 100 stores a 2D skeleton inference model 1410.
- the information processing device 100 stores, for example, weight parameters that define a neural network that becomes the 2D skeleton inference model 1410.
- the information processing apparatus 100 refers to the 2D skeleton inference model 1410 at each point in time and performs 2D skeleton inference processing on each of the plurality of camera images 1401, thereby generating a 2D skeleton inference result 1402. generate.
- the 2D skeleton inference result 1402 includes, for example, 2D coordinates (x[pixel], y[pixel]) indicating the joint position and a likelihood indicating the certainty of the joint position.
- the information processing device 100 stores a 3D skeleton inference model 1420.
- the information processing device 100 stores, for example, weight parameters that define a neural network that becomes the 3D skeleton inference model 1420.
- the information processing device 100 generates 3D skeleton inference results 1403 by referring to the 3D skeleton inference model 1420 and performing 3D skeleton inference processing on a plurality of 2D skeleton inference results 1402 at each time point.
- the 3D skeleton inference result 1403 includes, for example, 3D coordinates (x [mm], y [mm], z [mm]) indicating the positions of joints.
- the information processing apparatus 100 generates time series data 1404 that summarizes the 3D skeleton inference results 1403 for each time point. Next, the explanation will move on to FIG. 15.
- the information processing device 100 stores a motion state estimation model 1510.
- the information processing device 100 stores, for example, weight parameters that define a neural network serving as the motion state estimation model 1510.
- the information processing device 100 refers to the movement state estimation model 1510 and performs movement state estimation processing on the time series data 1404 to estimate the type of movement of the subject and calculate the estimated movement of the subject.
- a motion state estimation result 1501 including the type is generated.
- the information processing device 100 stores a Factor Graph definition DB 1520.
- the Factor Graph definition DB 1520 stores, for each type of movement, a template of a Factor Graph corresponding to the type of movement, including Pairwise Terms indicating time series constraints. Pairwise Term indicates, for example, that the temporal change in the position of a joint corresponding to a type of exercise is constrained according to the movement tendency of the subject corresponding to the type of exercise.
- the Factor Graph definition DB 1520 indicates, for example, the type of movement, the type of joint of the subject, and the movement tendency of the subject's joint corresponding to the type of movement.
- the movement tendency is, for example, uniform motion, uniform linear motion, uniform acceleration motion, etc.
- the information processing device 100 refers to the Factor Graph definition DB 1520 and selects a Factor Graph template corresponding to the estimated type of movement of the subject included in the movement state estimation result 1501 as the Factor Graph to be used.
- Bone length model 1530 includes parameters that define Pairwise Term, which represents bone length constraints. The parameters are, for example, the average and variance of bone lengths.
- the information processing device 100 refers to the bone length model 1530 and adds a Pairwise Term indicating bone length constraints to the selected Factor Graph.
- the information processing device 100 corrects the position of each joint by performing optimization processing on the assigned Factor Graph.
- the information processing apparatus 100 generates a corrected 3D skeletal inference model 1502 that includes the corrected positions of each joint. Thereby, the information processing device 100 can accurately specify the position of each joint of the subject at each time point.
- the overall processing is realized by, for example, the CPU 301 shown in FIG. 3, storage areas such as the memory 302 and the recording medium 305, and the network I/F 303.
- FIG. 16 is a flowchart illustrating an example of the overall processing procedure.
- the information processing apparatus 100 acquires time-series data of the three-dimensional skeleton inference results of the subject (step S1601).
- the information processing apparatus 100 then calculates the likelihood of each part of the subject based on the acquired time series data of the three-dimensional skeleton inference results of the subject (step S1602).
- the information processing device 100 estimates the exercise state of the subject at each time point based on the acquired time-series data of the three-dimensional skeleton inference results of the subject (step S1603).
- the information processing apparatus 100 selects a Factor Graph corresponding to the estimated exercise state of the subject for each time point (step S1604).
- the information processing device 100 corrects the time series data of the three-dimensional skeleton inference result of the subject by optimizing the Factor Graph (step S1607). Then, the information processing apparatus 100 outputs time-series data of the corrected three-dimensional skeleton inference results of the subject (step S1608). After that, the information processing device 100 ends the entire process.
- the information processing device 100 can accurately correct the three-dimensional skeleton inference result of the subject. Therefore, the information processing device 100 can improve the usefulness of the three-dimensional skeleton inference result of the subject.
- the information processing device 100 can, for example, improve the accuracy of analysis processing based on the 3D skeleton inference results of the subject.
- the information processing apparatus 100 may perform the processing of some of the steps in FIG. 16 by changing the order of the processing. For example, the order of processing in steps S1605 and S1606 can be changed. Further, the information processing apparatus 100 may omit the processing of some steps in FIG. 16 . For example, the process in step S1605 can be omitted.
- time-series data of skeletal information including the position of each of a plurality of body parts of a subject can be acquired.
- the information processing device 100 it is possible to specify the type of motion of the subject corresponding to the skeletal information at the first time point in the acquired time-series data, based on the feature amount of the skeletal information in the acquired time-series data.
- the information processing device 100 among the skeletal information at the first point in time in the acquired time-series data, the temporal change in the position of any one of the plurality of parts is determined based on the temporal change in the position of one of the parts corresponding to the specified type of motion.
- the information processing apparatus 100 it is possible to generate a graph including nodes indicating the positions of respective parts at each time point. According to the information processing apparatus 100, it is possible to provide a first edge that connects nodes indicating the positions of different biologically connected parts at each time point in the graph. According to the information processing apparatus 100, it is possible to provide a second edge that connects nodes indicating the positions of any parts at different times in the graph. According to the information processing apparatus 100, it is possible to associate the determined model with the second edge in the graph. According to the information processing apparatus 100, the skeleton information at the first point in time in the time series data can be corrected based on the generated graph. Thereby, the information processing device 100 can accurately correct the skeleton information at the first time point.
- the temporal change in the position of any part of the skeletal information at the first point in time is determined as equal positional movement, uniform velocity movement, or uniform movement of any part corresponding to the specified type of motion. , it is possible to determine a model of the probability distribution that is constrained according to the tendency of uniform acceleration motion. Thereby, the information processing apparatus 100 can determine a model that allows the skeletal information at the first time point to be appropriately modified according to the type of motion.
- the information processing device 100 for a plurality of parts other than one of the parts, whether there is one first edge connected to each node indicating the position of the other part at a different time. It can be determined whether or not. According to the information processing device 100, for other parts, if there is one first edge connected to each node indicating the position of the other part at a different time, the third edge connecting the nodes is , a graph can be generated for inclusion in the graph. Thereby, the information processing apparatus 100 can increase the number of edges connected to a node, and can accurately correct the position of another part indicated by the node.
- the information processing device 100 it is possible to identify a region other than one of the plurality of regions that is in an abnormal state with respect to its position. According to the information processing apparatus 100, for the identified other part, if there is one first edge connected to each node indicating the position of the other part at a different point in time, the third edge connecting the nodes is Graphs can be generated such that edges are included in the graph. Thereby, the information processing apparatus 100 can specify another part that is preferable to be corrected, and can accurately correct the position of the specified other part.
- the information processing method described in this embodiment can be realized by executing a program prepared in advance on a computer such as a PC or a workstation.
- the information processing program described in this embodiment is recorded on a computer-readable recording medium, and executed by being read from the recording medium by the computer.
- the recording medium includes a hard disk, a flexible disk, a CD (Compact Disc)-ROM, an MO (Magneto Optical Disc), a DVD (Digital Versatile Disc), and the like.
- the information processing program described in this embodiment may be distributed via a network such as the Internet.
- Information processing device 101 Skeletal information 110 Graph 111, 1311 to 1313, 1321 to 1323 Node 112, 1331, 1332, 1341, 1342 First edge 113, 1001, 1101, 1351 Second edge 200 Information processing system 201 Image capturing device 202 Client device 210 Network 300,400 Bus 301,401 CPU 302,402 Memory 303,403 Network I/F 304, 404 Recording medium I/F 305,405 Recording medium 306 Display 307 Input device 406 Camera 500 Storage unit 501 Acquisition unit 502 Analysis unit 503 Learning unit 504 Specification unit 505 Determination unit 506 Generation unit 507 Correction unit 508 Output unit 600 Multi-view image 601 2D heat map 602,603 ,1403,1502 3D skeleton inference result 700 Original data 701 Processed data 710 Abnormality determination DNN 900 State estimation DNN 910,1520 Factor Graph definition DB 911 Template 1201, 1361 3rd edge 1300 Factor Graph 1310, 1320 Node group 1401 Camera image
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Human Computer Interaction (AREA)
- Social Psychology (AREA)
- Psychiatry (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Bioinformatics & Computational Biology (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Image Analysis (AREA)
- Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)
Abstract
Description
図1は、実施の形態にかかる情報処理方法の一実施例を示す説明図である。情報処理装置100は、対象者の部位の位置を精度よく特定可能にするためのコンピュータである。対象者は、例えば、人である。部位は、例えば、首、頭、右肩および左肩、右肘および左肘、右手および左手、右膝および左膝、右足および左足などである。部位は、例えば、関節である。位置は、例えば、3次元座標である。
次に、図2を用いて、図1に示した情報処理装置100を適用した、情報処理システム200の一例について説明する。
次に、図3を用いて、情報処理装置100のハードウェア構成例について説明する。
次に、図4を用いて、画像撮像装置201のハードウェア構成例について説明する。
クライアント装置202のハードウェア構成例は、具体的には、図3に示した情報処理装置100のハードウェア構成例と同様であるため、説明を省略する。
次に、図5を用いて、情報処理装置100の機能的構成例について説明する。
次に、図6~図15を用いて、情報処理装置100の動作例について説明する。まず、例えば、図6を用いて、情報処理装置100の動作の流れについて説明する。
次に、図16を用いて、情報処理装置100が実行する、全体処理手順の一例について説明する。全体処理は、例えば、図3に示したCPU301と、メモリ302や記録媒体305などの記憶領域と、ネットワークI/F303とによって実現される。
101 骨格情報
110 グラフ
111,1311~1313,1321~1323 ノード
112,1331,1332,1341,1342 第1エッジ
113,1001,1101,1351 第2エッジ
200 情報処理システム
201 画像撮像装置
202 クライアント装置
210 ネットワーク
300,400 バス
301,401 CPU
302,402 メモリ
303,403 ネットワークI/F
304,404 記録媒体I/F
305,405 記録媒体
306 ディスプレイ
307 入力装置
406 カメラ
500 記憶部
501 取得部
502 解析部
503 学習部
504 特定部
505 決定部
506 生成部
507 修正部
508 出力部
600 多視点画像
601 2Dヒートマップ
602,603,1403,1502 3D骨格推論結果
700 元データ
701 加工データ
710 異常判定DNN
900 状態推定DNN
910,1520 Factor Graph定義DB
911 雛型
1201,1361 第3エッジ
1300 Factor Graph
1310,1320 ノード群
1401 カメラ画像
1402 2D骨格推論結果
1404 時系列データ
1410 2D骨格推論モデル
1420 3D骨格推論モデル
1501 運動状態推定結果
1510 運動状態推定モデル
1530 骨長さモデル
Claims (6)
- 対象者の複数の部位のそれぞれの部位の位置を含む骨格情報の時系列データを取得し、
取得した前記時系列データにおける骨格情報の特徴量に基づいて、取得した前記時系列データにおける第1時点の骨格情報に対応する前記対象者の動作の種類を特定し、
取得した前記時系列データにおける前記第1時点の骨格情報のうち、前記複数の部位のいずれかの部位の位置の時間変化を、特定した前記動作の種類に対応する前記いずれかの部位の動きの傾向に応じて制約する確率分布のモデルを決定し、
時点ごとの前記それぞれの部位の位置を示すノードと、前記時点ごとの生体的に連結する異なる部位の位置を示すノード同士を接続する第1エッジと、異なる時点の前記いずれかの部位の位置を示すノード同士を接続する第2エッジとを含み、前記第2エッジに決定した前記モデルを対応付けたグラフを生成し、
生成した前記グラフに基づいて、前記時系列データにおける前記第1時点の骨格情報を修正する、
処理をコンピュータに実行させることを特徴とする情報処理プログラム。 - 前記決定する処理は、
取得した前記時系列データにおける前記第1時点の骨格情報のうち、前記いずれかの部位の位置の時間変化を、特定した前記動作の種類に対応する前記いずれかの部位の等位置運動、等速運動、または、等加速運動の傾向に応じて制約する確率分布のモデルを決定する、ことを特徴とする請求項1に記載の情報処理プログラム。 - 前記生成する処理は、
前記複数の部位のうち前記いずれかの部位以外の他の部位について、異なる時点の前記他の部位の位置を示すノードそれぞれに接続された前記第1エッジが1つずつであれば、当該ノード同士を接続する第3エッジを、前記グラフに含めるよう、前記グラフを生成する、ことを特徴とする請求項1または2に記載の情報処理プログラム。 - 前記複数の部位のうち、前記いずれかの部位以外であって、位置に関して異常な状態である他の部位を特定する、
処理を前記コンピュータに実行させ、
前記生成する処理は、
特定した前記他の部位について、異なる時点の前記他の部位の位置を示すノードそれぞれに接続された前記第1エッジが1つずつであれば、当該ノード同士を接続する第3エッジを、前記グラフに含めるよう、前記グラフを生成する、ことを特徴とする請求項3に記載の情報処理プログラム。 - 対象者の複数の部位のそれぞれの部位の位置を含む骨格情報の時系列データを取得し、
取得した前記時系列データにおける骨格情報の特徴量に基づいて、取得した前記時系列データにおける第1時点の骨格情報に対応する前記対象者の動作の種類を特定し、
取得した前記時系列データにおける前記第1時点の骨格情報のうち、前記複数の部位のいずれかの部位の位置の時間変化を、特定した前記動作の種類に対応する前記いずれかの部位の動きの傾向に応じて制約する確率分布のモデルを決定し、
時点ごとの前記それぞれの部位の位置を示すノードと、前記時点ごとの生体的に連結する異なる部位の位置を示すノード同士を接続する第1エッジと、異なる時点の前記いずれかの部位の位置を示すノード同士を接続する第2エッジとを含み、前記第2エッジに決定した前記モデルを対応付けたグラフを生成し、
生成した前記グラフに基づいて、前記時系列データにおける前記第1時点の骨格情報を修正する、
処理をコンピュータが実行することを特徴とする情報処理方法。 - 対象者の複数の部位のそれぞれの部位の位置を含む骨格情報の時系列データを取得し、
取得した前記時系列データにおける骨格情報の特徴量に基づいて、取得した前記時系列データにおける第1時点の骨格情報に対応する前記対象者の動作の種類を特定し、
取得した前記時系列データにおける前記第1時点の骨格情報のうち、前記複数の部位のいずれかの部位の位置の時間変化を、特定した前記動作の種類に対応する前記いずれかの部位の動きの傾向に応じて制約する確率分布のモデルを決定し、
時点ごとの前記それぞれの部位の位置を示すノードと、前記時点ごとの生体的に連結する異なる部位の位置を示すノード同士を接続する第1エッジと、異なる時点の前記いずれかの部位の位置を示すノード同士を接続する第2エッジとを含み、前記第2エッジに決定した前記モデルを対応付けたグラフを生成し、
生成した前記グラフに基づいて、前記時系列データにおける前記第1時点の骨格情報を修正する、
制御部を有することを特徴とする情報処理装置。
Priority Applications (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2022/016364 WO2023188217A1 (ja) | 2022-03-30 | 2022-03-30 | 情報処理プログラム、情報処理方法、および情報処理装置 |
JP2024510982A JPWO2023188217A1 (ja) | 2022-03-30 | 2022-03-30 | |
EP22935362.8A EP4502925A4 (en) | 2022-03-30 | 2022-03-30 | INFORMATION PROCESSING PROGRAM, INFORMATION PROCESSING METHOD, AND INFORMATION PROCESSING DEVICE |
CN202280094203.4A CN118974771A (zh) | 2022-03-30 | 2022-03-30 | 信息处理程序、信息处理方法以及信息处理装置 |
US18/885,788 US20250014389A1 (en) | 2022-03-30 | 2024-09-16 | Non-transitory computer-readable recording medium storing information processing program, information processing method, and information processing device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2022/016364 WO2023188217A1 (ja) | 2022-03-30 | 2022-03-30 | 情報処理プログラム、情報処理方法、および情報処理装置 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US18/885,788 Continuation US20250014389A1 (en) | 2022-03-30 | 2024-09-16 | Non-transitory computer-readable recording medium storing information processing program, information processing method, and information processing device |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2023188217A1 true WO2023188217A1 (ja) | 2023-10-05 |
Family
ID=88199827
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2022/016364 WO2023188217A1 (ja) | 2022-03-30 | 2022-03-30 | 情報処理プログラム、情報処理方法、および情報処理装置 |
Country Status (5)
Country | Link |
---|---|
US (1) | US20250014389A1 (ja) |
EP (1) | EP4502925A4 (ja) |
JP (1) | JPWO2023188217A1 (ja) |
CN (1) | CN118974771A (ja) |
WO (1) | WO2023188217A1 (ja) |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170344829A1 (en) | 2016-05-31 | 2017-11-30 | Microsoft Technology Licensing, Llc | Skeleton -based action detection using recurrent neural network |
JP2019016106A (ja) * | 2017-07-05 | 2019-01-31 | 富士通株式会社 | 情報処理プログラム、情報処理装置、情報処理方法、及び情報処理システム |
JP2020042476A (ja) | 2018-09-10 | 2020-03-19 | 国立大学法人 東京大学 | 関節位置の取得方法及び装置、動作の取得方法及び装置 |
WO2021002025A1 (ja) | 2019-07-04 | 2021-01-07 | 富士通株式会社 | 骨格認識方法、骨格認識プログラム、骨格認識システム、学習方法、学習プログラムおよび学習装置 |
WO2021064942A1 (ja) | 2019-10-03 | 2021-04-08 | 富士通株式会社 | 評価方法、評価プログラムおよび情報処理システム |
CN112991656A (zh) * | 2021-02-04 | 2021-06-18 | 北京工业大学 | 基于姿态估计的全景监控下人体异常行为识别报警系统及方法 |
JP2021105887A (ja) * | 2019-12-26 | 2021-07-26 | 国立大学法人 東京大学 | 3dポーズ取得方法及び装置 |
CN113191230A (zh) * | 2021-04-20 | 2021-07-30 | 内蒙古工业大学 | 一种基于步态时空特征分解的步态识别方法 |
JP2021135877A (ja) * | 2020-02-28 | 2021-09-13 | Kddi株式会社 | 骨格追跡方法、装置およびプログラム |
-
2022
- 2022-03-30 EP EP22935362.8A patent/EP4502925A4/en active Pending
- 2022-03-30 WO PCT/JP2022/016364 patent/WO2023188217A1/ja active Application Filing
- 2022-03-30 CN CN202280094203.4A patent/CN118974771A/zh active Pending
- 2022-03-30 JP JP2024510982A patent/JPWO2023188217A1/ja active Pending
-
2024
- 2024-09-16 US US18/885,788 patent/US20250014389A1/en active Pending
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170344829A1 (en) | 2016-05-31 | 2017-11-30 | Microsoft Technology Licensing, Llc | Skeleton -based action detection using recurrent neural network |
JP2019016106A (ja) * | 2017-07-05 | 2019-01-31 | 富士通株式会社 | 情報処理プログラム、情報処理装置、情報処理方法、及び情報処理システム |
JP2020042476A (ja) | 2018-09-10 | 2020-03-19 | 国立大学法人 東京大学 | 関節位置の取得方法及び装置、動作の取得方法及び装置 |
WO2021002025A1 (ja) | 2019-07-04 | 2021-01-07 | 富士通株式会社 | 骨格認識方法、骨格認識プログラム、骨格認識システム、学習方法、学習プログラムおよび学習装置 |
WO2021064942A1 (ja) | 2019-10-03 | 2021-04-08 | 富士通株式会社 | 評価方法、評価プログラムおよび情報処理システム |
JP2021105887A (ja) * | 2019-12-26 | 2021-07-26 | 国立大学法人 東京大学 | 3dポーズ取得方法及び装置 |
JP2021135877A (ja) * | 2020-02-28 | 2021-09-13 | Kddi株式会社 | 骨格追跡方法、装置およびプログラム |
CN112991656A (zh) * | 2021-02-04 | 2021-06-18 | 北京工业大学 | 基于姿态估计的全景监控下人体异常行为识别报警系统及方法 |
CN113191230A (zh) * | 2021-04-20 | 2021-07-30 | 内蒙古工业大学 | 一种基于步态时空特征分解的步态识别方法 |
Non-Patent Citations (3)
Title |
---|
ISKAKOVKARIM ET AL.: "Proceedings of the IEEE", 2019, CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, article "Learnable triangulation of human pose." |
MOONGYEONGSIKJU YONG CHANGKYOUNG MU LEE: "Proceedings of the IEEE conference on computer vision and pattern Recognition", 2018, article "V2v-posenet: Voxel-to-voxel prediction network for accurate 3d hand and human pose estimation from a single depth map" |
See also references of EP4502925A4 |
Also Published As
Publication number | Publication date |
---|---|
EP4502925A4 (en) | 2025-04-16 |
EP4502925A1 (en) | 2025-02-05 |
CN118974771A (zh) | 2024-11-15 |
US20250014389A1 (en) | 2025-01-09 |
JPWO2023188217A1 (ja) | 2023-10-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111488824A (zh) | 运动提示方法、装置、电子设备和存储介质 | |
CN104035557B (zh) | 一种基于关节活跃度的Kinect动作识别方法 | |
US11759126B2 (en) | Scoring metric for physical activity performance and tracking | |
US20090042661A1 (en) | Rule based body mechanics calculation | |
JPWO2018087933A1 (ja) | 情報処理装置、情報処理方法、およびプログラム | |
CN110232727A (zh) | 一种连续姿态动作评估智能算法 | |
US12315299B2 (en) | Motion recognition method, non-transitory computer-readable recording medium and information processing apparatus | |
US12067664B2 (en) | System and method for matching a test frame sequence with a reference frame sequence | |
Varshney et al. | RETRACTED ARTICLE: Rule-based multi-view human activity recognition system in real time using skeleton data from RGB-D sensor | |
US20250014215A1 (en) | Non-transitory computer-readable recording medium storing information processing program, information processing method, and information processing device | |
KR20230112636A (ko) | 정보 처리 장치, 정보 처리 방법 및 프로그램 | |
WO2023188217A1 (ja) | 情報処理プログラム、情報処理方法、および情報処理装置 | |
KR20220156062A (ko) | 역기구학에 기반한 관절 회전 추론들 | |
SIMOES et al. | Accuracy assessment of 2D pose estimation with MediaPipe for physiotherapy exercises | |
JPWO2023188216A5 (ja) | ||
Hachaj et al. | Heuristic Method for Calculating the Translation of Human Body Recordings Using Data from an Inertial Motion Capture Costume | |
JP2021099666A (ja) | 学習モデルの生成方法 | |
JP7199931B2 (ja) | 画像生成装置、画像生成方法及びコンピュータープログラム | |
JP7419993B2 (ja) | 信頼度推定プログラム、信頼度推定方法、および信頼度推定装置 | |
Slupczynski et al. | Analyzing Exercise Repetitions: YOLOv8-Enhanced Dynamic Time Warping Approach on InfiniteRep Dataset | |
WO2025054197A1 (en) | Image-to-3d pose estimation via disentangled representations | |
WO2025035128A2 (en) | Approaches to generating semi-synthetic training data for real-time estimation of pose and systems for implementing the same | |
WO2025054192A1 (en) | Unsupervised depth features for three-dimensional pose estimation | |
Aiman et al. | Workout Mentor: Improving posture in real-time using computer vision | |
Rishabh et al. | Enhancing Exercise Form: A Pose Estimation Approach with Body Landmark Detection |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22935362 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2024510982 Country of ref document: JP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 202280094203.4 Country of ref document: CN |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2022935362 Country of ref document: EP |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 2022935362 Country of ref document: EP Effective date: 20241030 |