WO2021024691A1

WO2021024691A1 - Image processing system, image processing program, and image processing method

Info

Publication number: WO2021024691A1
Application number: PCT/JP2020/026880
Authority: WO
Inventors: 希武田中; 池田　直樹
Original assignee: コニカミノルタ株式会社
Priority date: 2019-08-07
Filing date: 2020-07-09
Publication date: 2021-02-11
Also published as: JP7435609B2; JPWO2021024691A1

Abstract

[Problem] To provide an image processing system capable of improving deduction accuracy of a human motion based on a photographed image. [Solution] An image processing system has: a detection unit that detects feature points regarding the body of a subject on the basis of an image which has been photographed and which includes the subject; a calculation unit that calculates a direction from the center region of the image toward the subject on the basis of the image; a determination unit that determines whether a subject's motion is included in prescribed motions on the basis of the arraying direction of prescribed feature points among the feature points and the direction toward the subject; and an output unit that outputs information about the subject's motion when the subject's motion is determined to be included in the prescribed motions.

Description

Image processing system, image processing program, and image processing method

The present invention relates to an image processing system, an image processing program, and an image processing method.

In Japan, the longevity of life has become remarkable due to the improvement of living standards, the improvement of sanitary environment, and the improvement of medical standards due to the high economic growth after the war. For this reason, coupled with the decline in the birth rate, the aging society has a high aging rate. In such an aging society, it is expected that the number of people requiring long-term care will increase due to illness, injury, and aging.

People requiring long-term care may fall while walking or fall out of bed and get injured in facilities such as hospitals and welfare facilities for the elderly. Therefore, in order to detect the condition of the care recipient from the captured image so that the staff such as the caregiver and the nurse can immediately rush to the care recipient when the care recipient becomes in such a state. System development is underway. In order to detect the state of a person requiring long-term care or the like with such a system, it is necessary to detect the posture and behavior of the person to be detected from the image with high accuracy.

The following prior art is disclosed in Patent Document 1 below. The monitoring function by the detection unit that detects the predetermined action of the monitored person and gives a notification or the like is stopped based on the information or the like received from the terminal unit. As a result, the monitoring function can be stopped as needed, so that false detections for persons other than the monitored person can be reduced.

International Publication No. 2016/152428

However, the prior art disclosed in Patent Document 1 can prevent erroneous detection of the behavior of a person other than the monitored person as the behavior of the monitored person, but cannot improve the detection accuracy of the behavior of the monitored person. There's a problem.

The present invention has been made to solve such a problem. That is, it is an object of the present invention to provide an image processing system, an image processing program, and an image processing method capable of improving the estimation accuracy of a person's behavior based on a captured image.

The above-mentioned problems of the present invention are solved by the following means.

(1) A feature point detection unit that detects feature points related to the target person's body based on an image including the target person taken by an imaging device, and the target from the central region of the image based on the image. The behavior of the target person is determined based on the calculation unit that calculates the direction toward the person, the arrangement direction of the predetermined feature points among the detected feature points, and the calculated direction toward the target person. When the determination unit that determines whether or not the behavior is included in the behavior of the subject and the determination unit determine that the behavior of the target person is an behavior included in the predetermined behavior, information on the behavior of the target person is obtained. An image processing system having an output unit for output.

(2) The determination unit is an action in which the action of the target person is included in the predetermined action when the arrangement direction of the predetermined feature points and the direction toward the target person have a predetermined relationship. The image processing system according to (1) above.

(3) The image processing system according to (2) above, wherein the predetermined relationship is set according to the distance from the photographing device to the target person.

(4) The image processing system according to any one of (1) to (3) above, wherein the arrangement direction of the predetermined feature points is a specific direction from one of the two feature points to the other.

(5) The image processing system according to (4) above, wherein the determination unit determines whether or not the behavior of the target person is included in the predetermined behavior by using the plurality of specific directions.

(6) The image processing according to (2) or (3) above, wherein the predetermined relationship is such that the angle formed by the arrangement direction of the predetermined feature points and the direction toward the target person is equal to or more than a predetermined threshold value. system.

(7) Any of the above (1) to (6), wherein the predetermined action is a fall and a fall, and the determination unit determines whether or not the action of the subject is at least one of a fall and a fall. Image processing system described in Crab.

(8) The photographing device is a wide-angle camera, and the image is an image including the predetermined area taken by the wide-angle camera installed at a position overlooking a predetermined area. The image processing system according to any one of 7).

(9) The direction toward the target person is any of the above (1) to (8), which is a direction calculated based on the points included in the central region of the image and the predetermined feature points. The image processing system described in.

(10) A procedure (a) for detecting a feature point related to the subject's body based on an image including the subject taken by an imaging device, and the subject from the central region of the image based on the image. The behavior of the target person based on the procedure (b) for calculating the direction toward the person, the arrangement direction of the predetermined feature points among the detected feature points, and the calculated direction toward the target person. In the procedure (c) for determining whether or not is an action included in the predetermined action, and when it is determined in the procedure (c) that the action of the target person is an action included in the predetermined action, the target An image processing program for causing a computer to execute a process having a procedure (d) for outputting information on a person's behavior.

(11) A method of causing an image processing system to execute a step (a) of detecting a feature point related to the subject's body based on an image including the subject taken by an imaging device, and the image. Based on the step (b) of calculating the direction from the central region of the image toward the target person, the arrangement direction of the predetermined feature points in the detected feature points, and the calculated direction toward the target person. Based on the direction, the step (c) of determining whether the behavior of the target person is an action included in the predetermined action, and the action in which the behavior of the target person is included in the predetermined action in the step (c). An image processing method including a step (d) of outputting information regarding the behavior of the subject when it is determined to be.

The feature points related to the body of the subject are detected based on the captured image, and the behavior of the subject is a predetermined behavior based on the arrangement direction of the predetermined feature points and the direction from the central region of the image toward the subject. It is determined whether or not it is included in, and when it is determined that it is included, information on the behavior of the target person is output. As a result, it is possible to improve the estimation accuracy of the behavior of the person based on the captured image.

It is a figure which shows the schematic structure of the image recognition system. It is a block diagram which shows the structure of the detection part. It is a figure which shows the person area detected in the image. It is a figure which shows the feature point. It is a figure which shows the feature point arrangement direction. It is a figure which shows the feature point arrangement direction and the subject direction in the image by a wide-angle camera. It is a figure which shows the relationship between the feature point arrangement direction and the subject direction when the subject is walking in the image by a wide-angle camera. It is a figure which shows the relationship between the feature point arrangement direction and the subject direction at the time of at least one of a fall and a fall in an image by a wide-angle camera. It is a block diagram which shows the configuration of a server. It is a block diagram which shows the structure of a mobile terminal. It is a flowchart which shows the operation of an image recognition system.

Hereinafter, the image processing system, the image processing program, and the image processing method according to the embodiment of the present invention will be described with reference to the drawings. In the drawings, the same elements are designated by the same reference numerals, and duplicate description will be omitted. The dimensional ratios in the drawings are exaggerated for convenience of explanation and may differ from the actual ratios. Further, the angle formed by the two directions can be considered as two angles such as 60 degrees and 300 degrees (360 degrees -60 degrees), but in the present specification, the smaller of the two angles It means an angle.

FIG. 1 is a diagram showing a schematic configuration of the image recognition system 10.

The image recognition system 10 includes a detection unit 100, a server 200, a communication network 300, and a mobile terminal 400. The detection unit 100 is communicably connected to the server 200 and the mobile terminal 400 by the communication network 300. The mobile terminal 400 may be connected to the communication network 300 via the access point 310. The detection unit 100 constitutes an image processing system. The detection unit 100 may be one integrated device or a plurality of devices separately arranged. As will be described later, the server 200 may perform a part of the functions of the detection unit 100.

(Detection unit 100)
FIG. 2 is a block diagram showing the configuration of the detection unit 100. As shown in the example of FIG. 2, the detection unit 100 includes a control unit 110, a communication unit 120, a camera 130, and a body motion sensor 140, which are connected to each other by a bus. The camera 130 constitutes a photographing device.

The control unit 110 is composed of a CPU (Central Processing Unit) and a memory such as a RAM (Random Access Memory) and a ROM (Read Only Memory), and controls and performs arithmetic processing of each part of the detection unit 100 according to a program. The control unit 110 constitutes a feature point detection unit, a calculation unit, and a determination unit. The control unit 110 constitutes an output unit together with the communication unit 120. The details of the operation of the control unit 110 will be described later.

The communication unit 120 is an interface circuit (for example, a LAN card or the like) for communicating with the mobile terminal 400 or the like via the communication network 300.

The camera 130 is, for example, a wide-angle camera. The camera 130 is installed at a position where the detection unit 100 is installed on the ceiling or the like of the living room of the target person 500 to overlook a predetermined area, and an image including the predetermined area (hereinafter, also simply referred to as “image 600”). Take a picture of). The target person 500 is a person who needs long-term care or nursing by, for example, a staff member. The predetermined area may be a three-dimensional area including the entire floor surface of the living room of the subject 500. The camera 130 may be a standard camera having a narrower angle of view than a wide-angle camera. Hereinafter, for the sake of simplicity, the camera 130 will be described as a wide-angle camera. The image 600 may include the subject 500 as an image. Image 600 includes still images and moving images. The camera 130 is a near-infrared camera, which irradiates near-infrared rays toward the photographing area by an LED (Light Emitting Device) and emits the reflected light of the near-infrared rays reflected by an object in the photographing area to a CMOS (Complemementary Metal Oxide Semiconductor) sensor. A predetermined area can be photographed by receiving light from the camera. The image 600 can be a monochrome image having the reflectance of near infrared rays as each pixel. As the camera 130, a visible light camera may be used instead of the near infrared camera, or these may be used in combination.

The body movement sensor 140 is a doppler shift type sensor that transmits and receives microwaves to the bed 700 and detects the doppler shift of microwaves generated by the body movement (for example, respiratory movement) of the subject 500.

The operation of the control unit 110 will be described.

The control unit 110 detects the silhouette of a person's image (hereinafter referred to as "human silhouette") from the image 600. The human silhouette can be detected, for example, by extracting a range of pixels having a relatively large difference by the time difference method for extracting the difference between images (frames) whose shooting times are before and after. The human silhouette may be detected by the background subtraction method that extracts the difference between the photographed image and the background image. The control unit 110 can detect a predetermined action of the subject 500 based on the silhouette of the person. Predetermined actions include, for example, falls and falls. The control unit 110 may fall due to, for example, the center of gravity of the detected silhouette changing from a state in which it was moving in time series to a state in which it suddenly stopped, or a change in the aspect ratio of a rectangle corresponding to a human silhouette. Can be detected. The control unit 110, for example, has changed from a state in which the human silhouette exists in the area of the bed 700 to a state in which the person silhouette suddenly exists outside the area of the bed 700, and a rectangular aspect ratio corresponding to the human silhouette. The fall can be detected by the change of. The area of the bed 700 in the image 600 is preset when the detection unit 100 is installed, and can be stored in the memory of the control unit 110 as data.

Based on the image 600, the control unit 110 detects the person area 610 as an area including the target person 500, and from the person area 610, a feature point related to the human body (hereinafter, simply referred to as “feature point 620”) is obtained. To detect.

FIG. 3 is a diagram showing a person area 610 detected in the image 600.

The control unit 110 detects an area including the target person 500 who is a person as a person area 610 from the image 600. Specifically, the control unit 110 can detect the person area 610 by detecting the area where the object (object) exists on the image 600 and estimating the category of the object included in the detected area. The region where the object exists can be detected as a rectangle (candidate rectangle) including the object on the image 600. The detection unit 100 detects the person area 610 by detecting the candidate rectangles whose object category is presumed to be a person among the detected candidate rectangles. The person region 610 can be detected using a neural network (hereinafter referred to as "NN"). Examples of the method for detecting the person region 610 by the NN include known methods such as Faster R-CNN, Fast R-CNN, and R-CNN. The NN for detecting the person area 610 from the image 600 detects (estimates) the person area 610 from the image 600 by using the teacher data of the combination of the image 600 and the person area 610 set as the correct answer for the image 600. ) Is learned in advance.

FIG. 4 is a diagram showing feature points 620.

The control unit 110 detects the feature point 620 based on the person area 610. Feature points 620 may include joint points 621 and a pair of vertices 622 of the head (eg, head rectangle). The feature point 620 may further include, for example, the center point 621c of the two

joint points

621a and 621b at the tip of the foot. The central point 621c is calculated based on the two joint points 620a and 621b at the tip of the foot. The feature point 620 can be detected by a known technique using NN such as DeepPose. The feature point 620 can be detected (and calculated) as the coordinates in the image 600. Details of DeepPose are described in publicly known literature (Alexander Toshev, et al. “DeepPose: HumanPoseEstimation via DeepNeural Networks”, in CVPR, 2014). The NN for detecting the feature point 620 from the person area 610 uses the teacher data of the combination of the person area 610 and the feature point 620 set as the correct answer for the person area 610, and uses the teacher data of the combination of the person area 610 to the feature point 620. Learning for detecting (estimating) is performed in advance. The feature point 620 may be estimated directly from the image 600 by using the NN for detecting the feature point 620 from the image 600. In this case, the NN for detecting the feature point 620 from the image 600 uses the teacher data of the combination of the image 600 and the feature point 620 set as the correct answer for the image 600 to obtain the feature point 620 from the image 600. Learning for detection (estimation) is performed in advance.

Based on the image 600, the control unit 110 calculates the direction from the central region of the image 600 toward the target person 500 on the image 600 (hereinafter, also referred to as “target person direction”). The center region of the image 600 includes the center of the image 600. Hereinafter, for the sake of simplicity, the subject direction is assumed to be the direction from the center of the image 600 toward the center of gravity of the feature point 620 of the subject 500. The target person direction may be a direction from the center of the image 600 toward the center of the person area 610. Further, the subject direction may be a direction from the center of the image 600 toward the joint point 621e at the center of the waist.

The control unit 110 determines the behavior of the target person 500 based on the arrangement direction of the predetermined feature points 620 in the detected feature points 620 (hereinafter, also referred to as “characteristic point arrangement direction”) and the target person direction. Determine if the action is included in a predetermined action. The predetermined feature points 620 are a plurality of feature points 620 arranged in the height direction of the subject 500, and from the viewpoint of the determination accuracy of whether or not they are included in the predetermined behavior, among the feature points 620 detected by the experiment. , Can be selected appropriately. The feature point arrangement direction may be a direction from one of the two predetermined feature points to the other (hereinafter, also referred to as a "specific direction"). The feature point arrangement direction may be a direction parallel to a straight line that minimizes the sum of squares of the distances from three or more predetermined feature points. The predetermined action may be a plurality of actions or a single action. Predetermined actions can include falls and falls. Specifically, the control unit 110 determines the feature point arrangement direction and the target person when the action of the target person 500 is detected as any one of the predetermined actions (for example, a fall) based on the person silhouette or the like. Based on the direction, it is determined (re-determined) whether or not the detected behavior corresponds to the behavior included in the predetermined behavior (for example, a fall and a fall). Whether or not the action of the subject 500 is an action included in the predetermined action is determined for the image 600 in which any one of the predetermined actions is detected based on the human silhouette.

FIG. 5 is a diagram showing the direction in which the feature points are arranged. The feature point arrangement direction is indicated by a solid arrow in FIG.

In the example of FIG. 5, the predetermined feature points are the two feature points 620 of the two joint points 620a and 620b at the tip of the foot and the joint point 621d at the center of the shoulder, and 2 of the tip of the foot. The direction from the center point 621c of the two joint points 620a and 620b to the joint point 621d at the center of the shoulder is defined as the feature point alignment direction. The feature point alignment direction may be the direction from the center point 621c of the two joint points 620a and 620b at the tip of the foot to the joint point 621e at the center of the waist. Further, the feature point alignment direction may be a direction from the joint point 621e at the center of the waist to the joint point 621d at the center of the shoulder.

The method of determining whether or not the action of the target person 500 is included in the predetermined action based on the feature point arrangement direction and the target person direction will be described in more detail.

When the feature point arrangement direction and the target person direction have a predetermined relationship, the control unit 110 determines that the action of the target person 500 is an action included in the predetermined action. For example, the control unit 110 determines that the action of the target person 500 is an action included in the predetermined action when the angle formed by the feature point arrangement direction and the target person direction is equal to or more than a predetermined threshold value. The predetermined threshold value can be appropriately set by an experiment from the viewpoint of the determination accuracy of whether or not the behavior of the subject 500 is included in the predetermined behavior.

FIG. 6 is a diagram showing the feature point arrangement direction and the target person direction in the image 600 taken by the wide-angle camera. FIG. 7 is a diagram showing the relationship between the feature point arrangement direction and the target person direction when the target person 500 is walking in the image 600 taken by the wide-angle camera. FIG. 8 is a diagram showing the relationship between the feature point arrangement direction and the target person direction when the target person 500 falls or falls at least in the image 600 taken by the wide-angle camera. In these figures, the feature point arrangement direction is shown as a solid arrow on the image 600. In addition, the target person direction is shown as a broken line arrow. In FIG. 6, feature points 620 are further shown along with an image of subject 500.

When the subject 500 is walking, the subject 500 is in a standing position. When the subject 500 is in either a fall or a fall behavior, the subject 500 is in a lying position. With reference to FIGS. 7 and 8, when the subject 500 is in a standing posture, the feature point arrangement direction and the subject direction are close to parallel to each other, and the angle θ formed by the two is relatively small. .. When the subject 500 is in the recumbent posture, the feature point alignment direction and the subject direction approach each other orthogonally, and the angle θ formed by the two becomes relatively large. This tendency becomes more remarkable in the image 600 taken by the wide-angle camera due to the distortion characteristic of the wide-angle lens of the wide-angle camera, but if the image 600 is an image taken from a bird's-eye view of a relatively wide area, it is standard. The same tendency can be seen in the image 600 taken by the camera. Therefore, if the angle θ formed by the feature point arrangement direction and the target person direction is equal to or greater than a predetermined threshold value, it can be determined that the action of the target person 500 is at least one of the predetermined actions of falling and falling. It should be noted that the judgment based on the feature point arrangement direction and the subject direction does not distinguish between a fall and a fall. However, since a fall or a fall is distinguished in the detection of a predetermined behavior based on a human silhouette, it is sufficient if it can be determined that the fall or the fall is at least one of the falls. When a fall or fall is detected based on the silhouette of a person, the fall and the fall are further determined by determining that the fall or the fall is at least one of the fall and the fall based on the feature point alignment direction and the subject direction. The detection accuracy can be improved.

A predetermined angle set between the feature point alignment direction and the target person direction for determining that the action of the target person 500 described above is at least one of a predetermined action, a fall and a fall. The threshold value can be set according to the distance from the camera 130 to the subject 500. The distance from the camera 130 to the subject 500 corresponds to the distance from the center of the image 600 to the subject 500 in the image 600. Therefore, setting a predetermined threshold value according to the distance from the camera 130 to the target person 500 means that the predetermined threshold value is set according to the distance from the center of the image 600 to the target person 500 in the image 600. Corresponding to that. For example, the range of a relatively short distance from the center of the image 600 is the first range, the range of a distance relatively far from the center of the image 600 is the third range, and the range between the first range and the third range is the second range. The range. Then, the angle that is the threshold value can be set larger in the order of the predetermined first threshold value set in the first range, the predetermined second threshold value set in the second range, and the predetermined third threshold value set in the third range. For example, the first threshold is set to 80 degrees, the second threshold is set to 70 degrees, and the third threshold is set to 60 degrees. As a result, the shorter the distance from the center of the image 600, the stricter the criterion for determining that the image 600 is included in the predetermined action. This is because the subject 500, which is closer to the center of the image 600, tends to have a smaller difference in the angle θ between the feature point alignment direction and the subject direction between the standing posture and the lying posture. This is because it is difficult to determine whether or not the action corresponds to a predetermined action. This tendency occurs when the distortion at the center of the image 600 is relatively small and the distortion increases toward the periphery due to the characteristics of the wide-angle lens, such as the image 600 taken by a wide-angle camera. Especially noticeable.

Whether or not the control unit 110 is included in a predetermined action based on the relationship between the plurality of specific directions and the target person direction using a plurality of specific directions (directions from one of the two predetermined feature points to the other). Can be judged. For example, the direction formed from the center point 621c of the two joint points 620a and 620b at the tip of the foot toward the joint point 621e at the center of the waist is set as the first specific direction, and the angle formed by the first specific direction and the subject direction is predetermined. Judge whether it is above the threshold value of. With the direction from the joint point 621e at the center of the waist to the joint point 621d at the center of the shoulder as the second specific direction, it is determined whether or not the angle formed by the second specific direction and the subject direction is equal to or greater than a predetermined threshold value. Then, when it is determined that any of the angles is equal to or greater than a predetermined threshold value, it can be determined that the behavior is at least one of the predetermined actions (action included in the predetermined action).

When the control unit 110 determines that the action of the target person 500 is at least one of the predetermined actions based on the feature point arrangement direction and the target person direction, the control unit 110 provides information on the action of the target person 500. It is output by transmitting it to the server 200 by the communication unit 120 or the like. The information regarding the behavior of the subject 500 is the first information indicating that the behavior of the subject 500 is at least one of the predetermined behaviors, or the probability (probability) of the predetermined behavior detected based on the human silhouette is high. It can be the second information indicating that. The first information is, for example, information that "the behavior of the subject 500 is at least one of a fall and a fall". The second information is, for example, information that "the probability of being a detected action is high". The control unit 110 may further transmit the behavior specific information indicating the predetermined behavior of the target person 500, which is detected based on the human silhouette, to the server 200 or the like in association with the information regarding the behavior of the target person 500. The first information, the second information, and the action specific information can be associated with each other by including information that identifies the target person 500 such as the ID (number) of the target person 500, and the shooting time of the image 600. As will be described later, the server 200 makes a final determination that the target person 500 has performed a predetermined action detected based on the human silhouette based on the action specific information and the information on the behavior of the target person 500. obtain.

On the other hand, when the control unit 110 detects any of the predetermined actions of the target person 500 based on the silhouette of the person and at least one of the predetermined actions based on the relationship between the specific direction and the target person direction. When it is determined, the final determination that the subject 500 has performed a predetermined action detected based on the silhouette of the person may be made. In this case, the control unit 110 may transmit (output) the third information indicating the final determination that the target person 500 has performed a predetermined action to the server 200 or the like as information regarding the action of the target person 500. The action specific information does not need to be transmitted to the server 200 or the like. The third information is, for example, information that "the subject 500 has fallen". The third information includes information that identifies the target person 500, such as the name of the target person 500.

(Server 200)
FIG. 9 is a block diagram showing the configuration of the server 200. The server 200 includes a control unit 210, a communication unit 220, and a storage unit 230. The components are connected to each other by a bus.

The basic configuration of the control unit 210 and the communication unit 220 is the same as that of the control unit 110 and the communication unit 120, which are the corresponding components of the detection unit 100.

The control unit 210 receives information on the behavior of the target person 500 from the detection unit 100 by the communication unit 220. The control unit 210 may further receive the action specific information from the detection unit 100.

When the information regarding the behavior of the target person 500 is the first information indicating that the behavior of the target person 500 is at least one of the predetermined behaviors, the control unit 21 determines that the target person 500 indicates the behavior specific information. Make the final decision that you have acted. Similarly, when the information regarding the behavior of the target person 500 is the second information indicating that the certainty (probability) of the predetermined behavior detected based on the human silhouette is high, the control unit 21 also receives the target person 500. Make a final decision that the action specific information has taken the prescribed action. When the control unit 21 makes a final determination that the predetermined action indicated by the action specific information has been performed, the control unit 21 sends an event notification for notifying the staff or the like that the target person 500 has performed the predetermined action (for example, a fall). It can be transmitted to a mobile terminal 400 or the like.

When the information regarding the behavior of the target person 500 is the third information indicating the final determination that the target person 500 has performed the predetermined action, the control unit 21 notifies the staff or the like that the target person 500 has performed the predetermined action. The event notification for the purpose can be transmitted to the mobile terminal 400 or the like.

Note that the server 200 can be implemented by substituting a part of the functions of the detection unit 100. For example, the server 200 receives the image 600 from the detection unit 100, detects the human silhouette from the image 600, and detects a predetermined action of the target person 500 based on the human silhouette. When a predetermined action of the target person 500 is detected, the person area 610 is detected, and the feature point 620 is detected based on the person area 610. Then, the target person direction is calculated based on the image 600 or the like, and the predetermined action of the target person 500 detected based on the human silhouette based on the feature point arrangement direction and the target person direction is included in the predetermined action. It can be determined whether or not the behavior is

(Mobile terminal 400)
FIG. 10 is a block diagram showing the configuration of the mobile terminal 400. The mobile terminal 400 includes a control unit 410, a wireless communication unit 420, a display unit 430, an input unit 440, and a voice input / output unit 450. The components are connected to each other by a bus. The mobile terminal 400 may be composed of, for example, a communication terminal device such as a tablet computer, a smartphone, or a mobile phone.

The control unit 410 has a basic configuration such as a CPU, RAM, and ROM, similar to the configuration of the control unit 110 of the detection unit 100.

The wireless communication unit 420 has a function of performing wireless communication according to standards such as Wi-Fi and Bluetooth (registered trademark), and wirelessly communicates with each device via an access point or directly. The wireless communication unit 420 receives the event notification from the server 200.

The display unit 430 and the input unit 440 are touch panels, and a touch sensor as the input unit 440 is provided on the display surface of the display unit 430 composed of a liquid crystal or the like. The event notification is displayed by the display unit 430 and the input unit 440. Then, an input screen for prompting the response to the target person 500 regarding the event notification is displayed, and the staff's intention to respond to the event notification input on the input screen is received.

The voice input / output unit 450 is, for example, a speaker and a microphone, and enables voice communication between staff members with another mobile terminal 400 via the wireless communication unit 420. Further, the voice input / output unit 450 may have a function of enabling a voice call with the detection unit 100 via the wireless communication unit 420.

The operation of the image recognition system 10 will be described.

FIG. 11 is a flowchart showing the operation of the image recognition system 10. This flowchart is executed by the control unit 110 according to the program.

The control unit 110 detects the feature point 620 of the target person 500 based on the image 600 when the predetermined action of the target person 500 is detected based on the person silhouette detected from the image 600 (S101). ..

The control unit 110 calculates the feature point arrangement direction based on the detected feature point 620 (S102).

The control unit 110 calculates the target person direction based on the image 600 and the feature point 620 (S103).

Based on the feature point alignment direction and the target person direction, the control unit 110 determines whether the action of the target person 500 is at least one of a fall and a fall (whether it is included in the fall and the fall). Whether or not) is determined (S104).

When the control unit 110 determines that the action of the target person 500 is neither a fall nor a fall (S105: NO), the process ends.

When the control unit 110 determines that the behavior of the target person 500 is at least one of a predetermined behavior, a fall and a fall (S105: YES), the control unit 110 transmits information on the behavior of the target person to the server 200. Is output (S106).

The embodiment has the following effects.

The feature points of the subject were detected based on the captured image, and it was determined whether the behavior of the subject was included in the predetermined behavior based on the direction in which the feature points were arranged and the direction of the subject, and it was determined that the behavior was included. Sometimes it outputs information about the behavior of the subject. As a result, it is possible to improve the estimation accuracy of the behavior of the person based on the captured image.

Further, when the feature point arrangement direction and the target person direction have a predetermined relationship, it is determined that the target person's action is a predetermined action. As a result, the accuracy of estimating the behavior of the person based on the captured image can be further improved.

Furthermore, the above-mentioned predetermined relationship is set according to the distance from the photographing device to the target person. As a result, the behavior of the person can be estimated with high accuracy based on the image regardless of the position of the target person.

Furthermore, the direction in which the feature points are arranged is set to a specific direction from one of the two feature points to the other. As a result, it is possible to more easily improve the estimation accuracy of the behavior of the person based on the captured image.

Furthermore, it is determined whether the behavior of the target person is included in the predetermined behavior by using a plurality of specific directions. As a result, the behavior of the person can be estimated with high accuracy based on the image for various postures belonging to the same posture of the subject.

Furthermore, the predetermined relationship is such that the angle formed by the feature point arrangement direction and the target person direction is equal to or greater than the predetermined threshold value. As a result, it is possible to more easily improve the estimation accuracy of the behavior of the person based on the captured image.

Furthermore, a predetermined action is fallen and dropped, and it is determined whether or not the subject's action is at least one of a fall or a fall. As a result, the accuracy of estimating the behavior of the person based on the captured image can be further improved.

Further, the image is an image including a predetermined area taken by a wide-angle camera installed at a position overlooking the predetermined area. This makes it possible to more effectively improve the estimation accuracy of the behavior of the person based on the captured image.

Furthermore, the target person direction is the direction calculated based on the points included in the central region of the image and the feature points. As a result, the calculation result of the feature point can be used for the calculation of the target person direction.

The configuration of the image recognition system 10 described above has been described as a main configuration in explaining the features of the above-described embodiment, and is not limited to the above-mentioned configuration and may be variously modified within the scope of claims. it can. Further, the configuration provided in a general image recognition system is not excluded.

For example, in the embodiment, when the angle formed by the feature point arrangement direction and the target person direction becomes equal to or more than a predetermined threshold value, it is determined that the target person's action is included in the predetermined action. However, for example, when the sine value of the angle formed by the feature point arrangement direction and the target person direction is calculated and the calculated sine value is equal to or more than a predetermined threshold value, the target person's action is included in the predetermined action. You may judge.

Further, the detection unit 100, the server 200, and the mobile terminal 400 may each be configured by a plurality of devices, or any plurality of devices may be configured as a single device.

Further, in the above-mentioned flowchart, some steps may be omitted or other steps may be added. Further, a part of each step may be executed at the same time, or one step may be divided into a plurality of steps and executed.

Further, the means and methods for performing various processes in the image recognition system 10 described above can be realized by either a dedicated hardware circuit or a programmed computer. The program may be provided by a computer-readable recording medium such as a USB memory or a DVD (Digital Versailles Disc) -ROM, or may be provided online via a network such as the Internet. In this case, the program recorded on the computer-readable recording medium is usually transferred to and stored in a storage unit such as a hard disk. Further, the above program may be provided as a single application software, or may be incorporated into the software of a device such as a detection unit as one function.

This application is based on a Japanese patent application (Japanese Patent Application No. 2019-145511) filed on August 7, 2019, the disclosure of which is referenced and incorporated as a whole.

Claims

A feature point detection unit that detects feature points related to the target person's body based on an image including the target person taken by the photographing device.
A calculation unit that calculates the direction from the central region of the image toward the target person based on the image.
Judgment as to whether or not the action of the target person is included in the predetermined action based on the direction in which the predetermined feature points are arranged in the detected feature points and the calculated direction toward the target person. Department and
When the determination unit determines that the behavior of the target person is an action included in the predetermined behavior, an output unit that outputs information about the behavior of the target person and
Image processing system with.
When the predetermined feature point arrangement direction and the direction toward the target person have a predetermined relationship, the determination unit determines that the target person's action is included in the predetermined action. , The image processing system according to claim 1.
The image processing system according to claim 2, wherein the predetermined relationship is set according to the distance from the photographing device to the target person.
The image processing system according to any one of claims 1 to 3, wherein the arrangement direction of the predetermined feature points is a specific direction from one of the two feature points to the other.
The image processing system according to claim 4, wherein the determination unit determines whether or not the action of the target person is an action included in the predetermined action by using the plurality of the specific directions.
The image processing system according to claim 2 or 3, wherein the predetermined relationship is such that the angle formed by the arrangement direction of the predetermined feature points and the direction toward the target person is equal to or more than a predetermined threshold value.
The prescribed actions are falls and falls,
The image processing system according to any one of claims 1 to 6, wherein the determination unit determines whether or not the subject's behavior is at least one of a fall and a fall.
The photographing device is a wide-angle camera, and the image is an image including the predetermined area taken by the wide-angle camera installed at a position overlooking a predetermined area, any one of claims 1 to 7. The image processing system described in the section.
The image according to any one of claims 1 to 8, wherein the direction toward the target person is a direction calculated based on the points included in the central region of the image and the predetermined feature points. Processing system.
The procedure (a) of detecting the feature points related to the body of the subject based on the image including the subject taken by the photographing device, and
The procedure (b) of calculating the direction from the central region of the image toward the target person based on the image, and
A procedure for determining whether or not the behavior of the target person is included in the predetermined behavior based on the direction in which the predetermined feature points are arranged in the detected feature points and the calculated direction toward the target person. (C) and
In the procedure (c), when it is determined that the behavior of the target person is an action included in the predetermined behavior, the procedure (d) of outputting information on the behavior of the target person and the procedure (d).
An image processing program for causing a computer to execute a process having the above.
It ’s a method to let the image processing system execute.
The step (a) of detecting the feature points related to the body of the subject based on the image including the subject taken by the photographing device, and
A step (b) of calculating the direction from the central region of the image toward the target person based on the image, and
A step of determining whether or not the behavior of the target person is included in the predetermined behavior based on the direction in which the predetermined feature points are arranged in the detected feature points and the calculated direction toward the target person. (C) and
In the step (c), when it is determined that the behavior of the subject is an action included in the predetermined behavior, the step (d) of outputting information on the behavior of the subject,
Image processing method having.