WO2022239291A1 - 物体検知装置及び方法 - Google Patents
物体検知装置及び方法 Download PDFInfo
- Publication number
- WO2022239291A1 WO2022239291A1 PCT/JP2021/048247 JP2021048247W WO2022239291A1 WO 2022239291 A1 WO2022239291 A1 WO 2022239291A1 JP 2021048247 W JP2021048247 W JP 2021048247W WO 2022239291 A1 WO2022239291 A1 WO 2022239291A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image data
- detection
- coordinates
- control unit
- types
- Prior art date
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 331
- 238000000034 method Methods 0.000 title claims description 36
- 238000003384 imaging method Methods 0.000 claims abstract description 47
- 238000006243 chemical reaction Methods 0.000 claims abstract description 11
- 230000009466 transformation Effects 0.000 claims description 28
- 238000010801 machine learning Methods 0.000 claims description 4
- 238000000605 extraction Methods 0.000 description 88
- 238000012545 processing Methods 0.000 description 48
- 238000004364 calculation method Methods 0.000 description 29
- 238000010586 diagram Methods 0.000 description 22
- 238000004891 communication Methods 0.000 description 21
- 230000008569 process Effects 0.000 description 19
- 230000006870 function Effects 0.000 description 14
- 238000005516 engineering process Methods 0.000 description 5
- 230000010365 information processing Effects 0.000 description 4
- 230000000694 effects Effects 0.000 description 3
- 238000007792 addition Methods 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 238000003825 pressing Methods 0.000 description 2
- 239000004065 semiconductor Substances 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 description 1
- 206010024796 Logorrhoea Diseases 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 239000000470 constituent Substances 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 230000005484 gravity Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012806 monitoring device Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30241—Trajectory
Definitions
- the present disclosure relates to an object detection device and method.
- Patent Literature 1 discloses an object tracking system that includes a plurality of detection units that detect objects from images captured by a plurality of cameras, and an integrated tracking unit that associates current and past object positions based on the detection results. .
- the detection result of each detection unit includes information indicating the coordinate values of the lower end of the object (such as the point where the object touches the ground) and the circumscribed rectangle of the object in the coordinate system on the corresponding captured image of the camera.
- Each detection unit uses camera parameters representing the position, orientation, etc. of each camera obtained by calibration in advance to convert the coordinate values on the captured image to a common coordinate system defined within the shooting space of a plurality of cameras. Convert to coordinates.
- the integrated tracking unit tracks an object by integrating coordinate values of a common coordinate system obtained from a plurality of detection units.
- the present disclosure provides an object detection device and method capable of accurately detecting the positions of various objects on an imaging plane imaged by a camera.
- An object detection device detects the position of an object on an imaging plane imaged by a camera.
- An object detection device includes an acquisition unit, a control unit, and a storage unit.
- the acquisition unit acquires image data generated by an imaging operation of the camera.
- the control unit performs coordinate transformation from first coordinates according to the image indicated by the image data to second coordinates according to the imaging plane.
- the storage unit stores setting information used for coordinate transformation.
- the setting information includes a setting value indicating the height from the imaging plane for each type of object among a plurality of types of objects.
- the control unit Based on the image data acquired by the acquisition unit, the control unit acquires a detection result that associates the position of the object at the first coordinates with the type of the object determined from the plurality of types.
- the control unit calculates the position of the object on the second coordinates by calculating the coordinate transformation so as to switch the setting value according to the type of the object in the detection result.
- An object detection device detects the position of an object on an imaging plane imaged by a camera.
- the object detection device includes an acquisition section, a control section, a storage section, and an information input section.
- the acquisition unit acquires image data generated by an imaging operation of the camera.
- the control unit performs coordinate transformation from first coordinates according to the image indicated by the image data to second coordinates according to the imaging plane.
- the storage unit stores setting information used for coordinate transformation.
- the information input unit acquires information through a user's operation.
- the setting information includes a setting value indicating the height from the imaging plane for each type of object among a plurality of types of objects.
- the information input unit acquires setting values for each of a plurality of types in a user operation for inputting setting values. Based on the image data acquired by the acquisition unit, the control unit acquires a detection result that associates the position of the object at the first coordinates with the type of the object determined from the plurality of types. The control unit calculates the position of the object on the second coordinates by performing coordinate transformation according to the set value acquired by the user operation for each type of object in the detection result.
- the object detection device, method, and system of the present disclosure it is possible to accurately detect the positions of various objects on the imaging plane imaged by the camera.
- FIG. 1 is a diagram for explaining an object detection system according to a first embodiment
- FIG. 1 is a block diagram illustrating the configuration of a terminal device according to Embodiment 1
- FIG. 3 is a block diagram illustrating the configuration of a flow line extraction server according to the first embodiment
- FIG. Diagram for explaining flow line information in the object detection system Diagram for explaining problems in object detection systems
- Flowchart illustrating basic operation of flow line extraction server in object detection system 5 is a flowchart illustrating position calculation processing in the flow line extraction server of the object detection system according to the first embodiment; Diagram for explaining position calculation processing FIG.
- FIG. 2 is a diagram illustrating the data structure of object feature information in the object detection system of Embodiment 1;
- Diagram for explaining the effects of the flow line extraction server 4 is a flowchart illustrating setting processing in the terminal device according to the first embodiment;
- FIG. 4 is a diagram showing a display example of a setting screen in the terminal device according to the first embodiment;
- 4 is a flowchart illustrating learning processing of an object detection model in the flow line extraction server according to the first embodiment;
- 8 is a flowchart illustrating position calculation processing in the object detection system of the second embodiment;
- FIG. 9 is a diagram for explaining position calculation processing in the object detection system of the second embodiment;
- 13 is a flowchart illustrating position calculation processing in the object detection system of the third embodiment;
- FIG. 11 is a diagram for explaining position calculation processing in the object detection system of the third embodiment;
- FIG. 1 is a diagram showing an outline of an object detection system 1 according to this embodiment.
- An object detection system 1 of the present embodiment includes an omnidirectional camera 2, a terminal device 4, and a flow line extraction server 5, as shown in FIG. 1, for example.
- the flow line extraction server 5 is an example of the object detection device in this embodiment.
- the system 1 can be applied to the use of detecting the positions of a person 11 and an object 12 such as cargo in a workshop 6 such as a factory, and analyzing the flow line based on the detected positions.
- the terminal device 4 of the system 1 is used by the user 3, such as the manager of the workplace 6 or the person in charge of data analysis, to analyze the flow line and perform annotation work for setting information about the detection target in advance. used for
- the vertical direction in the workplace 6 will be referred to as the Z direction.
- Two directions perpendicular to each other on a horizontal plane perpendicular to the Z direction are called the X direction and the Y direction, respectively.
- the +Z direction may be referred to as upward
- the ⁇ Z direction may be referred to as downward.
- the horizontal plane of the workplace 6 is an example of an imaging plane that is imaged by the omnidirectional camera 2 in this embodiment.
- FIG. 1 shows an example in which various equipment 20 and the like are installed in a workplace 6 separately from objects to be detected such as a person 11 and a target object 12 .
- the omnidirectional camera 2 is arranged on the ceiling or the like of the workplace 6 so as to overlook the workplace 6 from above.
- the flow line extraction server 5 extracts the positions of the person 11, the object 12, etc. in the image captured by the omnidirectional camera 2 so that the terminal device 4 displays the flow line on the map of the workplace 6, for example.
- the detected result is associated with the position corresponding to the horizontal plane of the workplace 6. - ⁇
- the present embodiment provides an object detection apparatus and method capable of accurately detecting the positions of various objects such as the person 11 and the target object 12 in the workplace 6 in such an object detection system 1 .
- the configuration of each part in the system 1 will be described below.
- the omnidirectional camera 2 is an example of a camera in this system 1.
- the omnidirectional camera 2 includes, for example, an optical system such as a fisheye lens, and an imaging device such as a CCD or CMOS image sensor.
- the omnidirectional camera 2 performs an imaging operation according to, for example, a stereoscopic projection method, and generates image data representing a captured image.
- the omnidirectional camera 2 is connected to the flow line extraction server 5 so that image data is transmitted to the flow line extraction server 5, for example.
- the flow line extraction server 5 is composed of an information processing device such as a computer.
- the terminal device 4 is configured by an information processing device such as a PC (personal computer), for example.
- the terminal device 4 is connected to the flow line extraction server 5 so as to be able to communicate with the flow line extraction server 5 via a communication network such as the Internet.
- a communication network such as the Internet.
- FIG. 2 is a block diagram illustrating the configuration of the terminal device 4 .
- the terminal device 4 illustrated in FIG. 2 includes a control unit 40 , a storage unit 41 , an operation unit 42 , a display unit 43 , a device interface 44 and a network interface 45 .
- the interface is abbreviated as "I/F".
- the control unit 40 includes, for example, a CPU or MPU that cooperates with software to realize predetermined functions.
- the control unit 40 controls the overall operation of the terminal device 4, for example.
- the control unit 40 reads out the data and programs stored in the storage unit 41 and performs various arithmetic processing to realize various functions.
- the above program may be provided from a communication network such as the Internet, or may be stored in a portable recording medium.
- the control unit 50 may be composed of various semiconductor integrated circuits such as a GPU.
- the storage unit 41 is a storage medium that stores programs and data necessary for realizing the functions of the terminal device 4 .
- the storage unit 41 includes a storage unit 41a and a temporary storage unit 41b, as shown in FIG.
- the storage unit 41a stores parameters, data, control programs, etc. for realizing predetermined functions.
- the storage unit 41a is composed of, for example, an HDD or an SSD.
- the storage unit 41a stores the above program and the like.
- the storage unit 41 a may store image data representing a map of the workplace 6 .
- the operation unit 42 is a general term for operation members operated by the user.
- the operation unit 42 may constitute a touch panel together with the display unit 43 .
- the operation unit 42 is not limited to a touch panel, and may be, for example, a keyboard, a touch pad, buttons, switches, and the like.
- the operation unit 42 is an example of an information input unit that acquires information through user's operation.
- the display unit 43 is an example of an output unit configured with, for example, a liquid crystal display or an organic EL display.
- the display unit 43 may display various types of information such as various icons for operating the operation unit 42 and information input from the operation unit 42 .
- the network I/F 45 is a circuit for connecting the terminal device 4 to a communication network via a wireless or wired communication line.
- the network I/F 45 performs communication conforming to a predetermined communication standard.
- the predetermined communication standards include communication standards such as IEEE802.3, IEEE802.11a/11b/11g/11ac.
- the network I/F 45 may constitute an acquisition unit that receives various information or an output unit that transmits various information in the terminal device 4 via a communication network.
- the network I/F 45 may be connected to the omnidirectional camera 2 and the flow line extraction server 5 via a communication network.
- FIG. 3 is a block diagram illustrating the configuration of the flow line extraction server 5 .
- the flow line extraction server 5 illustrated in FIG. 3 includes a control unit 50, a storage unit 51, a device I/F 54, and a network I/F 55.
- the control unit 50 includes, for example, a CPU or MPU that cooperates with software to realize predetermined functions.
- the control unit 50 controls the overall operation of the flow line extraction server 5, for example.
- the control unit 50 reads data and programs stored in the storage unit 51 and performs various arithmetic processing to realize various functions.
- the control unit 50 includes an object detection unit 71, a coordinate conversion unit 72, and a model learning unit 73 as functional configurations.
- the object detection unit 71 detects the preset position of the object to be processed in the image indicated by the image data, and detects the object to be processed. Recognize the area.
- the detection result by the object detection unit 71 may include, for example, information indicating the time when the region to be processed was recognized.
- the object detection unit 71 is realized, for example, by the control unit 50 reading out and executing the object detection model 70 stored in advance in the storage unit 51 or the like.
- the coordinate transformation unit 72 computes coordinate transformation between predetermined coordinate systems with respect to the position of the region recognized in the image.
- a model learning unit 73 executes machine learning of the object detection model 70 . Operations by various functions of the flow line extraction server 5 will be described later.
- the storage unit 51 is a storage medium that stores programs and data necessary for realizing the functions of the flow line extraction server 5 .
- the storage unit 51 includes a storage unit 51a and a temporary storage unit 51b, as shown in FIG.
- the storage unit 51a stores parameters, data, control programs, etc. for realizing predetermined functions.
- the storage unit 51a is composed of, for example, an HDD or an SSD.
- the storage unit 51a stores the above program, the map information D0, the object characteristic information D1, the object detection model 70, and the like.
- the map information D0 indicates the arrangement of various facilities 20 in the workplace 6, for example, in a predetermined coordinate system.
- the object feature information D1 indicates the height feature of an object to be processed by the object detection unit 71, which is set for each type of object. Details of the object feature information D1 will be described later.
- the object detection model 70 is a trained model by a neural network such as a convolutional neural network.
- the object detection model 70 includes various parameters such as weight parameters that indicate learning results.
- the temporary storage unit 51b is composed of a RAM such as DRAM or SRAM, and temporarily stores (that is, retains) data.
- the temporary storage unit 51b holds image data received from the omnidirectional camera 2 and the like.
- the temporary storage unit 51 b may function as a work area of the control unit 50 or may be configured as a storage area in the internal memory of the control unit 50 .
- the device I/F 54 is a circuit for connecting external devices such as the omnidirectional camera 2 to the flow line extraction server 5 .
- the device I/F 54 performs communication according to a predetermined communication standard, like the device I/F 44 of the terminal device 4, for example.
- the device I/F 54 is an example of an acquisition unit that receives image data and the like from the omnidirectional camera 2 .
- the device I/F 54 may constitute an output unit for transmitting various information to external devices in the flow line extraction server 5 .
- the network I/F 55 is a circuit for connecting the flow line extraction server 5 to a communication network via a wireless or wired communication line.
- the network I/F 55 performs communication conforming to a predetermined communication standard.
- the network I/F 55 may constitute an acquisition unit that receives various information or an output unit that transmits various information in the flow line extraction server 5 via a communication network.
- the network I/F 55 may be connected to the omnidirectional camera 2 and the terminal device 4 via a communication network.
- the configuration of the terminal device 4 and flow line extraction server 5 as described above is an example, and the configuration is not limited to the above example.
- the object detection method of the present embodiment may be performed in distributed computing.
- the acquisition units in the terminal device 4 and the flow line extraction server 5 may be implemented in cooperation with various software in the control units 40 and 50, respectively.
- Each acquisition unit reads various information stored in various storage media (for example, storage units 41a and 51a) to work areas (for example, temporary storage units 41b and 51b) of control units 40 and 50, respectively, to acquire information. may be performed.
- the object detection model 70 may be stored in an external information processing device communicably connected to the flow line extraction server 5 .
- the device I/F 54 and/or the network I/F 55 in the flow line extraction server 5 may constitute an information input unit that acquires information by user's operation.
- the omnidirectional camera 2 performs a moving image capturing operation in a workplace 6 where a person 11, an object 12, and the like are moving, and shows captured images for each frame period of the moving image.
- Image data is generated and transmitted to the flow line extraction server 5 .
- the flow line extraction server 5 When the flow line extraction server 5 receives the image data from the omnidirectional camera 2, it inputs the received image data to the object detection model 70, for example, and detects the positions of the person 11, the target object 12, and the like.
- the flow line extraction server 5 repeats the calculation of coordinate conversion from the coordinates corresponding to the image indicated by the image data to the coordinates corresponding to the horizontal plane of the workplace 6 with respect to the positions of the person 11 and the target object 12, etc., and extracts the flow line information. Generate.
- the flow line information is, for example, information in which flow lines of the person 11, the object 12, and the like are associated with the map information D0.
- the flow line extraction server 5 transmits the generated flow line information to the terminal device 4, for example.
- the terminal device 4 displays the received flow line information on the display unit 43, for example.
- FIG. 4 shows a display example of flow line information generated by the flow line extraction server 5 based on the captured image of the workplace 6 in FIG.
- the flow line F1 of the person 11 and the flow line F2 of the object 12 are displayed on the display unit 43 of the terminal device 4 .
- Flow lines F1 and F2 indicate the trajectories of map positions m1 and m6 in the map coordinate system of the person 11 and the object 12, which are calculated by the flow line extraction server 5, respectively.
- the map coordinate system is an example of a coordinate system corresponding to the imaging plane of the omnidirectional camera 2, and indicates the position in the workplace 6, for example, based on the map information D0.
- the map coordinate system includes, for example, Xm coordinates for indicating the position of the workplace 6 in the X direction and Ym coordinates for indicating the position in the Y direction.
- the map position indicates the position of the object in the map coordinate system.
- FIG. 5 is a diagram for explaining problems in the object detection system 1.
- FIG. FIG. 5 shows the omnidirectional camera 2, the person 11, and the object 12 in the workplace 6 as seen from the Y direction.
- FIG. 5(A) shows a scene in which the whole body of the person 11 is reflected in the image captured by the omnidirectional camera 2.
- FIG. 5B shows a scene in which only part of the person 11 appears in the captured image.
- FIG. 5C shows a scene in which an object 12 other than the person 11 appears in the captured image.
- the object detection model 70 of the flow line extraction server 5 recognizes the detection area A1 of the whole body of the person 11 in the captured image from the omnidirectional camera 2 .
- a detection area A1 indicates the detection result of the position of the whole body by the object detection model 70 .
- the flow line extraction server 5 calculates the map position m1 from the detection position indicating the center of the detection area A1 on the captured image.
- the map position m1 is calculated as, for example, the position of the intersection of the horizontal plane 60 and the perpendicular drawn from the target position c1 corresponding to the detection position of the detection area A1 to the horizontal plane 60 of the workplace 6 .
- the target position indicates a spatial position in the workplace 6 corresponding to the detected position on the captured image.
- the object detection model 70 recognizes the upper body detection area A2 of the person 11 .
- part of the body of the person 11 is hidden by the equipment 20 of the workplace 6 and is not captured in the captured image.
- the target position c2 of is above the target position c1 of the whole-body detection area A1 in FIG. 5(A).
- the calculated position m2' will deviate from the map position m2 corresponding to the target position c2.
- the object detection model 70 recognizes the detection area A6 of the target object 12 .
- the target position c6 of the detection area A6 is above the target position c1 in the example of FIG. 5A. Therefore, in this case as well, when the position calculation of the detection area A6 is performed in the same manner as described above, the calculated position m6′ shifts from the map position m6 corresponding to the target position c6 as shown in FIG. 5(C). put away.
- the calculated positions are different from the map positions m1 to m6 of the detection areas A1 to A6. There may be a problem of slippage.
- a reference height corresponding to the type of processing target of the object detection unit 71 in the object feature information D1 is set in position calculation. Perform coordinate transformation using height. As a result, for example, a partial detection area of the body of the person 11 can be recognized as shown in FIG. Even when the detection area of the object 12 is recognized, the map positions m2 and m6 can be calculated with high accuracy.
- FIG. 6 is a flowchart illustrating the basic operation of the flow line extraction server 5 in the object detection system 1. Each process shown in the flowchart of FIG. 6 is executed by the control unit 50 of the flow line extraction server 5 functioning as the object detection unit 71 and the coordinate conversion unit 72, for example.
- control unit 50 acquires one frame of image data from, for example, the device I/F 54 (S1).
- the device I/F 54 sequentially receives image data of each frame from the omnidirectional camera 2 .
- control unit 50 functions as the object detection unit 71 and performs image recognition processing for object detection in the image indicated by the acquired image data. Thereby, the control unit 50 recognizes the detection areas of the person 11 and the object 12 (S2). Then, the control unit 50 acquires the detection result and holds it in, for example, the temporary storage unit 51b.
- the object detection unit 71 associates, as a detection result, a detection area indicating an area in which an object to be processed, which is classified into, for example, one of a plurality of classes set in advance in an image, with each class.
- Classes include objects such as, for example, the full body, upper body and head of a person, and cargo.
- the object to be processed by the object detection unit 71 includes not only the entire object but also parts of the object.
- the detection area is defined, for example, by horizontal and vertical positions on the image, and indicates, for example, a rectangular area surrounding an object to be processed (see FIG. 8A).
- the control unit 50 functions as the coordinate transformation unit 72 and performs coordinate transformation from the image coordinate system to the map coordinate system regarding the position of the detected object, thereby transforming the object according to the horizontal plane of the workplace 6 . is calculated (S3).
- the image coordinate system is a two-dimensional coordinate system according to the arrangement of pixels in the image captured by the omnidirectional camera 2 .
- the image coordinate system is an example of a first coordinate system
- the map coordinate system is an example of a second coordinate system.
- position calculation process (S3) for example, as shown in FIG. is used to calculate the map position of the object.
- the control unit 50 accumulates the calculated map positions in, for example, the temporary storage unit 51b. Details of the position calculation process (S3) will be described later.
- the control unit 50 After executing the position calculation process (S3) in the acquired frame, the control unit 50 determines whether or not image data of the next frame has been received from the omnidirectional camera 2, for example, in the device I/F 54 (S4). . When the next frame is received (YES in S4), the control section 50 repeats the processing of steps S1 to S3 for that frame.
- control unit 50 After generating flow line information (S5), the control unit 50 terminates the processing shown in this flowchart.
- the map position of the object is calculated (S3) based on the detection area (S2) of the object in the captured image from the omnidirectional camera 2.
- flow line information of an object moving in the workplace 6 can be obtained (S5).
- a map position based on the detection position of each detection area is calculated.
- step S5 the flow line information generating process (S5) is performed not only after the next frame is no longer received (NO in S4), but in a predetermined number of frames (for example, one frame or several frames). It may be performed each time the process is executed.
- image data may be acquired not only through apparatus I/F54 but through network I/F55.
- step S1 for example, one frame of image data may be acquired by reading moving image data recorded by the omnidirectional camera 2 stored in advance from the storage unit 51a. In this case, instead of step S4, it is determined whether or not all frames in the moving image data have been acquired, and steps S1 to S4 are repeated until all frames are selected.
- FIG. 7 is a flowchart illustrating position calculation processing (S3) in the flow line extraction server 5 of the object detection system 1 according to this embodiment.
- FIG. 8 is a diagram for explaining the position calculation process (S3).
- FIG. 9 is a diagram illustrating the data structure of the object feature information D1 in the object detection system 1 of this embodiment.
- FIG. 10 is a diagram for explaining the effects of the flow line extraction server 5.
- control unit 50 calculates the detection position of the detection area recognized in step S2 of FIG. 4 (S11).
- FIG. 8(A) exemplifies the captured image Im indicated by the image data acquired in step S2 of FIG.
- the detection area A1 of the whole body of the person 11 is recognized in the captured image Im.
- the control unit 50 calculates the detection position C1 of the detection area A1 in the image coordinate system of the captured image Im.
- the image coordinate system includes, for example, H coordinates indicating the horizontal position of the captured image Im and V coordinates indicating the vertical position.
- control unit 50 refers to the temporary storage unit 51b, for example, and determines the class of each object according to the class output by the object detection unit 71 in association with the detection area of the object (S12).
- the class of the object in the detection area A1 is determined to be the whole body of a person.
- control unit 50 After determining the class for each object (S12), the control unit 50 refers to the object characteristic information D1 and acquires the reference height of each determined class (S13).
- the object feature information D1 illustrated in FIG. 9 manages the "class” and the "reference height” set in advance as the processing targets of the object detection unit 71 in association with each other.
- the reference height indicates, for example, the vertical distance from the horizontal plane 60 in the workplace 6 to the target position corresponding to the detection position of the detection area.
- a reference height "H1" corresponding to the "whole body” class is obtained.
- the object feature information D1 illustrated in FIG. 9 stores reference heights "H2", “H3” and "H6” respectively corresponding to classes of "upper body", "head” and "object” in addition to the whole body. is doing.
- control unit 50 calculates the map position of each corresponding object from the detected position calculated in step S11 (S14).
- the control unit 50 uses the reference height of the class acquired in step S13 and applies, for example, a predetermined arithmetic expression to calculate the coordinate transformation for calculating the map position from the detected position in the image coordinate system.
- the predetermined arithmetic expression is, for example, a transformation equation including inverse transformation of stereographic projection.
- FIG. 8(B) is a diagram for explaining the process of step S14.
- FIG. 8(B) is a view of the workplace 6 when the captured image Im of FIG. 8(A) was captured, viewed from the Y direction, as in FIG. 5(A).
- a target position c1 in FIG. 8(B) indicates a position in the workplace 6 corresponding to the detection position C1 of the detection area A1 in the captured image Im in FIG. 8(A).
- the detection position C1 is reflected in the direction corresponding to the X direction of the workplace 6 from the image center 30 of the captured image Im will be described.
- the position y (millimeters: mm) from the center of the imaging device where the detection position C1 is reflected in the imaging device of the omnidirectional camera 2 is the focal length f ( mm), it is represented by the following equation (1).
- Equation (2) is the ratio of the position y to the radius L (mm) of the imaging device, and the distance p1 (pixels) from the image center 30 of the captured image Im illustrated in FIG. and the radius p0 (pixel) indicating the photographable range corresponding to the radius L are equal.
- the distance R1 is obtained by the following formula (4) based on the height h of the omnidirectional camera 2 from the horizontal plane 60, the reference height H1 of the whole body class, and the angle ⁇ 1. is represented as
- step S14 of FIG. 7 the control unit 50 calculates the distance R1 from the detected position C1 in the image coordinate system by arithmetic processing based on, for example, the above formulas (3) and (4), and calculates the distance R1 corresponding to the map position m1. Coordinates in a coordinate system corresponding to the workplace 6 with the azimuth camera 2 as a reference are calculated.
- the control unit 50 can calculate the coordinates of the map position m1 from the coordinates, for example, by a predetermined calculation including affine transformation.
- the control unit 50 stores, for example, the calculated map position m1 (S14) in the temporary storage unit 51b, and ends the position calculation process (S3 in FIG. 6). After that, the control unit 50 proceeds to step S4, and repeats the above processing at predetermined intervals, for example (S1 to S4).
- the map position of each object is calculated (S14).
- the map position can be calculated with high accuracy in the object detection system 1 that detects a plurality of types of objects having different heights.
- FIGS. 10A and 10B show map positions m2 and m6, respectively, in the same scenes as FIGS. An example of calculation is shown.
- the map position m2 of the upper body of the person 11 is accurately calculated using the reference height H2 of the upper body class.
- the map position m6 of the object 12 is accurately calculated using the reference height H6 of the object class.
- the map positions m1 to m6 based on the respective detection areas A1 to A6 can be obtained with high accuracy.
- the reference height of the object feature information D1 can be set, for example, when the terminal device 4 performs annotation work for creating correct data for the object detection model 70.
- the correct data is data used as a correct answer in the machine learning of the object detection model 70, and includes, for example, image data associated with a correct answer label that defines an area on an image in which an object of each class is shown as the correct answer.
- FIG. 11 is a flowchart illustrating setting processing in the terminal device 4 of this embodiment.
- FIG. 12 is a diagram showing a display example of a setting screen on the terminal device 4. As shown in FIG. Each process shown in the flowchart of FIG. 11 is executed by the control unit 40 of the terminal device 4, for example.
- the control unit 40 accepts a user operation to enter a class name in the input field 82, for example, adds the class value in the object feature information D1, and sets the entered class name (S21).
- the input field 82 is displayed on the display unit 43 in response to a user operation of pressing the add button 81, for example.
- the classes "whole body” and "upper body” entered in the input field 82 are added to the object feature information D1, and the respective class names are set.
- the control unit 40 repeats the processing of steps S21 to S23 until a user operation to end class setting, such as pressing the end button 83, is input (NO in S23).
- the control unit 40 receives a user operation for performing annotation work and acquires annotation information (S24). For example, in the input area 84, the control unit 40 displays the captured image Im based on the image data acquired in advance from the omnidirectional camera 2, and receives a user operation to perform annotation work.
- the captured image Im in the input area 84 of FIG. 12 shows an example in which the upper body of the person 21 is shown.
- a user operation is input to draw a region B1 surrounding the upper body of the person 21 in association with the upper body class.
- step S24 for example, for a predetermined number of captured images acquired in advance for creating correct answer data, by repeatedly accepting the user operation as described above, the class and the area in which each class is shown on the captured image are obtained. Annotation information associated with is acquired.
- control unit 40 After obtaining the annotation information (S24), the control unit 40 transmits the annotation information and the object feature information D1 to the flow line extraction server 5, for example, via the network I/F 45 (S25). After that, the control unit 40 terminates the processing shown in this flowchart.
- the class name and reference height in the object feature information D1 are set (S21, S22), and sent to the flow line extraction server 5 together with the acquired annotation information (S24) (S25).
- the reference height for example, by making it possible to set the reference height together with the class name, it is possible to easily manage the reference height for each class in association with the detection target class in the object characteristic information D1.
- each information may be stored in the storage unit 41a in step S25.
- the user 3 or the like may perform an operation to read out each information from the storage unit 41 a and input each information using an operation device or the like connectable to the device I/F 54 of the flow line extraction server 5 .
- the setting of the reference height (S22) may be performed not only after step S21, but also after the annotation information is acquired (S24), for example.
- the input field 82 of FIG. 12 a user operation to edit the input reference height may be received.
- the flow line extraction server 5 executes learning processing of the object detection model 70 .
- the control unit 50 acquires annotation information and object feature information D1 from, for example, the terminal device 4 via the network I/F 55 (S31).
- the network I/F 55 acquires, as the object feature information D1, reference heights for each of a plurality of classes in user operations in annotation work.
- the control unit 50 holds, for example, the annotation information in the temporary storage unit 51b, and stores the object feature information D1 in the storage unit 51a.
- control unit 50 generates the object detection model 70 by supervised learning using correct data based on annotation information (S32).
- the control unit 50 stores the generated object detection model 70 in, for example, the storage unit 51a (S33), and ends the processing shown in this flowchart.
- the object detection model 70 is generated based on the annotation information associated with the class by the setting processing (FIG. 11). As a result, an object detection model 70 is obtained that can accurately recognize a detection area of a desired class of the user 3 or the like in an image captured by the omnidirectional camera 2 .
- the learning process of the object detection model 70 is not limited to the flow line extraction server 5, and may be executed by the control unit 40 in the terminal device 4, for example.
- the flow line extraction server 5 may acquire the learned object detection model 70 from the terminal device 4 via the device I/F 54 or the like before starting the operation of FIG.
- the learning process may be executed by an information processing device external to the object detection system 1 , and the learned object detection model 70 may be transmitted to the flow line extraction server 5 .
- the flow line extraction server 5 in the present embodiment performs object detection for detecting the position of an object on a horizontal plane (an example of an imaging plane) of the workplace 6 captured by the omnidirectional camera 2 (an example of a camera). It is an example of a device.
- the flow line extraction server 5 includes a device I/F 54, a control unit 50, and a storage unit 51 as an example of an acquisition unit.
- the device I/F 54 acquires image data generated by the imaging operation of the omnidirectional camera 2 (S1).
- the control unit 50 determines the detected position as an example of the position of the object in the first coordinates, and the class of the object as an example of the type of the object determined from a plurality of types. is acquired (S2).
- the control unit 50 calculates the map positions m1 to m6 as an example of the position of the object on the second coordinates by performing coordinate transformation so as to switch the reference heights H1 to H6 according to the type of the object in the detection result. (S3, S11 to S14).
- map positions m1 to m1 to m6 is calculated. As a result, the positions of various objects can be accurately detected on the imaging plane imaged by the omnidirectional camera 2 .
- a class which is an example of a plurality of types, includes the full body and upper body of a person as an example of a type indicating the whole of one object and a type indicating a part of the object.
- the object feature information D1 includes different reference heights H1, H2 for each type in the whole type and partial type.
- the control unit 50 inputs acquired image data to an object detection model 70 that detects objects of a plurality of classes as an example of a plurality of types, and outputs detection results (S2).
- the object detection model 70 is generated by machine learning using correct data that associates image data based on the omnidirectional camera 2 with labels indicating each of a plurality of classes.
- the result of object detection by the object detection model 70 can be output in association with a preset class, and the type of object can be determined based on the class of the detection result (S12).
- the flow line extraction server 5 includes a network I/F 55 as an example of an information input unit that acquires information through user's operation.
- the network I/F 55 acquires reference heights for each of a plurality of classes in user operations in annotation work for creating correct data for the object detection model 70 (S31).
- the object characteristic information D1 may be set by the terminal device 4 operating as an object detection device.
- the operation unit 42 acquires the reference height for each of the plurality of classes in the user's operation in the annotation work (S22).
- the object detection method in this embodiment is a method of detecting the position of an object on the imaging plane imaged by the omnidirectional camera 2 .
- the position of the object is stored on the imaging plane from the first coordinates corresponding to the image indicated by the image data generated by the imaging operation of the omnidirectional camera 2.
- Object feature information D1 used for coordinate conversion to the corresponding second coordinates is stored.
- the object feature information D1 includes a reference height indicating the height from the imaging plane for each class of objects in a plurality of classes (one example of types).
- a program for causing a computer to execute the object detection method as described above.
- the positions of various objects can be accurately detected on the imaging plane imaged by the omnidirectional camera 2 .
- the flow line extraction server 5 in this embodiment is an example of an object detection device that detects the position of an object on a horizontal plane (an example of an imaging plane) of the workplace 6 imaged by the omnidirectional camera 2 (an example of a camera).
- the flow line extraction server 5 includes a device I/F 54 as an example of an acquisition unit, a control unit 50, a storage unit 51, and a network I/F 55 as an example of an information input unit.
- the device I/F 54 acquires image data generated by the imaging operation of the omnidirectional camera 2 (S1).
- the control unit 50 converts the coordinates indicating the detection position in the image coordinate system, as an example of the first coordinates according to the image indicated by the image data, to the coordinates indicating the detection position in the image coordinate system, as an example of the second coordinates according to the imaging plane, Coordinate conversion to coordinates indicating map positions m1 to m6 in the map coordinate system is calculated (S3).
- the storage unit 51 stores object feature information D1 as an example of setting information used for coordinate transformation.
- the network I/F 55 acquires information by user's operation.
- the object characteristic information D1 includes reference heights H1 to H6 as examples of set values indicating heights from the imaging plane for each type of object among a plurality of types of objects.
- the flow line extraction server 5 of the present embodiment recognizes the detection areas of a plurality of classes that overlap in the captured image, it selects one class according to a predetermined priority and sets the reference height of the class. is used to calculate the map position.
- the object feature information D1 includes information indicating priority associated with each class.
- Predetermined priority indicates the order of classes that are set in advance with respect to the classes to be detected by the object detection model 70, such that, for example, the higher the priority of the class, the earlier the class. In the following, an example will be described in which the priority is set in the order that the whole body has the highest priority, then the upper body, and then the head.
- control unit 50 determines the class of each object whose detection area is recognized from the detection result based on the image data of one frame (S1 in FIG. 6) (S12). It is determined whether or not a plurality of detection areas are recognized (S41). In step S41, the control unit 50 determines whether detection areas of a plurality of classes are recognized at the same time and whether the plurality of detection areas overlap.
- FIG. 15 is a diagram for explaining position calculation processing in the object detection system 1 of this embodiment.
- FIG. 15 shows an example in which the detection areas A1, A2, and A3 of the whole body, upper body, and head of the person 11 are recognized in the captured image Im.
- the detection areas A1 to A3 are recognized overlapping on the captured image Im.
- control unit 50 selects the class with the highest priority among the multiple classes (S42).
- the whole body class having the highest priority among the whole body, upper body and head classes is selected.
- control unit 50 After selecting the class with the highest priority (S42), the control unit 50 acquires the reference height of the class corresponding to the selection result from the object feature information D1 (S13).
- control unit 50 acquires the reference height of the class corresponding to the determination result of step S12 (S13).
- the object feature information D1 includes information indicating priority as an example of information indicating a predetermined order set for a plurality of classes.
- the control unit 50 gives priority to One class is selected from two or more classes according to the degrees (S42), and the map position of the selected class object is calculated as an example of the position of the selected type of object in the second coordinates (S13-S14 ).
- a predetermined condition may be set in the determination (S41) of whether or not a plurality of overlapping detection areas are recognized. For example, when 90% or more of one of the plurality of detection regions is included in the other region, it may be determined that the plurality of detection regions overlap and are recognized (YES in S41).
- the flow line extraction server 5 of the present embodiment recognizes detection areas of a plurality of classes that overlap in the captured image, it can be considered that the detection areas are easier to connect as flow lines than the detection result based on the image data of the immediately preceding frame. choose a class.
- FIG. 16 is a flowchart illustrating position calculation processing in the object detection system 1 of this embodiment.
- the control unit 50 performs the same processes as steps S11 to S14 and S41 to S42 in the position calculation process (FIG. 14) of the second embodiment, are executed (S51-S52).
- the control unit 50 determines that a plurality of overlapping detection areas have been recognized (YES in S41), the detection result of the previous image recognition processing (S2 in FIG. 4) indicates that the detection area is of the same class as the current detection area. exists in the vicinity on the captured image (S51).
- the control unit 50 refers to the previous detection result held in, for example, the temporary storage unit 51b, and finds the detection regions in which the distance between the detection positions of the detection regions of the same class in the previous time and the current time is smaller than a predetermined distance. It is determined whether or not it exists in the previous detection result.
- the predetermined distance is set in advance as a distance small enough to be regarded as a neighborhood on the image. For example, the predetermined distance is set so that the sizes of the H component and the V component are about 1/4 to 1/3 of the width and height of the rectangular detection area, respectively, according to the size of the detection area. be done.
- FIG. 17 is a diagram for explaining position calculation processing in the object detection system 1 of this embodiment.
- FIGS. 17A to 17C exemplify captured images Im indicated by image data of three consecutive frames acquired from the omnidirectional camera 2.
- FIG. 17A a part of the body of the person 11 is hidden by the equipment, and the detection area A2 of the upper body is recognized.
- FIG. 17B the person 11 has moved from FIG. 17A, and the detection area A1 of the whole body and the detection area A2 of the upper body are recognized.
- FIG. 17(C) the person 11 has moved further from FIG. 17(B), and the detection area A1 of the whole body and the detection area A2 of the upper body are recognized.
- step S51 in the captured image Im of FIG. 17(B), in step S51, in the vicinity of each detection region A1, A2 of this time, whether the detection region of the same class was recognized in the previous captured image Im of FIG. 17(A). It is determined whether or not In the examples of FIGS. 17A and 17B, since there is no detection area of the whole body class in the object detection result by the previous image recognition processing, it is determined "NO" in step S51.
- the control unit 50 The class of the detection area closest to the detection area is selected (S52).
- the distances d1 and d2 between the previous detection position C21 of the detection area A2 and the current detection positions C12 and C22 of the detection areas A1 and A2 are compared. Since the distance d2 is smaller than the distance d1, the upper-body class is selected based on the assumption that the detection area A2 is closest to the previous detection area A2 among the current detection areas A1 and A2.
- control unit 50 performs a predetermined priority select the class with the highest priority (S42).
- 17B and 17C show that the distance d3 between the previous detection position C12 and the current detection position C13 for the whole body detection area A1 is smaller than a predetermined distance, and the upper body detection area A2 for the previous and current detection areas An example is shown in which the distance d4 between the detection positions C22 and C23 is smaller than the predetermined distance.
- "YES" is determined in step S51, and in step S42, for example, the whole body class having the highest preset priority is selected.
- the closest detected area on the captured image is A class of detection regions is selected (S51-S52).
- the map position is calculated using the reference height of the class detected closest to the previous detection result, that is, the class that can be regarded as a flow line that can be easily connected. (S14).
- step S51 of FIG. 16 it may be determined whether or not a detection area exists in the vicinity of the captured image for each current detection area, regardless of the difference in class in the previous detection result.
- the class of the current detection area closest to the previous detection area may be selected (S52).
- the class with the highest priority may be selected from the current detection result (S42).
- the class may be selected based on information other than the priority. For example, information that associates the layout of various facilities 20 based on the map information of the workplace 6 with the image coordinate system may be used. For example, based on the information, the upper body or full body class may be selected depending on whether the detection position of the detection region in the captured image is within a predetermined range considered to be near the equipment 20 of the workplace 6.
- the control unit 50 calculates the position of the object at the second coordinates for each image data based on the image data sequentially acquired by the device I/F 54.
- flow line information including map positions in order is generated (S1 to S5).
- one class is selected from two or more classes of objects (S51-S52), and as an example of the position of the selected type of object in the second coordinates, the selected The map position of the class object is calculated (S13-S14).
- S13-S14 the map position of the class object is calculated.
- Embodiments 1 to 3 have been described as examples of the technology disclosed in the present application.
- the technology in the present disclosure is not limited to this, and can also be applied to embodiments in which modifications, substitutions, additions, omissions, etc. are made as appropriate.
- the detection targets of the object detection model 70 are the whole body and upper body of a person and objects such as cargo, but other priorities may be used.
- the detection targets of the object detection model 70 include persons and vehicles.
- the priority may be set such that the person is next to the vehicle.
- the map position is calculated using the reference height of the vehicle class. In this way, it is possible to accurately calculate the position based on the detection result according to the priority according to the application of the object detection system 1 .
- the current detection result may be compared with the detection results based on the image data of the frames immediately before and after, and a class that can be considered to be likely to be connected to the flow line may be selected.
- image data of a plurality of continuous frames are acquired in step S1 of FIG.
- the number of omnidirectional cameras 2 is not limited to one, and may be plural.
- the flow line extraction server 5 executes the operation of FIG. may be performed.
- step S3 of FIG. 6 an example of calculating the map position as a position corresponding to the horizontal plane 60 of the workplace 6 based on the detection result has been described.
- a coordinate system may be used.
- the position based on the detection result may be calculated using a coordinate system indicating the position on the horizontal plane 60 according to the omnidirectional camera 2 before being converted into the map coordinate system.
- the calculated position may be transformed into the map coordinate system in step S5 of FIG. 6, for example.
- step S3 for example, the position of the detection result based on each omnidirectional camera is aligned by coordinate transformation according to each omnidirectional camera. may be calculated as follows.
- the position of the detection area is not limited to the detection position, and for example, the midpoint of one side of the detection area may be used.
- the position of the detection area may be the position of a plurality of points, or may be the center of gravity of a non-rectangular area.
- the setting of the reference height is not limited to this.
- the flow line extraction server 5 after the generation of the object detection model 70 and before the start of the basic operation (FIG. 6), when various parameters related to coordinate conversion from the image coordinate system to the map coordinate system are set, A reference height may be set together.
- the flow line extraction server 5 of the present embodiment sets the reference height according to the user operation of inputting the reference height for each class from the terminal device 4 or an external operation device via the device I/F 54, for example. .
- the detection targets of the object detection model 70 include classes corresponding to parts of an object such as the upper body of a person. good too.
- the flow line extraction server 5 of the present embodiment includes, in addition to the object detection model 70, a detection model for detecting the upper body and a detection model for detecting the head. Upper body and head detection models may be applied to the region. Based on the detection result of each detection model, the type of the object such as whole body, upper body and head is determined instead of the class determination in step S12. can be calculated.
- each part can be determined based on the captured image of the workspace 6 by the processing in step S3, and the position can be accurately determined. can be calculated well.
- the flow line extraction server 5 using the detection models of the upper body and the head, which are targets for calculating the map position, was described. may be used.
- the type of the object such as the whole body, upper body, and head reflected in the captured image can be determined. good.
- the control unit 50 recognizes the whole body area of a person as an example of the area where the entire object is detected in the image indicated by the acquired image data.
- the control unit 50 recognizes the upper body and head regions as an example of regions in which one or more portions of one object are detected in the entire recognized region, and uses the recognition results for the one or more portions as an example. Based on this, a class is determined as an example of the type of object.
- a technique of skeletal detection or posture estimation is applied to a captured image to detect the person.
- each part of the body may be determined as the object type.
- the object detection unit 71 outputs the detection result by associating the detection area with the class.
- a detection area defined by the position and size on the image may be output as the detection result regardless of the class.
- the type of object may be determined based on the position and size of the detection area instead of the class.
- the flow line extraction server 5 has been described as an example of the object detection device.
- the terminal device 4 may be configured as an object detection device, and various operations of the object detection device may be executed by the control unit 40 .
- the omnidirectional camera 2 has been described as an example of the camera in the object detection system 1.
- the object detection system 1 is not limited to the omnidirectional camera 2, and may include various cameras.
- the camera of the system 1 may be various imaging devices that employ various projection methods such as an orthographic projection method, an equidistant projection method, and an equisolid angle projection method.
- the site to which the object detection system 1 and the flow line extraction server 5 are applied is not limited to the workshop 6, and may be various sites such as a distribution warehouse or a sales floor of a store.
- the present disclosure can be applied to various object detection devices that detect the positions of multiple types of objects using cameras, such as flow line detection devices, monitoring devices, and tracking devices.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
Description
実施形態1に係る物体検知システムについて、図1を用いて説明する。図1は、本実施形態に係る物体検知システム1の概要を示す図である。
本実施形態の物体検知システム1は、例えば図1に示すように、全方位カメラ2と、端末装置4と、動線抽出サーバ5とを備える。動線抽出サーバ5は、本実施形態における物体検知装置の一例である。本システム1は、例えば工場などの作業場6において、人物11及び例えば貨物といった対象物12等の位置を検知して、検知した位置に基づく動線を分析する用途に適用可能である。本システム1の端末装置4は、例えば作業場6の管理者またはデータ分析の担当者などのユーザ3が、動線を分析したり、予め検知対象に関する情報を設定するためのアノテーション作業を行ったりするために用いられる。
図2は、端末装置4の構成を例示するブロック図である。図2に例示する端末装置4は、制御部40と、記憶部41と、操作部42と、表示部43と、機器インタフェース44と、ネットワークインタフェース45とを備える。以下、インタフェースを「I/F」と略記する。
図3は、動線抽出サーバ5の構成を例示するブロック図である。図3に例示する動線抽出サーバ5は、制御部50と、記憶部51と、機器I/F54と、ネットワークI/F55とを備える。
以上のように構成される物体検知システム1、動線抽出サーバ5及び端末装置4の動作について、以下説明する。
以上のような動線F1,F2を抽出する際などに課題となる場面について、図5を用いて説明する。
以下では、本システム1における動線抽出サーバ5の基本的な動作について、図6を用いて説明する。
図6のステップS3における位置算出処理の詳細について、図7~図10を用いて説明する。
以上のようなクラス毎の基準高さの設定に関する設定処理について、図11及び図12を用いて説明する。
以上のように取得されるアノテーション情報に基づき、物体検知モデル70を生成する学習処理について、図13を用いて説明する。本実施形態の物体検知システム1では、例えば動線抽出サーバ5において、物体検知モデル70の学習処理が実行される。
以上のように、本実施形態における動線抽出サーバ5は、全方位カメラ2(カメラの一例)により撮像される作業場6の水平面(撮像平面の一例)における物体の位置を検知する物体検知装置の一例である。動線抽出サーバ5は、取得部の一例として機器I/F54と、制御部50と、記憶部51とを備える。機器I/F54は、全方位カメラ2の撮像動作により生成された画像データを取得する(S1)。制御部50は、物体の位置に関して、画像データが示す画像に応じた第1の座標の一例として、画像座標系における検知位置を示す座標から、撮像平面に応じた第2の座標の一例として、地図座標系における地図位置m1~m6を示す座標への座標変換を演算する(S3)。記憶部51は、座標変換に用いられる設定情報の一例として物体特徴情報D1を記憶する。物体特徴情報D1は、複数の種類の物体における各種類の物体に関して、撮像平面からの高さを示す設定値の一例として、基準高さH1~H6を含む。制御部50は、機器I/F54により取得された画像データに基づいて、第1の座標における物体の位置の一例として検知位置と、複数の種類から判別された物体の種類の一例として物体のクラスとを関連付けた検知結果を取得する(S2)。制御部50は、検知結果における物体の種類に応じて、基準高さH1~H6を切り替えるように座標変換を演算して、第2の座標における物体の位置の一例として地図位置m1~m6を算出する(S3,S11~S14)。
実施形態1では、物体の検知結果に応じて判定したクラスの基準高さを用いて、地図位置を算出する動線抽出サーバ5を説明した。実施形態2では、物体検知システム1において、複数のクラスの検知領域が重畳して認識されるとき、所定の優先度に応じたクラスの基準高さを用いて、地図位置を算出する動線抽出サーバ5を説明する。
実施形態2では、重なり合った複数の検知領域が認識されるとき、予め設定された優先度に従って地図位置を算出する動線抽出サーバ5を説明した。実施形態3では、物体検知システム1において、重なり合った複数の検知領域が認識されるとき、検知領域に対応する物体の動線との関係に基づいて地図位置を算出する動線抽出サーバ5を説明する。
以上のように、本出願において開示する技術の例示として、実施形態1~3を説明した。しかしながら、本開示における技術は、これに限定されず、適宜、変更、置換、付加、省略などを行った実施の形態にも適用可能である。また、上記各実施形態で説明した各構成要素を組み合わせて、新たな実施の形態とすることも可能である。そこで、以下、他の実施形態を例示する。
Claims (10)
- カメラにより撮像される撮像平面における物体の位置を検知する物体検知装置であって、
前記カメラの撮像動作により生成された画像データを取得する取得部と、
前記物体の位置に関して、前記画像データが示す画像に応じた第1の座標から、前記撮像平面に応じた第2の座標への座標変換を演算する制御部と、
前記座標変換に用いられる設定情報を記憶する記憶部と
を備え、
前記設定情報は、複数の種類の物体における各種類の物体に関して、前記撮像平面からの高さを示す設定値を含み、
前記制御部は、
前記取得部により取得された画像データに基づいて、前記第1の座標における前記物体の位置と、前記複数の種類から判別された前記物体の種類とを関連付けた検知結果を取得し、
前記検知結果における前記物体の種類に応じて、前記設定値を切り替えるように前記座標変換を演算して、前記第2の座標における前記物体の位置を算出する
物体検知装置。 - 前記複数の種類は、一物体の全体を示す種類及び当該物体における部分を示す種類を含み、
前記設定情報は、前記全体の種類及び前記部分の種類における各種類に関して、異なる設定値を含む
請求項1に記載の物体検知装置。 - 前記制御部は、前記複数の種類の物体を検知する物体検知モデルに、取得された画像データを入力して、前記検知結果を出力し、
前記物体検知モデルは、前記カメラに基づく画像データと、前記複数の種類の各種類を示すラベルとを関連付けた正解データを用いた機械学習により生成される
請求項1又は2に記載の物体検知装置。 - ユーザの操作において情報を取得する情報入力部をさらに備え、
前記情報入力部は、前記正解データを作成するためのアノテーション作業におけるユーザ操作において、前記複数の種類毎の設定値を取得する
請求項3に記載の物体検知装置。 - 前記設定情報は、前記複数の種類に関して設定された所定の順序を示す情報を含み、
前記制御部は、
取得された画像データが示す画像において、前記複数の種類の物体のうちの2以上の種類の物体が互いに重畳して検知されたとき、
前記所定の順序に従って、前記2以上の種類から一の種類を選択して、前記第2の座標における前記選択した種類の物体の位置を算出する
請求項1から4のいずれか1項に記載の物体検知装置。 - 前記制御部は、前記取得部により順次、取得される画像データに基づいて、前記画像データ毎の前記第2の座標における前記物体の位置を順番に含む動線情報を生成し、
前記制御部は、
新たに取得された画像データが示す画像において、前記複数の種類の物体のうちの2以上の種類の物体が互いに重畳して検知されたとき、
前記動線情報に含まれる位置に基づいて、前記2以上の種類の物体から一の種類を選択して、
前記第2の座標における前記選択した種類の物体の位置を算出する
請求項1から5のいずれか1項に記載の物体検知装置。 - 前記制御部は、
取得された画像データが示す画像において前記一物体の全体が検知された領域を認識し、
認識された全体の領域において前記一物体の1以上の部分が検知された領域を認識して、
前記1以上の部分の領域に関する認識結果に基づいて、前記物体の種類を判別する
請求項2に記載の物体検知装置。 - カメラにより撮像される撮像平面における物体の位置を検知する物体検知方法であって、
コンピュータの記憶部には、前記物体の位置に関して、前記カメラの撮像動作により生成された画像データが示す画像に応じた第1の座標から、前記撮像平面に応じた第2の座標への座標変換に用いられる設定情報が格納されており、
前記設定情報は、複数の種類の物体における各種類の物体に関して、前記撮像平面からの高さを示す設定値を含み、
前記コンピュータの制御部が、
前記画像データを取得するステップと、
取得された画像データに基づいて、前記第1の座標における前記物体の位置と、前記複数の種類から判別された前記物体の種類とを関連付けた検知結果を取得するステップと、
前記検知結果における前記物体の種類に応じて、前記設定値を切り替えるように前記座標変換を演算して、前記第2の座標における前記物体の位置を算出するステップと
を含む物体検知方法。 - 請求項8に記載の物体検知方法をコンピュータに実行させるためのプログラム。
- カメラにより撮像される撮像平面における物体の位置を検知する物体検知装置であって、
前記カメラの撮像動作により生成された画像データを取得する取得部と、
前記物体の位置に関して、前記画像データが示す画像に応じた第1の座標から、前記撮像平面に応じた第2の座標への座標変換を演算する制御部と、
前記座標変換に用いられる設定情報を記憶する記憶部と、
ユーザの操作において情報を取得する情報入力部と
を備え、
前記設定情報は、複数の種類の物体における各種類の物体に関して、前記撮像平面からの高さを示す設定値を含み、
前記情報入力部は、前記設定値を入力するユーザ操作において、前記複数の種類毎の設定値を取得し、
前記制御部は、
前記取得部により取得された画像データに基づいて、前記第1の座標における前記物体の位置と、前記複数の種類から判別された前記物体の種類とを関連付けた検知結果を取得し、
前記検知結果における前記物体の種類毎に、前記ユーザ操作において取得された設定値に応じて前記座標変換を演算して、前記第2の座標における前記物体の位置を算出する
物体検知装置。
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2023520761A JPWO2022239291A1 (ja) | 2021-05-13 | 2021-12-24 | |
CN202180098118.0A CN117296079A (zh) | 2021-05-13 | 2021-12-24 | 物体探测装置以及方法 |
US18/383,518 US20240070894A1 (en) | 2021-05-13 | 2023-10-25 | Object detection device and method |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2021081787 | 2021-05-13 | ||
JP2021-081787 | 2021-05-13 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US18/383,518 Continuation US20240070894A1 (en) | 2021-05-13 | 2023-10-25 | Object detection device and method |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022239291A1 true WO2022239291A1 (ja) | 2022-11-17 |
Family
ID=84028106
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2021/048247 WO2022239291A1 (ja) | 2021-05-13 | 2021-12-24 | 物体検知装置及び方法 |
Country Status (4)
Country | Link |
---|---|
US (1) | US20240070894A1 (ja) |
JP (1) | JPWO2022239291A1 (ja) |
CN (1) | CN117296079A (ja) |
WO (1) | WO2022239291A1 (ja) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2013162329A (ja) * | 2012-02-06 | 2013-08-19 | Sony Corp | 画像処理装置、画像処理方法、プログラム、及び記録媒体 |
JP2017117244A (ja) * | 2015-12-24 | 2017-06-29 | Kddi株式会社 | カメラの撮影画像に映る人物を検出する画像解析装置、プログラム及び方法 |
JP2020149111A (ja) * | 2019-03-11 | 2020-09-17 | オムロン株式会社 | 物体追跡装置および物体追跡方法 |
JP2020173504A (ja) * | 2019-04-08 | 2020-10-22 | 清水建設株式会社 | 位置推定システム、位置推定装置、位置推定方法、及びプログラム |
-
2021
- 2021-12-24 JP JP2023520761A patent/JPWO2022239291A1/ja active Pending
- 2021-12-24 WO PCT/JP2021/048247 patent/WO2022239291A1/ja active Application Filing
- 2021-12-24 CN CN202180098118.0A patent/CN117296079A/zh active Pending
-
2023
- 2023-10-25 US US18/383,518 patent/US20240070894A1/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2013162329A (ja) * | 2012-02-06 | 2013-08-19 | Sony Corp | 画像処理装置、画像処理方法、プログラム、及び記録媒体 |
JP2017117244A (ja) * | 2015-12-24 | 2017-06-29 | Kddi株式会社 | カメラの撮影画像に映る人物を検出する画像解析装置、プログラム及び方法 |
JP2020149111A (ja) * | 2019-03-11 | 2020-09-17 | オムロン株式会社 | 物体追跡装置および物体追跡方法 |
JP2020173504A (ja) * | 2019-04-08 | 2020-10-22 | 清水建設株式会社 | 位置推定システム、位置推定装置、位置推定方法、及びプログラム |
Also Published As
Publication number | Publication date |
---|---|
CN117296079A (zh) | 2023-12-26 |
US20240070894A1 (en) | 2024-02-29 |
JPWO2022239291A1 (ja) | 2022-11-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11308347B2 (en) | Method of determining a similarity transformation between first and second coordinates of 3D features | |
US10445887B2 (en) | Tracking processing device and tracking processing system provided with same, and tracking processing method | |
US8094204B2 (en) | Image movement based device control method, program, and apparatus | |
JP6587489B2 (ja) | 画像処理装置、画像処理方法および画像処理システム | |
US9832447B2 (en) | Image processing system and image processing program | |
US11082634B2 (en) | Image processing system, image processing method, and program | |
TW201715476A (zh) | 運用擴增實境技術之導航系統 | |
JP6590609B2 (ja) | 画像解析装置及び画像解析方法 | |
JP5699697B2 (ja) | ロボット装置、位置姿勢検出装置、位置姿勢検出プログラム、および位置姿勢検出方法 | |
EP2381415A1 (en) | Person-judging device, method, and program | |
JP5456175B2 (ja) | 映像監視装置 | |
JP5525495B2 (ja) | 映像監視装置、映像監視方法およびプログラム | |
JP6579727B1 (ja) | 動体検出装置、動体検出方法、動体検出プログラム | |
JP2013038454A (ja) | 画像処理装置および方法、並びにプログラム | |
WO2022239291A1 (ja) | 物体検知装置及び方法 | |
EP3477544A1 (en) | Information processing apparatus, information processing method, imaging apparatus, network camera system, and program | |
JP2015184986A (ja) | 複合現実感共有装置 | |
JP2020135446A (ja) | 画像処理装置及び画像処理方法 | |
JP2013257830A (ja) | 情報処理装置 | |
JPWO2021130982A5 (ja) | 情報処理装置、制御方法及びプログラム | |
JP2020173698A (ja) | 作業動線記録システム | |
JP2011158956A (ja) | 情報処理装置および情報処理方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21942019 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2023520761 Country of ref document: JP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 202180098118.0 Country of ref document: CN |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21942019 Country of ref document: EP Kind code of ref document: A1 |