WO2019138834A1 - Information processing device, information processing method, program, and system - Google Patents

Information processing device, information processing method, program, and system Download PDF

Info

Publication number
WO2019138834A1
WO2019138834A1 PCT/JP2018/047022 JP2018047022W WO2019138834A1 WO 2019138834 A1 WO2019138834 A1 WO 2019138834A1 JP 2018047022 W JP2018047022 W JP 2018047022W WO 2019138834 A1 WO2019138834 A1 WO 2019138834A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
unit
agv
orientation
imaging unit
Prior art date
Application number
PCT/JP2018/047022
Other languages
French (fr)
Japanese (ja)
Inventor
誠 冨岡
鈴木 雅博
小林 俊広
片山 昭宏
藤木 真和
小林 一彦
小竹 大輔
修一 三瓶
智行 上野
知弥子 中島
聡美 永島
Original Assignee
キヤノン株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by キヤノン株式会社 filed Critical キヤノン株式会社
Publication of WO2019138834A1 publication Critical patent/WO2019138834A1/en

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01BMEASURING LENGTH, THICKNESS OR SIMILAR LINEAR DIMENSIONS; MEASURING ANGLES; MEASURING AREAS; MEASURING IRREGULARITIES OF SURFACES OR CONTOURS
    • G01B11/00Measuring arrangements characterised by the use of optical techniques
    • G01B11/08Measuring arrangements characterised by the use of optical techniques for measuring diameters
    • G01B11/10Measuring arrangements characterised by the use of optical techniques for measuring diameters of objects while moving
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01BMEASURING LENGTH, THICKNESS OR SIMILAR LINEAR DIMENSIONS; MEASURING ANGLES; MEASURING AREAS; MEASURING IRREGULARITIES OF SURFACES OR CONTOURS
    • G01B11/00Measuring arrangements characterised by the use of optical techniques
    • G01B11/24Measuring arrangements characterised by the use of optical techniques for measuring contours or curvatures
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01BMEASURING LENGTH, THICKNESS OR SIMILAR LINEAR DIMENSIONS; MEASURING ANGLES; MEASURING AREAS; MEASURING IRREGULARITIES OF SURFACES OR CONTOURS
    • G01B11/00Measuring arrangements characterised by the use of optical techniques
    • G01B11/24Measuring arrangements characterised by the use of optical techniques for measuring contours or curvatures
    • G01B11/245Measuring arrangements characterised by the use of optical techniques for measuring contours or curvatures using a plurality of fixed, simultaneously operating transducers
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01BMEASURING LENGTH, THICKNESS OR SIMILAR LINEAR DIMENSIONS; MEASURING ANGLES; MEASURING AREAS; MEASURING IRREGULARITIES OF SURFACES OR CONTOURS
    • G01B11/00Measuring arrangements characterised by the use of optical techniques
    • G01B11/24Measuring arrangements characterised by the use of optical techniques for measuring contours or curvatures
    • G01B11/25Measuring arrangements characterised by the use of optical techniques for measuring contours or curvatures by projecting a pattern, e.g. one or more lines, moiré fringes on the object
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01BMEASURING LENGTH, THICKNESS OR SIMILAR LINEAR DIMENSIONS; MEASURING ANGLES; MEASURING AREAS; MEASURING IRREGULARITIES OF SURFACES OR CONTOURS
    • G01B11/00Measuring arrangements characterised by the use of optical techniques
    • G01B11/26Measuring arrangements characterised by the use of optical techniques for measuring angles or tapers; for testing the alignment of axes
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C11/00Photogrammetry or videogrammetry, e.g. stereogrammetry; Photographic surveying
    • G01C11/04Interpretation of pictures
    • G01C11/06Interpretation of pictures by comparison of two or more pictures of the same area
    • G01C11/12Interpretation of pictures by comparison of two or more pictures of the same area the pictures being supported in the same relative position as when they were taken
    • G01C11/14Interpretation of pictures by comparison of two or more pictures of the same area the pictures being supported in the same relative position as when they were taken with optical projection
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course or altitude of land, water, air, or space vehicles, e.g. automatic pilot
    • G05D1/02Control of position or course in two dimensions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/16Anti-collision systems

Definitions

  • the present invention relates to technology for performing movement control of a mobile.
  • moving objects such as a guided vehicle (AGV (Automated Guided Vehicle)) and an autonomous mobile robot (AMR (Autonomous Mobile Robot)).
  • AGV Automated Guided Vehicle
  • AMR Autonomous Mobile Robot
  • a tape is attached to the floor as in Patent Document 1 and a sensor mounted with the tape on the moving body I was running while detecting.
  • Patent Document 1 since it is necessary to restick the tape every time the flow line is changed by changing the layout of an object in the environment in which the moving body travels, it takes time and effort. It is required to reduce the time and effort and stably run the moving body.
  • the present invention has been made in view of the above problems, and an object thereof is to provide an information processing apparatus which stably performs movement control of a mobile body. Moreover, it aims at providing the method and program.
  • An information processing apparatus has the following configuration.
  • An input unit that receives an input of image information acquired by an imaging unit that is mounted on a moving body and in which each light receiving unit on the imaging element is configured by two or more light receiving elements; Holding means for holding map information; Acquisition means for acquiring the position and orientation of the imaging unit based on the image information and the map information; Control means for obtaining a control value for controlling the movement of the movable body based on the position and orientation acquired by the acquisition means.
  • movement control of the moving body can be stably performed.
  • FIG. 2 is a diagram for explaining a system configuration in the first embodiment.
  • FIG. 2 is a diagram for explaining a functional configuration in the first embodiment.
  • FIG. 7 is a diagram for explaining an imaging element D150 included in the imaging unit 110.
  • FIG. 7 is a diagram for explaining an imaging element D150 included in the imaging unit 110.
  • FIG. 7 is a diagram for explaining an imaging element D150 included in the imaging unit 110.
  • FIG. 6 is a view showing an example of images 152a to 154d captured by the imaging unit 110.
  • 3 is a flowchart showing the flow of processing of the device of the first embodiment.
  • FIG. 2 is a diagram showing a hardware configuration of the device of Embodiment 1.
  • 10 is a flowchart showing a procedure of correction processing of visual information using motion stereo in the second embodiment.
  • FIG. 7 is a diagram for explaining a functional configuration in a third embodiment.
  • FIG. 13 is a diagram for explaining a functional configuration in a fourth embodiment.
  • 16 is a flowchart illustrating a procedure of correction processing of visual information using a measurement result of the three-dimensional measurement device in the fourth embodiment.
  • 16 is a flowchart showing a processing procedure of object detection and calculation of position and orientation in the fifth embodiment.
  • 16 is a flowchart showing a processing procedure of semantic area division of visual information in the sixth embodiment.
  • FIG. 18 is a diagram for explaining a functional configuration in an eighth embodiment.
  • the flowchart which shows the flow of processing of the device of execution form 8.
  • Embodiment 1 movement control of a mobile unit referred to as a guided vehicle (AGV (Automated Guided Vehicle)) or an autonomous mobile robot (AMR (Autonomous Mobile Robot)) will be described.
  • AGV Automated Guided Vehicle
  • AMR Autonomous Mobile Robot
  • FIG. 1 shows a system configuration diagram in the present embodiment.
  • the information processing system 1 in the present embodiment includes a plurality of mobile units 12 (12-1, 12-2,...), A process management system 14 and a mobile unit management system 13.
  • the information processing system 1 is a distribution system, a production system, and the like.
  • the plurality of mobile bodies 12 (12-1, 12-2,...) Are transportation vehicles (AGV (Automated Guided Vehicle)) that transport objects in accordance with the schedule of processes determined by the process management system.
  • a plurality of mobile units move (run) within the environment.
  • the process management system 14 manages the process performed by the information processing system. For example, it is MES (Manufacturing Execution System) which manages the process in a factory or a distribution warehouse. It communicates with the mobile management system 3.
  • MES Manufacturing Execution System
  • the mobile management system 13 is a system that manages mobiles. It communicates with the process control system 12. In addition, communication (for example, Wi-Fi communication) is also performed with mobiles, and operation information is bidirectionally transmitted and received.
  • communication for example, Wi-Fi communication
  • FIG. 2 is a diagram showing an example of a hardware configuration of the mobile unit 12 including the information processing apparatus 10 in the present embodiment.
  • the information processing apparatus 10 includes an input unit 1110, a calculation unit 1120, a holding unit 1130, and a control unit 1140.
  • the input unit 1110 is connected to the imaging unit 110 mounted on the moving body 12.
  • the controller 1140 is connected to the actuator 120.
  • a communication device (not shown) communicates information with the mobile management system 3 in a bi-directional manner, and inputs / outputs to / from various means of the information processing apparatus 10.
  • FIG. 2 is an example of a device configuration.
  • FIG. 3 is a diagram for explaining an imaging device D 150 provided in the imaging unit 110.
  • the imaging unit 110 internally includes an imaging device D150.
  • a large number of light receiving units D151 are arranged in a lattice shape inside the imaging device D150.
  • FIG. 3A shows four light receiving units.
  • the micro lens D153 is provided in the upper surface in each light-receiving part D151, and it can collect now efficiently.
  • the conventional imaging device includes one light receiving element for one light receiving unit D151.
  • each light receiving unit D151 includes a plurality of light receiving devices D152. Is equipped.
  • FIG. 3B shows one light receiving unit D 151 as viewed from the side.
  • two light receiving elements D 152 a and 152 b are provided in one light receiving unit D 151.
  • the individual light receiving elements D152a and D152b are independent of each other, and the charge accumulated in the light receiving element D152a does not move to the light receiving element D152b, and conversely, the charge accumulated in the light receiving element D152b moves to the light receiving element D152a There is nothing to do. Therefore, in FIG. 3B, the light receiving element D 152 a receives the light flux incident from the right side of the microlens D 153. On the other hand, the light receiving element D 152 b receives the light flux incident from the left side of the microlens D 153.
  • the imaging unit 110 can select only the charge accumulated in the light receiving element D 152 a to generate the image D 154 a. At the same time, the imaging unit 110 can select only the charge accumulated in the light receiving element D 152 b to generate the image D 154 b.
  • the image D154a is generated by selecting the light from the right side of the microlens 153
  • the image D154b is generated by selecting only the light from the left side of the microlens D153. Therefore, as shown in FIG. It is an image taken from
  • the imaging unit 110 forms an image from each light receiving unit D 151 using the charges accumulated in both of the light receiving elements D 152 a and D 152 b.
  • an image D154e (not shown) which is an image captured from a certain viewpoint is obtained.
  • the imaging unit 110 can simultaneously capture the images D154a and D154b having different shooting viewpoints and the conventional image 154e according to the principle described above.
  • each light receiving unit D151 may include more light receiving elements D152, and an arbitrary number of light receiving elements D152 can be set.
  • FIG. 3C shows an example in which four light receiving elements D152a to D152d are provided inside the light receiving part D151.
  • the imaging unit 110 may perform a corresponding point search from the pair of images D154a and D154b to calculate a parallax image D154f (not illustrated), and may further calculate a three-dimensional shape of an object by a stereo method based on the parallax images. It can.
  • Corresponding point search and stereo methods are known techniques, and various methods can be applied.
  • a template matching method of searching for a similar template using several pixels around each pixel of the image as a template, or extracting edge or corner feature points from the gradient of the brightness information of the image Use a method to search for similar points.
  • the stereo method the relationship between coordinate systems of two images is derived, a projective transformation matrix is derived, and a three-dimensional shape is calculated.
  • the imaging unit 110 has a function of outputting an image D154a, an image D154b, a parallax image D154f, a depth map D154d obtained by the stereo method, and a three-dimensional point group D154c in addition to the image D154e.
  • the depth map refers to the image which hold
  • the value correlated with the distance to the measurement object is an integer value that can be configured as a normal image, and by multiplying a predetermined coefficient determined from the focal distance, the physical distance to the object (for example, millimeter) Can be converted to The focal length is included in the unique information of the imaging unit 110 as described above.
  • the imaging unit 110 can obtain a pair of images D154a and D154b with different viewpoints by a single imaging device D150, so the configuration is more compact unlike the conventional stereo method that requires two or more imaging units. Makes it possible to realize three-dimensional measurement.
  • the imaging unit D110 further includes an autofocus mechanism that controls the focal length of the optical system and a zoom mechanism that controls the angle of view.
  • the auto focus mechanism can be switched on or off, and the set focal length can be fixed.
  • the imaging unit D110 reads a control value defined by a drive amount such as a rotation angle or movement amount of an optical system control motor provided to control a focus and an angle of view, and refers to a lookup table (not shown). The distance can be calculated and output.
  • the imaging unit D110 can read, from the mounted lens, unique information of the lens such as a focal length range, an aperture, a distortion coefficient, and an optical center.
  • the read inherent information is used for correction of lens distortion of a parallax image D 154 f and a depth map D 154 d described later, and calculation of a three-dimensional point group D 154 c.
  • the imaging unit 110 corrects the lens distortion of the images D154a to D154b and the parallax image D154f, the depth map D154d, the image coordinates of the principal point position (hereinafter referred to as the image center), and the base lengths of the images D154a and D154b. It has a function to output. It also has a function to output three-dimensional measurement data such as generated images 154a to 154c, optical system data such as focal length and image center, parallax image D154f, base line length, depth map D154d, and three-dimensional point group D154c. ing. In the present embodiment, these data are collectively referred to as image information (hereinafter also referred to as "visual information").
  • the imaging unit 110 selectively outputs all or part of the image information in accordance with a parameter set in a storage area (not shown) provided inside the imaging unit 110 or an instruction given from the outside of the imaging unit 110.
  • the movement control in the present embodiment is to control a motor that is an actuator included in the moving body and a steering that changes the direction of the wheel. By controlling these, the mobile unit is moved to a predetermined destination. Also, the control value is a command value for controlling the moving body.
  • the position and orientation of the imaging unit in the present embodiment are six parameters including three parameters indicating the position of the imaging unit 110 in an arbitrary world coordinate system defined in the real space and three parameters indicating the orientation of the imaging unit 110. It is Note that the mounting position of the imaging device with respect to the center of gravity of the moving object is measured at the design stage of the moving object such as AGV, and a matrix representing the mounting position and orientation is stored in the external memory H14. The gravity center position of the AGV can be calculated by multiplying the position and orientation of the imaging unit by the matrix representing the attachment position and orientation described above. For this reason, in the present embodiment, the position and orientation of the imaging unit are treated as synonymous with the position and orientation of the AGV.
  • a three-dimensional coordinate system defined on the imaging unit with the optical axis of the imaging unit 110 as the Z axis, the horizontal direction of the image as the X axis, and the vertical direction as the Y axis is called an imaging unit coordinate system.
  • the input unit 1110 inputs, in a time series (for example, 60 frames per second), a depth map in which a depth value is stored for each pixel of an image of a scene as image information (visual information) acquired by the imaging unit 110 Output to 1120.
  • the depth value is the distance between the imaging unit 110 and an object in the scene.
  • the calculation unit 1120 calculates and acquires the position and orientation of the imaging unit using the depth map input by the input unit 1110 and map information serving as an index of position and orientation calculation held by the holding unit 1130. The map information will be described later.
  • the calculation unit 1120 further outputs the calculated position and orientation to the control unit 1140.
  • the calculation unit may obtain information necessary for outputting the position and orientation from the input unit and may simply compare the information with the map information held by the holding unit 1130.
  • the holding unit 1130 holds a point cloud as map information.
  • the point cloud is three-dimensional point cloud data of a scene.
  • the point cloud is held by the holding unit 1130 as a data list storing three values of three-dimensional coordinates (X, Y, Z) in an arbitrary world coordinate system.
  • Three-dimensional point cloud data indicates three-dimensional position information.
  • the three-dimensional coordinates that are the destination of the AGV and the target position and posture representing the posture are held.
  • the target position and orientation may be one or more, but for the sake of simplicity, an example in which the target position and orientation is one point will be described.
  • the holding unit 1130 outputs map information to the calculation unit 1120 as needed. Furthermore, the target position and orientation are output to the control unit 1140.
  • the control unit 1140 calculates a control value for controlling the AGV based on the position and orientation of the imaging unit 110 calculated by the calculation unit 1120, the map information held by the holding unit 1130, and the operation information input by the communication device (not shown). Do.
  • the calculated control value is output to the actuator 120.
  • FIG. 6 is a diagram showing a hardware configuration of the information processing apparatus 1.
  • a CPU H11 controls various devices connected to the system bus H21.
  • H12 is a ROM, which stores a BIOS program and a boot program.
  • H13 is a RAM, which is used as a main storage device of the CPU H11.
  • An external memory H14 stores a program processed by the information processing apparatus 1.
  • the input unit H15 is a keyboard, a mouse, or a robot controller, and performs processing related to input of information and the like.
  • the display unit H16 outputs the calculation result of the information processing device 1 to the display device according to the instruction from H11.
  • the display device may be of any type such as a liquid crystal display device, a projector, or an LED indicator.
  • the display unit H16 included in the information processing apparatus may play a role as a display device.
  • a communication interface H17 performs information communication via a network.
  • the communication interface may be Ethernet (registered trademark), and may be of any type such as USB, serial communication, or wireless communication.
  • Information is exchanged with the mobile object management system 13 described above via the communication interface H17.
  • H18 is I / O, and inputs image information (visual information) from the imaging device H19.
  • the imaging device H19 is the imaging unit 110 described above.
  • H20 is the actuator 120 described above.
  • FIG. 5 is a flowchart showing the processing procedure of the information processing apparatus 10 in the present embodiment.
  • the processing steps include initialization S110, visual information acquisition S120, visual information input S130, position and orientation calculation S140, control value calculation S150, control of AGV S160, and system termination determination S170.
  • step S110 the system is initialized. That is, the program is read from the external memory H14, and the information processing apparatus 10 is made operable.
  • the parameters (internal parameters and focal length of the imaging unit 110) of each device connected to the information processing apparatus 10, and the initial position and orientation of the imaging unit 110 are read as a previous time position and orientation in H13 which is a RAM.
  • each device of AGV is started, and it is put in the state where it can operate and control.
  • the operation information is received from the mobile management system through the communication I / F (H17), the three-dimensional coordinates of the destination to which the AGV should head is received, and held in the holding unit 1130.
  • step S120 the imaging unit 110 acquires visual information and inputs the visual information to the input unit 1110.
  • visual information is a depth map, and it is assumed that the imaging unit 110 has acquired the depth map by the method described above. That is, the depth map is D154 d in FIG.
  • step S130 the input unit 1110 acquires the depth map acquired by the imaging unit 110.
  • the depth map is a two-dimensional array list storing the depth value of each pixel.
  • the calculating unit 1120 calculates the position and orientation of the imaging unit 110 using the depth map input by the input unit 1110 and the map information held by the holding unit 1130. Specifically, first, a three-dimensional point group defined in the imaging coordinate system is calculated from the depth map. A three-dimensional point group (X t , Y) using image coordinates (u t , v t ), internal parameters (f x , f y , c x , c y ) of the imaging unit 110 and depth values D of pixels of the depth map t, the Z t) is calculated by equation 1.
  • the imaging unit 110 uses the previous time position and orientation of the imaging unit 110 to coordinate conversion of the three-dimensional point group to the previous time position and orientation coordinate system. That is, the three-dimensional point group is multiplied by the matrix of the previous time position and orientation.
  • the position and orientation are calculated such that the sum of the distances between the nearest three-dimensional points of the calculated three-dimensional point group and the point cloud of the map information held by the holding unit 1130 is reduced.
  • the position and orientation of the imaging unit 110 with respect to the previous time position and orientation are calculated using an ICP (Iterative Closest Point) algorithm.
  • ICP Intelligent Closest Point
  • it is converted into the world coordinate system, and the position and orientation in the world coordinate system are output to the control unit 1140.
  • the calculated position and orientation are stored over the H13, which is the RAM, as the previous time position and orientation.
  • step S150 the control unit 1140 calculates a control value for controlling the AGV. Specifically, the control value is calculated such that the Euclidean distance between the destination coordinates held by the holding unit 1130 and the position and orientation of the imaging unit 110 calculated by the calculation unit 1120 is reduced. The control value calculated by the control unit 1140 is output to the actuator 120.
  • step S160 the actuator 120 controls the AGV using the control value calculated by the control unit 1140.
  • step S170 it is determined whether to end the system. Specifically, if the Euclidean distance between the destination coordinates held by the holding unit 1130 and the position and orientation of the imaging unit 110 calculated by the calculation unit 1120 is equal to or less than a predetermined threshold, the processing ends as having arrived at the destination. If not, the process returns to step S120 and continues processing.
  • each of the light receiving units on the image pickup device includes two or more light receiving elements, and a three-dimensional point obtained from the depth map acquired by the imaging unit and the point cloud as map information Use three-dimensional points.
  • the position and orientation of the imaging unit are calculated so as to minimize the distance between those three-dimensional points.
  • the imaging unit 110 calculates the depth map D 154 d, and the input unit 1110 in the information processing apparatus inputs the depth map.
  • the input unit 1110 can input the point cloud calculated by the imaging unit 110.
  • the calculation unit 1120 can perform position and orientation calculation using the point cloud input by the input unit 1110.
  • the point cloud calculated by the imaging unit 110 is the three-dimensional point group D 154 in FIG. 4.
  • the calculation unit 1120 obtains the depth map by the corresponding point search and the stereo method. Good. Further, in addition to them, the input unit 1110 may also input an image that is an RGB image or a gray image acquired by the imaging unit 110 as visual information. That is, the calculation unit 1120 may perform the depth map calculation performed by the imaging unit 110 instead.
  • the imaging unit 110 may further include a focus control mechanism that controls the focal length of the optical system, and the information processing apparatus may control the focus control.
  • the control unit 1140 of the information processing apparatus may calculate a control value (focus value) for adjusting the focus.
  • a control value for adjusting the focus of the imaging unit 110 in accordance with the depth of the average value or the median value of the depth map is calculated.
  • the present information processing apparatus can adjust the autofocus mechanism formed inside the imaging unit 110 instead of adjusting the focus. By adjusting the focus, it is possible to obtain more focused visual information, and it is possible to calculate the position and orientation with high accuracy.
  • the imaging unit 110 may have a configuration (focus fixed) without the focus control function. In this case, the imaging unit 110 can be downsized because it is not necessary to mount the focus control mechanism.
  • the imaging unit 110 may further include a zoom control mechanism that controls the zoom of the optical system, and the information processing apparatus may perform this zoom control.
  • the control unit 1140 calculates a control value (adjustment value) for adjusting the zoom value so that the zoom is wide angle and the visual information combination of the wide field of view is acquired.
  • the zoom value is set so that the zoom is narrow and the visual information combination of the narrow field of view is acquired with high resolution.
  • the control value (adjustment value) to adjust is calculated.
  • the imaging unit 110 has been described on the assumption that the optical system is applicable to the pinhole camera model, the optical system can acquire the position and orientation of the imaging unit 110 and visual information for controlling the moving object.
  • Any optical device (lens) may be used as long as it is a system. Specifically, it may be a full sky lens, a fisheye lens, or a hyperboloid mirror.
  • a macro lens may be used. For example, if an all-sky lens or a fisheye lens is used, it is possible to acquire a depth value of a wide field of view, and the robustness of position and orientation estimation is improved. A detailed position and orientation can be calculated by using a macro lens.
  • the user can freely change (exchange and the like) the lens in accordance with the scene to be used, and the position and orientation of the imaging unit 110 can be stably calculated with high accuracy.
  • the moving body can be controlled stably and with high accuracy.
  • a control value defined by the rotation angle or movement amount of the optical system control motor provided for controlling the focus and the angle of view of the imaging unit 110 Read Then, the focal length is calculated with reference to a lookup table (not shown).
  • the imaging unit 110 reads the focal length value recorded in the lens through the electronic contact given to the lens.
  • a person can also input a focal length to the imaging unit 110 using a UI (not shown).
  • the imaging unit 110 calculates the depth map using the focal length value acquired in this manner. Then, the input unit 110 of the information processing apparatus inputs a focal length value from the imaging unit 110 together with visual information.
  • the calculation unit 1120 calculates the position and orientation using the depth map input by the input unit 1110 and the focal length value.
  • the imaging unit 110 can also calculate a point cloud in the imaging unit 110 coordinate system using the calculated focal length.
  • the input unit 110 of the information processing apparatus inputs the point cloud calculated by the imaging unit 110, and the calculation unit 1120 calculates the position and orientation using the point cloud input by the input unit 110.
  • the map information in the present embodiment is a point cloud.
  • any information may be used as long as it is an index for calculating the position and orientation of the imaging unit 110.
  • it may be a point cloud with color information in which three values, which are color information, are added to each point of the point cloud.
  • the depth map may be associated with the position and orientation to form a key frame, and a plurality of key frames may be held. At this time, the position and orientation are calculated so as to minimize the distance between the depth map of the key frame and the depth map acquired by the imaging unit 110.
  • the calculation unit 1120 may store the input image in association with the key frame.
  • a configuration may be adopted in which a 2D map in which an area through which the AGV can pass and an impassable place such as a wall are associated is held. The usage of the 2D map will be described later.
  • the position and orientation calculation in the present embodiment is described using an ICP algorithm, any method may be used as long as the position and orientation can be calculated. That is, instead of the point cloud described in the present embodiment, the calculation unit 1120 may calculate a mesh model from them and may calculate the position and orientation so as to minimize the distance between the surfaces. Alternatively, a three-dimensional edge which is a discontinuous point may be calculated from the depth map and the point cloud, and a position and orientation may be calculated so that the distance between the three-dimensional edges is minimized. In addition, if the input unit 1110 is configured to input an image, the calculation unit 1120 can also calculate the position and orientation by further using the input image.
  • the input unit 1110 inputs a sensor value of the input sensor.
  • the calculating unit 1120 can also calculate the position and orientation of the imaging unit 110 by using the sensor values. Specifically, related techniques are known as Kalman Filter and Visual Inertial SLAM, and these can be used. As described above, by using the visual information and the sensor information of the imaging unit 110 in combination, the position and orientation can be calculated robustly with high accuracy.
  • an inertial sensor such as a gyro or an IMU can be used to reduce blurring of visual information captured by the imaging unit 110.
  • the control unit 1140 calculates the control value so that the distance between the target position and orientation and the position and orientation calculated by the calculation unit 1120 is reduced. In addition, as long as the control unit 1140 calculates a control value for reaching the destination, it may calculate or use any control value. Specifically, when the depth value of the depth map, which is the input geometric information, is less than a predetermined distance, the control unit 1140 calculates a control value such as turning to the right, for example. In addition, the calculation unit 1120 generates a route by the dynamic programming method, with the map information held by the holding unit 1130 not being able to pass through the part where the point cloud exists and not passing space, the control unit 1140 follows this route Control values can also be calculated.
  • the calculation unit 1120 previously projects a point cloud that is map information on a plane that is the ground to create a 2D map.
  • the point where the point cloud is projected is an impassable point such as a wall or an obstacle, and the unprojected point is an impassable point without passing through the space.
  • dynamic programming can generate a route to a destination.
  • the calculation unit 1120 calculates a cost map that stores values that decrease as it approaches the destination, and the control unit 1140 receives this as input to output a control value.
  • the controller may be used to calculate the control value. By calculating the control value that moves while avoiding the obstacle such as the wall in this manner, the AGV can be operated stably and safely.
  • the holding unit 1130 may not hold map information. Specifically, based on the visual information acquired by the imaging unit 110 at time t and t ′ ′ one time before that, the calculation unit 1120 calculates the position and orientation of the time t relative to the time t ′ ′.
  • the position and orientation of the imaging unit 110 can be calculated without the map information by multiplying the position and orientation change amount matrix calculated by the calculation unit 1120 every time as described above. With such a configuration, it is possible to calculate the position and orientation even in a computer with a small computational resource and to control the moving body.
  • the holding unit 1130 holds the map information created in advance.
  • a configuration of SLAM Simultaneous Localization and Mapping
  • position and orientation estimation is performed while creating map information based on the visual information acquired by the imaging unit 110 and the position and orientation calculated by the calculation unit 1120.
  • Many methods of SLAM have been proposed and can be used. For example, it is possible to use a Point-Based Fusion algorithm that integrates point clouds acquired by the imaging unit 110 at multiple times in time series. In addition, it is possible to use a Kinect Fusion algorithm that integrates the boundary between the measured depth object and the space as voxel data in time series.
  • an RGB-D SLAM algorithm or the like that generates a map while tracking the depth of a feature point detected from an image as a depth value of a depth sensor is known, and these can be used.
  • the maps are not limited to those generated in the same time zone. For example, time zones may be changed to generate multiple maps, and these may be synthesized.
  • the map information is not limited to generation from data acquired by the imaging unit 110 mounted on the mobile object 11.
  • the holding unit 1130 may hold and hold a CAD drawing or map image of the environment as it is or after converting the data format.
  • the holding unit 1130 may hold a map based on a CAD drawing or a map image as an initial map, and update the map using the above-described SLAM technology.
  • the control unit 1140 may calculate the control value for controlling the AGV so as to update the map at a point where a predetermined time has passed, while holding the map update time.
  • the map may be updated by overwriting, or the initial map may be held and the difference may be stored as update information.
  • the map can be managed in layers and checked on the display unit H16 or can be returned to the initial map. Convenience is improved by performing the operation while looking at the display screen.
  • the moving object is operating based on the destination coordinates set by the moving object management system 13.
  • the position / posture and control value calculated by the information processing apparatus can be transmitted to the mobile management system through the communication I / F (H17).
  • the mobile object management system 13 and the process management system 14 refer to positions and orientations and control values calculated based on the visual information acquired by the imaging unit 110 to perform process management and mobile object management more efficiently. it can.
  • the holding unit 1130 can be configured to receive any time via the communication I / F without holding the destination coordinates.
  • the process management system 14 manages the entire process of the factory
  • the mobile management system 13 manages the operation information of the mobile according to the management status
  • the mobile 12 according to the operation information.
  • any configuration may be employed as long as the moving body moves based on the visual information acquired by the imaging unit 110.
  • the process management system and the mobile management system may be omitted if the configuration is such that predetermined two points are held in advance in the holding unit 1130 and the space between them is exchanged.
  • the moving body 12 is not limited to the carrier vehicle (AGV).
  • the mobile unit 12 may be an autonomous vehicle or an autonomous mobile robot, and the movement control described in the present embodiment may be applied to them.
  • the above-described information processing apparatus is mounted on a car, it can also be used as a car that realizes automatic driving.
  • the vehicle is moved using the control value calculated by the control unit 1140.
  • the position and orientation may be calculated based on the visual information acquired by the imaging unit 110 instead of controlling the moving body.
  • the method of the present embodiment is used to align the real space and the virtual object in the mixed reality system, that is, to measure the position and orientation of the imaging unit 110 in the real space for use in drawing the virtual object. It can also be applied.
  • a 3DCG model is aligned and synthesized on the image D 154 a captured by the imaging unit 110 on the display of a mobile terminal represented by a smartphone or a tablet.
  • the input unit 1120 inputs an image D154a in addition to the depth map D152c acquired by the imaging device 1110.
  • the holding unit 1130 further holds the 3DCG model of the virtual object and the three-dimensional position at which the 3DCG model is installed in the map coordinate system.
  • the calculation unit 1120 combines the 3DCG model with the image D 154 a using the position and orientation of the imaging device 1110 calculated as described in the first embodiment. By doing this, the user who experiences mixed reality holds the mobile terminal and stabilizes the real space on which the virtual object is superimposed based on the position and orientation calculated by the information processing apparatus through the display of the mobile terminal. It can be observed.
  • the position and orientation of the imaging unit are calculated using the depth map acquired by the imaging unit.
  • An imaging unit based on dual pixel auto focus (DAF) can measure a specific distance range from the imaging unit with high accuracy. Therefore, in the second embodiment, even if the distance from the imaging unit is out of a specific range, the depth value is calculated by motion stereo to further increase the accuracy of the depth map acquired by the imaging unit and stabilize the position and orientation. Calculation with high accuracy.
  • DAF dual pixel auto focus
  • the configuration of the device in the second embodiment is the same as that of FIG. 2 showing the configuration of the information processing device 10 described in the first embodiment, and thus the description thereof is omitted.
  • the input unit 1110 inputs visual information to the holding unit 1130, and the holding unit 1130 holds the visual information, which is different from the first embodiment.
  • the second embodiment differs from the first embodiment in that the calculating unit 1120 corrects the depth map using the visual information held by the holding unit 1130 and calculates the position and orientation. Further, it is assumed that the holding unit 1130 holds a list in which the reliability of the depth value of the depth map acquired by the imaging unit 110 is associated in advance as the characteristic information of the imaging unit 110.
  • the reliability of the depth value is a value obtained by clipping the reciprocal of the error between the actual distance and the measured distance when taking a picture of the imaging unit 110 and the flat panel at a predetermined distance in advance, from 0 to 1 It is. It is assumed that the reliability has been calculated in advance for various distances. However, the point where the measurement could not be made has a reliability of 0.
  • visual information acquired by the imaging unit 110 and input by the input unit 1110 is an image and a depth map.
  • the procedure of the entire processing in the second embodiment is the same as that in FIG. 4 showing the processing procedure of the information processing apparatus 10 described in the first embodiment, and thus the description will be omitted.
  • the second embodiment differs from the first embodiment in that the depth map correction step is added before the position and orientation calculation step S140.
  • FIG. 7 is a flowchart showing details of the processing procedure in the depth map correction step.
  • step S2110 the calculation unit 1120 reads the characteristic information of the imaging unit 110 from the holding unit 1130.
  • step S2120 calculation unit 1120 uses the image, which is visual information acquired at arbitrary time t ′ before time t at which imaging unit 110 acquired the visual image, held by holding unit 1130, and the depth map input image. Calculate the depth value by motion stereo.
  • an image which is visual information acquired at an arbitrary time t ‘before time t will also be described as a past image.
  • the depth map acquired at any time t ⁇ before time t will also be referred to as a past depth map.
  • Motion stereo is a known technique and various methods can be applied.
  • the ambiguity of the scale of depth value remains in the motion stereo from two images, about this, it can calculate based on the ratio with the past depth map and the depth value calculated by motion stereo.
  • step S2130 the calculation unit 1120 updates the depth map with a weighted sum using the reliability associated with the depth value, which is the characteristic information read in step S2110, and the depth value calculated by motion stereo in step S2120. Specifically, assuming that the value of the reliability in the vicinity of each depth value d of the depth map is a weight ⁇ , the weighted sum of Expression 2 is used to correct the depth value m calculated by motion stereo.
  • step S150 described in the first embodiment is continued.
  • the imaging unit 110 when the imaging unit 110 can acquire the depth value with high accuracy, the weight of the depth value acquired by the imaging unit 110 is large, and otherwise the depth value calculated by motion stereo Increase the weight of As a result, even if the measurement accuracy of the imaging unit 110 is lowered, it can be corrected by motion stereo to calculate the depth map with high accuracy.
  • the reliability in correction of the depth map is calculated from the measurement error of the depth value of the depth map calculated by the imaging unit 110, and is used as the weight ⁇ .
  • any method may be used as long as the depth map acquired by the imaging unit 110 and the depth value calculated by motion stereo are integrated to calculate a weight value that enhances the depth map.
  • a value obtained by integrating a predetermined coefficient ⁇ to the reciprocal of the depth of the depth map may be used as the weight.
  • the gradient of the input image may be calculated, and the inner product of the gradient direction and the arrangement direction of the elements in the imaging unit 110 may be used as a weight.
  • the input unit 1110 further receives the baseline lengths of the two images D154a and D154b in the imaging unit 110 and the baseline lengths of the parallax image D154f from the imaging unit 110, and the ratio of the baseline length in motion stereo as a weight It can also be used.
  • a method of integrating only specific pixels, or calculation of weighted sum by applying the same weight to some or all pixels is calculated. You may
  • motion stereo may be performed using images and depth maps at a plurality of times in addition to a certain past one time.
  • the AGV can also be controlled so that visual information can be obtained so that the position and orientation can be calculated more accurately and robustly.
  • the control unit 1140 calculates the control value so as to move the AGV so that the baseline length of the motion stereo becomes large. Specifically, a control value in which the vehicle travels in a serpentine manner while capturing a predetermined distant point by the imaging unit 110 is an example. As a result, since the base length at the time of motion stereo becomes long, it is possible to accurately calculate the depth value further away.
  • the control unit 1140 can also calculate the control value so that the imaging unit 110 obtains visual information in a wider field of view. Specifically, the control value is calculated so as to perform a look around operation centered on the optical center of the imaging unit 110. As a result, visual information with a wider field of view can be acquired, so that divergence and error in optimization can be reduced and position and orientation can be calculated.
  • the input unit 1110 receives an image and position and orientation from another AGV through the communication I / F, and calculates the depth value by performing motion stereo using the received image and position and orientation and the image acquired by the imaging unit 110. It can also be done. Moreover, what is received may be anything as long as it is visual information, and may be a depth map, parallax image, or three-dimensional point group acquired by an imaging unit of another AGV.
  • the position and orientation and the control value are calculated based on visual information obtained by photographing the scene acquired by the imaging unit 110.
  • depth accuracy may be reduced in walls and columns without texture. Therefore, in the third embodiment, depth accuracy is improved by projecting predetermined pattern light onto a scene and the imaging unit 110 acquiring the pattern light.
  • the configuration of the information processing apparatus 30 in the present embodiment is shown in FIG. The difference is that the control unit 1140 in the information processing apparatus 10 described in the first embodiment further calculates a control value of the projection apparatus 310 and outputs the calculated control value.
  • the projection apparatus in this embodiment is a projector, and it is attached so that the optical axis of the imaging part 110 and the optical axis of a projection apparatus may correspond.
  • the pattern projected by the projection device 310 is a random pattern generated so that projected and non-projected regions exist at random.
  • the visual information is the image D 154 e and the depth map D 154 d acquired by the imaging unit 110, and the input unit 1110 inputs the image from the imaging unit 110.
  • step S150 the calculation unit 1120 calculates the texture degree value indicating whether the input visual information is poor in texture, and the control unit 1140 controls the pattern projection ON / OFF based on the texture degree value.
  • the point to calculate differs from the first embodiment.
  • the calculation unit 1120 convolutes the Sobel filter with the input image and further calculates their absolute values to calculate a gradient image.
  • the Sobel filter is a type of filter for calculating the first derivative of an image and is known in various documents.
  • the ratio of pixels equal to or greater than a predetermined gradient value threshold in the calculated gradient image is taken as the texture degree.
  • the control unit 1140 calculates a control value so as to turn on the projection device if the value of the texture degree is equal to or more than a predetermined threshold, and turn off the projection device if the value is less than the predetermined threshold.
  • the imaging unit when the scene is poor in texture, random pattern light is projected. As a result, a random pattern is added to the scene, so that even if the scene is poor in texture, the imaging unit can acquire the depth map more accurately. Therefore, the position and orientation can be calculated with high accuracy.
  • the pattern light is a random pattern.
  • any pattern may be used as long as it gives a texture to an area poor in texture.
  • a random dot pattern or a fringe pattern (such as a restriction or a lattice pattern) may be projected.
  • the stripe pattern there is an ambiguity that it is not possible to distinguish the distance between the inside and outside of the modulation wavelength, but by using a gray code method to obtain depth values from input images acquired at multiple times with different frequencies. It can be eliminated.
  • the control unit 1140 outputs the control value of ON / OFF of the projection, and the control device 310 switches the presence or absence of the projection.
  • the configuration is not limited to this as long as the projection device 310 can project pattern light.
  • the projector 310 may be configured to start projection when the power is turned on in the initialization step S110.
  • the projection device 310 may be configured to project an arbitrary part of the scene.
  • the control unit 1140 can also switch the projection pattern of the projection device 310 so that the projection device 310 projects only in a region where the gradient value of the gradient image is less than a predetermined threshold.
  • the object detection described in the fifth embodiment it is also possible to detect a human eye and calculate a control value so as to project a pattern while avoiding the human eye.
  • the brightness may be changed as well as ON / OFF of the pattern. That is, the control unit 1140 can calculate the control value so that the projection device 310 projects the area brighter in the depth map more brightly, or the control value such that the dark part of the input image projects more brightly Can also be calculated.
  • the pattern may be changed as long as the residual of the error in the iterative calculation when the calculation unit 1120 calculates the position and orientation is equal to or more than a predetermined threshold.
  • the texture degree value uses the gradient image by the Sobel filter. Besides, it is also possible to calculate the texture degree value using a gradient image or an edge image calculated by a filter such as a pre-fit filter, a SCHARR filter, or a Canny filter that performs edge detection. Alternatively, a high frequency component obtained by applying DFT (discrete Fourier transform) to an image may be used as the texture degree value. Alternatively, feature points such as corners in an image may be calculated, and the number of feature points may be used as the texture degree.
  • DFT discrete Fourier transform
  • the position and orientation and the control value are calculated based on visual information obtained by photographing the scene acquired by the imaging unit.
  • the pattern light is projected to improve the accuracy for a scene with poor texture.
  • a method will be described in which three-dimensional information representing the position of a scene measured by another three-dimensional sensor is additionally used.
  • the configuration of the information processing apparatus 40 in the present embodiment is shown in FIG. This embodiment differs from the first embodiment in that the input unit 1110 in the information processing apparatus 10 described in the first embodiment further inputs three-dimensional information from the three-dimensional measurement apparatus 410.
  • the three-dimensional measurement device 410 in the present embodiment is a 3D LiDAR (light detection and ranging), which is a device that measures the distance based on the round trip time of the laser pulse.
  • the input unit 1110 inputs a measurement value acquired by the three-dimensional device as a point cloud.
  • the holding unit is a list in which the reliability of the depth value of the depth map acquired by the imaging unit 110 is associated in advance and the list in which the reliability of the depth value of the three-dimensional measuring apparatus 410 is associated. It shall be held. It is assumed that these reliabilities are calculated in advance by both the imaging unit 110 and the three-dimensional measurement apparatus 410 by the method described in the second embodiment.
  • the procedure of the entire processing in the fourth embodiment is the same as that of FIG. 4 showing the processing procedure of the information processing apparatus 10 described in the first embodiment, and thus the description will be omitted.
  • the second embodiment differs from the first embodiment in that a depth map correction step is added before the position and orientation calculation step S140.
  • FIG. 10 is a flowchart showing details of the processing procedure in the depth map correction step.
  • step S4110 the calculation unit 1120 reads the characteristic information of the imaging unit 110 and the three-dimensional measurement apparatus 410 from the holding unit 1130.
  • step S4120 the calculation unit 1120 calculates the depth map calculated by the imaging unit 110 using the reliability associated with the depth value, which is the characteristic information read in step S4110, and the point cloud measured by the three-dimensional measurement device 410. Integrate. Specifically, the depth map can be updated by replacing the value m in Equation 2 with the depth value measured by the three-dimensional measurement device 410.
  • the weight ⁇ is calculated by Equation 3, assuming that the reliability of the depth map is ⁇ D and the reliability of a point cloud pointing to the same point is ⁇ L.
  • the depth map is updated by equation 2 using the calculated weights.
  • the depth map correction step is ended, and the processing after step 150 described in the first embodiment is continued.
  • the imaging unit when the imaging unit can acquire the depth value with high accuracy, the weight of the depth value acquired by the imaging unit is large, and the three-dimensional measuring apparatus can acquire the depth value with high accuracy.
  • the weight of the depth value acquired by the three-dimensional measurement device is increased.
  • the depth map can be calculated using the depth values that can be measured with higher accuracy by the imaging unit and the three-dimensional measurement apparatus, and the position and orientation can be calculated with high accuracy.
  • the three-dimensional measurement device 410 is not limited to this, as long as it can measure three-dimensional information that can increase the accuracy of visual information acquired by the imaging unit 110.
  • it may be a TOF (Time Of Flight) distance measurement camera, or may be a stereo camera provided with two cameras.
  • a stereo configuration may be adopted in which the single-eye camera different from the imaging unit 110 by DAF is disposed in alignment with the optical axis of the imaging unit 110.
  • An imaging unit 110 having different reliability characteristics may be further mounted, and this may be regarded as the three-dimensional measuring device 410 to similarly update the depth map.
  • the position and orientation and the control value are calculated based on visual information obtained by photographing the scene acquired by the imaging unit 110.
  • predetermined pattern light is projected onto the scene.
  • the three-dimensional shape measured by the three-dimensional measurement apparatus is used together.
  • an object is detected from visual information and used to control a moving object.
  • the AGV loads and carries a load, and when reaching the destination, the case where it must be strictly stopped at a predetermined position with respect to the shelf and the belt conveyor will be described.
  • the AGV by calculating the exact position and orientation by calculating the position and orientation of an object such as a shelf or a belt conveyor imaged by the imaging unit 110 will be described.
  • the feature information of an object is the position and orientation of the object.
  • the configuration of the device according to the fifth embodiment is the same as that of FIG.
  • the calculating unit 1120 further detects an object from visual information, and the control unit 1140 controls the moving body so that the detected object appears at a predetermined position in the visual information.
  • the holding unit 1130 holds an object model for object detection, and holds a target position / posture with respect to the object as to what position / posture should be with respect to the object when the AGV arrives at the purpose. .
  • the above points differ from the first embodiment.
  • An object model is a CAD model representing the shape of an object, and PPF (Point Pair Feature) feature information in which the relative position of a two-dimensional three-dimensional point group having a normal as a three-dimensional feature point of the object is a feature. Is a list that stores
  • FIG. 11 is a flowchart illustrating the details of the object detection step.
  • step S5110 the calculation unit 1120 reads the object model held by the holding unit 1130.
  • step S5120 the calculation unit 1120 detects where in the visual information an object that fits the object model is included in the depth map. Specifically, first, PPF features are calculated from the depth map. Then, by matching the PPF detected from the depth map with the PPF of the object model, the initial value of the object position / posture with respect to the imaging unit 110 is calculated.
  • step S1530 with the position and orientation of the object with respect to the imaging unit 110 calculated by the calculating unit 1120 as the initial position, the position and orientation of the object with respect to the imaging unit 110 are accurately calculated by ICP algorithm. At the same time, the residual between the object and the target position and orientation held by the holding unit 1130 is calculated. The calculation unit 1120 inputs the calculated residual to the control unit 1140, and ends the object detection step.
  • step S150 in FIG. 5 the control unit 1140 calculates the control value of the actuator 120 so that the AGV moves in the direction in which the residual of the position and orientation of the object calculated by the calculation unit 1120 decreases.
  • an image pickup unit characterized in that each light receiving unit on the image pickup device includes two or more light receiving elements detects an object shown in the acquired depth map, and the position and orientation of the object are detected by model fitting. calculate. Then, the AGV is controlled so that the difference between the position and orientation with respect to the object given in advance and the position and orientation of the detected object becomes small. That is, the AGV is controlled to be precisely aligned with the object.
  • the position and orientation of an object whose shape is known in advance, the position and orientation can be calculated with high accuracy, and the AGV can be controlled with high accuracy.
  • the PPF feature is used to detect an object.
  • any method capable of detecting an object may be used.
  • a feature quantity a SHOT feature may be used in which a histogram of an inner product of a normal to a three-dimensional point and a normal to a three-dimensional point located around the three-dimensional point is a feature.
  • a feature using Spin Image in which surrounding three-dimensional points are projected on a cylindrical surface having a normal vector of a certain three-dimensional point as an axis may be used.
  • a learning model by machine learning can also be used as a method of detecting an object without using a feature amount.
  • a neural network learned so that an object area is 1 and non-object areas are 0 can be used as a learning model.
  • the position and orientation of the object may be calculated by combining steps S5110 to S5130.
  • a shelf or a belt conveyor is used as an example of the object.
  • it may be any object as long as the imaging unit 110 can observe when the AGV is stopped and the relative position and orientation (relative position and relative orientation) are uniquely determined.
  • a three-dimensional marker specifically, an arbitrary-shaped object having arbitrary asperities printed by a 3D printer
  • a depth map may be used as an object model when stopping at a target position and orientation in advance. At this time, during AGV operation, AGV may be controlled so that the position and orientation error between the held depth map and the depth map input by input unit 1110 is reduced. In this way, an object model can be generated without the trouble of creating a CAD model.
  • a method of detecting an object and performing model fitting for position and orientation calculation for exact positioning of the AGV is illustrated.
  • it may be used not only for the purpose of exact position and orientation calculation but also for collision avoidance and position and orientation detection of other AGVs.
  • the control unit 1140 can also be used to calculate control values so as to avoid coordinates of other AGVs. And avoid colliding with other AGVs.
  • an alert may be presented, and another AGV may be instructed to clear its own traveling route.
  • the control unit 1140 calculates the control values so as to connect and move to the charging station. It is also good. In addition, when the wiring is made in the passage in the factory, the calculation unit 1120 may detect the wiring and calculate the control value so that the control unit 1140 bypasses them. If the ground is uneven, the control value may be calculated to avoid the unevenness. Further, if labels such as entry prohibition and recommended route are associated with each object model, it is possible to easily set whether or not the AGV can pass by arranging the object in the scene.
  • the object model is a CAD model.
  • any model may be used as long as the position and orientation of the object can be calculated.
  • it may be a mesh model generated by three-dimensional reconstruction of a target object from stereo images taken at a plurality of viewpoints by the Structure From Motion algorithm.
  • it may be a polygon model created by integrating depth maps captured from a plurality of viewpoints with an RGB-D sensor.
  • a CNN Convolutional Neural Network
  • the imaging unit 110 may image an object carried by the AGV, the calculation unit 1110 may recognize, and the control unit 1140 may calculate the control value according to the mounted object type. Specifically, if the mounted object is a broken object, the control value is calculated so that the AGV moves at a low speed. In addition, if a list in which the target position and posture are associated with each object is held in the holding unit 1130 in advance, the control value may be calculated so as to move the AGV to the target position related to the mounted object.
  • the control unit 1140 may calculate the control value of the robot arm such that the robot arm acquires the object.
  • the method of calculating the control value of the moving object by stably calculating the position and orientation with high accuracy based on the visual information acquired by the imaging unit 110 has been described.
  • the fifth embodiment has described the method of detecting an object from visual information and using it to control a moving object.
  • a method of stably performing control of AGV and generation of map information with high accuracy using the result of dividing input visual information into regions will be described.
  • the present embodiment exemplifies a method of adapting upon generation of map information.
  • map information In the map information, a static object whose position and orientation does not change even if time passes is registered, and using these to calculate the position and orientation improves the robustness to the change of the scene. Therefore, visual information is divided into semantic regions, and it is determined what kind of object each pixel is. Then, a method of generating hierarchical map information using stationary object likeness information calculated for each object type in advance, and a position and orientation estimation method using them will be described.
  • the feature information of an object is the type of the object unless otherwise noted.
  • the configuration of the apparatus in the sixth embodiment is the same as that of FIG. 2 showing the configuration of the information processing apparatus 10 described in the first embodiment, and thus the description thereof is omitted.
  • the calculating unit 1120 further divides visual information into semantic regions, and generates map information hierarchically using them.
  • the hierarchical map information in this embodiment is a point cloud composed of four layers of (1) layout CAD model of a factory, (2) stationary object map, (3) fixture map, and (4) moving object map. .
  • the holding unit 1130 holds (1) the layout CAD model of the factory in the external memory H14. Also, the position and orientation are calculated using hierarchically created map information. The position and orientation calculation method will be described later.
  • visual information acquired by the imaging unit 110 and input by the input unit 1110 is an image and a depth map.
  • the holding unit 1130 also holds a CNN, which is a learning model learned so as to output, for each object type, a mask image indicating whether each pixel is a corresponding object when an image is input.
  • a look-up table is held for which each object type is in each of the layers (2) to (4), and when the object type is specified, it is known which layer the object type is.
  • the diagram of the procedure of the entire process in the sixth embodiment is the same as FIG. 4 showing the procedure of the information processing apparatus 10 described in the first embodiment, and therefore the description thereof is omitted.
  • the second embodiment differs from the first embodiment in that the calculating unit 1120 calculates the position and orientation in consideration of the layer of the map information held by the holding unit 1130 when calculating the position and orientation. Further, the second embodiment differs from the first embodiment in that an area division / map generation step is added after the position and orientation calculation step S140. Details of these processes will be described later.
  • step S140 the calculating unit 1120 calculates, for each layer of map information held by the holding unit 1130, a weight serving as a contribution degree of position and orientation calculation to the point cloud to calculate the position and orientation. Specifically, in the case of holding the layers (1) to (4) as in the example in the present embodiment, the weights are successively reduced from (1) to (4) which is less resistant map information.
  • FIG. 12 is a flowchart illustrating the details of the area division / map generation step. This area division / map generation step is added and executed immediately after the position and orientation calculation step S140 in FIG.
  • step S6110 the calculation unit 1120 divides the input image into semantic regions.
  • a number of approaches have been proposed for semantic domain segmentation, which can be incorporated.
  • the method is not limited to the above method as long as the image is divided into semantic regions. These methods are used to obtain mask images in which each pixel is assigned with the object or not for each object type.
  • step S6120 the depth map is divided into areas. Specifically, first, a normal is calculated for each pixel of the depth map, and an edge of the normal whose inner product with the surrounding normal is equal to or less than a predetermined value is detected. Then, the depth map is divided into areas by allocating different labels to the respective areas with the edge of the normal as a boundary, and an area divided image is obtained.
  • step S6130 the calculation unit 1120 performs point cloud semantic area division based on the mask image obtained by dividing the input image into semantic areas and the area division image obtained by area division of the depth map. Specifically, the ratio N i, j of the inclusion relationship between the area S Dj of each depth map and the object area S Mj of each mask is calculated by Expression 4.
  • i is an object type
  • j is a label of area division of the depth map.
  • the object type i is assigned to the area S Dj of the depth map in which N i, j is equal to or greater than a predetermined threshold.
  • background labels are assigned to pixels to which no object type has been assigned.
  • the object type i is assigned to each pixel of the depth map.
  • step S6140 the calculation unit 1120 hierarchically generates map information based on the object type label assigned to the depth map in step S6130. Specifically, the lookup table is referred to for each object type label of the depth map, and the three-dimensional point group obtained from the depth map is stored in each layer of the map information held by the holding unit 1130. When the storage is completed, the area division / map generation step is ended.
  • the non-moving object suitable for position and orientation calculation and the moving object unsuitable for position and orientation calculation are separately registered in the map information. be able to. Also, using the divided map information, weights are assigned such that moving objects become smaller. Then, the position and orientation are calculated according to the assigned weight. By doing this, the position and orientation can be calculated more stably and robustly.
  • the layers (1) to (4) are used.
  • the configuration is sufficient as long as the configuration has a plurality of layers according to the movement of the object, and the configuration may be such that the holding unit 1130 holds only an arbitrary number of layers (1) to (4).
  • a specific object AVG layer, human layer
  • map information generation and position and orientation are calculated using the semantically segmented depth map.
  • the control unit 1140 may calculate the control value using the depth map divided into semantic regions. Specifically, when a person or another AGV is detected when the semantic region is divided, the control unit 1140 can calculate the control value so as to avoid them. By doing this, AGV can be operated safely. Also, the control unit 1140 may calculate a control value that follows a person or another AGV. By doing this, the AGV can operate even without map information. Furthermore, the calculation unit 1120 may recognize a human gesture based on the semantic region division result, and the control unit 1140 may calculate the control value.
  • the regions of the image are labeled by parts such as human arms and fingers, head, torso, and legs, and gestures are recognized based on their mutual positional relationship. If a person beckoning gesture is recognized, a control value is calculated so as to move closer to the person, and if a person pointing gesture is recognized, a control value is calculated so as to move in a pointing direction. As described above, by recognizing a human gesture, the user can directly move the AGV using a controller or the like without controlling the AGV, so that the AGV can be operated without any trouble.
  • the control unit 1140 may calculate the control value according to the object type detected by the method of the present embodiment. Specifically, control is performed so as to stop if the object type is a person, and to avoid if the object type is another AGV. This makes it possible to operate the AGV efficiently, avoiding non-human ones safely so as not to hit people who must avoid collisions.
  • the AGV passively segments the object.
  • the AGV may instruct a person to exclude moving objects.
  • the control unit 1140 calculates a control value for outputting a voice for moving the person using a speaker (not shown). By doing this, it is possible to generate map information excluding moving objects.
  • semantic region division is performed to specify an object type.
  • the configuration may be such that the calculation unit 1120 calculates map information, position and orientation, and the control unit 1140 calculates control values without specifying the object type. That is, S6110 and S6120 in FIG. 12 can be removed.
  • the depth map is divided into areas by height from the ground.
  • the control value is calculated ignoring pixels which are at a height higher than the height of the AGV.
  • point clouds at heights at which AGVs do not collide are not used for route generation.
  • the number of point clouds to be processed decreases, and the control value can be calculated at high speed.
  • the area may be divided based on the planarity. In this way, it is possible to prioritize and use the three-dimensional edge with a high degree of contribution of position and orientation calculation (to exclude the plane in which the ambiguity remains in the position and orientation from the processing). It leads to the improvement of robustness.
  • the weight of the moving object in the map information is reduced to reduce the degree of contribution of the position and orientation calculation.
  • the calculation unit 1120 divides the depth map into semantic regions according to the processing procedure of S6110 to S6130. Then, the weight is determined by referring to the look-up table based on the object type label of each pixel. Thereafter, in step S140, the position and orientation are calculated in consideration of the weight. As described above, the influence of moving objects in position and orientation calculation can be reduced without making the map into a layer structure. This can reduce the capacity of the map.
  • the imaging unit 110 in the present embodiment is not limited to the imaging unit characterized in that each light receiving unit on the imaging device is constituted by two or more light receiving elements, three-dimensional depth information such as a TOF camera or 3DLiDAR Anything that can be acquired may be used.
  • the holding unit 1130 holds map information for each layer. These can be confirmed by the display unit H16 or returned to the initial map. By checking the layer while looking at the display screen, it is possible to operate the AGV easily and stably by instructing the AGV to generate the map again if the moving object is registered in the map. .
  • the ICP algorithm aligns the point clouds pointing to the same point of the map information created by each AGV so as to be the same position.
  • integration may be performed so as to leave newer map information by referring to the map creation time.
  • the control unit 1140 may move an AGV that has not been worked on so as to generate a map of an area for which the map information has not been updated for a while.
  • the calculation of the control value calculated by the control unit 1140 is not limited to the method described in the present embodiment as long as it is a method of calculating so as to approach the target position and orientation using map information.
  • the control value can be determined using a learning model for route generation.
  • DQN Deep Q-Network
  • it can be realized by learning in advance a learning model of reinforcement learning so as to increase the reward when approaching the target position and posture, lower the reward when separating from the target position and posture, and lower the reward when approaching the obstacle.
  • map information is not limited to this.
  • AGV transport simulation may be performed, and the process management system may generate processes so that the AGV can be transported efficiently.
  • the mobile management system may generate a route that avoids the AGV operation timing and congestion based on the map information.
  • the learning model described above may be learned along with the delivery simulation with the created map.
  • the control unit 1140 stabilizes using a learning model even if a similar situation actually occurs, by reproducing and learning a situation such as installation of an obstacle or a collision with a person or another AGV on a simulation.
  • the control value can then be calculated.
  • the learning model can be configured to learn the control method efficiently in a short time.
  • a UI that can be commonly applied to the first to sixth embodiments will be described. It will be described that the user confirms the visual information acquired by the imaging unit, the position / posture calculated by the calculation unit, the detection result of the object, the map information, and the like.
  • the AGV is described to be controlled by the user's input because it moves by automatic control.
  • a display device for example, a GUI is displayed on a display so that the user can control the AGV so that the user can confirm the status of the AGV, and an operation from the user is input using an input device such as a mouse or a touch panel.
  • the display is mounted on the AGV, but the present invention is not limited to such a configuration.
  • the liquid crystal display connected to the mobile management system as the display device using the display of the mobile terminal owned by the user as the display device through the communication I / F (H17).
  • display information can be generated by the information processing apparatus.
  • the computer which accompanies a display apparatus may acquire the information required for production
  • the configuration of the device according to the seventh embodiment is the same as that of FIG.
  • the touch panel display in which the calculation unit 1120 generates display information based on the visual information acquired by the imaging unit 110, the position and orientation calculated by the calculation unit 1120, the detected object, and the control value calculated by the control unit 1140 And the like are different from the first embodiment. The details of the display information will be described later. Further, in the present embodiment, the holding unit 1130 holds 2D map information and 3D map information.
  • FIG. 13 shows a GUI 100 which is an example of display information presented by the display device according to the present embodiment.
  • G110 is a window for presenting 2D map information.
  • G120 is a window for presenting 3D map information.
  • G130 is a window for presenting the image D154e acquired by the imaging unit 110.
  • G 140 is a window for presenting the depth map D 154 d acquired by the imaging unit 110.
  • G150 represents the position and orientation calculated by the calculation unit 1120 as described in the first embodiment, the object detected as described in the fifth and sixth embodiments, and the G150 calculated by the control unit 1140 as described in the first embodiment. It is a window for presenting display information based on a control value.
  • G110 shows an example of presentation of a 2D map held by the holding unit 1130.
  • G111 is an AGV on which the imaging unit 110 is mounted.
  • the calculation unit 1120 synthesizes on the 2D map based on the position and orientation (position and orientation of the AGV) of the imaging unit.
  • G112 is an example in which an alert is presented as a balloon when there is a possibility of a collision, based on the position and orientation of the object detected by the calculation unit 1120 according to the methods of the fifth and sixth embodiments.
  • G113 is an example in which an AGV planned route is presented as an arrow based on the control value calculated by the control unit 1140. In FIG. 13, the AGV is heading to the destination presented on G114.
  • the user can easily grasp the AGV operation status by presenting the 2D map, the position of the AGV, the detection result of the object, and the route.
  • G111 to G114 may allow the user to more easily understand the operation status by changing the color, the thickness of the line, and the shape.
  • G120 shows an example of presentation of the 3D map held by the holding unit 1130.
  • G121 is an example of visualizing the result of updating the 3D map held by the holding unit 1130 using the result of the calculation unit 1120 dividing the depth map into meaningful areas as described in the sixth embodiment.
  • non-moving objects obtained from factory CAD data are dark, and are presented as lighter as other moving objects such as other AGVs and people.
  • the label of the object detected by the calculation unit 1120 is presented on the GUI 122.
  • the user can comprehend the operation status in consideration of the height direction in comparison with the 2D map.
  • it is an object type found while the AGV is traveling, it can be searched without going to the site.
  • G130 has shown the example of presentation of the picture which image pick-up part 110 acquired.
  • a banding box is superimposed on another AGV which is an object detected by the calculation unit 1120 or the outer circumference of a person in a dotted line.
  • it may be a practice or a double line, or it may be emphasized by changing and presenting a color.
  • the user can confirm the object detected by the calculation unit 1120 without any trouble.
  • G140 shows the example of presentation of the depth map which image pick-up part 110 acquired.
  • G141 is an example in which the CAD model of the object held by the holding unit 1130 described in the fifth embodiment is superimposed as a wire frame using the position and orientation of the object calculated by the calculation unit 1120.
  • G142 is an example in which an AGV CAD model is superimposed as a wire frame.
  • G143 is an example in which a CAD model of a three-dimensional marker is superimposed.
  • G150 indicates a GUI for manually operating the AGV, values calculated by the calculation unit 1120 and the control unit 1140, and an example of presentation of operation information of the AGV.
  • G151 is an emergency stop button, and the user can stop the movement of the AGV by touching the button with a finger.
  • G152 is a mouse cursor, which can move the cursor according to a user's touch operation through a mouse, a controller, and a touch panel (not shown), and can operate buttons and radio buttons in the GUI by pressing a button.
  • G153 is an example showing a controller of AGV. By moving the circle inside the controller up, down, left, and right, the user can perform the front, rear, left, and right movement of the AGV according to those inputs.
  • G154 is an example showing the internal state of AGV.
  • the AGV is illustrated as an example in which it is traveling automatically and operating at a speed of 0.5 m / s.
  • operational information such as the time since the AGV started to travel, the remaining time to the destination, and the difference in the estimated arrival time with respect to the schedule are also presented.
  • G156 is a GUI for setting the operation and display information of the AGV. The user can perform operations such as whether to generate map information and whether to present a detected object.
  • G157 is an example of presenting AGV operation information. In this example, the position and orientation calculated by the calculation unit 1120, the destination coordinates received from the mobile management system 13, and the name of the article being transported by the AGV are presented. Thus, AGV can be more intuitively operated by presenting the GUI related to the input from the user that presents the operation information.
  • the processing procedure of the information processing apparatus in the seventh embodiment is a display information generation step in which the calculation unit 1120 generates display information after step S160 of FIG. 5 which describes the processing procedure of the information processing apparatus 10 described in the first embodiment. (Not shown) differs in that a new addition is made.
  • the display information generation step the display information is rendered based on the visual information captured by the imaging unit 110, the position and orientation calculated by the calculation unit 1120, the detected object, and the control value calculated by the control unit 1140, and output to the display device Do.
  • the calculation unit generates display information based on the visual information acquired by the imaging unit, the position and orientation calculated by the calculation unit, the detected object, and the control value calculated by the control unit, and presents it on the display. .
  • the user can easily check the state of the information processing apparatus.
  • the user inputs an AGV control value, various parameters, a display mode, and the like. This makes it possible to easily change or move various settings of the AGV.
  • by presenting the GUI it becomes possible to easily operate the AGV.
  • the display device is not limited to the display. If a projector is mounted on the AGV, display information can also be presented using the projector. In addition, if a display device is connected to the mobile management 13 system, display information may be transmitted and presented to the mobile management system 13 via the communication I / F (H17). Further, it is also possible to transmit only the information necessary for generating the display information, and to generate the display information by a computer inside the mobile management system 13. By doing this, the user can easily perform the operation status and operation of the AGV without confirming the display device mounted on the AGV.
  • the display information in the present embodiment may be anything as long as it presents information handled by the present information processing.
  • the time and frame rate related to the position and orientation calculation, the remaining information of the AGV battery, etc. may be displayed.
  • the GUI described in the present embodiment is an example, and any GUI may be used as long as it can perform an operation (input) on the AGV to grasp the operating status of the AGV.
  • the display information can be changed such as changing color, switching line thickness, solid line, broken line, double line, scaling, and hiding unnecessary information.
  • the object model may display a contour instead of a wire frame, or a transparent polygon model may be superimposed. By changing the method of visualizing display information in this manner, the user can more intuitively understand the display information.
  • the GUI described in the present embodiment can also be connected to a server (not shown) via the Internet.
  • a server not shown
  • the person in charge of the AGV manufacturer can confirm the state of the AGV by acquiring display information via the server without going to the site. .
  • the input device is exemplified by a touch panel, but any input device that receives an input from the user may be used. It may be a keyboard, a mouse, or a gesture (for example, it may be recognized from visual information acquired by the imaging unit 110). Furthermore, the mobile management system may be an input device via the network. In addition, if a smartphone or a tablet terminal is connected via the communication I / F (H17), they can also be used as a display device / input device.
  • the input device is not limited to the one described in the present embodiment, and anything may be used as long as it changes the parameters of the information processing apparatus.
  • the user's input may be accepted to change the upper limit (the upper limit of the speed) of the control value of the moving object, or the destination point clicked by the user may be input on G110.
  • the user's selection of a model used for object detection and a model not used may be input. Even if an object that could not be detected is input on the G 130 so that the user encloses it, the learning device (not shown) of the learning model is configured to learn so as to detect the object according to the visual information of the imaging unit 110 Good.
  • the visual information obtained by the imaging unit 110 is divided into semantic regions, the type of object is determined by each pixel, and a map is generated, and these maps and the determined type of object.
  • Embodiment 8 further describes a method of recognizing different semantic information depending on the situation even in the same object type, and controlling the AGV based on the recognition result.
  • the stacking degree of objects such as stacked products in a factory is recognized as semantic information. That is, it recognizes the semantic information of the object that is in the field of view of the imaging unit 110. Then, a method of controlling the AGV according to the stacking degree of the objects will be described. In other words, the AGV is controlled to more safely avoid objects that are stacked.
  • the stacking degree of objects in the present embodiment is the number of stacked objects or the height.
  • An AGV control value calculation uses an occupancy map indicating whether space is occupied by an object.
  • the occupancy map a two-dimensional occupancy grid map is used in which a scene is divided into grids and the probability that an obstacle exists in each grid is held.
  • the occupancy map holds a value representing the approach rejection degree of AGV (passing is permitted as closer to 0 and passing is rejected as closer to 1).
  • the AGV controls to the destination so that the approach rejection value of the occupancy map does not pass through the area (the grid in the present embodiment) which is equal to or more than a predetermined value.
  • the destination is a two-dimensional coordinate which is the destination of the AGV, which is included in the operation information acquired from the process management system 12.
  • the information processing system according to the present embodiment is the same as the system configuration described in FIG.
  • FIG. 14 is a diagram showing a module configuration of the mobile unit 12 including the information processing apparatus 80 according to the eighth embodiment.
  • the information processing apparatus 80 includes an input unit 1110, a position and orientation calculation unit 8110, a semantic information recognition unit 8120, and a control unit 8130.
  • the input unit 1110 is connected to the imaging unit 110 mounted on the moving body 12.
  • the controller 8130 is connected to the actuator 120.
  • a communication device (not shown) communicates information with the mobile management system 3 in a bi-directional manner, and inputs / outputs to / from various means of the information processing device 80.
  • the imaging unit 110, the actuator 120, and the input unit 1110 in the present embodiment are the same as in the first embodiment, and thus the detailed description will be omitted.
  • the position and orientation calculation unit 8110, the semantic information recognition unit 8120, and the control unit 8130 will be sequentially described below.
  • the position and orientation calculation unit 8110 calculates the position and orientation of the imaging unit 110 based on the depth map input by the input unit 1110. Also, a three-dimensional map of the scene is created based on the calculated position and orientation. The calculated position and orientation and the three-dimensional map are input to the semantic information recognition unit 8120 and the control unit 8130.
  • the semantic information recognition unit 8120 uses the depth map input by the input unit 1110, the position and orientation calculated by the position and orientation calculation unit 8110, and the three-dimensional map to calculate the number of objects and heights of stacked objects as semantic information. presume.
  • the estimated number and height values are input to the control unit 8130.
  • the control unit 8130 inputs the position and orientation calculated by the position and orientation calculation unit 8110 and the three-dimensional map. Further, the value of the number and height of stacked objects as semantic information estimated by the semantic information recognition unit 8120 is input.
  • the control unit 8130 calculates an approach rejection value to an object in the scene based on the input value, and controls a control value for controlling the AGV so that the object passes through the cells of the occupied grid above the predetermined approach rejection value. calculate. Control unit 8130 outputs the calculated control value to actuator 120.
  • FIG. 15 is a flowchart showing the processing procedure of the information processing apparatus 80 in the present embodiment.
  • the processing steps include initialization S110, visual information acquisition S120, visual information input S130, position and orientation calculation S810, semantic information estimation S820, control value calculation S830, control S160, and system termination determination S170. Note that the initialization S110, the visual information acquisition S120, the visual information input S130, the control S160, and the system termination determination S170 are the same as those in FIG.
  • the steps of position and orientation calculation S810, semantic information estimation S820, and control value calculation S830 will be described in order below.
  • the position and orientation calculation unit 8110 calculates the position and orientation of the imaging device 110, and creates a three-dimensional map.
  • This is realized by an SLAM (Simultaneous Localization and Mapping) algorithm that performs position and orientation estimation while creating a map based on the position and orientation.
  • the position and orientation are calculated by the ICP algorithm such that the difference in depth of the depth map acquired by the imaging unit 110 at a plurality of times is minimized.
  • a three-dimensional map is created using a Point-Based Fusion algorithm that integrates depth maps in time series based on the calculated position and orientation.
  • step S820 the semantic information recognition unit 8120 divides the depth map and the three-dimensional map into areas, and calculates the overlapping number (n) and height (h) of objects for each area. Specific processing procedures will be described in order below.
  • the normal direction is calculated based on the depth value of each pixel of the depth map and the surrounding pixels.
  • a unique area identification label is assigned as the same object area. In this way, the depth map is segmented. Then, the area identification label is propagated to each point cloud of the three-dimensional map pointed to by each pixel of the area divided depth map to perform area division of the three-dimensional map.
  • bounding boxes are created by dividing the three-dimensional map in the XZ direction (moving plane of the AGV) at equal intervals. Each divided bounding box is scanned in order from the ground in the vertical direction (Y-axis direction), and the number of labels of each point cloud included in the bounding box is counted. In addition, the maximum value of the height from the ground (XZ plane) of the point cloud is calculated. The calculated number n of areas and the maximum height h are held in a three-dimensional map for each point cloud.
  • step S830 the control unit S160 creates an occupancy map based on the three-dimensional map. Also, the value of the approach rejection of the occupancy map is updated from the number of overlapping objects (n) and the height (h) of the objects. Then, the AGV is controlled based on the updated occupancy map.
  • the three-dimensional map created in step S810 is projected onto the XZ plane, which is a floor surface corresponding to the movement plane of the AGV, to obtain a 2D occupancy map.
  • the distance between the points obtained by projecting each point cloud of the three-dimensional map onto the XZ plane and each occupied map, and the number of overlapping objects (n) and height (h) of the objects in the point cloud occupied Update the close rejection value, which is the value of each grid in the map.
  • the i-th point coordinates X-Z obtained by projecting the cloud P i in X-Z plane and p i.
  • the coordinates of the j-th cell Qj of occupancy be q j .
  • the value of occupancy is larger as h and N are larger and smaller as the distance is larger.
  • d ij is the Euclidean distance between p i and q i .
  • the AGV and the target position / posture are minimized while the AGV is used to avoid grids with high values of approach rejection of the occupancy map.
  • the control value calculated by the controller 8130 is output to the actuator 130.
  • the stacking number and height of objects around the AGV are estimated as the semantic information, and the AGV is controlled to travel at a distance from the objects as their values become larger. This makes it possible for the AGV to travel at a further distance from the objects, for example, when there are shelves or pallets on which the AGV is loaded with a lot of articles, such as a distribution warehouse, which makes the AGV safer. I will be able to operate.
  • the imaging unit 110 in the present embodiment may be anything as long as it can acquire an image and a depth map, such as a TOF camera or a stereo camera. Furthermore, an RGB camera that acquires only an image, or a monocular camera such as a monochrome camera may be used. When a single-eye camera is used, depth is required for position and orientation calculation and occupancy map generation processing, but the present embodiment is realized by calculating the depth value from the movement of the camera.
  • the imaging unit 110 described in the following embodiments is also configured in the same manner as the present embodiment.
  • the value of the approach rejection degree of the occupancy map is not limited to the method described in the present embodiment as long as it is a function whose value is larger as the height of the object is higher and the stacking number is larger and smaller as the distance is larger.
  • it may be a function proportional to the height or stacking number of the object, or may be a function inversely proportional to the distance.
  • the function may consider only one of the height of the object or the number of stacks. It may be determined with reference to a list that stores occupancy values according to the distance, the height of the object, and the number of stacks. Note that this list may be stored in advance in the external memory (H14) or held by the mobile management system 13, and may be stored in the information processing apparatus via the communication I / F (H17) as necessary. It may be downloaded to the information processing apparatus 80.
  • the occupancy map is not limited to the occupancy map as described in the present embodiment, but may be anything as long as it can determine the presence or absence of an object in the space. For example, it may be represented as a point cloud of a predetermined radius or may be approximated by some function. Not only a two-dimensional occupancy map but also a three-dimensional occupancy map may be used. For example, a signed distance field which is a TSDF (Trunked Signed Distance Function) even when stored in a 3D voxel space (X, Y, Z) You may hold as.
  • TSDF Trusted Distance Function
  • the control value is calculated using a line map in which the degree of approach rejection is changed according to the height and the number of stacks of the object, but the control value may be changed based on the semantic information of the object. For example, it is not limited to this.
  • the control value may be determined with reference to a list describing a control method according to the height and stacking number of objects.
  • the list describing the control method is a list that defines operations such as turning to the left or decelerating if the number of objects and the number of stacking are predetermined values and the conditions are satisfied.
  • the AGV may be controlled based on a predetermined rule, such as calculating a control value that rotates so as not to appear in the field of view when objects of a predetermined height or stacking number are found.
  • the AGV may be controlled by applying a function having a measured value as a variable, such as calculating a control value that reduces the speed as the height and the number of stacks increase.
  • the imaging unit 110 is mounted on the AGV.
  • the imaging unit 110 need not be mounted on the AGV as long as it can capture the traveling direction of the AGV.
  • a surveillance camera attached to a ceiling may be used as the imaging device 110.
  • the imaging device 110 can capture an AGV, and the position and orientation with respect to the imaging device 110 can be determined by, for example, an ICP algorithm.
  • a marker may be attached to the upper part of the AGV, and the position and orientation may be obtained by the imaging device 110 detecting the marker.
  • the imaging device 110 may detect an object on the traveling route of the AGV.
  • the imaging device 110 may be one or more.
  • the position and orientation calculation unit 8110, the semantic information recognition unit 8120, and the control unit 8130 do not have to be mounted on the AGV.
  • the control unit 8130 is mounted on the mobile management system 13. In this case, it can be realized by transmitting and receiving necessary information through the communication I / F (H17). By doing this, it is not necessary to place a large computer on the mobile AGV and the weight of the AGV can be lightened, so that the AGV can be operated efficiently.
  • the semantic information is the stacking degree of objects.
  • the semantic information recognition unit 8120 may recognize any semantic information as long as the AGV can calculate the control value for operating safely and efficiently.
  • the control unit 8130 may calculate the control value using the semantic information.
  • the position of the structure may be recognized as semantic information.
  • the degree of opening of a "door” which is a structure in a factory can also be used as semantic information.
  • the AGV runs slower when the door is open or opening as compared to when the door is closed. Also, it recognizes that an object is suspended by a crane, and controls so as not to get into the lower part of the object. By doing this, the AGV can be operated more safely.
  • the stacking degree is recognized in the present embodiment, it is also possible to recognize that they are lined up close to each other. For example, a plurality of bogies are recognized, and if their distances are smaller than a predetermined value, a control value is calculated so as to operate at a predetermined distance or more.
  • the other AGVs and the packages located above them may be recognized to recognize that the packages are on the AGVs. If another AGV carries a package, it will avoid itself (AGV), otherwise it will go straight ahead and avoid other AGVs to avoid itself (AGV) via mobile management system 13 It may be sent.
  • AGV AGV
  • the size of the package may be recognized and the control method may be determined according to the size. In this way, by determining whether or not the luggage is mounted and the size of the luggage, the energy or the energy required for the movement by the avoidance operation of the AGV without the luggage and the AGV carrying the smaller luggage The AGV can be operated efficiently by reducing the time.
  • a control value which AGV avoids may be calculated.
  • AGV can be operated safely with less damage to the package.
  • the outer shape of the object shown in the input image is used as the semantic information. Specifically, when the detected object is pointed, by traveling at a distance from such an object, the AGV can be operated safely without being injured. Moreover, if it is a flat object like a wall, by operating a fixed distance, it is possible to suppress the fluctuation of the AGV and operate stably and efficiently.
  • the degree of danger or fragility of the object itself may be recognized. For example, when recognizing the letters “danger” and the mark on the cardboard, the AGV is controlled to move away from the cardboard by a predetermined distance or more. By doing this, it is possible to operate the AGV more safely on the basis of the danger or fragility of the object.
  • the lighting condition of the laminated lamp indicating the operating condition of the automatic machine in the factory is recognized, and the control value is calculated so as not to approach a predetermined distance or more when the automatic machine is in operation. In this way, the AGV is detected by the safety sensor of the automatic machine, and there is no need to stop the automatic machine, and the AGV can be operated efficiently.
  • the control method is not limited to the above method, and any method that can operate AGV efficiently and safely can be used.
  • acceleration and deceleration parameters may be changed.
  • precise control can be performed such as whether to decelerate gently or suddenly decelerate according to the semantic information.
  • the parameters of the avoidance may be changed, or the control may be switched such as whether to avoid near the object, to largely avoid, to change the route and to avoid, or to stop.
  • the frequency of calculation of control value of AGV is increased or decreased. By increasing the frequency, finer control can be achieved, and by decreasing the frequency, slow control can be achieved.
  • AGV is operated more efficiently and safely by changing the control method based on the semantic information.
  • the AGV is controlled based on static semantic information around a certain time, such as the stacking degree and shape of objects existing around the AGV, and the state of the structure.
  • AGVs are controlled based on these temporal changes.
  • the semantic information in the present embodiment refers to the amount of movement of an object shown in an image.
  • the type of the object in addition to the movement amount of the object shown in the image, the type of the object is also recognized, and a method of calculating the control value of the AGV based on the result is described. Specifically, it recognizes AGVs, packages placed on them, and other AGVs as the types of surrounding objects, and the amount of movement of other AGVs, and based on the recognition results, their own (AGV) or other Calculate the control value of AGV.
  • the configuration of the information processing apparatus in the present embodiment is the same as that of FIG. 14 of the information processing apparatus 80 described in the eighth embodiment, and thus the description thereof is omitted.
  • the difference from the eighth embodiment is that the semantic information estimated by the semantic information recognition unit 8120 and input to the control unit 8130 is a movement amount of an AGV and a load placed thereon as a detected object type, and the movement amount of another AGV. is there.
  • the diagram of the processing procedure in the present embodiment is the same as FIG. 15 for describing the processing procedure of the information processing apparatus 80 described in the eighth embodiment, and therefore the description thereof is omitted. What differs from the eighth embodiment is the processing contents of the semantic information estimation step S820 and the control value calculation S830.
  • the semantic information recognition unit 8120 divides the depth map into areas, and further estimates the type of the object for each area. At this time, the position and size of the object estimated together are estimated. Next, among the detected objects, the position of (the other) AGV is compared with the past position of the (the other) AGV to calculate the movement amount of the (the other) AGV. In the present embodiment, the amount of movement of another AGV is the amount of change in the relative position and orientation with respect to oneself (AGV).
  • the depth map is divided into areas based on the image and the depth map, and an object type for each area is specified.
  • the area recognized as AGV is extracted, and the relative positional relationship with other areas is calculated. At this time, it is determined that a region having a distance to the AGV smaller than a predetermined threshold value and in a vertical (Y-axis) direction from a region recognized as an AGV is a cargo region mounted on the AGV. In addition, get the size of the AGV, and the size of the loaded luggage area. The size is the length of the long side of the bounding box that encloses the luggage area.
  • the regions recognized as AGV at time t-1 and time t are extracted respectively, and their relative positional relationship is calculated using an ICP algorithm.
  • the calculated relative positional relationship is the amount of change in the relative position and orientation of another AGV relative to oneself (AGV). This is hereinafter referred to as the movement amount of another AGV.
  • the control value calculation S 830 is the action of the control unit 8130 itself (AGV) based on the movement amount of the other AGV calculated by the semantic information recognition unit 8120 in step S 820 and the size of the other AGV and the package mounted thereon. Decide.
  • the control value is not changed when the other AGVs move away.
  • a new control value is calculated based on the size of the package. Specifically, the size of one's own (AGV) stored in RAM (H13) in advance by input means (not shown) is compared with the size of other AGV's and the size of its package, and one's own (AGV's) If it is smaller, you (AGV) do route planning to avoid other AGVs. On the other hand, if the own (AGV) is larger, the signal of making the mobile management system 13 perform the avoidance operation is sent to the mobile management system 13 through the communication interface H17 while decelerating the own (AGV).
  • control is performed based on the result of determining the type of an object around AGV as the semantic information, and further estimating the movement amount and the size of the loaded luggage in the case of another AGV. Calculate the value.
  • AGV another AGV or its package is larger than itself
  • control is performed such that oneself is avoided and, conversely, the other party is avoided.
  • AGV is detected as another mobile body.
  • any object may be detected as long as at least the position and the posture change and the control of the AGV can be changed accordingly.
  • a forklift or a mobile robot may be detected as the mobile body.
  • the position or posture of a part of the apparatus may recognize the amount of change as the semantic information, and the control of the AGV may be changed accordingly. For example, if the moving amount of the movable unit of the machine such as an automatic machine or an arm robot arm or a belt conveyor is larger than a predetermined operation speed, for example, the AGV may be controlled to be away from them by a predetermined distance.
  • the control value is calculated such that one of the AGV and the other AGV avoids, but any control method may be used as long as the control value is changed according to the movement of the object.
  • the value of the approach rejection degree of the occupancy map described in the eighth embodiment may be dynamically updated according to the magnitude of the movement amount, and the control value of the AGV may be calculated using this.
  • a control value may be calculated so as to move following the other AGV if the moving amount of the other AGV is in the same direction as that of one's own (AGV). If there is an AGV that has come from the lateral direction first when the crossroad is reached, a control value may be calculated to wait until the passage is finished, or if the user leads the crossroad via the mobile management system 13 A control value that causes another AGV to stand by may be calculated. Furthermore, when it is observed that another AGV vibrates to the left and right with respect to the traveling direction, for example, or when a movement of a load mounted on another AGV is vibrated with respect to the other AGV, constant A control value may be calculated that passes through a route that does not approach the distance or more.
  • the work process may be further recognized as semantic information from the movement of the object. For example, it may be recognized that the robot is in the process of loading another AGV. At this time, the control value may be calculated such that oneself (AGV) searches for another route.
  • AGV oneself
  • the movement of the object is recognized as the semantic information, and the mobile AGV and the forklift are controlled according to them to operate more efficiently.
  • a method for safely operating an AGV will be described based on the result of recognizing the work and role of a person.
  • a work type of a person is estimated based on human beings and object types held by the person as semantic information, and AGV control is performed according to the work type.
  • a person lifts a handlift that is pressed by a person and controls to make the AGV avoid by recognizing a transport work as a work type, detects a welder possessed by a person and a person, and recognizes a welding work as a work type Implement control such as changing the AGV route.
  • the proximity rejection degree parameter that determines the control of the AGV in accordance with the human being and the object possessed by the human is given in advance by hand.
  • the parameters are, for example, 0.4 when a person has a large package, 0.6 when a person is pushing a truck, 0.9 when a person has a welding machine, etc. It is.
  • the mobile unit management system 13 holds a parameter list in which these parameters are held.
  • the data can be downloaded from the mobile management system 13 to the information processing apparatus 80 via the communication I / F (H17) to the information processing apparatus, and can be stored and referenced in the external memory (H14).
  • the configuration of the information processing apparatus in the present embodiment is the same as that of FIG. 14 of the information processing apparatus 80 described in the eighth embodiment, and thus the description thereof is omitted.
  • the difference from the eighth embodiment is that the semantic information that the semantic information recognition unit 8120 estimates and inputs to the control unit 8130 is different.
  • the diagram of the processing procedure in the present embodiment is the same as FIG. 15 for describing the processing procedure of the information processing apparatus 80 described in the eighth embodiment, and therefore the description thereof is omitted. What differs from the eighth embodiment is the processing contents of the semantic information estimation step S820 and the control value calculation S830.
  • the semantic information recognition unit 8120 recognizes a person and an object type held by the person from the input image. Then, the AGV is controlled based on the parameter list in which the control rules of the AGV corresponding to the person and the object held by the person stored in advance in the external memory H14 are recorded.
  • the site of a human hand is detected from visual information.
  • a method is used that recognizes each human site and their connection and estimates human skeletons. Then, image coordinates corresponding to the position of the human hand are acquired.
  • an object type held by a person is detected.
  • the neural network described in the sixth embodiment and trained to divide an image into regions according to object types is used.
  • an area within a predetermined distance from the image coordinates of the position of the human hand is recognized as an object area held by a person, and an object type assigned to the area is acquired.
  • the object type mentioned here is uniquely associated with the object ID held by the above list.
  • the parameter of the approach rejection degree is acquired.
  • the acquired information is input to the control unit 8130 by the semantic information recognition unit 8120.
  • control unit 8130 determines the action of itself (AGV) based on the parameter of the approach rejection degree of the object calculated by the semantic information recognition unit 8120 in step S820.
  • the control value is calculated by updating the value of the approach rejection degree of the occupancy map described in the eighth embodiment as follows. Note that this is a function that increases as the value of the approach rejection degree increases and decreases as the distance increases.
  • Score j is the value of the j-th grid.
  • s i is a parameter representing the approach rejection degree of the ith object detected in step S 820.
  • the travel route of the AGV is determined as described in the eighth embodiment using the occupancy map defined as described above.
  • control value is calculated so as to limit the maximum value v max of the AGV velocity as follows based on the value of the approach rejection of the occupancy map in progress.
  • is an adjustment parameter between the value of the approach rejection degree of the occupancy map and the speed, and ⁇ is a value of the approach rejection degree of the occupancy map currently passing by the AGV.
  • v max is calculated as a value that approaches 0 as the value of the approach rejection of the occupancy map increases (closes to 1). The control value calculated by the control unit 8130 in this manner is output to the actuator 130.
  • the type of work of a person is determined from the combination of a person and an object held by the person, and a parameter representing the approach rejection degree is determined. Then, the control value is calculated so as to move slower as it gets away from the person as the approach rejection degree is larger. This controls the AGV at an appropriate distance according to the work of the person. In this way, AGV can be controlled more safely.
  • People's clothes may be recognized as semantic information. For example, in a factory, it is assumed that it is a worker wearing a work clothes and a visitor wearing a suit. This recognition result is used to control AGV more safely by comparing with a worker who is used to the movement of AGV and making it progress more slowly when passing in the vicinity of the visitor who is not particularly used to the movement of AGV. Do.
  • a person's age may be recognized as semantic information. For example, in an AGV that performs in-hospital delivery in a hospital, when a child or an elderly person is recognized, the AGV can be more safely operated by passing at a predetermined distance slowly.
  • Human movement may be recognized as semantic information. For example, in an AGV carrying luggage at a hotel, when it is recognized that a person repeatedly moves back and forth and left and right as when walking with a stagger, a control value to be passed at a predetermined distance is calculated. AGV can be operated safely.
  • a control value may be calculated so as to approach the worker slowly and stop until the loading of the package ends. By doing this, it is not necessary to load the load after the operator has moved to the stop position of the AGV, and the work can be performed efficiently.
  • the number of people may be recognized as semantic information. Specifically, the route is changed when more people than a predetermined number are recognized on the progress route of the AGV. By doing this, it is possible to operate the AGV more safely by avoiding contact with one person in case of progressing through people.
  • the configuration of the device in the eleventh embodiment is the same as that of FIG. 2 showing the configuration of the information processing device 80 described in the eighth embodiment, and thus the description thereof is omitted.
  • the configuration of the device for display is the same as the configuration described in the seventh embodiment, and is thus omitted.
  • FIG. 13 shows a GUI 200 which is an example of display information presented by the display device according to the present embodiment.
  • G210 is a window for presenting the visual information acquired by the imaging unit 110 and the semantic information recognized by the semantic information recognition unit 8120.
  • G220 is a window for presenting the approach rejection for navigation of the AGV described in the eighth embodiment.
  • G230 is a window for presenting a 2D occupancy map.
  • G240 is a GUI for manually operating the AGV, a value calculated by the position and orientation calculation unit 8110, the semantic information recognition unit 8120, and the control unit 8130, and a window for presenting AGV operation information.
  • G210 shows an example of presentation of a plurality of objects, their relative distances, and values of the approach rejection degree as the semantic information detected by the semantic information recognition unit 8120.
  • G211 is a bounding box of the detected object. In this embodiment, bounding boxes that detect other AGVs and their packages and surround them are indicated by dotted lines. Although a plurality of objects are integrated to present a bounding box, a bounding box may be drawn for each of the detected objects. In addition, the bounding box may be anything as long as the position of the detected object is known, and may be drawn by a dotted line or a solid line, or a semitransparent mask may be superimposed and presented.
  • G212 is a pop-up that presents the detected semantic information. A plurality of detected object types, their distances, and values of proximity rejection are presented. Thus, by superimposing the recognized semantic information on visual information and presenting it, the user can intuitively associate visual information with the semantic information and grasp.
  • G220 is an example in which the proximity rejection degree of the AGV calculated by the control unit 8130 is superimposed on the visual information acquired by the imaging unit 110.
  • G221 superimposes darker colors as the degree of approach rejection is higher. By presenting the approach rejection degree superimposed on the visual information in this manner, the user can intuitively associate the visual information with the approach rejection degree and grasp the information. Note that G221 may allow the user to more easily understand the approach rejection degree by changing the color, density, or shape.
  • G230 is an example of presenting the occupancy map calculated by the control unit 8130 and the semantic information recognized by the semantic information recognition unit 8120.
  • G 231 visualizes the value of the approach rejection of the occupancy map so that the value of the approach rejection of the occupancy map becomes larger as the value of the proximity rejection becomes larger and smaller as the value becomes smaller.
  • G232 further presents the position of the structure as the semantic information recognized by the semantic information recognition unit 8120. In the present embodiment, an example is presented in which the result of recognizing that the factory door is open is presented.
  • G233 further presents the movement amounts of surrounding objects as the semantic information recognized by the semantic information recognition unit 8130. In the present embodiment, the moving direction and the speed of the object are presented.
  • the user can easily associate them and grasp the internal state of the AGV. Also, by presenting the occupancy map in this manner, the user can easily grasp the AGV route generation process of the control unit 8130.
  • G240 shows a GUI for manually operating the AGV, a value calculated by the position and orientation calculation unit 8110, the semantic information recognition unit 8120, and the control unit 8130, and an example of presentation of operation information of the AGV.
  • G241 is a GUI for setting the semantic information recognized by the semantic information recognition unit 8120 and whether or not to display the recognition result, and is, for example, a radio button for switching on / off of an item.
  • G 242 is a GUI for adjusting the proximity rejection distance calculated by the control unit 8130 and parameters for calculating the control value, and corresponds to, for example, a slide bar or a number input form.
  • the GUI described in the present embodiment is an example, and the semantic information calculated by the semantic information recognition unit 8120, the value of the proximity rejection of the occupancy map calculated by the control unit 8130, etc. are presented, and the internal state of the AGV is grasped.
  • Any visualization method may be used as long as it is a GUI to be used.
  • the display information can be changed such as changing color, switching line thickness, solid line, broken line, double line, scaling, and hiding unnecessary information. By changing the method of visualizing display information in this manner, the user can more intuitively understand the display information.
  • the present invention supplies a program that implements one or more functions of the above-described embodiments to a system or apparatus via a network or storage medium, and one or more processors in a computer of the system or apparatus read and execute the program. Can also be realized. It can also be implemented by a circuit (eg, an ASIC) that implements one or more functions.
  • a circuit eg, an ASIC

Abstract

This information processing device: accepts input of image information acquired by an image capturing unit which is mounted on a moving body and in which each light receiving unit on an image capturing element consists of at least two light receiving elements; holds map information; acquires a position and attitude of the image capturing unit on the basis of the image information and the map information; and obtains a control value for controlling the movement of the moving body on the basis of the position and attitude, acquired by an acquiring means.

Description

情報処理装置、情報処理方法、プログラム、およびシステムINFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, PROGRAM, AND SYSTEM
 本発明は、移動体の移動制御を行う技術に関する。 The present invention relates to technology for performing movement control of a mobile.
 例えば、搬送車(AGV(Automated Guided Vehicle))、自律移動ロボット(AMR(Autonomous Mobile Robot))といった移動体がある。それらを、例えば、工場や物流倉庫といった環境内で走行させる場合、移動体の移動制御を安定して行うため、特許文献1のように床にテープを貼り、テープを移動体に搭載したセンサで検知しながら走行させていた。 For example, there are moving objects such as a guided vehicle (AGV (Automated Guided Vehicle)) and an autonomous mobile robot (AMR (Autonomous Mobile Robot)). When running them in an environment such as a factory or a distribution warehouse, for example, in order to stably control the movement of the moving body, a tape is attached to the floor as in Patent Document 1 and a sensor mounted with the tape on the moving body I was running while detecting.
特開2010-33434Japanese Patent Application Publication No. 2010-33434
 しかし、特許文献1の技術では、移動体を走行させる環境内で、物のレイアウト変更を行って動線が変わる度に、テープを貼り直す必要があったため、手間がかかっていた。そのような手間を減らし、安定して移動体を走行させることが求められている。 However, in the technology of Patent Document 1, since it is necessary to restick the tape every time the flow line is changed by changing the layout of an object in the environment in which the moving body travels, it takes time and effort. It is required to reduce the time and effort and stably run the moving body.
 本発明は、上記の課題に鑑みてなされたものであり、移動体の移動制御を、安定して行う情報処理装置を提供することを目的とする。また、その方法、及びプログラムを提供することを目的とする。 The present invention has been made in view of the above problems, and an object thereof is to provide an information processing apparatus which stably performs movement control of a mobile body. Moreover, it aims at providing the method and program.
 本発明に係る情報処理装置は以下の構成を備える。 An information processing apparatus according to the present invention has the following configuration.
 移動体に搭載された、撮像素子上の各々の受光部が2以上の受光素子によって構成される撮像部が取得した画像情報の入力を受け付ける入力手段と、
 マップ情報を保持する保持手段と、
 前記画像情報と前記マップ情報とに基づいて前記撮像部の位置姿勢を取得する取得手段と、
 前記取得手段が取得した位置姿勢に基づいて前記移動体の移動を制御する制御値を得る制御手段。
An input unit that receives an input of image information acquired by an imaging unit that is mounted on a moving body and in which each light receiving unit on the imaging element is configured by two or more light receiving elements;
Holding means for holding map information;
Acquisition means for acquiring the position and orientation of the imaging unit based on the image information and the map information;
Control means for obtaining a control value for controlling the movement of the movable body based on the position and orientation acquired by the acquisition means.
 本発明によれば、移動体の移動制御を安定して行うことが出来る。 According to the present invention, movement control of the moving body can be stably performed.
 添付図面は明細書に含まれ、その一部を構成し、本発明の実施の形態を示し、その記述と共に本発明の原理を説明するために用いられる。 The accompanying drawings are included in the specification, constitute a part thereof, show embodiments of the present invention, and are used together with the description to explain the principle of the present invention.
実施形態1におけるシステム構成を説明する図。FIG. 2 is a diagram for explaining a system configuration in the first embodiment. 実施形態1における機能構成を説明する図。FIG. 2 is a diagram for explaining a functional configuration in the first embodiment. 撮像部110が備える撮像素子D150を説明する図。FIG. 7 is a diagram for explaining an imaging element D150 included in the imaging unit 110. 撮像部110が備える撮像素子D150を説明する図。FIG. 7 is a diagram for explaining an imaging element D150 included in the imaging unit 110. 撮像部110が備える撮像素子D150を説明する図。FIG. 7 is a diagram for explaining an imaging element D150 included in the imaging unit 110. 撮像部110が撮像する画像152a~154dの例を示す図。FIG. 6 is a view showing an example of images 152a to 154d captured by the imaging unit 110. 実施形態1の装置の処理の流れを示すフローチャート。3 is a flowchart showing the flow of processing of the device of the first embodiment. 実施形態1の装置のハードウェア構成を示す図。FIG. 2 is a diagram showing a hardware configuration of the device of Embodiment 1. 実施形態2におけるモーションステレオを用いた視覚情報の補正処理の手順を示すフローチャート。10 is a flowchart showing a procedure of correction processing of visual information using motion stereo in the second embodiment. 実施形態3における機能構成を説明する図。FIG. 7 is a diagram for explaining a functional configuration in a third embodiment. 実施形態4における機能構成を説明する図。FIG. 13 is a diagram for explaining a functional configuration in a fourth embodiment. 実施形態4における三次元計測装置の計測結果を用いた視覚情報の補正処理の手順を示すフローチャート。16 is a flowchart illustrating a procedure of correction processing of visual information using a measurement result of the three-dimensional measurement device in the fourth embodiment. 実施形態5における物体検出および位置姿勢の算出の処理手順を示すフローチャート。16 is a flowchart showing a processing procedure of object detection and calculation of position and orientation in the fifth embodiment. 実施形態6における視覚情報の意味的領域分割の処理手順を示すフローチャート。16 is a flowchart showing a processing procedure of semantic area division of visual information in the sixth embodiment. 表示情報を提示するGUIの一例を示す図。The figure which shows an example of GUI which presents display information. 実施形態8における機能構成を説明する図。FIG. 18 is a diagram for explaining a functional configuration in an eighth embodiment. 実施形態8の装置の処理の流れを示すフローチャート。The flowchart which shows the flow of processing of the device of execution form 8. 表示情報を提示するGUIの一例を示す図。The figure which shows an example of GUI which presents display information.
 以下、図面を参照しながら実施形態を説明する。なお、以下の実施形態において示す構成は一例に過ぎず、本発明は図示された構成に限定されるものではない。 Hereinafter, embodiments will be described with reference to the drawings. In addition, the structure shown in the following embodiment is only an example, and this invention is not limited to the illustrated structure.
 [実施形態1]
 本実施形態では、搬送車(AGV(Automated Guided Vehicle))または、自律移動ロボット(AMR(Autonomous Mobile Robot))等と称する移動体の移動制御について説明する。以下、移動体としてAGVを例に説明するが、移動体はAMRであっても良い。
Embodiment 1
In this embodiment, movement control of a mobile unit referred to as a guided vehicle (AGV (Automated Guided Vehicle)) or an autonomous mobile robot (AMR (Autonomous Mobile Robot)) will be described. Hereinafter, although AGV is demonstrated to an example as a mobile, a mobile may be AMR.
 図1に、本実施形態におけるシステム構成図を示す。本実施形態における情報処理システム1は、複数の移動体12(12-1、12-2、・・・)、工程管理システム14、移動体管理システム13から構成される。情報処理システム1は、物流システムや生産システムなどである。 FIG. 1 shows a system configuration diagram in the present embodiment. The information processing system 1 in the present embodiment includes a plurality of mobile units 12 (12-1, 12-2,...), A process management system 14 and a mobile unit management system 13. The information processing system 1 is a distribution system, a production system, and the like.
 複数の移動体12(12-1、12-2、・・・)は、工程管理システムで決められた工程のスケジュールに合わせて物体を搬送する搬送車(AGV(Automated Guided Vehicle))である。移動体は環境内で複数台が移動(走行)している。 The plurality of mobile bodies 12 (12-1, 12-2,...) Are transportation vehicles (AGV (Automated Guided Vehicle)) that transport objects in accordance with the schedule of processes determined by the process management system. A plurality of mobile units move (run) within the environment.
 工程管理システム14は、情報処理システムが実行する工程を管理する。例えば、工場や物流倉庫内の工程を管理するMES(Manufacturing Execution System)である。移動体管理システム3と通信を行っている。 The process management system 14 manages the process performed by the information processing system. For example, it is MES (Manufacturing Execution System) which manages the process in a factory or a distribution warehouse. It communicates with the mobile management system 3.
 移動体管理システム13は、移動体を管理するシステムである。工程管理システム12と通信を行っている。また、移動体とも通信(例えば、Wi-Fi通信)を行い、運行情報を双方向に送受信している。 The mobile management system 13 is a system that manages mobiles. It communicates with the process control system 12. In addition, communication (for example, Wi-Fi communication) is also performed with mobiles, and operation information is bidirectionally transmitted and received.
 図2は、本実施形態における情報処理装置10を備える移動体12のハードウェア構成例を示す図である。情報処理装置10は、入力部1110、算出部1120、保持部1130、制御部1140から構成されている。入力部1110は、移動体12に搭載された撮像部110と接続されている。制御部1140は、アクチュエータ120と接続されている。また、これらに加え、不図示の通信装置が移動体管理システム3と情報を双方向に通信を行っており、情報処理装置10の各種手段に入出力している。但し、図2は、機器構成の一例である。 FIG. 2 is a diagram showing an example of a hardware configuration of the mobile unit 12 including the information processing apparatus 10 in the present embodiment. The information processing apparatus 10 includes an input unit 1110, a calculation unit 1120, a holding unit 1130, and a control unit 1140. The input unit 1110 is connected to the imaging unit 110 mounted on the moving body 12. The controller 1140 is connected to the actuator 120. In addition to these, a communication device (not shown) communicates information with the mobile management system 3 in a bi-directional manner, and inputs / outputs to / from various means of the information processing apparatus 10. However, FIG. 2 is an example of a device configuration.
 図3は、撮像部110が備える撮像素子D150を説明するための図である。本実施形態において、撮像部110は、内部に撮像素子D150を備えている。図3Aに示すように、撮像素子D150にはその内部に受光部D151が格子状に多数配置されている。図3Aには、4つの受光部を示している。各々の受光部D151には、その上面にマイクロレンズD153が設けられ、効率的に集光できるようになっている。従来の撮像素子は1つの受光部D151に対して1つの受光素子を備えているが、本実施形態における撮像部110が備える撮像素子D150では、各々の受光部D151は内部に複数の受光素子D152を備えている。 FIG. 3 is a diagram for explaining an imaging device D 150 provided in the imaging unit 110. In the present embodiment, the imaging unit 110 internally includes an imaging device D150. As shown in FIG. 3A, a large number of light receiving units D151 are arranged in a lattice shape inside the imaging device D150. FIG. 3A shows four light receiving units. The micro lens D153 is provided in the upper surface in each light-receiving part D151, and it can collect now efficiently. The conventional imaging device includes one light receiving element for one light receiving unit D151. However, in the imaging device D150 included in the imaging unit 110 in the present embodiment, each light receiving unit D151 includes a plurality of light receiving devices D152. Is equipped.
 図3Bは、1つの受光部D151に着目し、側面から見た様子を示すものである。図3Bに示すように、1つの受光部D151の内部に2つの受光素子D152aおよび152bが備えられている。個々の受光素子D152a、D152bは互いに独立しており、受光素子D152aに蓄積された電荷が受光素子D152bに移動することはなく、また逆に受光素子D152bに蓄積された電荷が受光素子D152aに移動することはない。そのため、図3Bにおいて、受光素子D152aはマイクロレンズD153の右側から入射する光束を受光することになる。また逆に、受光素子D152bはマイクロレンズD153の左側から入射する光束を受光することになる。 FIG. 3B shows one light receiving unit D 151 as viewed from the side. As shown in FIG. 3B, two light receiving elements D 152 a and 152 b are provided in one light receiving unit D 151. The individual light receiving elements D152a and D152b are independent of each other, and the charge accumulated in the light receiving element D152a does not move to the light receiving element D152b, and conversely, the charge accumulated in the light receiving element D152b moves to the light receiving element D152a There is nothing to do. Therefore, in FIG. 3B, the light receiving element D 152 a receives the light flux incident from the right side of the microlens D 153. On the other hand, the light receiving element D 152 b receives the light flux incident from the left side of the microlens D 153.
 撮像部110は、受光素子D152aに蓄積されている電荷のみを選択して画像D154aを生成することができる。また同時に、撮像部110は、受光素子D152bに蓄積されている電荷のみを選択して画像D154bを生成することができる。画像D154aはマイクロレンズ153の右側からの光束、画像D154bはマイクロレンズD153の左側の光束のみを選択して生成されるため、図4に示すように、画像D154aと画像D154bは、互いに異なる撮影視点から撮影された画像となる。 The imaging unit 110 can select only the charge accumulated in the light receiving element D 152 a to generate the image D 154 a. At the same time, the imaging unit 110 can select only the charge accumulated in the light receiving element D 152 b to generate the image D 154 b. The image D154a is generated by selecting the light from the right side of the microlens 153, and the image D154b is generated by selecting only the light from the left side of the microlens D153. Therefore, as shown in FIG. It is an image taken from
 また、撮像部110が各受光部D151から、受光素子D152a、D152bの両方に蓄積されている電荷を用いて画像を形成する。従来の撮像素子を用いた場合と同じようにある視点から撮影した画像である画像D154e(不図示)が得られることになる。撮像部110は、以上説明した原理によって、撮影視点の異なる画像D154a、D154bと、従来の画像154eを同時に撮像することができる。 Further, the imaging unit 110 forms an image from each light receiving unit D 151 using the charges accumulated in both of the light receiving elements D 152 a and D 152 b. As in the case of using a conventional imaging device, an image D154e (not shown) which is an image captured from a certain viewpoint is obtained. The imaging unit 110 can simultaneously capture the images D154a and D154b having different shooting viewpoints and the conventional image 154e according to the principle described above.
 なお、各受光部D151は、より多くの受光素子D152を備えてもよく、任意の数の受光素子D152を設定することができる。例えば、図3Cは、受光部D151の内部に4つの受光素子D152a~D152dを設けた例を示している。 Note that each light receiving unit D151 may include more light receiving elements D152, and an arbitrary number of light receiving elements D152 can be set. For example, FIG. 3C shows an example in which four light receiving elements D152a to D152d are provided inside the light receiving part D151.
 撮像部110は、一対の画像D154a、D154bから、対応点探索を行って視差画像D154f(不図示)を算出し、さらにその視差画像に基づいてステレオ法で対象の三次元形状を算出することができる。対応点探索やステレオ法は公知の技術であり、様々な方法を適用可能である。対応点探索には、画像の各画素の周囲の数画素をテンプレートとして類似するテンプレートを探索するテンプレートマッチング法や、画像の輝度情報の勾配からエッジやコーナーの特徴点を抽出し、特徴点の特徴が類似する点を探索する手法などを使う。ステレオ法では、2つの画像の座標系の関係を導出し、射影変換行列を導出し、三次元形状を算出する。撮像部110は画像D154eに加えて、画像D154a、画像D154b、視差画像D154f、ステレオ法によって求めたデプスマップD154dや三次元点群D154cを出力する機能を有している。 The imaging unit 110 may perform a corresponding point search from the pair of images D154a and D154b to calculate a parallax image D154f (not illustrated), and may further calculate a three-dimensional shape of an object by a stereo method based on the parallax images. it can. Corresponding point search and stereo methods are known techniques, and various methods can be applied. In the corresponding point search, a template matching method of searching for a similar template using several pixels around each pixel of the image as a template, or extracting edge or corner feature points from the gradient of the brightness information of the image, Use a method to search for similar points. In the stereo method, the relationship between coordinate systems of two images is derived, a projective transformation matrix is derived, and a three-dimensional shape is calculated. The imaging unit 110 has a function of outputting an image D154a, an image D154b, a parallax image D154f, a depth map D154d obtained by the stereo method, and a three-dimensional point group D154c in addition to the image D154e.
 なお、ここで言うデプスマップとは、画像154cを構成する各画素に対して、計測対象までの距離(奥行き)と相関のある値を保持する画像を指す。通常、計測対象までの距離と相関のある値は、通常の画像として構成可能な整数値であり、焦点距離から決定される所定の係数を乗ずることで、対象までの物理的な距離(例えばミリメートル)に変換することができる。この焦点距離は、先述のように撮像部110の固有情報に含まれる。 In addition, the depth map said here refers to the image which hold | maintains the value with the distance (depth) to measurement object with respect to each pixel which comprises the image 154c. Usually, the value correlated with the distance to the measurement object is an integer value that can be configured as a normal image, and by multiplying a predetermined coefficient determined from the focal distance, the physical distance to the object (for example, millimeter) Can be converted to The focal length is included in the unique information of the imaging unit 110 as described above.
 また、三次元点群D154について説明する。上記のようにデプスマップD154dから変換された計測対象までの物理的な距離に対して、別途設定される三次元空間中の直交座標系における原点(撮像部の光学中心)からの各軸(X,Y,Z)の値として設定される座標の集合である。 Also, the three-dimensional point group D154 will be described. Each axis (X) from the origin (optical center of the imaging unit) in the orthogonal coordinate system in the three-dimensional space separately set with respect to the physical distance to the measurement object converted from the depth map D 154 d as described above , Y, Z) are a set of coordinates.
 撮像部110は、単一の撮像素子D150によって視点の異なる一対の画像D154a、D154bを取得することができるため、2つ以上の撮像部を必要とする従来のステレオ法と異なり、より小型な構成によって三次元計測を実現することが可能となる。 The imaging unit 110 can obtain a pair of images D154a and D154b with different viewpoints by a single imaging device D150, so the configuration is more compact unlike the conventional stereo method that requires two or more imaging units. Makes it possible to realize three-dimensional measurement.
 撮像部D110は、さらに光学系の焦点距離を制御するオートフォーカス機構および画角を制御するズーム機構を備える。オートフォーカス機構は有効あるいは無効を切り替え可能であり、設定した焦点距離を固定することができる。撮像部D110は、焦点および画角を制御するために設けられた光学系制御モータの回転角あるいは移動量といった駆動量によって規定される制御値を読み取り、不図示のルックアップテーブルを参照して焦点距離を算出し、出力することができる。また撮像部D110は、装着されたレンズから、焦点距離範囲、口径、ディストーションの係数、光学中心などのレンズの固有情報を読み取ることができる。読み取った固有情報を、後述する視差画像D154f及びデプスマップD154dのレンズ歪みの補正や、三次元点群D154cの算出に用いる。 The imaging unit D110 further includes an autofocus mechanism that controls the focal length of the optical system and a zoom mechanism that controls the angle of view. The auto focus mechanism can be switched on or off, and the set focal length can be fixed. The imaging unit D110 reads a control value defined by a drive amount such as a rotation angle or movement amount of an optical system control motor provided to control a focus and an angle of view, and refers to a lookup table (not shown). The distance can be calculated and output. Further, the imaging unit D110 can read, from the mounted lens, unique information of the lens such as a focal length range, an aperture, a distortion coefficient, and an optical center. The read inherent information is used for correction of lens distortion of a parallax image D 154 f and a depth map D 154 d described later, and calculation of a three-dimensional point group D 154 c.
 撮像部110は、画像D154a~D154bおよび視差画像D154f、デプスマップD154dのレンズ歪みを補正する機能、主点位置の画像座標(以下、画像中心と表記する)および画像D154aと画像D154bの基線長を出力する機能を有している。また、生成された画像154a~154c、焦点距離、画像中心などの光学系データ、視差画像D154f、基線長、デプスマップD154d、三次元点群D154cなどの三次元計測データを出力する機能を有している。本実施形態においては、これらのデータを総称して画像情報(以下、「視覚情報」とも記載する。)と呼ぶ。撮像部110は、撮像部110が内部に備える記憶領域(不図示)に設定されたパラメータあるいは撮像部110外部から与えられる命令に応じて、画像情報の全部あるいは一部を選択的に出力する。 The imaging unit 110 corrects the lens distortion of the images D154a to D154b and the parallax image D154f, the depth map D154d, the image coordinates of the principal point position (hereinafter referred to as the image center), and the base lengths of the images D154a and D154b. It has a function to output. It also has a function to output three-dimensional measurement data such as generated images 154a to 154c, optical system data such as focal length and image center, parallax image D154f, base line length, depth map D154d, and three-dimensional point group D154c. ing. In the present embodiment, these data are collectively referred to as image information (hereinafter also referred to as "visual information"). The imaging unit 110 selectively outputs all or part of the image information in accordance with a parameter set in a storage area (not shown) provided inside the imaging unit 110 or an instruction given from the outside of the imaging unit 110.
 本実施形態における移動制御とは、移動体が備えるアクチュエータであるモータ、および車輪の向きを変更するステアリングを制御することである。これらを制御することで、移動体を所定の目的地まで移動させる。また、制御値とは移動体を制御するための指令値のことである。 The movement control in the present embodiment is to control a motor that is an actuator included in the moving body and a steering that changes the direction of the wheel. By controlling these, the mobile unit is moved to a predetermined destination. Also, the control value is a command value for controlling the moving body.
 本実施形態における撮像部の位置姿勢とは、現実空間中に規定された任意の世界座標系における撮像部110の位置を表す3パラメータ、及び撮像部110の姿勢を表す3パラメータを合わせた6パラメータのことである。なお、AGVなどの移動体の設計段階で移動体の重心位置に対する撮像装置の取り付け位置を計測しておき、前述の取り付け位置姿勢を表す行列を外部メモリH14に記憶しておく。撮像部の位置姿勢に対して前述の取り付け位置姿勢を表す行列を掛け合わせることでAGVの重心位置を算出することができる。このため、本実施形態においては撮像部の位置姿勢を以て、AGVの位置姿勢と同義として扱うこととする。また、撮像部110の光軸をZ軸、画像の水平方向をX軸、垂直方向をY軸とする撮像部上に規定される三次元の座標系を撮像部座標系と呼ぶ。 The position and orientation of the imaging unit in the present embodiment are six parameters including three parameters indicating the position of the imaging unit 110 in an arbitrary world coordinate system defined in the real space and three parameters indicating the orientation of the imaging unit 110. It is Note that the mounting position of the imaging device with respect to the center of gravity of the moving object is measured at the design stage of the moving object such as AGV, and a matrix representing the mounting position and orientation is stored in the external memory H14. The gravity center position of the AGV can be calculated by multiplying the position and orientation of the imaging unit by the matrix representing the attachment position and orientation described above. For this reason, in the present embodiment, the position and orientation of the imaging unit are treated as synonymous with the position and orientation of the AGV. A three-dimensional coordinate system defined on the imaging unit with the optical axis of the imaging unit 110 as the Z axis, the horizontal direction of the image as the X axis, and the vertical direction as the Y axis is called an imaging unit coordinate system.
 入力部1110は、撮像部110が取得する画像情報(視覚情報)として、シーンの画像の各画素に対して奥行き値を格納したデプスマップを時系列(例えば毎秒60フレーム)に入力し、算出部1120に出力する。奥行き値とは、撮像部110とシーン内の物体との距離である。 The input unit 1110 inputs, in a time series (for example, 60 frames per second), a depth map in which a depth value is stored for each pixel of an image of a scene as image information (visual information) acquired by the imaging unit 110 Output to 1120. The depth value is the distance between the imaging unit 110 and an object in the scene.
 算出部1120は、入力部1110が入力したデプスマップ、保持部1130が保持する位置姿勢算出の指標となるマップ情報を用いて撮像部の位置姿勢を算出し、取得する。なお、マップ情報については後述する。算出部1120はさらに、算出した位置姿勢を制御部1140に出力する。尚、算出部では、位置姿勢を出力するために必要な情報を入力部から取得して、保持部1130で保持しているマップ情報と比較するだけでも良い。 The calculation unit 1120 calculates and acquires the position and orientation of the imaging unit using the depth map input by the input unit 1110 and map information serving as an index of position and orientation calculation held by the holding unit 1130. The map information will be described later. The calculation unit 1120 further outputs the calculated position and orientation to the control unit 1140. The calculation unit may obtain information necessary for outputting the position and orientation from the input unit and may simply compare the information with the map information held by the holding unit 1130.
 保持部1130は、マップ情報としてポイントクラウドを保持する。ポイントクラウドとはシーンの三次元点群データのことである。本実施形態では、ポイントクラウドは任意の世界座標系における三次元座標(X,Y,Z)の三値を格納したデータリストとして保持部1130が保持しているものとする。三次元点群データは、三次元位置情報を示している。また、これらに加え、AGVの目的地である三次元座標と姿勢を表す目的位置姿勢を保持する。目標位置姿勢は1つでも複数あってもよいが、ここでは簡単のため目標位置姿勢が1地点である例を説明する。また、保持部1130はマップ情報を必要に応じて算出部1120に出力する。さらに、目標位置姿勢を制御部1140に出力する。 The holding unit 1130 holds a point cloud as map information. The point cloud is three-dimensional point cloud data of a scene. In this embodiment, the point cloud is held by the holding unit 1130 as a data list storing three values of three-dimensional coordinates (X, Y, Z) in an arbitrary world coordinate system. Three-dimensional point cloud data indicates three-dimensional position information. Also, in addition to these, the three-dimensional coordinates that are the destination of the AGV and the target position and posture representing the posture are held. The target position and orientation may be one or more, but for the sake of simplicity, an example in which the target position and orientation is one point will be described. Also, the holding unit 1130 outputs map information to the calculation unit 1120 as needed. Furthermore, the target position and orientation are output to the control unit 1140.
 制御部1140は、算出部1120が算出した撮像部110の位置姿勢、保持部1130が保持するマップ情報、および不図示の通信装置が入力した運行情報をもとにAGVを制御する制御値を算出する。算出した制御値をアクチュエータ120へ出力する。 The control unit 1140 calculates a control value for controlling the AGV based on the position and orientation of the imaging unit 110 calculated by the calculation unit 1120, the map information held by the holding unit 1130, and the operation information input by the communication device (not shown). Do. The calculated control value is output to the actuator 120.
 図6は、情報処理装置1のハードウェア構成を示す図である。H11はCPUであり、システムバスH21に接続された各種デバイスの制御を行う。H12はROMであり、BIOSのプログラムやブートプログラムを記憶する。H13はRAMであり、CPUであるH11の主記憶装置として使用される。H14は外部メモリであり、情報処理装置1が処理するプログラムを格納する。入力部H15はキーボードやマウス、ロボットコントローラーであり、情報等の入力に係る処理を行う。表示部H16はH11からの指示に従って情報処理装置1の演算結果を表示装置に出力する。なお、表示装置は液晶表示装置やプロジェクタ、LEDインジケーターなど、種類は問わない。また、情報処理装置が備える表示部H16が表示装置としての役割であってもよい。H17は通信インターフェイスであり、ネットワークを介して情報通信を行うものであり、通信インターフェイスはイーサネット(登録商標)でもよく、USBやシリアル通信、無線通信等種類は問わない。なお、前述した移動体管理システム13とは通信インターフェイスH17を介して情報のやり取りを行う。H18はI/Oであり、撮像装置H19から画像情報(視覚情報)を入力する。なお、撮像装置H19とは前述した撮像部110のことである。H20は前述したアクチュエータ120のことである。 FIG. 6 is a diagram showing a hardware configuration of the information processing apparatus 1. A CPU H11 controls various devices connected to the system bus H21. H12 is a ROM, which stores a BIOS program and a boot program. H13 is a RAM, which is used as a main storage device of the CPU H11. An external memory H14 stores a program processed by the information processing apparatus 1. The input unit H15 is a keyboard, a mouse, or a robot controller, and performs processing related to input of information and the like. The display unit H16 outputs the calculation result of the information processing device 1 to the display device according to the instruction from H11. The display device may be of any type such as a liquid crystal display device, a projector, or an LED indicator. Further, the display unit H16 included in the information processing apparatus may play a role as a display device. A communication interface H17 performs information communication via a network. The communication interface may be Ethernet (registered trademark), and may be of any type such as USB, serial communication, or wireless communication. Information is exchanged with the mobile object management system 13 described above via the communication interface H17. H18 is I / O, and inputs image information (visual information) from the imaging device H19. The imaging device H19 is the imaging unit 110 described above. H20 is the actuator 120 described above.
 次に、本実施形態における処理手順について説明する。図5は、本実施形態における情報処理装置10の処理手順を示すフローチャートである。以下、フローチャートは、CPUが制御プログラムを実行することにより実現されるものとする。処理ステップは、初期化S110、視覚情報取得S120、視覚情報入力S130、位置姿勢算出S140、制御値算出S150、AGVの制御S160、システム終了判定S170から構成されている。 Next, the processing procedure in the present embodiment will be described. FIG. 5 is a flowchart showing the processing procedure of the information processing apparatus 10 in the present embodiment. Hereinafter, the flowchart is realized by the CPU executing the control program. The processing steps include initialization S110, visual information acquisition S120, visual information input S130, position and orientation calculation S140, control value calculation S150, control of AGV S160, and system termination determination S170.
 ステップS110では、システムの初期化を行う。すなわち、外部メモリH14からプログラムを読み込み、情報処理装置10を動作可能な状態にする。また、情報処理装置10に接続された各機器のパラメータ(撮像部110の内部パラメータや焦点距離)や、撮像部110の初期位置姿勢を前時刻位置姿勢としてRAMであるH13に読み込む。また、AGVの各デバイスを起動し、動作・制御可能な状態とする。これらに加え、通信I/F(H17)を通して移動体管理システムから運行情報を受信し、AGVが向かうべき目的地の三次元座標を受信し、保持部1130に保持する。 In step S110, the system is initialized. That is, the program is read from the external memory H14, and the information processing apparatus 10 is made operable. In addition, the parameters (internal parameters and focal length of the imaging unit 110) of each device connected to the information processing apparatus 10, and the initial position and orientation of the imaging unit 110 are read as a previous time position and orientation in H13 which is a RAM. In addition, each device of AGV is started, and it is put in the state where it can operate and control. In addition to these, the operation information is received from the mobile management system through the communication I / F (H17), the three-dimensional coordinates of the destination to which the AGV should head is received, and held in the holding unit 1130.
 ステップS120では、撮像部110が視覚情報を取得し、入力部1110に入力する。本実施形態において視覚情報とはデプスマップのことであり、前述の方法で撮像部110がデプスマップを取得してあるものとする。つまり、デプスマップとは図4におけるD154dのことである。 In step S120, the imaging unit 110 acquires visual information and inputs the visual information to the input unit 1110. In the present embodiment, visual information is a depth map, and it is assumed that the imaging unit 110 has acquired the depth map by the method described above. That is, the depth map is D154 d in FIG.
 ステップS130では、入力部1110が、撮像部110が取得したデプスマップを取得する。なお、本実施形態においては、デプスマップは各画素の奥行き値を格納した二次元配列リストのことである。 In step S130, the input unit 1110 acquires the depth map acquired by the imaging unit 110. In the present embodiment, the depth map is a two-dimensional array list storing the depth value of each pixel.
 ステップS140では、算出部1120が、入力部1110が入力したデプスマップと、保持部1130が保持するマップ情報とを用いて撮像部110の位置姿勢を算出する。具体的には、まずデプスマップから撮像座標系に規定された三次元点群を算出する。画像座標(u,v)と撮像部110の内部パラメータ(f、f、c、c)、デプスマップの画素の奥行き値Dを用いて三次元点群(X,Y,Z)を数式1により算出する。 In step S140, the calculating unit 1120 calculates the position and orientation of the imaging unit 110 using the depth map input by the input unit 1110 and the map information held by the holding unit 1130. Specifically, first, a three-dimensional point group defined in the imaging coordinate system is calculated from the depth map. A three-dimensional point group (X t , Y) using image coordinates (u t , v t ), internal parameters (f x , f y , c x , c y ) of the imaging unit 110 and depth values D of pixels of the depth map t, the Z t) is calculated by equation 1.
Figure JPOXMLDOC01-appb-M000001
Figure JPOXMLDOC01-appb-M000001
 次に、撮像部110の前時刻位置姿勢を用いて三次元点群を前時刻位置姿勢座標系に座標変換する。つまり三次元点群に前時刻位置姿勢の行列を掛け合わせる。算出した三次元点群と保持部1130が保持するマップ情報のポイントクラウドの各三次元点の最近傍の点同士の距離の和が小さくなるように位置姿勢を算出する。具体的には、ICP(Iterative Closest Point)アルゴリズムを用いて前時刻位置姿勢に対する撮像部110の位置姿勢を算出する。最後に、世界座標系に変換して、世界座標系における位置姿勢を制御部1140に出力する。なお、算出した位置姿勢はRAMであるH13に前時刻位置姿勢として上書きして保持する。 Next, using the previous time position and orientation of the imaging unit 110, coordinate conversion of the three-dimensional point group is performed to the previous time position and orientation coordinate system. That is, the three-dimensional point group is multiplied by the matrix of the previous time position and orientation. The position and orientation are calculated such that the sum of the distances between the nearest three-dimensional points of the calculated three-dimensional point group and the point cloud of the map information held by the holding unit 1130 is reduced. Specifically, the position and orientation of the imaging unit 110 with respect to the previous time position and orientation are calculated using an ICP (Iterative Closest Point) algorithm. Finally, it is converted into the world coordinate system, and the position and orientation in the world coordinate system are output to the control unit 1140. The calculated position and orientation are stored over the H13, which is the RAM, as the previous time position and orientation.
 ステップS150では、制御部1140が、AGVを制御するための制御値を算出する。具体的には、保持部1130が保持する目的地座標と算出部1120が算出した撮像部110の位置姿勢とのユークリッド距離が小さくなるように制御値を算出する。制御部1140が算出した制御値をアクチュエータ120に出力する。 In step S150, the control unit 1140 calculates a control value for controlling the AGV. Specifically, the control value is calculated such that the Euclidean distance between the destination coordinates held by the holding unit 1130 and the position and orientation of the imaging unit 110 calculated by the calculation unit 1120 is reduced. The control value calculated by the control unit 1140 is output to the actuator 120.
 ステップS160では、アクチュエータ120が、制御部1140が算出した制御値を用いてAGVを制御する。 In step S160, the actuator 120 controls the AGV using the control value calculated by the control unit 1140.
 ステップS170では、システムを終了するか否か判定する。具体的には、保持部1130が保持する目的地座標と算出部1120が算出した撮像部110の位置姿勢とのユークリッド距離が所定の閾値以下であれば、目的地に到着したとして終了する。そうでなければステップS120に戻り処理を続ける。 In step S170, it is determined whether to end the system. Specifically, if the Euclidean distance between the destination coordinates held by the holding unit 1130 and the position and orientation of the imaging unit 110 calculated by the calculation unit 1120 is equal to or less than a predetermined threshold, the processing ends as having arrived at the destination. If not, the process returns to step S120 and continues processing.
 実施形態1では、撮像素子上の各々の受光部が2以上の受光素子によって構成されることを特徴とする撮像部が取得したデプスマップから求める三次元点と、マップ情報であるポイントクラウドの各三次元点とを用いる。それらの三次元点の距離が最小となるように撮像部の位置姿勢を算出する。算出した撮像部の位置姿勢と、AGVの目的地との距離が最小化するように自動的にAGVを制御することで、安定して、かつ手間を減らしてAGVを運用することが出来る。 In the first embodiment, each of the light receiving units on the image pickup device includes two or more light receiving elements, and a three-dimensional point obtained from the depth map acquired by the imaging unit and the point cloud as map information Use three-dimensional points. The position and orientation of the imaging unit are calculated so as to minimize the distance between those three-dimensional points. By automatically controlling the AGV so as to minimize the distance between the calculated position and orientation of the imaging unit and the destination of the AGV, the AGV can be operated stably and with less effort.
 <変形例>
 実施形態1では、撮像部110がデプスマップD154dを算出し、本情報処理装置における入力部1110がデプスマップを入力していた。変形例として、撮像部110の位置姿勢を算出できれば、入力部1110が入力するのは撮像部110が算出したデプスマップに限らない。具体的には、撮像部110が内部で撮像部110座標系におけるポイントクラウドを算出していれば、入力部1110が、撮像部110が算出したポイントクラウドを入力することができる。このとき、算出部1120は、入力部1110が入力したポイントクラウドを用いて位置姿勢算出を行うことができる。なお、撮像部110が算出するポイントクラウドとは、図4における三次元点群D154のことである。また、入力部1110が、撮像部110が取得した一対の画像D154a、D154b、および撮像部110が保持する焦点距離を入力し、算出部1120が対応点探索およびステレオ法によってデプスマップを求めてもよい。また、それらに加えて入力部1110が、撮像部110が取得したRGB画像やグレー画像である画像を合わせて視覚情報として入力してもよい。つまり、撮像部110が行うデプスマップ算出を、かわりに算出部1120が行うこともできる。
<Modification>
In the first embodiment, the imaging unit 110 calculates the depth map D 154 d, and the input unit 1110 in the information processing apparatus inputs the depth map. As a modification, as long as the position and orientation of the imaging unit 110 can be calculated, what the input unit 1110 inputs is not limited to the depth map calculated by the imaging unit 110. Specifically, if the imaging unit 110 internally calculates a point cloud in the imaging unit 110 coordinate system, the input unit 1110 can input the point cloud calculated by the imaging unit 110. At this time, the calculation unit 1120 can perform position and orientation calculation using the point cloud input by the input unit 1110. The point cloud calculated by the imaging unit 110 is the three-dimensional point group D 154 in FIG. 4. Further, even if the input unit 1110 inputs the pair of images D154a and D154b acquired by the imaging unit 110 and the focal length held by the imaging unit 110, the calculation unit 1120 obtains the depth map by the corresponding point search and the stereo method. Good. Further, in addition to them, the input unit 1110 may also input an image that is an RGB image or a gray image acquired by the imaging unit 110 as visual information. That is, the calculation unit 1120 may perform the depth map calculation performed by the imaging unit 110 instead.
 撮像部110は、さらに光学系の焦点距離を制御するフォーカス制御機構を備えることができ、このフォーカス制御を本情報処理装置が制御してもよい。例えば、本情報処理装置の制御部1140がフォーカスを調整する制御値(フォーカス値)を算出してもよい。例えば、移動体が移動し視覚画像の見えが変わった時に、デプスマップの平均値や中央値の奥行きに合わせて撮像部110のフォーカスを調整する制御値を算出する。また、本情報処理装置がフォーカスを調整するのではなく、撮像部110内部に構成されたオートフォーカス機構が調節することもできる。フォーカスを調整することでよりピントの合った視覚情報を取得できるため高精度に位置姿勢を算出することができる。なお、撮像部110はフォーカス制御機能が無い構成(フォーカス固定)であってもよい。この場合には撮像部110はフォーカス制御機構を搭載しなくて済むため小型化できる。 The imaging unit 110 may further include a focus control mechanism that controls the focal length of the optical system, and the information processing apparatus may control the focus control. For example, the control unit 1140 of the information processing apparatus may calculate a control value (focus value) for adjusting the focus. For example, when the moving object moves and the appearance of the visual image changes, a control value for adjusting the focus of the imaging unit 110 in accordance with the depth of the average value or the median value of the depth map is calculated. In addition, the present information processing apparatus can adjust the autofocus mechanism formed inside the imaging unit 110 instead of adjusting the focus. By adjusting the focus, it is possible to obtain more focused visual information, and it is possible to calculate the position and orientation with high accuracy. The imaging unit 110 may have a configuration (focus fixed) without the focus control function. In this case, the imaging unit 110 can be downsized because it is not necessary to mount the focus control mechanism.
 撮像部110は、さらに光学系のズームを制御するズーム制御機構を備えることができ、このズーム制御を本情報処理装置が行ってもよい。具体的には、移動体が高速に移動する場合には、制御部1140がズームを広角にして広い視野の視覚情報合を取得するようにズーム値を調整する制御値(調整値)を算出する。また、移動体を高精度に制御したく、撮像部110の位置姿勢を高精度に算出したい場合には、ズームを狭角にして狭い視野の視覚情報合を高解像度で取得するようにズーム値を調整する制御値(調整値)を算出する。このように必要に応じてズーム値を変えることで、安定して、高精度に撮像部110の位置姿勢を算出することができる。このため、安定して、高精度に移動体を制御することができる。 The imaging unit 110 may further include a zoom control mechanism that controls the zoom of the optical system, and the information processing apparatus may perform this zoom control. Specifically, when the moving object moves at high speed, the control unit 1140 calculates a control value (adjustment value) for adjusting the zoom value so that the zoom is wide angle and the visual information combination of the wide field of view is acquired. . In addition, when it is desired to control the moving object with high accuracy and to calculate the position and orientation of the imaging unit 110 with high accuracy, the zoom value is set so that the zoom is narrow and the visual information combination of the narrow field of view is acquired with high resolution. The control value (adjustment value) to adjust is calculated. Thus, by changing the zoom value as necessary, the position and orientation of the imaging unit 110 can be stably calculated with high accuracy. Therefore, it is possible to control the moving body stably and with high accuracy.
 本実施形態においては、撮像部110はピンホールカメラモデルに当てはまる光学系を想定して説明したが、撮像部110の位置姿勢、移動体の制御を行うための視覚情報を取得することのできる光学系であればどのような光学装置(レンズ)を用いてもよい。具体的には全天周レンズや魚眼レンズでもよいし双曲面ミラーでもよい。マクロレンズを用いてもよい。例えば、全天周レンズや魚眼レンズを用いると広大な視野の奥行き値を取得でき、位置姿勢推定のロバスト性が向上する。マクロレンズを用いると詳細な位置姿勢を算出することができる。このように、使用するシーンに合わせてユーザがレンズを自由に変更(交換など)することができ、安定して、高精度に撮像部110の位置姿勢を算出することができる。また、安定して、高精度に移動体を制御することができる。 In the present embodiment, although the imaging unit 110 has been described on the assumption that the optical system is applicable to the pinhole camera model, the optical system can acquire the position and orientation of the imaging unit 110 and visual information for controlling the moving object. Any optical device (lens) may be used as long as it is a system. Specifically, it may be a full sky lens, a fisheye lens, or a hyperboloid mirror. A macro lens may be used. For example, if an all-sky lens or a fisheye lens is used, it is possible to acquire a depth value of a wide field of view, and the robustness of position and orientation estimation is improved. A detailed position and orientation can be calculated by using a macro lens. As described above, the user can freely change (exchange and the like) the lens in accordance with the scene to be used, and the position and orientation of the imaging unit 110 can be stably calculated with high accuracy. In addition, the moving body can be controlled stably and with high accuracy.
 このようにズーム値や焦点距離を変える構成とする場合には、撮像部110が、焦点および画角を制御するために設けられた光学系制御モータの回転角あるいは移動量によって規定される制御値を読み取る。そして、不図示のルックアップテーブルを参照して焦点距離を算出する。また、レンズを変えた場合には撮像部110は、レンズに付与した電子接点を通してレンズに記録された焦点距離値を読み取る。また、不図示のUIを用いて人が撮像部110に焦点距離を入力することもできる。このようにして取得した焦点距離値を用いて撮像部110はデプスマップを算出する。そして、本情報処理装置の入力部110は撮像部110から視覚情報と合わせて焦点距離値を入力する。算出部1120は、入力部1110が入力したデプスマップ、および焦点距離値を用いて位置姿勢を算出する。また、撮像部110は算出した焦点距離を用いて撮像部110座標系におけるポイントクラウドを算出することができる。この時には本情報処理装置の入力部110は撮像部110が算出したポイントクラウドを入力し、算出部1120は入力部110が入力したポイントクラウドを用いて位置姿勢を算出する。 When the zoom value and the focal length are changed as described above, a control value defined by the rotation angle or movement amount of the optical system control motor provided for controlling the focus and the angle of view of the imaging unit 110 Read Then, the focal length is calculated with reference to a lookup table (not shown). When the lens is changed, the imaging unit 110 reads the focal length value recorded in the lens through the electronic contact given to the lens. A person can also input a focal length to the imaging unit 110 using a UI (not shown). The imaging unit 110 calculates the depth map using the focal length value acquired in this manner. Then, the input unit 110 of the information processing apparatus inputs a focal length value from the imaging unit 110 together with visual information. The calculation unit 1120 calculates the position and orientation using the depth map input by the input unit 1110 and the focal length value. The imaging unit 110 can also calculate a point cloud in the imaging unit 110 coordinate system using the calculated focal length. At this time, the input unit 110 of the information processing apparatus inputs the point cloud calculated by the imaging unit 110, and the calculation unit 1120 calculates the position and orientation using the point cloud input by the input unit 110.
 本実施形態におけるマップ情報とは、ポイントクラウドのことであった。しかしながら、撮像部110の位置姿勢を算出するための指標となる情報であれば何でもよい。具体的には、ポイントクラウドの各点にさらに色情報である三値を付与した色情報付きポイントクラウドであってもよい。また、デプスマップと位置姿勢を関連付けてキーフレームとし、キーフレームを複数保持する構成としてよい。このときには、キーフレームのデプスマップと撮像部110が取得したデプスマップとの距離を最小化するように位置姿勢を算出する。さらに、また、入力部1110が画像を入力する構成であれば、算出部1120が入力画像をキーフレームに関連付けて保持しておいてもよい。また、さらにAGVが通行可能な領域と壁などの通行不可の場所を関連付けた2Dマップを保持する構成としておいてもよい。なお、2Dマップの利用法については後述する。 The map information in the present embodiment is a point cloud. However, any information may be used as long as it is an index for calculating the position and orientation of the imaging unit 110. Specifically, it may be a point cloud with color information in which three values, which are color information, are added to each point of the point cloud. In addition, the depth map may be associated with the position and orientation to form a key frame, and a plurality of key frames may be held. At this time, the position and orientation are calculated so as to minimize the distance between the depth map of the key frame and the depth map acquired by the imaging unit 110. Furthermore, if the input unit 1110 is configured to input an image, the calculation unit 1120 may store the input image in association with the key frame. Furthermore, a configuration may be adopted in which a 2D map in which an area through which the AGV can pass and an impassable place such as a wall are associated is held. The usage of the 2D map will be described later.
 本実施形態において位置姿勢算出はICPアルゴリズムを用いた例を説明したが、位置姿勢を算出することができればどのような方法を使ってもよい。つまり、本実施形態で説明したポイントクラウドの代わりに、算出部1120がそれらからメッシュモデルを算出し、各面の距離が最小化するように位置姿勢を算出する方法を用いてもよい。また、デプスマップおよびポイントクラウドから不連続点となる三次元エッジを算出し、それら三次元エッジ同士の距離が最小化するように位置姿勢を算出する方法を用いてもよい。また、入力部1110が画像を入力する構成であれば、算出部1120が入力画像を更に用いて位置姿勢を算出することもできる。 Although the position and orientation calculation in the present embodiment is described using an ICP algorithm, any method may be used as long as the position and orientation can be calculated. That is, instead of the point cloud described in the present embodiment, the calculation unit 1120 may calculate a mesh model from them and may calculate the position and orientation so as to minimize the distance between the surfaces. Alternatively, a three-dimensional edge which is a discontinuous point may be calculated from the depth map and the point cloud, and a position and orientation may be calculated so that the distance between the three-dimensional edges is minimized. In addition, if the input unit 1110 is configured to input an image, the calculation unit 1120 can also calculate the position and orientation by further using the input image.
 また、AGVがジャイロやIMUなどの慣性センサ、タイヤの回転量を取得するエンコーダといったセンサを備えていれば、入力部1110が、入力センサのセンサ値を入力する。算出部1120が、センサ値を併用して撮像部110の位置姿勢を算出することもできる。具体的にはKalman Filterや、Visual Inertial SLAMとして関連技術が公知であり、これらを援用できる。このように、撮像部110の視覚情報とセンサ情報を併用することで高精度にロバストに位置姿勢を算出することができる。また、ジャイロやIMUなどの慣性センサを撮像部110が撮像する視覚情報のブレの低減に用いることができる。具体的には、鉛直方向の移動や回転を検知した場合にはAGVの振動とみなし、これをキャンセルするように視覚情報を画像変形する。このようにすることで、AGVの走行時の揺れの影響を受けずに高精度に位置姿勢を算出することができる。 Further, when the AGV includes a sensor such as an inertial sensor such as a gyro or an IMU, or an encoder for acquiring the amount of rotation of a tire, the input unit 1110 inputs a sensor value of the input sensor. The calculating unit 1120 can also calculate the position and orientation of the imaging unit 110 by using the sensor values. Specifically, related techniques are known as Kalman Filter and Visual Inertial SLAM, and these can be used. As described above, by using the visual information and the sensor information of the imaging unit 110 in combination, the position and orientation can be calculated robustly with high accuracy. In addition, an inertial sensor such as a gyro or an IMU can be used to reduce blurring of visual information captured by the imaging unit 110. Specifically, when movement or rotation in the vertical direction is detected, it is regarded as vibration of the AGV, and image information of the visual information is deformed so as to cancel this. By doing this, it is possible to calculate the position and orientation with high accuracy without being affected by the shake during traveling of the AGV.
 実施形態1では、単に目標位置姿勢と算出部1120が算出した位置姿勢との距離が小さくなるように、制御部1140が制御値を算出していた。その他、制御部1140は目的地に到達するための制御値を算出するものであればどのような制御値を算出しても用いてもよい。具体的には、入力幾何情報であるデプスマップの奥行き値が所定の距離未満になった場合には、制御部1140が例えば右に旋回するような制御値を算出する。また、保持部1130が保持するマップ情報でポイントクラウドが存在する部分は通行不可、存在しない空間は通行可として、算出部1120が動的計画法によってルートを生成し、制御部1140がこのルートに従って制御値を算出することもできる。このようにすることで壁に沿った行動を行うことができ壁との衝突を回避しつつ目的地まで移動することができる。また、算出部1120が、あらかじめマップ情報であるポイントクラウドを地面である平面に射影して2Dマップを作成しておく。ポイントクラウドが射影された地点は壁や障害物など通行できない地点であり、射影されていない地点は空間中に何もなく通行が可能な地点である。この情報を基に、動的計画法により目的地までのルートを生成することができる。また、算出部1120が、目的地に近くなるほど小さくなるような値を格納したコストマップを算出し、制御部1140がこれを入力として制御値を出力するように学習したニューラルネットワークである深層強化学習器を用いて制御値を算出してもよい。このように壁などの障害物を回避しつつ移動する制御値を算出することで、安定して、安全にAGVを運用することができる。 In the first embodiment, the control unit 1140 calculates the control value so that the distance between the target position and orientation and the position and orientation calculated by the calculation unit 1120 is reduced. In addition, as long as the control unit 1140 calculates a control value for reaching the destination, it may calculate or use any control value. Specifically, when the depth value of the depth map, which is the input geometric information, is less than a predetermined distance, the control unit 1140 calculates a control value such as turning to the right, for example. In addition, the calculation unit 1120 generates a route by the dynamic programming method, with the map information held by the holding unit 1130 not being able to pass through the part where the point cloud exists and not passing space, the control unit 1140 follows this route Control values can also be calculated. By doing this, it is possible to perform an action along the wall, and to move to the destination while avoiding a collision with the wall. In addition, the calculation unit 1120 previously projects a point cloud that is map information on a plane that is the ground to create a 2D map. The point where the point cloud is projected is an impassable point such as a wall or an obstacle, and the unprojected point is an impassable point without passing through the space. Based on this information, dynamic programming can generate a route to a destination. The calculation unit 1120 calculates a cost map that stores values that decrease as it approaches the destination, and the control unit 1140 receives this as input to output a control value. The controller may be used to calculate the control value. By calculating the control value that moves while avoiding the obstacle such as the wall in this manner, the AGV can be operated stably and safely.
 保持部1130がマップ情報を保持しない構成としてもよい。具体的には、撮像部110が時刻tとその1時刻前のt’’に取得した視覚情報をもとに算出部1120が時刻t’’に対する時刻tの位置姿勢を算出する。このように算出部1120が毎時刻算出する位置姿勢変化量の行列を掛け合わせることで、マップ情報が無くとも撮像部110の位置姿勢を算出することができる。このような構成とすることで計算資源の小さい計算機においても位置姿勢を算出し、移動体の制御を行うことができる。 The holding unit 1130 may not hold map information. Specifically, based on the visual information acquired by the imaging unit 110 at time t and t ′ ′ one time before that, the calculation unit 1120 calculates the position and orientation of the time t relative to the time t ′ ′. The position and orientation of the imaging unit 110 can be calculated without the map information by multiplying the position and orientation change amount matrix calculated by the calculation unit 1120 every time as described above. With such a configuration, it is possible to calculate the position and orientation even in a computer with a small computational resource and to control the moving body.
 実施形態1では、保持部1130が、事前に作成したマップ情報を保持していた。しかしながら、撮像部110が取得した視覚情報と、算出部1120が算出した位置姿勢をもとにマップ情報を作成しつつ位置姿勢推定を行うSLAM(Simultaneous Localization and Mapping)の構成としてもよい。SLAMの方法は数多く提案されておりそれらを援用できる。例えば、複数時刻に撮像部110が取得したポイントクラウドを時系列統合するPoint-Based Fusionアルゴリズムを用いることができる意。また、計測した奥行きの物体と空間との境界をボクセルデータとして時系列的に統合するKinect Fusionアルゴリズムを用いることができる。この他にも、画像から検出した特徴点の奥行きにデプスセンサの奥行き値としてトラッキングしつつ、マップを生成するRGB-D SLAMアルゴリズム等が公知であり、これらを援用できる。また、本実施形態において、マップは同一の時間帯に生成されるものに限定されない。例えば、時間帯を変えてマップを複数生成し、それらを合成してもよい。 In the first embodiment, the holding unit 1130 holds the map information created in advance. However, a configuration of SLAM (Simultaneous Localization and Mapping) may be performed in which position and orientation estimation is performed while creating map information based on the visual information acquired by the imaging unit 110 and the position and orientation calculated by the calculation unit 1120. Many methods of SLAM have been proposed and can be used. For example, it is possible to use a Point-Based Fusion algorithm that integrates point clouds acquired by the imaging unit 110 at multiple times in time series. In addition, it is possible to use a Kinect Fusion algorithm that integrates the boundary between the measured depth object and the space as voxel data in time series. Besides this, an RGB-D SLAM algorithm or the like that generates a map while tracking the depth of a feature point detected from an image as a depth value of a depth sensor is known, and these can be used. Further, in the present embodiment, the maps are not limited to those generated in the same time zone. For example, time zones may be changed to generate multiple maps, and these may be synthesized.
 本実施形態において、マップ情報は、移動体11に搭載している撮像部110で取得したデータから生成することに限らない。例えば、環境のCAD図面や地図画像をそのまま、あるいはデータフォーマット変換したものを保持部1130が保持して置いてもよい。また、CAD図面や地図画像によるマップを初期マップとして保持部1130が保持しておき、前述のSLAM技術で更新しても良い。マップの更新時刻を保持しておき、所定の時間が過ぎた地点のマップを更新するようにAGVを制御する制御値を制御部1140が算出してもよい。マップの更新は、上書きで更新しても良いし、初期マップを保持しておき差分を更新情報として記憶しても良い。また、その際、マップをレイヤーで管理して、表示部H16で確認したり、初期マップに戻したりすることも出来る。ディスプレイ画面を見ながら操作を行うことで利便性が向上する。 In the present embodiment, the map information is not limited to generation from data acquired by the imaging unit 110 mounted on the mobile object 11. For example, the holding unit 1130 may hold and hold a CAD drawing or map image of the environment as it is or after converting the data format. Alternatively, the holding unit 1130 may hold a map based on a CAD drawing or a map image as an initial map, and update the map using the above-described SLAM technology. The control unit 1140 may calculate the control value for controlling the AGV so as to update the map at a point where a predetermined time has passed, while holding the map update time. The map may be updated by overwriting, or the initial map may be held and the difference may be stored as update information. At this time, the map can be managed in layers and checked on the display unit H16 or can be returned to the initial map. Convenience is improved by performing the operation while looking at the display screen.
 実施形態1では、移動体管理システム13が設定した目的地座標を基に移動体が動作していた。一方、本情報処理装置が算出した位置姿勢や制御値を移動体管理システムに通信I/F(H17)を通して送信することもできる。撮像部110が取得した視覚情報を基に算出した位置姿勢や制御値を移動体管理システム13や工程管理システム14が参照することで、より効率よく工程の管理、移動体の管理を行うことができる。また、常時オンラインで移動管理システムから目的地座標を取得するのであれば、保持部1130は目的地座標を保持しないで通信I/Fを介して随時受信する構成とすることもできる。 In the first embodiment, the moving object is operating based on the destination coordinates set by the moving object management system 13. On the other hand, the position / posture and control value calculated by the information processing apparatus can be transmitted to the mobile management system through the communication I / F (H17). The mobile object management system 13 and the process management system 14 refer to positions and orientations and control values calculated based on the visual information acquired by the imaging unit 110 to perform process management and mobile object management more efficiently. it can. In addition, if destination coordinates are always obtained from the movement management system online, the holding unit 1130 can be configured to receive any time via the communication I / F without holding the destination coordinates.
 実施形態1では、情報処理システム1として、工程管理システム14が工場の全体工程を管理し、管理状況に応じて移動体管理システム13が移動体の運行情報を管理し、運行情報に従って移動体12が移動する構成だった。しかしながら、移動体が、撮像部110が取得する視覚情報を基に移動する構成であればよい。例えば、あらかじめ所定の二地点を保持部1130に保持しておき、それらの間を行き来するような構成とすれば工程管理システムや移動体管理システムが無くてもよい。 In the first embodiment, as the information processing system 1, the process management system 14 manages the entire process of the factory, the mobile management system 13 manages the operation information of the mobile according to the management status, and the mobile 12 according to the operation information. Was configured to move. However, any configuration may be employed as long as the moving body moves based on the visual information acquired by the imaging unit 110. For example, the process management system and the mobile management system may be omitted if the configuration is such that predetermined two points are held in advance in the holding unit 1130 and the space between them is exchanged.
 本実施形態において、移動体12は搬送車(AGV)に限定されるものではない。例えば、移動体12は自動運転車、自律移動ロボットであっても良く、本実施形態で説明した移動制御をそれらに適用しても良い。 In the present embodiment, the moving body 12 is not limited to the carrier vehicle (AGV). For example, the mobile unit 12 may be an autonomous vehicle or an autonomous mobile robot, and the movement control described in the present embodiment may be applied to them.
 特に、前述の情報処理装置を自動車に搭載すれば、自動運転を実現する自動車としても用いることができる。制御部1140が算出した制御値を用いて自動車を移動させる。この場合には、自動車に搭載されるカーナビゲーションシステムからI/O(H18)を通して目的地座標やマップ情報を取得することができる。 In particular, if the above-described information processing apparatus is mounted on a car, it can also be used as a car that realizes automatic driving. The vehicle is moved using the control value calculated by the control unit 1140. In this case, it is possible to acquire destination coordinates and map information from the car navigation system mounted on a car through the I / O (H18).
 また、移動体を制御するために用いるのではなく、撮像部110が取得した視覚情報を基に位置姿勢を算出する装置として構成してもよい。具体的には、複合現実感システムにおける現実空間と仮想物体との位置合わせ、すなわち、仮想物体の描画に利用するための現実空間における撮像部110の位置及び姿勢の計測に本実施形態の方法を適用することもできる。ここでは一例として、スマートフォンやタブレットに代表されるモバイル端末のディスプレイに、撮像部110が撮像した画像D154aに3DCGモデルを位置合わせし合成して提示する例を説明する。このような用途を実現するために、入力部1120は、撮像装置1110が取得したデプスマップD152cに加え、画像D154aを入力する。また、保持手段1130は仮想物体の3DCGモデルと、マップ座標系において3DCGモデルを設置する三次元位置を更に保持する。算出手段1120は、実施形態1で説明したように算出した撮像装置1110の位置姿勢を用いて、画像D154aに、3DCGモデルを合成する。このようにすることで、複合現実感を体験するユーザはモバイル端末を把持し、モバイル端末のディスプレイを通じて本情報処理装置が算出した位置姿勢を基に仮想物体が重畳された現実空間を安定して観察することができる。 Further, the position and orientation may be calculated based on the visual information acquired by the imaging unit 110 instead of controlling the moving body. Specifically, the method of the present embodiment is used to align the real space and the virtual object in the mixed reality system, that is, to measure the position and orientation of the imaging unit 110 in the real space for use in drawing the virtual object. It can also be applied. Here, as an example, an example will be described in which a 3DCG model is aligned and synthesized on the image D 154 a captured by the imaging unit 110 on the display of a mobile terminal represented by a smartphone or a tablet. In order to realize such an application, the input unit 1120 inputs an image D154a in addition to the depth map D152c acquired by the imaging device 1110. The holding unit 1130 further holds the 3DCG model of the virtual object and the three-dimensional position at which the 3DCG model is installed in the map coordinate system. The calculation unit 1120 combines the 3DCG model with the image D 154 a using the position and orientation of the imaging device 1110 calculated as described in the first embodiment. By doing this, the user who experiences mixed reality holds the mobile terminal and stabilizes the real space on which the virtual object is superimposed based on the position and orientation calculated by the information processing apparatus through the display of the mobile terminal. It can be observed.
 [実施形態2]
 実施形態1では、撮像部が取得したデプスマップを用いて撮像部の位置姿勢を算出していた。DAF(Dual Pixel Auto Focus)による撮像部は、特に撮像部から特定の距離範囲を高精度に計測することができる。そこで、実施形態2では、撮像部からの距離が特定の範囲外であってもモーションステレオによって奥行き値を算出することで、撮像部が取得したデプスマップをさらに高精度化し、位置姿勢を安定して、高精度に算出する。
Second Embodiment
In the first embodiment, the position and orientation of the imaging unit are calculated using the depth map acquired by the imaging unit. An imaging unit based on dual pixel auto focus (DAF) can measure a specific distance range from the imaging unit with high accuracy. Therefore, in the second embodiment, even if the distance from the imaging unit is out of a specific range, the depth value is calculated by motion stereo to further increase the accuracy of the depth map acquired by the imaging unit and stabilize the position and orientation. Calculation with high accuracy.
 実施形態2における装置の構成は、実施形態1で説明した情報処理装置10の構成を示す図2と同一であるため省略する。入力部1110が、視覚情報を保持部1130に入力し、保持部1130が視覚情報を保持することが、実施形態1と異なる。また、および算出部1120が、保持部1130が保持する視覚情報も用いてデプスマップを補正し、位置姿勢を算出する点が、実施形態1と異なる。また、保持部1130はあらかじめ撮像部110の特性情報として、撮像部110が取得するデプスマップの奥行き値の信頼度を関連付けたリストを保持しているものとする。奥行き値の信頼度とは、事前に撮像部110と平面パネルとのを所定の距離だけ離して撮影したときの実距離と計測距離との誤差の逆数を0から1の値にクリッピングした値のことである。あらかじめさまざまな距離に対して信頼度を算出してあるものとする。ただし、計測ができなかった点は信頼度を0としておく。なお、本実施形態においては撮像部110が取得し、入力部1110が入力する視覚情報とは、画像およびデプスマップであるものとする。 The configuration of the device in the second embodiment is the same as that of FIG. 2 showing the configuration of the information processing device 10 described in the first embodiment, and thus the description thereof is omitted. The input unit 1110 inputs visual information to the holding unit 1130, and the holding unit 1130 holds the visual information, which is different from the first embodiment. The second embodiment differs from the first embodiment in that the calculating unit 1120 corrects the depth map using the visual information held by the holding unit 1130 and calculates the position and orientation. Further, it is assumed that the holding unit 1130 holds a list in which the reliability of the depth value of the depth map acquired by the imaging unit 110 is associated in advance as the characteristic information of the imaging unit 110. The reliability of the depth value is a value obtained by clipping the reciprocal of the error between the actual distance and the measured distance when taking a picture of the imaging unit 110 and the flat panel at a predetermined distance in advance, from 0 to 1 It is. It is assumed that the reliability has been calculated in advance for various distances. However, the point where the measurement could not be made has a reliability of 0. In the present embodiment, visual information acquired by the imaging unit 110 and input by the input unit 1110 is an image and a depth map.
 実施形態2における処理全体の手順は、実施形態1で説明した情報処理装置10の処理手順を示す図4と同一であるため、説明を省略する。実施形態1とは、位置姿勢算出ステップS140前にデプスマップの補正ステップが追加される点が異なる。図7は、デプスマップ補正ステップにおける処理手順の詳細を示すフローチャートである。 The procedure of the entire processing in the second embodiment is the same as that in FIG. 4 showing the processing procedure of the information processing apparatus 10 described in the first embodiment, and thus the description will be omitted. The second embodiment differs from the first embodiment in that the depth map correction step is added before the position and orientation calculation step S140. FIG. 7 is a flowchart showing details of the processing procedure in the depth map correction step.
 ステップS2110では、算出部1120が、保持部1130から撮像部110の特性情報を読み込む。 In step S2110, the calculation unit 1120 reads the characteristic information of the imaging unit 110 from the holding unit 1130.
 ステップS2120では、算出部1120が、保持部1130が保持する、撮像部110が視覚画像を取得した時刻t以前の任意の時刻t’に取得した視覚情報である画像とデプスマップ入力画像とを用いてモーションステレオによって奥行き値を算出する。時刻t以前の任意の時刻t‘に取得した視覚情報である画像を、以降、過去画像とも記載する。時刻t以前の任意の時刻t‘に取得したデプスマップを、以降、過去デプスマップとも記載する。モーションステレオ法は公知の技術であり、様々な方法を適用可能である。なお、二枚の画像からのモーションステレオでは奥行き値のスケールの曖昧性が残るが、これについては過去デプスマップとモーションステレオによって算出した奥行き値との比率を基に算出することができる。 In step S2120, calculation unit 1120 uses the image, which is visual information acquired at arbitrary time t ′ before time t at which imaging unit 110 acquired the visual image, held by holding unit 1130, and the depth map input image. Calculate the depth value by motion stereo. Hereinafter, an image which is visual information acquired at an arbitrary time t ‘before time t will also be described as a past image. Hereinafter, the depth map acquired at any time t 以前 before time t will also be referred to as a past depth map. Motion stereo is a known technique and various methods can be applied. In addition, although the ambiguity of the scale of depth value remains in the motion stereo from two images, about this, it can calculate based on the ratio with the past depth map and the depth value calculated by motion stereo.
 ステップS2130では、算出部1120が、ステップS2110で読み込んだ特性情報である奥行き値に関連づいた信頼度とステップS2120でモーションステレオにより算出した奥行き値を用いて重み付き和によりデプスマップを更新する。具体的には、デプスマップの各奥行き値d付近の信頼度の値を重みαとすると、モーションステレオで算出した奥行き値mとを数式2の重み付き和で補正する。 In step S2130, the calculation unit 1120 updates the depth map with a weighted sum using the reliability associated with the depth value, which is the characteristic information read in step S2110, and the depth value calculated by motion stereo in step S2120. Specifically, assuming that the value of the reliability in the vicinity of each depth value d of the depth map is a weight α, the weighted sum of Expression 2 is used to correct the depth value m calculated by motion stereo.
new=αd+(1-α)m・・・(数式2) d new = αd + (1−α) m (Equation 2)
 算出したdnewを用いてデプスマップを更新する。デプスマップの全ての画素の更新を終えたら、デプスマップ補正ステップを終了し、実施形態1で説明したステップS150以降の処理を続ける。 To update the depth map using the calculated d new. When updating of all the pixels in the depth map is completed, the depth map correction step is ended, and the processing after step S150 described in the first embodiment is continued.
 以上のように、実施形態2では、撮像部110が奥行き値を高精度に取得できる場合には撮像部110の取得した奥行き値の重みを大きく、そうでない場合にはモーションステレオによって算出した奥行き値の重みを大きくする。これによって撮像部110の計測精度が低下してもそれをモーションステレオにより補正して、高精度にデプスマップを算出することができる。 As described above, in the second embodiment, when the imaging unit 110 can acquire the depth value with high accuracy, the weight of the depth value acquired by the imaging unit 110 is large, and otherwise the depth value calculated by motion stereo Increase the weight of As a result, even if the measurement accuracy of the imaging unit 110 is lowered, it can be corrected by motion stereo to calculate the depth map with high accuracy.
 <変形例>
 本実施形態においては、デプスマップの補正における信頼度を、撮像部110が算出するデプスマップの奥行き値の計測誤差から算出し、重みαとしていた。しかしながら、撮像部110が取得したデプスマップとモーションステレオによって算出した奥行き値とを統合し、デプスマップを高精度化するような重みの値を算出する方法であればよい。例えば、デプスマップの奥行きの逆数に所定の係数βを積算した値を重みとする方法でもよい。また、入力画像の勾配を算出し、勾配方向と撮像部110における素子の配置方向との内積を重みとしてもよい。他にも、撮像部110における二つの画像D154aとD154bの基線長や視差画像D154fの基線長を撮像部110からさらに入力部1110が入力し、これと、モーションステレオにおける基線長の比を重みとして用いることもできる。また、本実施形態で説明したように各画素に重みを算出するのではなく、特定の画素のみ統合する方式や、一部の画素または全画素に同一の重みを適応して重み付き和を算出してもよい。また、ある過去1時刻だけでなく、複数時刻の画像、デプスマップを用いてモーションステレオを行ってもよい。
<Modification>
In the present embodiment, the reliability in correction of the depth map is calculated from the measurement error of the depth value of the depth map calculated by the imaging unit 110, and is used as the weight α. However, any method may be used as long as the depth map acquired by the imaging unit 110 and the depth value calculated by motion stereo are integrated to calculate a weight value that enhances the depth map. For example, a value obtained by integrating a predetermined coefficient β to the reciprocal of the depth of the depth map may be used as the weight. Alternatively, the gradient of the input image may be calculated, and the inner product of the gradient direction and the arrangement direction of the elements in the imaging unit 110 may be used as a weight. In addition, the input unit 1110 further receives the baseline lengths of the two images D154a and D154b in the imaging unit 110 and the baseline lengths of the parallax image D154f from the imaging unit 110, and the ratio of the baseline length in motion stereo as a weight It can also be used. In addition, as described in the present embodiment, instead of calculating weights for each pixel, a method of integrating only specific pixels, or calculation of weighted sum by applying the same weight to some or all pixels is calculated. You may In addition, motion stereo may be performed using images and depth maps at a plurality of times in addition to a certain past one time.
 さらに、より精度よくロバストに位置姿勢を算出できるように視覚情報が得られるように、AGVを制御することもできる。例えば、モーションステレオの基線長が大きくなるようにAGVを動かすように制御部1140が制御値を算出する。具体的には、遠方の所定の点を撮像部110でとらえたまま蛇行走行するような制御値がその一例である。これにより、モーションステレオ時の基線長が長くなるためより遠方の奥行き値を精度よく算出できる。また、撮像部110がより広い視野の視覚情報を得るように制御部1140が制御値を算出することもできる。具体的には、撮像部110の光学中心を中心とした見回し動作をするような制御値を算出することである。これにより、より視野の広い視覚情報を取得できるため、最適化における発散や誤差を減らして位置姿勢を算出することができる。 Furthermore, the AGV can also be controlled so that visual information can be obtained so that the position and orientation can be calculated more accurately and robustly. For example, the control unit 1140 calculates the control value so as to move the AGV so that the baseline length of the motion stereo becomes large. Specifically, a control value in which the vehicle travels in a serpentine manner while capturing a predetermined distant point by the imaging unit 110 is an example. As a result, since the base length at the time of motion stereo becomes long, it is possible to accurately calculate the depth value further away. The control unit 1140 can also calculate the control value so that the imaging unit 110 obtains visual information in a wider field of view. Specifically, the control value is calculated so as to perform a look around operation centered on the optical center of the imaging unit 110. As a result, visual information with a wider field of view can be acquired, so that divergence and error in optimization can be reduced and position and orientation can be calculated.
 入力部1110が、通信I/Fを通して他のAGVから画像および位置姿勢を受信し、受信した画像および位置姿勢と、撮像部110が取得した画像とを用いてモーションステレオして奥行き値を算出することもできる。また、受信するのは視覚情報であればなんでもよく、他のAGVの撮像部が取得したデプスマップや視差画像、三次元点群であってもよい。 The input unit 1110 receives an image and position and orientation from another AGV through the communication I / F, and calculates the depth value by performing motion stereo using the received image and position and orientation and the image acquired by the imaging unit 110. It can also be done. Moreover, what is received may be anything as long as it is visual information, and may be a depth map, parallax image, or three-dimensional point group acquired by an imaging unit of another AGV.
 [実施形態3]
 実施形態1、2では、撮像部110が取得したシーンを撮影した視覚情報を基に位置姿勢や制御値を算出していた。しかしながら、テクスチャのない壁や柱においては奥行き精度が低下することがある。そこで実施形態3では、シーンに対して所定のパターン光を投影し、それらパターン光を撮像部110が取得することで、奥行き精度を向上させる。
Third Embodiment
In the first and second embodiments, the position and orientation and the control value are calculated based on visual information obtained by photographing the scene acquired by the imaging unit 110. However, depth accuracy may be reduced in walls and columns without texture. Therefore, in the third embodiment, depth accuracy is improved by projecting predetermined pattern light onto a scene and the imaging unit 110 acquiring the pattern light.
 本実施形態における情報処理装置30の構成を図8に示す。実施形態1で説明した情報処理装置10における制御部1140がさらに、投影装置310の制御値を算出し、出力する点が異なる。なお、本実施形態における投影装置とは、プロジェクタであり、撮像部110の光軸と投影装置の光軸とが一致するように取り付けられているものとする。また、投影装置310が投影するパターンとは、投影および非投影の領域がランダムに存在するように生成されたランダムパターンのことである。なお、本実施形態においては視覚情報とは、撮像部110が取得する画像D154eおよびデプスマップD154dであり、入力部1110が撮像部110より入力する。 The configuration of the information processing apparatus 30 in the present embodiment is shown in FIG. The difference is that the control unit 1140 in the information processing apparatus 10 described in the first embodiment further calculates a control value of the projection apparatus 310 and outputs the calculated control value. In addition, the projection apparatus in this embodiment is a projector, and it is attached so that the optical axis of the imaging part 110 and the optical axis of a projection apparatus may correspond. Also, the pattern projected by the projection device 310 is a random pattern generated so that projected and non-projected regions exist at random. In the present embodiment, the visual information is the image D 154 e and the depth map D 154 d acquired by the imaging unit 110, and the input unit 1110 inputs the image from the imaging unit 110.
 本実施形態における処理手順の図は、実施形態1で説明した情報処理装置10の処理手順を説明する図5と同一であるため説明を省略する。ステップS150において、算出部1120が、入力視覚情報がテクスチャに乏しいか否かを表すテクスチャ度合の値を算出し、制御部1140がテクスチャ度合の値を基にパターン投影のON/OFFの制御値を算出する点が実施形態1と異なる。 The diagram of the processing procedure in the present embodiment is the same as FIG. 5 for explaining the processing procedure of the information processing apparatus 10 described in the first embodiment, and therefore the description thereof is omitted. In step S150, the calculation unit 1120 calculates the texture degree value indicating whether the input visual information is poor in texture, and the control unit 1140 controls the pattern projection ON / OFF based on the texture degree value. The point to calculate differs from the first embodiment.
 ステップS150における、制御部1140が算出するパターン投影の制御値の算出手順の詳細を説明する。まず、算出部1120が、入力画像にソーベルフィルタを畳み込み、さらにそれらの絶対値を算出して勾配画像を算出する。ソーベルフィルタは、画像の1次微分を算出するためのフィルタの一種でありさまざまな文献で公知である。算出した勾配画像のうち所定の勾配値閾値以上の画素の割合をテクスチャ度合とする。次に、制御部1140が、テクスチャ度合の値が所定の閾値以上であれば投影装置をONに、所定の閾値未満であれば投影装置をOFFにするように制御値を算出する。 Details of the calculation procedure of the control value of the pattern projection calculated by the control unit 1140 in step S150 will be described. First, the calculation unit 1120 convolutes the Sobel filter with the input image and further calculates their absolute values to calculate a gradient image. The Sobel filter is a type of filter for calculating the first derivative of an image and is known in various documents. The ratio of pixels equal to or greater than a predetermined gradient value threshold in the calculated gradient image is taken as the texture degree. Next, the control unit 1140 calculates a control value so as to turn on the projection device if the value of the texture degree is equal to or more than a predetermined threshold, and turn off the projection device if the value is less than the predetermined threshold.
 以上のように、実施形態3では、シーンがテクスチャに乏しい場合には、ランダムなパターン光を投影する。これにより、シーンにランダムな模様が付加されるため、シーンがテクスチャに乏しい場合であっても撮像部がより精度よくデプスマップが取得できる。このため、精度よく位置姿勢を算出することができるようになる。 As described above, in the third embodiment, when the scene is poor in texture, random pattern light is projected. As a result, a random pattern is added to the scene, so that even if the scene is poor in texture, the imaging unit can acquire the depth map more accurately. Therefore, the position and orientation can be calculated with high accuracy.
 <変形例>
 本実施形態においては、パターン光とはランダムパターンのことであった。しかしながら、テクスチャに乏しい領域にテクスチャを付与するようなパターンであれば何でもよい。例えば、ランダムドットパターンや縞パターン(制限はや格子パターンなど)を投影してもよい。なお、縞パターンでは,変調波長内と波長外の距離を判別できないという曖昧性があるが、周波数を変えて複数時刻で取得した入力画像から奥行き値を求めるグレーコード方式を用いることで曖昧性を排除することができる。
<Modification>
In the present embodiment, the pattern light is a random pattern. However, any pattern may be used as long as it gives a texture to an area poor in texture. For example, a random dot pattern or a fringe pattern (such as a restriction or a lattice pattern) may be projected. In the stripe pattern, there is an ambiguity that it is not possible to distinguish the distance between the inside and outside of the modulation wavelength, but by using a gray code method to obtain depth values from input images acquired at multiple times with different frequencies. It can be eliminated.
 本実施形態では、制御部1140が投影のON/OFFの制御値を出力し、統制装置310が投影の有無を切り替えていた。しかし、投影装置310がパターン光を投影できる構成であればこれに限らない。例えば、初期化ステップS110において電源が投入されることで投影装置310が投影を開始するように構成してもよい。また、シーンの任意の部分を投影装置310が投影するように構成してもよい。具体的には、勾配画像の勾配値が所定の閾値未満の領域にのみ投影装置310が投影するように、制御部1140が投影装置310の投影パターンを切り替えることもできる。実施形態5で述べる物体検出において人の目を検出し、人の目を避けてパターンを投影するよう制御値を算出することもできる。さらには、パターンのON、OFFだけでなく明度を変えてもよい。つまり、制御部1140が、デプスマップの奥行き値が大きい領域はより明るく投影装置310が投影するように制御値を算出することもできるし、入力画像の暗い部分はより明るく投影するように制御値を算出することもできる。また、算出部1120が位置姿勢を算出する際の繰り返し計算における誤差の残差が所定の閾値以上であればパターンを変更するような構成とすることもできる。 In the present embodiment, the control unit 1140 outputs the control value of ON / OFF of the projection, and the control device 310 switches the presence or absence of the projection. However, the configuration is not limited to this as long as the projection device 310 can project pattern light. For example, the projector 310 may be configured to start projection when the power is turned on in the initialization step S110. Also, the projection device 310 may be configured to project an arbitrary part of the scene. Specifically, the control unit 1140 can also switch the projection pattern of the projection device 310 so that the projection device 310 projects only in a region where the gradient value of the gradient image is less than a predetermined threshold. In the object detection described in the fifth embodiment, it is also possible to detect a human eye and calculate a control value so as to project a pattern while avoiding the human eye. Furthermore, the brightness may be changed as well as ON / OFF of the pattern. That is, the control unit 1140 can calculate the control value so that the projection device 310 projects the area brighter in the depth map more brightly, or the control value such that the dark part of the input image projects more brightly Can also be calculated. In addition, the pattern may be changed as long as the residual of the error in the iterative calculation when the calculation unit 1120 calculates the position and orientation is equal to or more than a predetermined threshold.
 本実施形態では、テクスチャ度合値はソーベルフィルタによる勾配画像を用いていた。他にも、プレフィットフィルタやSCHARRフィルタ、エッジ検出を行うキャニーフィルタといったフィルタによって算出した勾配画像やエッジ画像を用いてテクスチャ度合値を算出することもできる。また、画像にDFT(discrete Fourier transform:離散フーリエ変換)をかけた高周波成分をテクスチャ度合値とすることもできる。また、画像中の角といった特徴点を算出し、特徴点の個数をテクスチャ度合として用いてもよい。 In the present embodiment, the texture degree value uses the gradient image by the Sobel filter. Besides, it is also possible to calculate the texture degree value using a gradient image or an edge image calculated by a filter such as a pre-fit filter, a SCHARR filter, or a Canny filter that performs edge detection. Alternatively, a high frequency component obtained by applying DFT (discrete Fourier transform) to an image may be used as the texture degree value. Alternatively, feature points such as corners in an image may be calculated, and the number of feature points may be used as the texture degree.
 [実施形態4]
 実施形態1、2では、撮像部が取得したシーンを撮影した視覚情報を基に位置姿勢や制御値を算出していた。実施形態3では、パターン光を投影してテクスチャの乏しいシーンに対する精度向上について述べた。実施形態4では、さらに他の三次元センサが計測したシーンの位置を表す三次元情報を合わせて用いる方法について述べる。
Fourth Embodiment
In the first and second embodiments, the position and orientation and the control value are calculated based on visual information obtained by photographing the scene acquired by the imaging unit. In the third embodiment, the pattern light is projected to improve the accuracy for a scene with poor texture. In the fourth embodiment, a method will be described in which three-dimensional information representing the position of a scene measured by another three-dimensional sensor is additionally used.
 本実施形態における情報処理装置40の構成を図9に示す。実施形態1で説明した情報処理装置10における入力部1110がさらに、三次元計測装置410からの三次元情報を入力する点で実施形態1と異なる。なお、本実施形態における三次元計測装置410とは3DLiDAR(light detection and ranging)であり、レーザパルスの往復時間により距離を測定する装置である。三次元装置が取得する計測値を入力部1110がポイントクラウドとして、入力部1110が入力する。また、保持部はあらかじめ撮像部110の特性情報として、撮像部110が取得するデプスマップの奥行き値の信頼度を関連付けたリスト、および三次元計測装置410の奥行き値の信頼度を関連付けたリストを保持しているものとする。これらの信頼度は実施形態2で説明した方法で撮像部110、および三次元計測装置410ともに事前に算出されているものとする。 The configuration of the information processing apparatus 40 in the present embodiment is shown in FIG. This embodiment differs from the first embodiment in that the input unit 1110 in the information processing apparatus 10 described in the first embodiment further inputs three-dimensional information from the three-dimensional measurement apparatus 410. Note that the three-dimensional measurement device 410 in the present embodiment is a 3D LiDAR (light detection and ranging), which is a device that measures the distance based on the round trip time of the laser pulse. The input unit 1110 inputs a measurement value acquired by the three-dimensional device as a point cloud. Further, the holding unit is a list in which the reliability of the depth value of the depth map acquired by the imaging unit 110 is associated in advance and the list in which the reliability of the depth value of the three-dimensional measuring apparatus 410 is associated. It shall be held. It is assumed that these reliabilities are calculated in advance by both the imaging unit 110 and the three-dimensional measurement apparatus 410 by the method described in the second embodiment.
 実施形態4における処理全体の手順は、実施形態1で説明した情報処理装置10の処理手順を示す図4と同一であるため、説明を省略する。実施形態1とは、位置姿勢算出ステップS140の前にデプスマップの補正ステップが追加される点が異なる。図10は、デプスマップ補正ステップにおける処理手順の詳細を示すフローチャートである。 The procedure of the entire processing in the fourth embodiment is the same as that of FIG. 4 showing the processing procedure of the information processing apparatus 10 described in the first embodiment, and thus the description will be omitted. The second embodiment differs from the first embodiment in that a depth map correction step is added before the position and orientation calculation step S140. FIG. 10 is a flowchart showing details of the processing procedure in the depth map correction step.
 ステップS4110では、算出部1120が、保持部1130から撮像部110、および三次元計測装置410の特性情報を読み込む。 In step S4110, the calculation unit 1120 reads the characteristic information of the imaging unit 110 and the three-dimensional measurement apparatus 410 from the holding unit 1130.
 ステップS4120では、算出部1120が、ステップS4110で読み込んだ特性情報である奥行き値に関連づいた信頼度を用いて撮像部110が算出したデプスマップと三次元計測装置410が計測したポイントクラウドとを統合する。具体的には、数式2における値mを三次元計測装置410が計測した奥行き値と置き換えることでデプスマップを更新することができる。なお、重みαは、デプスマップの信頼度をγ、同じ地点を指すポイントクラウドの信頼度をγとすると、数式3によって算出する。 In step S4120, the calculation unit 1120 calculates the depth map calculated by the imaging unit 110 using the reliability associated with the depth value, which is the characteristic information read in step S4110, and the point cloud measured by the three-dimensional measurement device 410. Integrate. Specifically, the depth map can be updated by replacing the value m in Equation 2 with the depth value measured by the three-dimensional measurement device 410. The weight α is calculated by Equation 3, assuming that the reliability of the depth map is γ D and the reliability of a point cloud pointing to the same point is γ L.
Figure JPOXMLDOC01-appb-M000002
Figure JPOXMLDOC01-appb-M000002
 算出した重みを用いて数2によりデプスマップを更新する。デプスマップの全ての画素の更新を終えたら、デプスマップ補正ステップを終了し、実施形態1で説明したステップ150以降の処理を続ける。 The depth map is updated by equation 2 using the calculated weights. When updating of all pixels in the depth map is completed, the depth map correction step is ended, and the processing after step 150 described in the first embodiment is continued.
 以上のように、実施形態4では、撮像部が奥行き値を高精度に取得できる場合には撮像部の取得した奥行き値の重みを大きく、三次元計測装置が奥行き値を高精度に取得できる場合には三次元計測装置が取得した奥行き値の重みを大きくする。これによって撮像部、および三次元計測装置のより良い精度で計測できる奥行き値を用いてデプスマップを算出することができ、高精度に位置姿勢を算出することができる。 As described above, in the fourth embodiment, when the imaging unit can acquire the depth value with high accuracy, the weight of the depth value acquired by the imaging unit is large, and the three-dimensional measuring apparatus can acquire the depth value with high accuracy. In the above, the weight of the depth value acquired by the three-dimensional measurement device is increased. As a result, the depth map can be calculated using the depth values that can be measured with higher accuracy by the imaging unit and the three-dimensional measurement apparatus, and the position and orientation can be calculated with high accuracy.
 <変形例>
 本実施形態においては、三次元計測装置410として3DLiDARを用いる方法について説明した。三次元計測装置410はこれに限るものでなく、撮像部110が取得した視覚情報を高精度化できる三次元情報が計測できるものであればよい。例えば、TOF(Time Of Flight)距離計測カメラであってもよいし、2台のカメラを備えたステレオカメラであってもよい。また、DAFによる撮像部110と別の単眼カメラを、撮像部110の光軸と一致させて配置したステレオ構成としてもよい。信頼度の特性が異なる撮像部110をさらに搭載し、これを三次元計測装置410とみなして同様にデプスマップを更新する構成でもよい。
<Modification>
In the present embodiment, the method using 3D LiDAR as the three-dimensional measurement device 410 has been described. The three-dimensional measurement device 410 is not limited to this, as long as it can measure three-dimensional information that can increase the accuracy of visual information acquired by the imaging unit 110. For example, it may be a TOF (Time Of Flight) distance measurement camera, or may be a stereo camera provided with two cameras. In addition, a stereo configuration may be adopted in which the single-eye camera different from the imaging unit 110 by DAF is disposed in alignment with the optical axis of the imaging unit 110. An imaging unit 110 having different reliability characteristics may be further mounted, and this may be regarded as the three-dimensional measuring device 410 to similarly update the depth map.
 [実施形態5]
 実施形態1、2では、撮像部110が取得したシーンを撮影した視覚情報を基に位置姿勢や制御値を算出していた。実施形態3では、シーンに対して所定のパターン光を投影した。実施形態4では、さらに三次元計測装置が計測した三次元形状を合わせて用いていた。実施形態5では、視覚情報から物体を検出し、これを用いて移動体の制御を行う。特に本実施形態においては、AGVは荷物を搭載して運んでおり、目的地に到着すると棚やベルトコンベアに対して所定の位置に厳密に停止しなければならない場合について述べる。本実施形態では、撮像部110が撮像した棚やベルトコンベアといった物体の位置姿勢を算出することで厳密な位置姿勢を算出し、AGVを制御する方法について述べる。なお、本実施形態においては特に断りが無い限り物体の特徴情報とは物体の位置姿勢のことである。
Fifth Embodiment
In the first and second embodiments, the position and orientation and the control value are calculated based on visual information obtained by photographing the scene acquired by the imaging unit 110. In the third embodiment, predetermined pattern light is projected onto the scene. In the fourth embodiment, the three-dimensional shape measured by the three-dimensional measurement apparatus is used together. In the fifth embodiment, an object is detected from visual information and used to control a moving object. In the present embodiment, in particular, the AGV loads and carries a load, and when reaching the destination, the case where it must be strictly stopped at a predetermined position with respect to the shelf and the belt conveyor will be described. In the present embodiment, a method of controlling the AGV by calculating the exact position and orientation by calculating the position and orientation of an object such as a shelf or a belt conveyor imaged by the imaging unit 110 will be described. In the present embodiment, unless otherwise noted, the feature information of an object is the position and orientation of the object.
 実施形態5における装置の構成は、実施形態1で説明した情報処理装置10の構成を示す図2と同一であるため省略する。なお、算出部1120が、さらに視覚情報から物体検出を行い、制御部1140が、検出した物体が視覚情報中の所定の位置に写るように移動体を制御する。さらに、保持部1130が物体検出のための物体モデルを保持しているとともに、AGVが目的に到着した際に物体に対してどの位置姿勢でいるべきかという物体に対する目標位置姿勢を保持している。以上の点が実施形態1と異なる。 The configuration of the device according to the fifth embodiment is the same as that of FIG. The calculating unit 1120 further detects an object from visual information, and the control unit 1140 controls the moving body so that the detected object appears at a predetermined position in the visual information. Furthermore, the holding unit 1130 holds an object model for object detection, and holds a target position / posture with respect to the object as to what position / posture should be with respect to the object when the AGV arrives at the purpose. . The above points differ from the first embodiment.
 物体モデルとは、物体の形状を表すCADモデルと、物体の三次元特徴点としてある法線を持った二点の三次元点群の相対位置を特徴量とするPPF(Point Pair Feature)特徴情報を格納したリストのことである。 An object model is a CAD model representing the shape of an object, and PPF (Point Pair Feature) feature information in which the relative position of a two-dimensional three-dimensional point group having a normal as a three-dimensional feature point of the object is a feature. Is a list that stores
 実施形態5における処理全体の手順は、実施形態1で説明した情報処理装置10の処理手順を示す図5と同一であるため、説明を省略する。ただし、実施形態1とは、位置姿勢算出ステップS140後に物体検出ステップが追加される点が異なる。図11は、物体検出ステップの詳細を説明したフローチャートである。 The procedure of the entire processing in the fifth embodiment is the same as that of FIG. 5 showing the processing procedure of the information processing apparatus 10 described in the first embodiment, so the description will be omitted. However, the second embodiment differs from the first embodiment in that an object detection step is added after the position and orientation calculation step S140. FIG. 11 is a flowchart illustrating the details of the object detection step.
 ステップS5110では、算出部1120が、保持部1130が保持する物体モデルを読み込む。 In step S5110, the calculation unit 1120 reads the object model held by the holding unit 1130.
 ステップS5120では、算出部1120が、デプスマップから物体モデルに当てはまる物体が視覚情報中のどこに写っているか検出する。具体的には、まずデプスマップからPPF特徴を算出する。そして、デプスマップから検出したPPFと物体モデルのPPFとをマッチングすることで、撮像部110に対する物体位置姿勢の初期値を算出する。 In step S5120, the calculation unit 1120 detects where in the visual information an object that fits the object model is included in the depth map. Specifically, first, PPF features are calculated from the depth map. Then, by matching the PPF detected from the depth map with the PPF of the object model, the initial value of the object position / posture with respect to the imaging unit 110 is calculated.
 ステップS1530では、算出部1120が算出した、撮像部110に対する物体の位置姿勢を初期位置としてさらにICPアルゴリズムにより精密に撮像部110に対する物体の位置姿勢を算出する。合わせて、保持部1130が保持する、物体に対する目標位置姿勢との残差を算出する。算出手段1120が、算出した残差を制御部1140に入力し、物体検出ステップを終了する。 In step S1530, with the position and orientation of the object with respect to the imaging unit 110 calculated by the calculating unit 1120 as the initial position, the position and orientation of the object with respect to the imaging unit 110 are accurately calculated by ICP algorithm. At the same time, the residual between the object and the target position and orientation held by the holding unit 1130 is calculated. The calculation unit 1120 inputs the calculated residual to the control unit 1140, and ends the object detection step.
 図5におけるステップS150においては、制御部1140が、算出部1120が算出した物体の位置姿勢の残差が小さくなる方向にAGVが移動するようにアクチュエータ120の制御値を算出する。 In step S150 in FIG. 5, the control unit 1140 calculates the control value of the actuator 120 so that the AGV moves in the direction in which the residual of the position and orientation of the object calculated by the calculation unit 1120 decreases.
 実施形態5では、撮像素子上の各々の受光部が2以上の受光素子によって構成されることを特徴とする撮像部が取得したデプスマップに写る物体を検出し、モデルフィッティングにより物体の位置姿勢を算出する。そして、事前に与えられた対象物に対する位置姿勢と、検出した物体の位置姿勢との差が小さくなるようにAGVを制御する。すなわち、AGVを対象物に対して厳密に位置合わせするよう制御する。このように、事前に形状が既知の物体に対する位置姿勢を算出することで、高精度に位置姿勢を算出でき、高精度にAGVを制御することができる。 In the fifth embodiment, an image pickup unit characterized in that each light receiving unit on the image pickup device includes two or more light receiving elements detects an object shown in the acquired depth map, and the position and orientation of the object are detected by model fitting. calculate. Then, the AGV is controlled so that the difference between the position and orientation with respect to the object given in advance and the position and orientation of the detected object becomes small. That is, the AGV is controlled to be precisely aligned with the object. Thus, by calculating the position and orientation of an object whose shape is known in advance, the position and orientation can be calculated with high accuracy, and the AGV can be controlled with high accuracy.
 <変形例>
 本実施形態では、物体の検出にPPF特徴を用いていた。しかしながら、物体を検出することができる方法なら何でもよい。例えば、特徴量として三次元点の法線と周囲に位置する三次元点の法線との内積のヒストグラムを特徴量とするSHOT特徴を用いてもよい。また、ある三次元点の法線ベクトルを軸とする円柱面に周囲の三次元点を投影したSpin Imageを用いた特徴を用いてもよい。また、特徴量を用いずに物体を検出する方法として機械学習による学習モデルを用いることもできる。具体的には、デプスマップを入力すると物体領域が1、非物体領域が0を返すように学習したニューラルネットワークを学習モデルとして用いることができる。また、デプスマップから物体の6自由度を出力できるよう学習した学習モデルであれば、ステップS5110からS5130をまとめて物体の位置姿勢を算出してもよい。
<Modification>
In the present embodiment, the PPF feature is used to detect an object. However, any method capable of detecting an object may be used. For example, as a feature quantity, a SHOT feature may be used in which a histogram of an inner product of a normal to a three-dimensional point and a normal to a three-dimensional point located around the three-dimensional point is a feature. Alternatively, a feature using Spin Image in which surrounding three-dimensional points are projected on a cylindrical surface having a normal vector of a certain three-dimensional point as an axis may be used. Further, as a method of detecting an object without using a feature amount, a learning model by machine learning can also be used. Specifically, when a depth map is input, a neural network learned so that an object area is 1 and non-object areas are 0 can be used as a learning model. In addition, in the case of a learning model learned so as to output six degrees of freedom of the object from the depth map, the position and orientation of the object may be calculated by combining steps S5110 to S5130.
 本実施形態では、物体の例として棚やベルトコンベアとしていた。しかしながらAGVを停止させたときに撮像部110が観測でき、相対位置姿勢(相対位置、相対姿勢)が一意に定まる物体であればなんでもよい。例えば、位置姿勢の指標として工場の天井に張り付けた三次元のマーカ(具体的には、3Dプリンタで印刷した任意の凹凸を持つ任意形形状の物体)でもよい。また、AGVが充電式で充電ステーションに停止する場合には充電ステーションの形状の3DCADモデルでもよい。また、CADモデルでなくとも、目標位置姿勢であらかじめ停止した際のデプスマップを物体モデルとして用いてもよい。このとき、AGV運用時には保持したデプスマップと入力部1110が入力したデプスマップとの間の位置姿勢誤差が小さくなるようにAGVを制御すればよい。このようにするとCADモデルの作成の手間なく物体モデルを生成できる。 In this embodiment, a shelf or a belt conveyor is used as an example of the object. However, it may be any object as long as the imaging unit 110 can observe when the AGV is stopped and the relative position and orientation (relative position and relative orientation) are uniquely determined. For example, a three-dimensional marker (specifically, an arbitrary-shaped object having arbitrary asperities printed by a 3D printer) affixed to a ceiling of a factory as an index of position and orientation may be used. When the AGV is rechargeable and stops at the charging station, it may be a 3D CAD model of the shape of the charging station. Also, instead of using a CAD model, a depth map may be used as an object model when stopping at a target position and orientation in advance. At this time, during AGV operation, AGV may be controlled so that the position and orientation error between the held depth map and the depth map input by input unit 1110 is reduced. In this way, an object model can be generated without the trouble of creating a CAD model.
 本実施形態では、AGVの厳密な位置決めのため、位置姿勢算出のために、物体を検出しモデルフィッティングする方法を例示した。しかしながら、厳密な位置姿勢算出の目的のためだけではなく、衝突回避やほかのAGVの位置姿勢検出に用いてもよい。具体的には、物体モデルとしてAGVの形状のCADモデルを保持しておき、算出部1120が物体検出によって他のAGVを見つけたとする。このとき、他のAGVの座標を避けるように制御部1140が制御値を算出することにも使うことができる。そして、他のAGVに衝突することを避ける。また、他のAGVを検出した時にはアラートを提示し、他のAGVに自身の進行ルートを空けるよう指令を出してもよい。他のAGVが止まっていれば、バッテリー切れで停止しているとみなしてそれらに近接するように制御値を算出し、連結して充電ステーションへ移動するよう制御部1140が制御値を算出してもよい。また、工場において通路に配線がなされている時には、算出部1120が配線を検出し、制御部1140がそれらを踏まないように迂回するように制御値を算出してもよい。地面に凹凸がある場合には凹凸を避けるように制御値を算出してもよい。また、物体モデルごとに進入禁止や、推奨ルートといったラベルを関連付けておけば、当該物体をシーン中に配置することで容易にAGVの通行可否を設定することができる。 In the present embodiment, a method of detecting an object and performing model fitting for position and orientation calculation for exact positioning of the AGV is illustrated. However, it may be used not only for the purpose of exact position and orientation calculation but also for collision avoidance and position and orientation detection of other AGVs. Specifically, it is assumed that a CAD model of an AGV shape is held as an object model, and the calculation unit 1120 finds another AGV by object detection. At this time, the control unit 1140 can also be used to calculate control values so as to avoid coordinates of other AGVs. And avoid colliding with other AGVs. Also, when another AGV is detected, an alert may be presented, and another AGV may be instructed to clear its own traveling route. If the other AGVs are stopped, the control value is calculated to be close to them, considering that they are stopped due to battery exhaustion, the control unit 1140 calculates the control values so as to connect and move to the charging station. It is also good. In addition, when the wiring is made in the passage in the factory, the calculation unit 1120 may detect the wiring and calculate the control value so that the control unit 1140 bypasses them. If the ground is uneven, the control value may be calculated to avoid the unevenness. Further, if labels such as entry prohibition and recommended route are associated with each object model, it is possible to easily set whether or not the AGV can pass by arranging the object in the scene.
 本実施形態においては、物体モデルとはCADモデルであった。しかしながら物体の位置姿勢を算出できればモデルは何でもよい。例えば、対象物を複数視点で撮影したステレオ画像からStructure From Motionアルゴリズムによって三次元復元して生成したメッシュモデルであってもよい。また、RGB-Dセンサで複数視点から撮影したデプスマップを統合して作成したポリゴンモデルであってもよい。また、前述のような物体を検出するように学習したニューラルネットワークモデルとして例えばCNN(Convolutional Neural Network)を用いてもよい。 In the present embodiment, the object model is a CAD model. However, any model may be used as long as the position and orientation of the object can be calculated. For example, it may be a mesh model generated by three-dimensional reconstruction of a target object from stereo images taken at a plurality of viewpoints by the Structure From Motion algorithm. In addition, it may be a polygon model created by integrating depth maps captured from a plurality of viewpoints with an RGB-D sensor. Also, a CNN (Convolutional Neural Network) may be used as a neural network model learned to detect an object as described above.
 AGVが運搬する物体を撮像部110が撮像し、算出部1110が認識し、搭載した物体種に応じて制御部1140が制御値を算出することもできる。具体的には、搭載した物体が壊れ物であればAGVが低速で移動するように制御値を算出する。また、あらかじめ物体毎に目的位置姿勢を関連付けたリストを保持部1130に保持していれば、搭載した物体と関連づいた目的位置にAGVを移動させるように制御値を算出してもよい。 The imaging unit 110 may image an object carried by the AGV, the calculation unit 1110 may recognize, and the control unit 1140 may calculate the control value according to the mounted object type. Specifically, if the mounted object is a broken object, the control value is calculated so that the AGV moves at a low speed. In addition, if a list in which the target position and posture are associated with each object is held in the holding unit 1130 in advance, the control value may be calculated so as to move the AGV to the target position related to the mounted object.
 また、撮像部110が、移動体から所定の距離範囲内で、物体を検知したら、運搬するべき物体が落下していると判断してアラートを表示することもできる。また、AGVに不図示のロボットアームを搭載していれば、ロボットアームで当該物体を取得するようにロボットアームの制御値を制御部1140が算出してもよい。 In addition, when the imaging unit 110 detects an object within a predetermined distance range from the moving object, it is possible to determine that the object to be transported is falling and display an alert. In addition, if a robot arm (not shown) is mounted on the AGV, the control unit 1140 may calculate the control value of the robot arm such that the robot arm acquires the object.
 [実施形態6]
 実施形態1~4では、撮像部110が取得した視覚情報を基に高精度に安定して位置姿勢を算出し、移動体の制御値を算出する方法について述べた。実施形態5では、視覚情報から物体を検出し、これを用いて移動体の制御を行う方法について述べた。実施形態6では実施形態1~5の追加機能として、入力視覚情報を領域分割した結果を用いてAGVの制御やマップ情報の生成を高精度に安定して行う方法について述べる。特に、本実施形態においてはマップ情報の生成時に適応する方法を例示する。マップ情報には、時間が経過しても位置姿勢が変わらない静止物体を登録し、これらを用いて位置姿勢を算出する方がシーンの変化に対するロバスト性が向上する。そこで、視覚情報を意味的領域分割し、各画素が何の物体種であるかを判別する。そして、あらかじめ物体種毎に算出した静止物体らしさ情報を用いて階層的なマップ情報を生成する方法、およびそれらを用いた位置姿勢推定方法について述べる。なお、本実施形態においては、特に断りが無い限り物体の特徴情報とは物体の種類のことである。
Sixth Embodiment
In the first to fourth embodiments, the method of calculating the control value of the moving object by stably calculating the position and orientation with high accuracy based on the visual information acquired by the imaging unit 110 has been described. The fifth embodiment has described the method of detecting an object from visual information and using it to control a moving object. In the sixth embodiment, as an additional function of the first to fifth embodiments, a method of stably performing control of AGV and generation of map information with high accuracy using the result of dividing input visual information into regions will be described. In particular, the present embodiment exemplifies a method of adapting upon generation of map information. In the map information, a static object whose position and orientation does not change even if time passes is registered, and using these to calculate the position and orientation improves the robustness to the change of the scene. Therefore, visual information is divided into semantic regions, and it is determined what kind of object each pixel is. Then, a method of generating hierarchical map information using stationary object likeness information calculated for each object type in advance, and a position and orientation estimation method using them will be described. In the present embodiment, the feature information of an object is the type of the object unless otherwise noted.
 実施形態6における装置の構成は、実施形態1で説明した情報処理装置10の構成を示す図2と同一であるため省略する。なお、算出部1120が、さらに視覚情報を意味的領域分割し、それらを用いて階層的にマップ情報を生成する。本実施形態における階層的マップ情報とは、(1)工場のレイアウトCADモデル、(2)静止物体マップ、(3)什器マップ、(4)動く物体マップの4レイヤーで構成されるポイントクラウドである。なお、保持部1130は外部メモリH14に(1)工場のレイアウトCADモデルを保持しておく。また、階層的に作成したマップ情報を用いて位置姿勢を算出する。位置姿勢算出方法については後述する。なお、本実施形態においては撮像部110が取得し、入力部1110が入力する視覚情報とは、画像およびデプスマップであるものとする。また、保持部1130は、画像を入力すると、各画素が該当する物体か否かを表すマスク画像を物体種ごとに出力するように学習された学習モデルであるCNNを合わせて保持する。その学習モデルとともに、各物体種が各階層(2)~(4)どれに当てはまるかのルックアップテーブルを保持しており、物体種を指定するとどの階層の物体種であるかが判明する。 The configuration of the apparatus in the sixth embodiment is the same as that of FIG. 2 showing the configuration of the information processing apparatus 10 described in the first embodiment, and thus the description thereof is omitted. The calculating unit 1120 further divides visual information into semantic regions, and generates map information hierarchically using them. The hierarchical map information in this embodiment is a point cloud composed of four layers of (1) layout CAD model of a factory, (2) stationary object map, (3) fixture map, and (4) moving object map. . The holding unit 1130 holds (1) the layout CAD model of the factory in the external memory H14. Also, the position and orientation are calculated using hierarchically created map information. The position and orientation calculation method will be described later. In the present embodiment, visual information acquired by the imaging unit 110 and input by the input unit 1110 is an image and a depth map. The holding unit 1130 also holds a CNN, which is a learning model learned so as to output, for each object type, a mask image indicating whether each pixel is a corresponding object when an image is input. Along with the learning model, a look-up table is held for which each object type is in each of the layers (2) to (4), and when the object type is specified, it is known which layer the object type is.
 実施形態6における処理全体の手順の図は、実施形態1で説明した情報処理装置10の処理手順を示す図4と同一であるため説明を省略する。算出部1120が位置姿勢を算出する際に、保持部1130が保持するマップ情報のレイヤーを考慮して位置姿勢を算出する点が実施形態1と異なる。また、実施形態1とは、位置姿勢算出ステップS140後に領域分割・マップ生成ステップが追加される点が異なる。これらの処理の詳細は後述する。 The diagram of the procedure of the entire process in the sixth embodiment is the same as FIG. 4 showing the procedure of the information processing apparatus 10 described in the first embodiment, and therefore the description thereof is omitted. The second embodiment differs from the first embodiment in that the calculating unit 1120 calculates the position and orientation in consideration of the layer of the map information held by the holding unit 1130 when calculating the position and orientation. Further, the second embodiment differs from the first embodiment in that an area division / map generation step is added after the position and orientation calculation step S140. Details of these processes will be described later.
 ステップS140では、算出部1120が、保持部1130が保持するマップ情報のレイヤーごとに、位置姿勢算出の寄与度となる重みをポイントクラウドに付与して位置姿勢を算出する。具体的には、本実施形態における例のように(1)~(4)のレイヤーを保持する場合には、より動きにくいマップ情報である(1)から(4)にかけて順次重みを小さくする。 In step S140, the calculating unit 1120 calculates, for each layer of map information held by the holding unit 1130, a weight serving as a contribution degree of position and orientation calculation to the point cloud to calculate the position and orientation. Specifically, in the case of holding the layers (1) to (4) as in the example in the present embodiment, the weights are successively reduced from (1) to (4) which is less resistant map information.
 図12は、領域分割・マップ生成ステップの詳細を説明したフローチャートである。この領域分割・マップ生成ステップは図5における位置姿勢算出ステップS140直後に追加され、実行される。 FIG. 12 is a flowchart illustrating the details of the area division / map generation step. This area division / map generation step is added and executed immediately after the position and orientation calculation step S140 in FIG.
 ステップS6110では、算出部1120が入力画像を意味的領域分割する。意味的領域分割は多数の手法が提案されており、これらを援用できる。ただし、画像を意味的領域分割する方法であれば、上記方法に限るものではない。これらの方法を用いて、物体種ごとに各画素に当該物体か否かを割り当てたマスク画像を得る。 In step S6110, the calculation unit 1120 divides the input image into semantic regions. A number of approaches have been proposed for semantic domain segmentation, which can be incorporated. However, the method is not limited to the above method as long as the image is divided into semantic regions. These methods are used to obtain mask images in which each pixel is assigned with the object or not for each object type.
 ステップS6120では、デプスマップを領域分割する。具体的には、まずデプスマップの各画素に対して法線を算出し、周囲の法線との内積が所定の値以下となる法線のエッジを検出する。そして法線のエッジを境界としてそれぞれの領域に異なるラベルを割り振ることでデプスマップを領域分割し、領域分割画像を得る。 In step S6120, the depth map is divided into areas. Specifically, first, a normal is calculated for each pixel of the depth map, and an edge of the normal whose inner product with the surrounding normal is equal to or less than a predetermined value is detected. Then, the depth map is divided into areas by allocating different labels to the respective areas with the edge of the normal as a boundary, and an area divided image is obtained.
 ステップS6130では、算出部1120が、入力画像を意味的領域分割したマスク画像と、デプスマップを領域分割した領域分割画像を基にポイントクラウドの意味的領域分割を行う。具体的には、各デプスマップの領域SDjと各マスクの物体領域SMjの包含関係の割合Ni,jを数式4により算出する。なお、iは物体種、jはデプスマップの領域分割のラベルである。 In step S6130, the calculation unit 1120 performs point cloud semantic area division based on the mask image obtained by dividing the input image into semantic areas and the area division image obtained by area division of the depth map. Specifically, the ratio N i, j of the inclusion relationship between the area S Dj of each depth map and the object area S Mj of each mask is calculated by Expression 4. Here, i is an object type, and j is a label of area division of the depth map.
Figure JPOXMLDOC01-appb-M000003
Figure JPOXMLDOC01-appb-M000003
 次に、Ni,jが所定の閾値以上であるデプスマップの領域SDjに物体種iを割り当てる。ただし、物体種が割り当てられていない画素は背景ラベルを割り当てておく。以上により、デプスマップの各画素に物体種iを割り当てる。 Next, the object type i is assigned to the area S Dj of the depth map in which N i, j is equal to or greater than a predetermined threshold. However, background labels are assigned to pixels to which no object type has been assigned. Thus, the object type i is assigned to each pixel of the depth map.
 ステップS6140では、算出部1120が、ステップS6130でデプスマップに割り当てられた物体種ラベルを基に階層的にマップ情報を生成する。具体的には、デプスマップの物体種ラベルごとにルックアップテーブルを参照し、保持部1130が保持するマップ情報の各レイヤーにデプスマップから求まる三次元点群を保存する。保存が完了したら、領域分割・マップ生成ステップを終了する。 In step S6140, the calculation unit 1120 hierarchically generates map information based on the object type label assigned to the depth map in step S6130. Specifically, the lookup table is referred to for each object type label of the depth map, and the three-dimensional point group obtained from the depth map is stored in each layer of the map information held by the holding unit 1130. When the storage is completed, the area division / map generation step is ended.
 以上のように、実施形態6では、デプスマップを意味的領域分割することで、位置姿勢算出に適切な動かない物体と、位置姿勢算出に不適な、動く物体とを分けてマップ情報に登録することができる。また、分割したマップ情報を用いて、より動く物体程小さくなるように重みを割り当てる。そして、割り当てた重みに従って位置姿勢を算出する。このようすることで、より安定してロバストに位置姿勢を算出することができる。 As described above, in the sixth embodiment, by dividing the depth map into semantic regions, the non-moving object suitable for position and orientation calculation and the moving object unsuitable for position and orientation calculation are separately registered in the map information. be able to. Also, using the divided map information, weights are assigned such that moving objects become smaller. Then, the position and orientation are calculated according to the assigned weight. By doing this, the position and orientation can be calculated more stably and robustly.
 <変形例>
 本実施形態では(1)~(4)のレイヤーを用いていた。しかしながら、物体の移動具合に応じて複数レイヤーを持つ構成であればよく、(1)~(4)の任意の個数のレイヤーのみ保持部1130が保持する構成であってよい。また、これら以外にも特定の物体(AGVレイヤー、人レイヤー)や、柱レイヤー、ランドマーク(3Dマーカや充電ステーション)レイヤーを保持する構成としてもよい。
<Modification>
In the present embodiment, the layers (1) to (4) are used. However, the configuration is sufficient as long as the configuration has a plurality of layers according to the movement of the object, and the configuration may be such that the holding unit 1130 holds only an arbitrary number of layers (1) to (4). In addition to the above, a specific object (AGV layer, human layer), pillar layer, and landmark (3D marker or charging station) layer may be held.
 本実施形態においては、意味的領域分割したデプスマップを用いてマップ情報生成、および位置姿勢を算出した。一方、意味的領域分割したデプスマップを用いて、制御部1140が制御値を算出してもよい。具体的には、意味的領域分割したときに人やほかのAGVが検出された場合には、それらを避けるように制御部1140が制御値を算出することができる。このようにすることで、安全にAGVを運用できる。また、人や他のAGVの後ろをついていくような制御値を制御部1140が算出してもよい。このようにすることで、マップ情報が無くともAGVが動作することができるようになる。さらに、算出部1120が、意味的領域分割結果を基に人のジェスチャーを認識し、制御部1140が制御値を算出してもよい。具体的には、例えば人の腕や指、頭、胴体、足などのパーツごとに画像の領域をラベリングし、それらの相互位置関係を基にジェスチャーを認識する。人の手招きジェスチャーを認識したら人の近くに移動するように制御値を算出する、人の指差しジェスチャーを認識したら指差した方向に移動するよう制御値を算出する。このように、人のジェスチャーを認識することで、ユーザが直接コントローラなどを用いてAGVをコントロールせずとも移動させることができるようになるため、手間なくAGVを運用できる。 In the present embodiment, map information generation and position and orientation are calculated using the semantically segmented depth map. On the other hand, the control unit 1140 may calculate the control value using the depth map divided into semantic regions. Specifically, when a person or another AGV is detected when the semantic region is divided, the control unit 1140 can calculate the control value so as to avoid them. By doing this, AGV can be operated safely. Also, the control unit 1140 may calculate a control value that follows a person or another AGV. By doing this, the AGV can operate even without map information. Furthermore, the calculation unit 1120 may recognize a human gesture based on the semantic region division result, and the control unit 1140 may calculate the control value. Specifically, for example, the regions of the image are labeled by parts such as human arms and fingers, head, torso, and legs, and gestures are recognized based on their mutual positional relationship. If a person beckoning gesture is recognized, a control value is calculated so as to move closer to the person, and if a person pointing gesture is recognized, a control value is calculated so as to move in a pointing direction. As described above, by recognizing a human gesture, the user can directly move the AGV using a controller or the like without controlling the AGV, so that the AGV can be operated without any trouble.
 本実施形態の方法で検出した物体種に応じて制御部1140が制御値を算出してもよい。具体的には、物体種が人であれば止まる、物体種がほかのAGVであれば避ける、というように制御する。これにより、衝突を避けなければならない人にぶつからないように安全に、人でないものは避けて効率よくAGVを運用できるようになる。 The control unit 1140 may calculate the control value according to the object type detected by the method of the present embodiment. Specifically, control is performed so as to stop if the object type is a person, and to avoid if the object type is another AGV. This makes it possible to operate the AGV efficiently, avoiding non-human ones safely so as not to hit people who must avoid collisions.
 本実施形態では、AGVが受動的に物体をセグメンテーションしていた。しかしながら、動く物体を排除するようにAGVが人に指示を出してもよい。具体的には、マップ情報を生成中に、算出部1120が人を検出した場合には、制御部1140が不図示のスピーカを用いて人を移動させる音声を出力する制御値を算出する。このようにすることで、動く物体を除外してマップ情報を生成することができる。 In the present embodiment, the AGV passively segments the object. However, the AGV may instruct a person to exclude moving objects. Specifically, when the calculation unit 1120 detects a person during generation of map information, the control unit 1140 calculates a control value for outputting a voice for moving the person using a speaker (not shown). By doing this, it is possible to generate map information excluding moving objects.
 本実施形態においては、意味的領域分割を行い、物体種を特定していた。しかしながら物体種を特定せずとも算出部1120がマップ情報や位置姿勢の算出や、制御部1140が制御値を算出する構成としてもよい。つまり、図12のS6110,S6120、を取り除いた構成とすることもできる。具体的には、例えばデプスマップを地面からの高さによって領域分割する。このとき、AGVの高さ以上の高さである画素は無視して制御値を算出する。つまり、AGVが衝突しない高さのポイントクラウドはルート生成に用いない。このようにすることで、処理する点群数が減少し高速に制御値を算出することができるようになる。また、平面度合を基に領域分割ってもよい。こうすると、位置姿勢算出の寄与度が高い三次元エッジを優先して位置姿勢算出に用いる(位置姿勢で曖昧性が残る平面を処理から除外する)ことができるようになり、算出時間の減少とロバスト性の向上につながる。 In the present embodiment, semantic region division is performed to specify an object type. However, the configuration may be such that the calculation unit 1120 calculates map information, position and orientation, and the control unit 1140 calculates control values without specifying the object type. That is, S6110 and S6120 in FIG. 12 can be removed. Specifically, for example, the depth map is divided into areas by height from the ground. At this time, the control value is calculated ignoring pixels which are at a height higher than the height of the AGV. In other words, point clouds at heights at which AGVs do not collide are not used for route generation. By doing this, the number of point clouds to be processed decreases, and the control value can be calculated at high speed. Also, the area may be divided based on the planarity. In this way, it is possible to prioritize and use the three-dimensional edge with a high degree of contribution of position and orientation calculation (to exclude the plane in which the ambiguity remains in the position and orientation from the processing). It leads to the improvement of robustness.
 本実施形態では、マップ情報で動く物体ほど重みを小さくして位置姿勢算出の寄与度を低下させていた。一方、マップ情報はレイヤー構造をもっていなくても、デプスマップの意味的領域分割結果を基にデプスマップの各画素に重みを算出し、この重みを用いて位置姿勢を算出することもできる。入力部1110がデプスマップを入力したら、次に算出部1120がデプスマップをS6110からS6130の処理手順で意味的領域分割する。そして各画素の物体種ラベルを基にルックアップテーブルを参照して重みを決める。その後、ステップS140により重みを考慮して位置姿勢を算出する。以上のように、マップをレイヤー構造とせずとも、位置姿勢算出における動く物体の影響を低下させることができる。こうするとマップの容量を小さくすることができる。 In the present embodiment, the weight of the moving object in the map information is reduced to reduce the degree of contribution of the position and orientation calculation. On the other hand, even if the map information does not have a layer structure, it is possible to calculate a weight for each pixel of the depth map based on the result of semantic area division of the depth map, and calculate the position and orientation using this weight. After the input unit 1110 inputs the depth map, the calculation unit 1120 divides the depth map into semantic regions according to the processing procedure of S6110 to S6130. Then, the weight is determined by referring to the look-up table based on the object type label of each pixel. Thereafter, in step S140, the position and orientation are calculated in consideration of the weight. As described above, the influence of moving objects in position and orientation calculation can be reduced without making the map into a layer structure. This can reduce the capacity of the map.
 本実施形態における撮像部110は、撮像素子上の各々の受光部が2以上の受光素子によって構成されることを特徴とする撮像部に限らず、TOFカメラや3DLiDARなど、三次元の奥行き情報を取得できるものであれば何でもよい。 The imaging unit 110 in the present embodiment is not limited to the imaging unit characterized in that each light receiving unit on the imaging device is constituted by two or more light receiving elements, three-dimensional depth information such as a TOF camera or 3DLiDAR Anything that can be acquired may be used.
 本実施形態では保持部1130がマップ情報をレイヤーごとに保持している。これらは表示部H16で確認したり、初期マップに戻したりすることも出来る。ディスプレイ画面を見ながらレイヤーを確認することで、動く物体がマップに登録されている場合には再度AGVにマップを生成する指示を出すことで、手軽に、安定してAGVを運用することができる。 In the present embodiment, the holding unit 1130 holds map information for each layer. These can be confirmed by the display unit H16 or returned to the initial map. By checking the layer while looking at the display screen, it is possible to operate the AGV easily and stably by instructing the AGV to generate the map again if the moving object is registered in the map. .
 本実施形態では、1台のAGVがマップ情報を作成することを想定していたが、複数のAGVが協調してマップ情報を生成することもできる。具体的には各AGVの作成したマップ情報の同一の地点を指すポイントクラウドが同じ位置となるようにICPアルゴリズムで位置合わせする。また、個別に作成したマップ情報を統合するときに、マップ作成時刻を参照して、より新しいマップ情報を残すように統合してもよい。また、しばらくマップ情報が更新されていない領域のマップを生成するように制御部1140が作業を行っていないAGVを移動させてもよい。このように、複数のAGVで強調してマップを生成することで、マップ情報の生成に係る時間が短くなり、手軽にAGVを運用することができる。 In the present embodiment, it is assumed that one AGV creates map information, but a plurality of AGVs can cooperate to generate map information. Specifically, the ICP algorithm aligns the point clouds pointing to the same point of the map information created by each AGV so as to be the same position. In addition, when integrating individually created map information, integration may be performed so as to leave newer map information by referring to the map creation time. The control unit 1140 may move an AGV that has not been worked on so as to generate a map of an area for which the map information has not been updated for a while. As described above, by emphasizing with a plurality of AGVs and generating a map, the time relating to the generation of map information becomes short, and AGVs can be easily operated.
 制御部1140が算出する制御値の算出は、マップ情報を用いて、目的位置姿勢に近づくように算出する方法であれば本実施形態で説明した方法に限らない。具体的には、ルート生成のための学習モデルを用いて制御値を決定することができる。例えば、強化学習の学習モデルであるDQN(Deep Q-Network)を援用できる。特に、あらかじめ目標位置姿勢に近づくと報酬を高く、目標位置姿勢から離れると報酬を低く、障害物に接近すると報酬を低くするように強化学習の学習モデルを学習しておくことで実現できる。 The calculation of the control value calculated by the control unit 1140 is not limited to the method described in the present embodiment as long as it is a method of calculating so as to approach the target position and orientation using map information. Specifically, the control value can be determined using a learning model for route generation. For example, DQN (Deep Q-Network), which is a learning model of reinforcement learning, can be used. In particular, it can be realized by learning in advance a learning model of reinforcement learning so as to increase the reward when approaching the target position and posture, lower the reward when separating from the target position and posture, and lower the reward when approaching the obstacle.
 実施形態1から6においては、マップ情報を用いて位置姿勢の算出、制御値算出を行う方法について述べた。しかしながら、マップ情報の利用目的はこれに限らない。具体的には、作成したマップ情報を用いて、AGVの運搬シミュレーションを行い、効率良く運搬できるよう工程管理システムが工程を生成してもよい。同様に移動体管理システムが、マップ情報を基にAGVの運行タイミングや混雑を回避するようなルートを生成してもよい。 In the first to sixth embodiments, the method of calculating the position and orientation and calculating the control value using the map information has been described. However, the purpose of use of map information is not limited to this. Specifically, using the generated map information, AGV transport simulation may be performed, and the process management system may generate processes so that the AGV can be transported efficiently. Similarly, the mobile management system may generate a route that avoids the AGV operation timing and congestion based on the map information.
 作成したマップで配送シミュレーションとともに前述の学習モデルの学習を行ってもよい。このとき、シミュレーション上で障害物の設置、人や他のAGVとの衝突といった状況を再現し学習しておくことで、現実に同様の状況が生じても制御部1140が学習モデルを用いて安定して制御値を算出することができる。また、A3C(Asynchronous Advantage Actor-Critic)などの方式により並列に学習することで、短時間で効率よく学習モデルが制御方法を学習するように構成することもできる。 The learning model described above may be learned along with the delivery simulation with the created map. At this time, the control unit 1140 stabilizes using a learning model even if a similar situation actually occurs, by reproducing and learning a situation such as installation of an obstacle or a collision with a person or another AGV on a simulation. The control value can then be calculated. Further, by learning in parallel by a method such as A3C (Asynchronous Advantage Actor-Critic), the learning model can be configured to learn the control method efficiently in a short time.
 [実施形態7]
 実施形態1から6に共通して適用できるUIについて説明する。撮像部が取得した視覚情報や、算出部が算出した位置姿勢、物体の検出結果、マップ情報などをユーザが確認することを説明する。また、AGVは自動制御で動くためユーザの入力により制御することを説明する。ユーザがAGVの状況を確認できるよう、AGVを制御することもできるように、表示装置として例えばディスプレイにGUIを表示し、マウスやタッチパネルといった入力装置によりユーザからの操作を入力する。なお、本実施形態において、ディスプレイはAGVに搭載されているものとしているが、このような構成に限るものではない。つまり、通信I/F(H17)を介して、ユーザの持つモバイル端末のディスプレイを表示装置として用いる、移動体管理システムに接続された液晶ディスプレイを表示装置として用いる、といったこともできる。AGVに搭載されている表示装置を用いる場合でも、AGVに搭載されていない表示装置を用いる場合でも、表示情報は、情報処理装置で生成することができる。また、AGVに搭載されていない表示装置を用いる場合には、表示装置に付随する計算機が、情報処理装置から表示情報の生成に必要な情報を取得し表示情報を生成してもよい。
Seventh Embodiment
A UI that can be commonly applied to the first to sixth embodiments will be described. It will be described that the user confirms the visual information acquired by the imaging unit, the position / posture calculated by the calculation unit, the detection result of the object, the map information, and the like. In addition, the AGV is described to be controlled by the user's input because it moves by automatic control. As a display device, for example, a GUI is displayed on a display so that the user can control the AGV so that the user can confirm the status of the AGV, and an operation from the user is input using an input device such as a mouse or a touch panel. In the present embodiment, the display is mounted on the AGV, but the present invention is not limited to such a configuration. That is, it is also possible to use the liquid crystal display connected to the mobile management system as the display device using the display of the mobile terminal owned by the user as the display device through the communication I / F (H17). Even when using a display device mounted on the AGV or using a display device not mounted on the AGV, display information can be generated by the information processing apparatus. Moreover, when using the display apparatus which is not mounted in AGV, the computer which accompanies a display apparatus may acquire the information required for production | generation of display information from an information processing apparatus, and may produce | generate display information.
 実施形態7における装置の構成は、実施形態1で説明した情報処理装置10の構成を示す図2と同一であるため省略する。算出部1120が、撮像部110が取得した視覚情報、算出部1120が算出した位置姿勢や検出した物体、制御部1140が算出した制御値に基づいて表示情報を生成し、AGVに搭載したタッチパネルディスプレイ等に提示する点が実施形態1と異なる。なお、表示情報の詳細については後述する。また、本実施形態においては、保持部1130は、2Dのマップ情報、3Dのマップ情報を保持しているものとする。 The configuration of the device according to the seventh embodiment is the same as that of FIG. The touch panel display in which the calculation unit 1120 generates display information based on the visual information acquired by the imaging unit 110, the position and orientation calculated by the calculation unit 1120, the detected object, and the control value calculated by the control unit 1140 And the like are different from the first embodiment. The details of the display information will be described later. Further, in the present embodiment, the holding unit 1130 holds 2D map information and 3D map information.
 図13に、本実施形態における表示装置が提示する表示情報の一例であるGUI100を示す。 FIG. 13 shows a GUI 100 which is an example of display information presented by the display device according to the present embodiment.
 G110は2Dのマップ情報を提示するためのウィンドウである。G120は3Dのマップ情報を提示するためのウィンドウである。G130は撮像部110が取得した画像D154eを提示するためのウィンドウである。G140は撮像部110が取得したデプスマップD154dを提示するためのウィンドウである。また、G150は、算出部1120が実施形態1で説明したように算出した位置姿勢や実施形態5、6で説明したように検出した物体、実施形態1で説明したように制御部1140が算出した制御値に基づいて表示情報を提示するためのウィンドウである。 G110 is a window for presenting 2D map information. G120 is a window for presenting 3D map information. G130 is a window for presenting the image D154e acquired by the imaging unit 110. G 140 is a window for presenting the depth map D 154 d acquired by the imaging unit 110. G150 represents the position and orientation calculated by the calculation unit 1120 as described in the first embodiment, the object detected as described in the fifth and sixth embodiments, and the G150 calculated by the control unit 1140 as described in the first embodiment. It is a window for presenting display information based on a control value.
 G110は、保持部1130が保持している2Dマップの提示例を示している。G111は、撮像部110を搭載したAGVである。算出部1120が、撮像部の位置姿勢(AGVの位置姿勢)に基づき2Dマップ上に合成する。G112は、算出部1120が実施例5や6の方法で検出した物体の位置姿勢に基づき、衝突の可能性がある場合に吹き出しとしてアラートを提示した例である。G113は、制御部1140が算出した制御値に基づき、AGVの進行予定ルートを矢印として提示した例である。図13においては、AGVはG114に提示した目的地に向かっている。このように、2DマップとAGVの位置、物体の検出結果、ルートを提示することでユーザが容易にAGVの運行状況を把握することができる。なお、G111~G114は色や線の太さ、形状を変えることでユーザがより容易に運行状況を把握できるようにしてよい。 G110 shows an example of presentation of a 2D map held by the holding unit 1130. G111 is an AGV on which the imaging unit 110 is mounted. The calculation unit 1120 synthesizes on the 2D map based on the position and orientation (position and orientation of the AGV) of the imaging unit. G112 is an example in which an alert is presented as a balloon when there is a possibility of a collision, based on the position and orientation of the object detected by the calculation unit 1120 according to the methods of the fifth and sixth embodiments. G113 is an example in which an AGV planned route is presented as an arrow based on the control value calculated by the control unit 1140. In FIG. 13, the AGV is heading to the destination presented on G114. As described above, the user can easily grasp the AGV operation status by presenting the 2D map, the position of the AGV, the detection result of the object, and the route. Note that G111 to G114 may allow the user to more easily understand the operation status by changing the color, the thickness of the line, and the shape.
 G120は、保持部1130が保持する3Dマップの提示例を示している。G121は、実施形態6で説明したように、算出部1120がデプスマップを意味的領域分割した結果を用いて保持部1130が保持する3Dマップを更新した結果を可視化した例である。具体的には工場のCADデータから得られた動かない物体は濃く、他のAGVや人など動く物体ほど薄く提示した。なお、濃さに限らず、レイヤーごとに色を変更して提示してよい。また、GUI122には、算出部1120が検出した物体のラベルを提示した。このように、3Dマップを提示することで、2Dマップと比較しさらに高さ方向を考慮してユーザは運行状況を把握することができる。また、AGVが走行している間に見つけた物体種であれば、現場に行かずとも探すことができる。 G120 shows an example of presentation of the 3D map held by the holding unit 1130. G121 is an example of visualizing the result of updating the 3D map held by the holding unit 1130 using the result of the calculation unit 1120 dividing the depth map into meaningful areas as described in the sixth embodiment. Specifically, non-moving objects obtained from factory CAD data are dark, and are presented as lighter as other moving objects such as other AGVs and people. In addition, you may change and present the color for every layer not only in density. Also, the label of the object detected by the calculation unit 1120 is presented on the GUI 122. Thus, by presenting the 3D map, the user can comprehend the operation status in consideration of the height direction in comparison with the 2D map. In addition, if it is an object type found while the AGV is traveling, it can be searched without going to the site.
 G130は、撮像部110が取得した画像の提示例を示している。G131、G132は、実施形態6で説明したように、算出部1120が検出した物体である他のAGVや人の外周にバンディングボックスを点線で重畳した。ただし実践や二重線でもよいし、色を変えて提示することで強調してもよい。このように撮像部110が取得した画像に物体の検出結果を重畳することで、算出部1120が検出した物体を手間なくユーザが確認することができる。 G130 has shown the example of presentation of the picture which image pick-up part 110 acquired. In G131 and G132, as described in the sixth embodiment, a banding box is superimposed on another AGV which is an object detected by the calculation unit 1120 or the outer circumference of a person in a dotted line. However, it may be a practice or a double line, or it may be emphasized by changing and presenting a color. As described above, by superimposing the detection result of the object on the image acquired by the imaging unit 110, the user can confirm the object detected by the calculation unit 1120 without any trouble.
 G140は、撮像部110が取得したデプスマップの提示例を示している。G141は、実施形態5で説明した、保持部1130が保持する物体のCADモデルを、算出部1120が算出した物体の位置姿勢を用いてワイヤーフレームとして重畳した例である。G142は、AGVのCADモデルをワイヤーフレームとして重畳した例である。G143は、三次元のマーカのCADモデルを重畳した例である。このように、デプスマップ上に検出して位置姿勢を算出した物体を提示することで、ユーザは検出した物体を容易に把握することができる。また、検出した物体の位置姿勢を用いてAGVの制御を行う場合には、デプスマップとCGとの位置ずれからAGVの位置姿勢の算出ずれを把握することができる。なお、ワイヤーフレームをさらにG130に重畳してもよい。こうすると、実写の画像とモデルとのずれをユーザは比較すればよく、より手軽に直感的にAGVの位置姿勢算出精度や物体の検出精度を確認することができるようになる。 G140 shows the example of presentation of the depth map which image pick-up part 110 acquired. G141 is an example in which the CAD model of the object held by the holding unit 1130 described in the fifth embodiment is superimposed as a wire frame using the position and orientation of the object calculated by the calculation unit 1120. G142 is an example in which an AGV CAD model is superimposed as a wire frame. G143 is an example in which a CAD model of a three-dimensional marker is superimposed. Thus, the user can easily grasp the detected object by presenting the object whose position and orientation is calculated on the depth map. Further, when controlling the AGV using the detected position and orientation of the object, it is possible to grasp the calculation deviation of the position and orientation of the AGV from the positional displacement between the depth map and the CG. A wire frame may be further superimposed on G130. Then, the user can compare the deviation between the photographed image and the model, and can more easily and intuitively confirm the position and orientation calculation accuracy of the AGV and the object detection accuracy.
 G150は、AGVを人手で操作するためのGUIや、算出部1120や制御部1140が算出した値、AGVの運行情報の提示例を示している。G151は緊急停止ボタンであり、ユーザはこのボタンに指で触れることでAGVの移動を停止させることができる。G152はマウスカーソルであり、不図示のマウスやコントローラ、タッチパネルを通したユーザのタッチ動作に従ってカーソルを移動させることができ、ボタンを押下することでGUI内のボタンやラジオボタンを操作することができる。G153はAGVのコントローラを提示した例である。ユーザはコントローラの内部の円を上下左右に移動させることで、それらの入力に応じたAGVの前後左右の動作を行うことができる。G154はAGVの内部状態を提示した例である。AGVが自動走行しており、秒速0.5m/sで動作している状態を例として図示した。また、AGVが走行を開始してからの時間、目的地までにかかる残りの時間、予定に対する到着予想時刻の差といった運行情報も合わせて提示した。G156はAGVの動作や表示情報の設定を行うためのGUIである。マップ情報を生成するか否か、検出した物体を提示するか否かといった操作をユーザが行うことができる。G157はAGVの運行情報を提示した例である。算出部1120が算出した位置姿勢や、移動体管理システム13から受信した目的地座標、AGVが運搬している物品名を提示した例である。このように、運行情報を提示する、ユーザからの入力に係るGUIを提示することで、より直感的にAGVを運用できるようになる。 G150 indicates a GUI for manually operating the AGV, values calculated by the calculation unit 1120 and the control unit 1140, and an example of presentation of operation information of the AGV. G151 is an emergency stop button, and the user can stop the movement of the AGV by touching the button with a finger. G152 is a mouse cursor, which can move the cursor according to a user's touch operation through a mouse, a controller, and a touch panel (not shown), and can operate buttons and radio buttons in the GUI by pressing a button. . G153 is an example showing a controller of AGV. By moving the circle inside the controller up, down, left, and right, the user can perform the front, rear, left, and right movement of the AGV according to those inputs. G154 is an example showing the internal state of AGV. The AGV is illustrated as an example in which it is traveling automatically and operating at a speed of 0.5 m / s. In addition, operational information such as the time since the AGV started to travel, the remaining time to the destination, and the difference in the estimated arrival time with respect to the schedule are also presented. G156 is a GUI for setting the operation and display information of the AGV. The user can perform operations such as whether to generate map information and whether to present a detected object. G157 is an example of presenting AGV operation information. In this example, the position and orientation calculated by the calculation unit 1120, the destination coordinates received from the mobile management system 13, and the name of the article being transported by the AGV are presented. Thus, AGV can be more intuitively operated by presenting the GUI related to the input from the user that presents the operation information.
 実施形態7における情報処理装置の処理手順は実施形態1で説明した情報処理装置10の処理手順を説明した図5のステップS160の後に、算出部1120が表示情報を生成する、表示情報生成ステップ(不図示)が新たに追加される点で異なる。表示情報生成ステップでは、撮像部110が撮像した視覚情報、算出部1120が算出した位置姿勢、検出した物体、制御部1140が算出した制御値を基に、表示情報をレンダリングし、表示装置に出力する。 The processing procedure of the information processing apparatus in the seventh embodiment is a display information generation step in which the calculation unit 1120 generates display information after step S160 of FIG. 5 which describes the processing procedure of the information processing apparatus 10 described in the first embodiment. (Not shown) differs in that a new addition is made. In the display information generation step, the display information is rendered based on the visual information captured by the imaging unit 110, the position and orientation calculated by the calculation unit 1120, the detected object, and the control value calculated by the control unit 1140, and output to the display device Do.
 実施形態7では、撮像部が取得した視覚情報や、算出部が算出した位置姿勢、検出した物体、制御部が算出した制御値を基に、算出部が表示情報を生成し、ディスプレイに提示する。これによりユーザが容易に本情報処理装置の状態を確認することができる。また、AGVの制御値や各種パラメータ、表示モード等をユーザが入力する。これにより手軽にAGVの各種設定を変更したり移動させたりすることができる。このように、GUIを提示することで手軽にAGVを運用することができるようになる。 In the seventh embodiment, the calculation unit generates display information based on the visual information acquired by the imaging unit, the position and orientation calculated by the calculation unit, the detected object, and the control value calculated by the control unit, and presents it on the display. . Thus, the user can easily check the state of the information processing apparatus. In addition, the user inputs an AGV control value, various parameters, a display mode, and the like. This makes it possible to easily change or move various settings of the AGV. Thus, by presenting the GUI, it becomes possible to easily operate the AGV.
 表示装置は、ディスプレイに限らない。AGVにプロジェクタを搭載すれば、プロジェクタを用いて表示情報を提示することもできる。また、移動体管理13システムに表示装置を接続してあれば、通信I/F(H17)経由で表示情報を移動体管理システム13に送信し提示してもよい。また、表示情報の生成に必要な情報のみ送信して、移動体管理システム13内部の計算機で表示情報を生成することもできる。このようにすることで、ユーザはAGVに搭載した表示装置を確認せずとも、AGVの運行状況や操作を手軽に行うことができるようになる。 The display device is not limited to the display. If a projector is mounted on the AGV, display information can also be presented using the projector. In addition, if a display device is connected to the mobile management 13 system, display information may be transmitted and presented to the mobile management system 13 via the communication I / F (H17). Further, it is also possible to transmit only the information necessary for generating the display information, and to generate the display information by a computer inside the mobile management system 13. By doing this, the user can easily perform the operation status and operation of the AGV without confirming the display device mounted on the AGV.
 本実施形態における表示情報は、本情報処理が扱う情報を提示するものであれば何でもよい。本実施形態で説明した表示情報の他にも、位置姿勢算出時の残差や物体検出時の認識尤度値を表示することもできる。さらには、位置姿勢算出に係った時間やフレームレート、AGVのバッテリーの残量情報なども表示してもよい。このように、本情報処理装置が扱う情報を提示することで、ユーザが本情報処理装置の内部状態を確認することができるようになる。 The display information in the present embodiment may be anything as long as it presents information handled by the present information processing. In addition to the display information described in the present embodiment, it is also possible to display residuals at the time of position and orientation calculation and recognition likelihood values at the time of object detection. Furthermore, the time and frame rate related to the position and orientation calculation, the remaining information of the AGV battery, etc. may be displayed. Thus, by presenting the information handled by the present information processing apparatus, the user can confirm the internal state of the present information processing apparatus.
 本実施形態で説明したGUIは一例であって、AGVの運行状況を把握する、AGVに対して操作(入力)を行うことができるようなGUIであればどんなGUIを用いてもよい。例えば色を変える、線の太さや実線・破線・二重線を切り替える、拡大縮小する、必要のない情報を隠す、というように表示情報を変更することもできる。また、物体モデルはワイヤーフレームではなく輪郭を表示してもよいし、透過したポリゴンモデルを重畳してもよい。このように表示情報の可視化方法を変えることで、ユーザがより直感的に表示情報を理解することができるようになる。 The GUI described in the present embodiment is an example, and any GUI may be used as long as it can perform an operation (input) on the AGV to grasp the operating status of the AGV. For example, the display information can be changed such as changing color, switching line thickness, solid line, broken line, double line, scaling, and hiding unnecessary information. Also, the object model may display a contour instead of a wire frame, or a transparent polygon model may be superimposed. By changing the method of visualizing display information in this manner, the user can more intuitively understand the display information.
 本実施形態で説明したGUIをインターネット経由で不図示のサーバに接続することもできる。このような構成とすると、例えばAGVに不具合が生じた場合に、AGVメーカの担当者が、現場に行かずともサーバを経由して表示情報を取得して、AGVの状態を確認することができる。 The GUI described in the present embodiment can also be connected to a server (not shown) via the Internet. With such a configuration, for example, when a defect occurs in the AGV, the person in charge of the AGV manufacturer can confirm the state of the AGV by acquiring display information via the server without going to the site. .
 入力装置はタッチパネルを例で挙げたが、ユーザの入力を受け付ける入力装置であればなんでもよい。キーボードでもよいし、マウスでもよく、ジェスチャー(例えば撮像部110が取得する視覚情報から認識する)でもよい。さらにはネットワーク経由で移動体管理システムが入力装置となってもよい。また、スマートフォンやタブレット端末を通信I/F(H17)を経由して接続すれば、それらを表示装置/入力装置として用いることもできる。 The input device is exemplified by a touch panel, but any input device that receives an input from the user may be used. It may be a keyboard, a mouse, or a gesture (for example, it may be recognized from visual information acquired by the imaging unit 110). Furthermore, the mobile management system may be an input device via the network. In addition, if a smartphone or a tablet terminal is connected via the communication I / F (H17), they can also be used as a display device / input device.
 本実施形態において入力装置が入力するのは、本実施形態で説明したものに限らず、本情報処理装置のパラメータを変えるものであれば何でもよい。例えば、移動体の制御値の上限(速度の上限)を変えるようなユーザの入力を受け付けてもよいし、G110上でユーザがクリックした目的地点を入力してもよい。さらには、物体検出で用いるモデルと用いないモデルのユーザの選択を入力してもよい。検出できなかった物体をG130上でユーザが囲うように入力し、学習モデルの不図示の学習装置が撮像部110の視覚情報に合わせて当該物体を検出できるように学習するように構成してもよい。 In the present embodiment, the input device is not limited to the one described in the present embodiment, and anything may be used as long as it changes the parameters of the information processing apparatus. For example, the user's input may be accepted to change the upper limit (the upper limit of the speed) of the control value of the moving object, or the destination point clicked by the user may be input on G110. Furthermore, the user's selection of a model used for object detection and a model not used may be input. Even if an object that could not be detected is input on the G 130 so that the user encloses it, the learning device (not shown) of the learning model is configured to learn so as to detect the object according to the visual information of the imaging unit 110 Good.
 [実施形態8]
 実施形態6では、撮像部110が取得した視覚情報を意味的領域分割し、各画素が何の物体種であるかを判別しマップを生成する方法、およびそれらマップや判別した物体種に応じてAGVを制御する方法について述べた。実施形態8ではさらに、同一物体種でも状況によって異なる意味情報を認識し、それら認識結果を基にAGVを制御する方法について述べる。
[Eighth embodiment]
In the sixth embodiment, the visual information obtained by the imaging unit 110 is divided into semantic regions, the type of object is determined by each pixel, and a map is generated, and these maps and the determined type of object. We described how to control AGV. Embodiment 8 further describes a method of recognizing different semantic information depending on the situation even in the same object type, and controlling the AGV based on the recognition result.
 本実施形態8では、撮像部110が取得した視覚情報から、工場における積み重ねておかれた製品などの、物体の積み重なり度合を意味情報として認識する。つまり、撮像部110の視界に入る物体の意味情報を認識する。そして、物体の積み重なり度合に応じてAGVを制御する方法について述べる。つまり、積み重なっているような物体をより安全に避けるようにAGVを制御する。なお、本実施形態における物体の積み重なり度合とは、物体の積み重ね個数、または高さのことである。 In the eighth embodiment, from the visual information acquired by the imaging unit 110, the stacking degree of objects such as stacked products in a factory is recognized as semantic information. That is, it recognizes the semantic information of the object that is in the field of view of the imaging unit 110. Then, a method of controlling the AGV according to the stacking degree of the objects will be described. In other words, the AGV is controlled to more safely avoid objects that are stacked. The stacking degree of objects in the present embodiment is the number of stacked objects or the height.
 AGVの制御値算出には空間が物体によって占有されているかどうかを表す占有マップを用いるものとする。なお、本実施形態においては占有マップとして、シーンを格子状に区切り、各格子に障害物が存在する確率を保持した二次元の占有格子マップを用いる。本実施形態においては、占有マップは、AGVの接近拒絶度を表す値(0に近い程通行が許容され、1に近い程通行が拒絶される0から1の連続変数)を保持することとした。AGVは占有マップの接近拒絶値が所定の値以上の領域(本実施形態においては格子)を通らないように、目的地まで制御する。目的地とは、工程管理システム12から取得した運行情報に含まれる、AGVの目的地である二次元座標のことである。 An AGV control value calculation uses an occupancy map indicating whether space is occupied by an object. In the present embodiment, as the occupancy map, a two-dimensional occupancy grid map is used in which a scene is divided into grids and the probability that an obstacle exists in each grid is held. In this embodiment, it is assumed that the occupancy map holds a value representing the approach rejection degree of AGV (passing is permitted as closer to 0 and passing is rejected as closer to 1). . The AGV controls to the destination so that the approach rejection value of the occupancy map does not pass through the area (the grid in the present embodiment) which is equal to or more than a predetermined value. The destination is a two-dimensional coordinate which is the destination of the AGV, which is included in the operation information acquired from the process management system 12.
 本実施形態における情報処理システムは、実施形態1の図1で説明したシステム構成と同一であるため説明を省略する。 The information processing system according to the present embodiment is the same as the system configuration described in FIG.
 図14は、本実施形態8における情報処理装置80を備える移動体12のモジュール構成を示す図である。情報処理装置80は、入力部1110、位置姿勢算出部8110、意味情報認識部8120、制御部8130から構成されている。入力部1110は、移動体12に搭載された撮像部110と接続されている。制御部8130は、アクチュエータ120と接続されている。また、これらに加え、不図示の通信装置が移動体管理システム3と情報を双方向に通信を行っており、情報処理装置80の各種手段に入出力している。 FIG. 14 is a diagram showing a module configuration of the mobile unit 12 including the information processing apparatus 80 according to the eighth embodiment. The information processing apparatus 80 includes an input unit 1110, a position and orientation calculation unit 8110, a semantic information recognition unit 8120, and a control unit 8130. The input unit 1110 is connected to the imaging unit 110 mounted on the moving body 12. The controller 8130 is connected to the actuator 120. In addition to these, a communication device (not shown) communicates information with the mobile management system 3 in a bi-directional manner, and inputs / outputs to / from various means of the information processing device 80.
 本実施形態における撮像部110、アクチュエータ120、入力部1110は実施形態1と同様であるため詳細の説明は省略する。 The imaging unit 110, the actuator 120, and the input unit 1110 in the present embodiment are the same as in the first embodiment, and thus the detailed description will be omitted.
 以下に、位置姿勢算出部8110、意味情報認識部8120、制御部8130を順に説明する。 The position and orientation calculation unit 8110, the semantic information recognition unit 8120, and the control unit 8130 will be sequentially described below.
 位置姿勢算出部8110は、入力部1110が入力したデプスマップを基に撮像部110の位置姿勢を算出する。また、算出した位置姿勢を基にシーンの三次元マップを作成する。算出した位置姿勢、および三次元マップを意味情報認識部8120、および制御部8130に入力する。 The position and orientation calculation unit 8110 calculates the position and orientation of the imaging unit 110 based on the depth map input by the input unit 1110. Also, a three-dimensional map of the scene is created based on the calculated position and orientation. The calculated position and orientation and the three-dimensional map are input to the semantic information recognition unit 8120 and the control unit 8130.
 意味情報認識部8120は、入力部1110が入力したデプスマップ、および位置姿勢算出部8110が算出した位置姿勢および三次元マップを基に、意味情報として積み重なっている物体の個数および高さの値を推定する。推定した個数および高さの値を制御部8130に入力する。 The semantic information recognition unit 8120 uses the depth map input by the input unit 1110, the position and orientation calculated by the position and orientation calculation unit 8110, and the three-dimensional map to calculate the number of objects and heights of stacked objects as semantic information. presume. The estimated number and height values are input to the control unit 8130.
 制御部8130は、位置姿勢算出部8110が算出した位置姿勢、および三次元マップを入力する。また、意味情報認識部8120が推定した意味情報としての積み重なっている物体の個数および高さの値を入力する。制御部8130は、入力された値を基に、シーン中の物体への接近拒絶値を算出し、物体から所定の接近拒絶値以上の占有格子のセルを通るようにAGVを制御する制御値を算出する。制御部8130は、算出した制御値をアクチュエータ120に出力する。 The control unit 8130 inputs the position and orientation calculated by the position and orientation calculation unit 8110 and the three-dimensional map. Further, the value of the number and height of stacked objects as semantic information estimated by the semantic information recognition unit 8120 is input. The control unit 8130 calculates an approach rejection value to an object in the scene based on the input value, and controls a control value for controlling the AGV so that the object passes through the cells of the occupied grid above the predetermined approach rejection value. calculate. Control unit 8130 outputs the calculated control value to actuator 120.
 次に、本実施形態における処理手順について説明する。図15は、本実施形態における情報処理装置80の処理手順を示すフローチャートである。処理ステップは、初期化S110、視覚情報取得S120、視覚情報入力S130、位置姿勢算出S810、意味情報推定S820、制御値算出S830、制御S160、システム終了判定S170からなる。なお、初期化S110、視覚情報取得S120、視覚情報入力S130、制御S160、システム終了判定S170は実施形態1の図5と同一であるため説明を省略する。以下に、位置姿勢算出S810、意味情報推定S820、制御値算出S830のステップを順に説明する。 Next, the processing procedure in the present embodiment will be described. FIG. 15 is a flowchart showing the processing procedure of the information processing apparatus 80 in the present embodiment. The processing steps include initialization S110, visual information acquisition S120, visual information input S130, position and orientation calculation S810, semantic information estimation S820, control value calculation S830, control S160, and system termination determination S170. Note that the initialization S110, the visual information acquisition S120, the visual information input S130, the control S160, and the system termination determination S170 are the same as those in FIG. The steps of position and orientation calculation S810, semantic information estimation S820, and control value calculation S830 will be described in order below.
 ステップS810では、位置姿勢算出部8110が、撮像装置110の位置姿勢を算出するとともに、三次元のマップを作成する。これは、位置姿勢をもとにマップを作成しつつ位置姿勢推定を行うSLAM(Simultaneous Localizationand Mapping)アルゴリズムにより実現する。具体的には、複数時刻に撮像部110が取得したデプスマップの奥行きの差が最小となるようにICPアルゴリズムで位置姿勢を算出する。また、算出した位置姿勢を基にデプスマップを時系列的に統合するPoint-Based Fusionアルゴリズムを用いて三次元のマップを作成する。 In step S810, the position and orientation calculation unit 8110 calculates the position and orientation of the imaging device 110, and creates a three-dimensional map. This is realized by an SLAM (Simultaneous Localization and Mapping) algorithm that performs position and orientation estimation while creating a map based on the position and orientation. Specifically, the position and orientation are calculated by the ICP algorithm such that the difference in depth of the depth map acquired by the imaging unit 110 at a plurality of times is minimized. In addition, a three-dimensional map is created using a Point-Based Fusion algorithm that integrates depth maps in time series based on the calculated position and orientation.
 ステップS820では、意味情報認識部8120が、デプスマップおよび三次元マップを領域分割し、領域ごとに物体の重なり数(n)と高さ(h)を算出する。具体的な処理手順を以下に順に説明する。 In step S820, the semantic information recognition unit 8120 divides the depth map and the three-dimensional map into areas, and calculates the overlapping number (n) and height (h) of objects for each area. Specific processing procedures will be described in order below.
 まず、デプスマップの各ピクセルとその周囲のピクセルの奥行き値に基づいて法線方向を算出する。次に、周囲の画素の法線方向との内積値が所定の値より大きければ同じ物体領域として一意の領域識別ラベルを割り当てる。このようにして、デプスマップを領域分割する。そして、領域分割したデプスマップの各画素が指す三次元マップの各ポイントクラウドにも領域識別ラベルを伝搬させることで三次元マップの領域分割を行う。 First, the normal direction is calculated based on the depth value of each pixel of the depth map and the surrounding pixels. Next, if the inner product value of surrounding pixels with the normal direction is larger than a predetermined value, a unique area identification label is assigned as the same object area. In this way, the depth map is segmented. Then, the area identification label is propagated to each point cloud of the three-dimensional map pointed to by each pixel of the area divided depth map to perform area division of the three-dimensional map.
 次に、三次元マップをX-Z方向(AGVの移動平面)に等間隔に分割したバウンディングボックスを作成する。分割した各バウンディングボックスを地面から順に鉛直方向(Y軸方向)に走査し、バウンディングボックスに含まれる各ポイントクラウドのラベル数を数える。また、ポイントクラウドの地面(X-Z平面)からの高さの最大値を算出する。算出した領域数n、最大の高さhをポイントクラウドごとに三次元マップに保持しておく。 Next, bounding boxes are created by dividing the three-dimensional map in the XZ direction (moving plane of the AGV) at equal intervals. Each divided bounding box is scanned in order from the ground in the vertical direction (Y-axis direction), and the number of labels of each point cloud included in the bounding box is counted. In addition, the maximum value of the height from the ground (XZ plane) of the point cloud is calculated. The calculated number n of areas and the maximum height h are held in a three-dimensional map for each point cloud.
 ステップS830では、制御部S160が、三次元マップを基に占有マップを作成する。また、物体の重なり数(n)と高さ(h)から占有マップの接近拒絶度の値を更新する。そして、更新した占有マップを基にAGVを制御する。 In step S830, the control unit S160 creates an occupancy map based on the three-dimensional map. Also, the value of the approach rejection of the occupancy map is updated from the number of overlapping objects (n) and the height (h) of the objects. Then, the AGV is controlled based on the updated occupancy map.
 具体的には、まず、ステップS810で作成した三次元マップをAGVの移動平面にあたる床面であるX-Z平面に射影し、2Dの占有マップを得る。次に、三次元マップの各ポイントクラウドをX-Z平面に射影した点と各占有マップとの距離、ポイントクラウドの物体の重なり数(n)と高さ(h)の値を用いて、占有マップの各格子の値である接近拒絶値を更新する。i番目のポイントクラウドPをX-Z平面に射影した座標X-Zをpとする。また、占有のj番目のセルQjの座標をqとする。占有の値はh,Nが大きい程大きく、距離が離れるほど小さくなる関数として、例えば次のように求める。ただし、dijはpとqのユークリッド距離である。 Specifically, first, the three-dimensional map created in step S810 is projected onto the XZ plane, which is a floor surface corresponding to the movement plane of the AGV, to obtain a 2D occupancy map. Next, using the distance between the points obtained by projecting each point cloud of the three-dimensional map onto the XZ plane and each occupied map, and the number of overlapping objects (n) and height (h) of the objects in the point cloud, occupied Update the close rejection value, which is the value of each grid in the map. The i-th point coordinates X-Z obtained by projecting the cloud P i in X-Z plane and p i. Also, let the coordinates of the j-th cell Qj of occupancy be q j . The value of occupancy is larger as h and N are larger and smaller as the distance is larger. Where d ij is the Euclidean distance between p i and q i .
Figure JPOXMLDOC01-appb-M000004
Figure JPOXMLDOC01-appb-M000004
 以上のように定めた、占有マップと目標位置姿勢・現在位置姿勢の情報から、AGVと目標地点の位置姿勢を最小にしつつも、占有マップの接近拒絶度の値が高い格子を避けるようにAGVの進行ルート(移動経路)を決め、制御値を算出する。制御部8130が算出した制御値をアクチュエータ130に出力する。 Based on the information on the occupancy map and the target position / posture and the current position / posture determined as described above, the AGV and the target position / posture are minimized while the AGV is used to avoid grids with high values of approach rejection of the occupancy map. Determine the traveling route (moving route) of and calculate the control value. The control value calculated by the controller 8130 is output to the actuator 130.
 実施形態8では、意味情報としてAGVの周囲の物体の積み重ね数および高さを推定し、それらの値が大きくなる程、AGVがそれら物体から距離をおいて走行するように制御する。これにより、例えばAGVが物流倉庫のようにたくさんの物品が積まれた棚やパレットがある場合に、AGVがそれら物体からさらに距離をおいて走行することができるようになり、より安全にAGVを運行できるようになる。 In the eighth embodiment, the stacking number and height of objects around the AGV are estimated as the semantic information, and the AGV is controlled to travel at a distance from the objects as their values become larger. This makes it possible for the AGV to travel at a further distance from the objects, for example, when there are shelves or pallets on which the AGV is loaded with a lot of articles, such as a distribution warehouse, which makes the AGV safer. I will be able to operate.
 <変形例8-1>
 本実施形態における撮像部110は、TOFカメラやステレオカメラなど、画像とデプスマップを取得できるものであれば何でもよい。さらには、画像のみを取得するRGBカメラや、モノクロカメラのような単眼カメラを用いてもよい。単眼カメラを用いる場合には位置及び姿勢の算出、占有マップの生成処理に奥行きが必要となるが、カメラの動きから奥行き値を算出することで本実施形態を実現する。なお、以降の実施形態において説明する撮像部110についても本実施形態と同様に構成する。
<Modification 8-1>
The imaging unit 110 in the present embodiment may be anything as long as it can acquire an image and a depth map, such as a TOF camera or a stereo camera. Furthermore, an RGB camera that acquires only an image, or a monocular camera such as a monochrome camera may be used. When a single-eye camera is used, depth is required for position and orientation calculation and occupancy map generation processing, but the present embodiment is realized by calculating the depth value from the movement of the camera. The imaging unit 110 described in the following embodiments is also configured in the same manner as the present embodiment.
 <変形例8-2>
 占有マップの接近拒絶度の値は物体の高さが高く、積み重なり数が大きい程大きく、距離が離れるほど小さくなる関数であれば本実施形態において説明した方法に限らない。例えば、物体の高さや積み重なり数に比例する関数としてもよいし、距離に反比例する関数としてもよい。物体の高さ、または積み重なり数のどちらか一方のみ考慮する関数でもよい。距離、物体の高さや積み重なり数に応じた占有の値を格納したリストを参照して定めてもよい。なお、このリストはあらかじめ外部メモリ(H14)に記憶しておいてもよいし、移動体管理システム13が保持しており、必要に応じて情報処理装置に通信I/F(H17)を介して情報処理装置80にダウンロードしてもよい。
<Modification 8-2>
The value of the approach rejection degree of the occupancy map is not limited to the method described in the present embodiment as long as it is a function whose value is larger as the height of the object is higher and the stacking number is larger and smaller as the distance is larger. For example, it may be a function proportional to the height or stacking number of the object, or may be a function inversely proportional to the distance. The function may consider only one of the height of the object or the number of stacks. It may be determined with reference to a list that stores occupancy values according to the distance, the height of the object, and the number of stacks. Note that this list may be stored in advance in the external memory (H14) or held by the mobile management system 13, and may be stored in the information processing apparatus via the communication I / F (H17) as necessary. It may be downloaded to the information processing apparatus 80.
 占有マップは本実施形態で説明したような占有マップでなくとも、空間中の物体の有無を判別できるような構成であれば何でもよい。例えば所定の半径のポイントクラウドとして表してもよいし、何らかの関数で近似してもよい。二次元の占有マップに限らず三次元の占有マップを用いてもよく、例えば3Dのボクセル空間(X,Y,Z)で保持しても、TSDF(Truncated Signed Distance Function)である符号付距離場として保持してもよい。 The occupancy map is not limited to the occupancy map as described in the present embodiment, but may be anything as long as it can determine the presence or absence of an object in the space. For example, it may be represented as a point cloud of a predetermined radius or may be approximated by some function. Not only a two-dimensional occupancy map but also a three-dimensional occupancy map may be used. For example, a signed distance field which is a TSDF (Trunked Signed Distance Function) even when stored in a 3D voxel space (X, Y, Z) You may hold as.
 本実施形態においては、物体の高さや積み重なり数に応じて接近拒絶度を変えた線マップを用いて制御値を算出していたが、対象の意味情報を基に制御値を変化させるものであればこれに限らない。例えば、物体の高さや積み重なり数に応じた制御方法を記したリストを参照して制御値を決めてもよい。制御方法を記したリストとは、具体的には物体数と積み重なり数が所定の値であり条件を満たせば左に旋回する、減速するなどといった動作を規定したリストのことである。他にも所定の高さや積み重なり数の物体が見つかった時、それらが視野に写らないように回転するような制御値を算出するなど、事前に決めたルールベースでAGVを制御してもよい。また、高さや積み重なり数が大きくなる程速度を低下させるような制御値を算出するなど、計測値を変数とした関数に当てはめてAGVを制御してもよい。 In the present embodiment, the control value is calculated using a line map in which the degree of approach rejection is changed according to the height and the number of stacks of the object, but the control value may be changed based on the semantic information of the object. For example, it is not limited to this. For example, the control value may be determined with reference to a list describing a control method according to the height and stacking number of objects. Specifically, the list describing the control method is a list that defines operations such as turning to the left or decelerating if the number of objects and the number of stacking are predetermined values and the conditions are satisfied. Alternatively, the AGV may be controlled based on a predetermined rule, such as calculating a control value that rotates so as not to appear in the field of view when objects of a predetermined height or stacking number are found. Alternatively, the AGV may be controlled by applying a function having a measured value as a variable, such as calculating a control value that reduces the speed as the height and the number of stacks increase.
 <変形例8-3>
 本実施形態においては、撮像部110はAGVに搭載されていた例を示したが、AGVの進行方向を撮影できればAGVに搭載されている必要は無い。具体的には、天井に取り付けられた監視カメラを撮像装置110として用いてもよい。このときには、撮像装置110がAGVを撮影し、例えばICPアルゴリズムにより撮像装置110に対する位置姿勢を求めることができる。また、AGV上部にマーカを貼っておき、撮像装置110がマーカを検出することで位置姿勢を求めることもできる。また、撮像装置110がAGVの進行ルート上の物体を検出してもよい。撮像装置110は1台であっても複数台であってもよい。
<Modification 8-3>
In the present embodiment, the imaging unit 110 is mounted on the AGV. However, the imaging unit 110 need not be mounted on the AGV as long as it can capture the traveling direction of the AGV. Specifically, a surveillance camera attached to a ceiling may be used as the imaging device 110. At this time, the imaging device 110 can capture an AGV, and the position and orientation with respect to the imaging device 110 can be determined by, for example, an ICP algorithm. Alternatively, a marker may be attached to the upper part of the AGV, and the position and orientation may be obtained by the imaging device 110 detecting the marker. Also, the imaging device 110 may detect an object on the traveling route of the AGV. The imaging device 110 may be one or more.
 位置姿勢算出部8110や意味情報認識部8120、制御部8130もAGVに搭載されている必要は無い。例えば、制御部8130が移動体管理システム13に搭載されている構成がある。この場合、通信I/F(H17)を介し制御部8130が必要な情報を送受信するようにすることで実現できる。このようにすることで移動体であるAGV上に大きな計算機を乗せる必要が無くなり、AGVの重量が軽くて済むため、効率良くAGVを運用することができる。 The position and orientation calculation unit 8110, the semantic information recognition unit 8120, and the control unit 8130 do not have to be mounted on the AGV. For example, there is a configuration in which the control unit 8130 is mounted on the mobile management system 13. In this case, it can be realized by transmitting and receiving necessary information through the communication I / F (H17). By doing this, it is not necessary to place a large computer on the mobile AGV and the weight of the AGV can be lightened, so that the AGV can be operated efficiently.
 <変形例8-4>
 本実施形態においては、意味情報とは物体の積み重なり度合のことであった。しかしながら、AGVが安全に、効率よく運行するための制御値を算出することのできる意味情報であれば、意味情報認識部8120はどのような意味情報を認識してもよい。また、その意味情報を用いて、制御部8130が制御値を算出してもよい。
<Modification 8-4>
In the present embodiment, the semantic information is the stacking degree of objects. However, the semantic information recognition unit 8120 may recognize any semantic information as long as the AGV can calculate the control value for operating safely and efficiently. Also, the control unit 8130 may calculate the control value using the semantic information.
 例えば、構造物の位置を意味情報として認識してもよい。具体的には、工場にある構造物である「ドア」の開き度合を意味情報として用いることもできる。ドアが閉まっている時と比べ、ドアが開いているまたは開きかけている場合にはAGVを低速に走行させる。また、物がクレーンでつるされていることを認識し、物の下部に潜り込まないように制御する。このようにすることで、より安全にAGVを運用する。 For example, the position of the structure may be recognized as semantic information. Specifically, the degree of opening of a "door" which is a structure in a factory can also be used as semantic information. The AGV runs slower when the door is open or opening as compared to when the door is closed. Also, it recognizes that an object is suspended by a crane, and controls so as not to get into the lower part of the object. By doing this, the AGV can be operated more safely.
 本実施形態では積み重ね度合を認識したが、積み重ねに限らず近接して並んでいることを認識してもよい。例えば、複数の台車を認識し、それらの距離が所定の値より小さければ所定の距離以上離れて運行するような制御値を算出する。 Although the stacking degree is recognized in the present embodiment, it is also possible to recognize that they are lined up close to each other. For example, a plurality of bogies are recognized, and if their distances are smaller than a predetermined value, a control value is calculated so as to operate at a predetermined distance or more.
 また、他のAGVとその上方に位置する荷物を認識してAGVの上に荷物が乗っていることを認識してもよい。他のAGVが荷物を搭載していれば自分(AGV)が避け、そうでなければ自分(AGV)は直進し、他のAGVに自分(AGV)を避ける制御値を移動体管理システム13経由で送信してもよい。他のAGVに荷物が搭載されている場合にはさらに、荷物の大きさを認識し、大きさに応じて制御方法を決めてもよい。このように荷物を搭載してかどうかや荷物の大きさを判別することで、荷物を搭載していないAGVやより小さな荷物を搭載しているAGVが回避動作をすることで移動にかかるエネルギーや時間を小さくし効率的にAGVを運行できる。さらには、他のAGVに搭載された物体の種別を認識し、種別に応じた物体の価値や壊れやすさを認識することで、価値が高いものや壊れやすいものを搭載していれば自分(AGV)が避けるような制御値を算出してもよい。荷物の種別を認識することで、荷物への損害を小さく、安全にAGVを運用することができる。 Also, the other AGVs and the packages located above them may be recognized to recognize that the packages are on the AGVs. If another AGV carries a package, it will avoid itself (AGV), otherwise it will go straight ahead and avoid other AGVs to avoid itself (AGV) via mobile management system 13 It may be sent. When a package is loaded in another AGV, the size of the package may be recognized and the control method may be determined according to the size. In this way, by determining whether or not the luggage is mounted and the size of the luggage, the energy or the energy required for the movement by the avoidance operation of the AGV without the luggage and the AGV carrying the smaller luggage The AGV can be operated efficiently by reducing the time. Furthermore, by recognizing the type of an object mounted on another AGV, and recognizing the value or fragility of the object according to the type, if it is loaded with something of high value or fragile, A control value which AGV avoids may be calculated. By recognizing the type of package, AGV can be operated safely with less damage to the package.
 入力画像に写る物体の外形を意味情報として用いることもできる。具体的には、検出した物体がとがっている場合にはそのような物体から距離をおいて走行するようにすることでAGVに傷を負わすことなく安全に運用できる。また、壁のように平らな物体であれば、一定距離を運行するようにすることで、AGVのふらつきを抑え、安定して効率よく運行できる。 It is also possible to use the outer shape of the object shown in the input image as the semantic information. Specifically, when the detected object is pointed, by traveling at a distance from such an object, the AGV can be operated safely without being injured. Moreover, if it is a flat object like a wall, by operating a fixed distance, it is possible to suppress the fluctuation of the AGV and operate stably and efficiently.
 意味情報として、物体自体の危険度や壊れやすさを認識してもよい。例えば、段ボールに印字された「危険」という文字や髑髏マークを認識したら、そのような段ボールからは所定の距離以上離れて移動するようAGVを制御する。このようにすることで、物体の危険性や壊れやすさをもとにより安全にAGVを運用することができる。また、工場の自動機の稼働状況を示す積層灯の点灯状況を認識し、自動機が稼働中であれば所定の距離以上近づかないような制御値を算出する。このようにすると、自動機の安全センサにAGVが検出されてしまい自動機を止めるようなことがなくなり、効率的にAGVを運用することができる。 As semantic information, the degree of danger or fragility of the object itself may be recognized. For example, when recognizing the letters “danger” and the mark on the cardboard, the AGV is controlled to move away from the cardboard by a predetermined distance or more. By doing this, it is possible to operate the AGV more safely on the basis of the danger or fragility of the object. In addition, the lighting condition of the laminated lamp indicating the operating condition of the automatic machine in the factory is recognized, and the control value is calculated so as not to approach a predetermined distance or more when the automatic machine is in operation. In this way, the AGV is detected by the safety sensor of the automatic machine, and there is no need to stop the automatic machine, and the AGV can be operated efficiently.
 <変形例8-5>
 本実施形態においては、意味情報を基にAGVを減速する方法を述べた。しかしながら、制御方法は上記の方法に限らず、AGVを効率的に、安全に運用できる方法であればよい。例えば、加減速のパラメータを変えるようにしてもよい。つまり意味情報に応じた減速においても緩やかに減速をするのか、急に減速するのかといった緻密な制御ができるようになる。回避のパラメータを変更してもよく、物体の近くを回避するのか、大きく回避するのか、ルートを変更して回避するのか、止まるのかといった制御を切り替えるように構成してもよい。また、AGVの制御値算出の頻度を増減する。頻度を増加させることでより緻密な制御ができるようになり、逆に低下させることにより緩徐に制御ができるようになる。このように意味情報を基に制御方法を変更することでより効率よく、安全にAGVを運用する。
<Modification 8-5>
In the present embodiment, the method of decelerating the AGV based on the semantic information has been described. However, the control method is not limited to the above method, and any method that can operate AGV efficiently and safely can be used. For example, acceleration and deceleration parameters may be changed. In other words, precise control can be performed such as whether to decelerate gently or suddenly decelerate according to the semantic information. The parameters of the avoidance may be changed, or the control may be switched such as whether to avoid near the object, to largely avoid, to change the route and to avoid, or to stop. In addition, the frequency of calculation of control value of AGV is increased or decreased. By increasing the frequency, finer control can be achieved, and by decreasing the frequency, slow control can be achieved. Thus, AGV is operated more efficiently and safely by changing the control method based on the semantic information.
 [実施形態9]
 実施形態8では、AGVの周りに存在する物体の積み重ね度合や形状、構造物の状態といった、ある一時刻の周囲の静的な意味情報を基にAGVを制御していた。実施形態9では、それらの時間的変化を踏まえてAGVを制御する。本実施形態における意味情報とは、画像に写る物体の移動量のことを指す。なお、本実施形態においては、画像に写る物体の移動量に加え、物体の種別も合わせて認識し、その結果をもとにAGVの制御値を算出する方法について述べる。具体的には、周囲の物体の種別として他の移動体であるAGVとそれらに積まれた荷物、および他のAGVの移動量を認識し、それら認識結果に基づいて自身(AGV)または他のAGVの制御値を算出する。
[Embodiment 9]
In the eighth embodiment, the AGV is controlled based on static semantic information around a certain time, such as the stacking degree and shape of objects existing around the AGV, and the state of the structure. In the ninth embodiment, AGVs are controlled based on these temporal changes. The semantic information in the present embodiment refers to the amount of movement of an object shown in an image. In the present embodiment, in addition to the movement amount of the object shown in the image, the type of the object is also recognized, and a method of calculating the control value of the AGV based on the result is described. Specifically, it recognizes AGVs, packages placed on them, and other AGVs as the types of surrounding objects, and the amount of movement of other AGVs, and based on the recognition results, their own (AGV) or other Calculate the control value of AGV.
 本実施形態における情報処理装置の構成は、実施形態8で説明した情報処理装置80の図14と同一であるので説明を省略する。実施形態8と異なるのは、意味情報認識部8120が推定し制御部8130に入力する意味情報が、検出した物体種としてAGVとそれに積まれた荷物、そして他のAGVの移動量である点である。 The configuration of the information processing apparatus in the present embodiment is the same as that of FIG. 14 of the information processing apparatus 80 described in the eighth embodiment, and thus the description thereof is omitted. The difference from the eighth embodiment is that the semantic information estimated by the semantic information recognition unit 8120 and input to the control unit 8130 is a movement amount of an AGV and a load placed thereon as a detected object type, and the movement amount of another AGV. is there.
 本実施形態における処理手順の図は、実施形態8で説明した情報処理装置80の処理手順を説明する図15と同一であるため説明を省略する。実施形態8と異なるのは、意味情報推定ステップS820、および制御値算出S830の処理内容である。 The diagram of the processing procedure in the present embodiment is the same as FIG. 15 for describing the processing procedure of the information processing apparatus 80 described in the eighth embodiment, and therefore the description thereof is omitted. What differs from the eighth embodiment is the processing contents of the semantic information estimation step S820 and the control value calculation S830.
 意味情報推定ステップS820では、意味情報認識部8120が、デプスマップを領域分割し、さらに領域毎に物体の種別を推定する。このとき、合わせて推定した物体の位置およびサイズを推定する。次に、検出した物体のうち(他の)AGVの位置とその(他の)AGVの過去の位置とを比較し、(他の)AGVの移動量を算出する。本実施形態においては、他のAGVの移動量とは、自分(AGV)に対する相対位置姿勢の変化量のことである。 In the semantic information estimation step S820, the semantic information recognition unit 8120 divides the depth map into areas, and further estimates the type of the object for each area. At this time, the position and size of the object estimated together are estimated. Next, among the detected objects, the position of (the other) AGV is compared with the past position of the (the other) AGV to calculate the movement amount of the (the other) AGV. In the present embodiment, the amount of movement of another AGV is the amount of change in the relative position and orientation with respect to oneself (AGV).
 まず、実施形態6で説明したように、画像、およびデプスマップを基に、デプスマップを領域分割し、領域毎の物体種を特定する。 First, as described in the sixth embodiment, the depth map is divided into areas based on the image and the depth map, and an object type for each area is specified.
 次にAGVと認識された領域を抽出し、他の領域との相対位置関係を算出する。このとき、AGVとの距離が所定の閾値より小さく、かつAGVと認識された領域より鉛直(Y軸)方向にある領域をAGVに搭載された荷物領域であると判定する。さらに、AGVのサイズ、および搭載された荷物領域の大きさを取得する。なお、サイズとは荷物領域を囲むバウンディングボックスの長辺の長さとする。 Next, the area recognized as AGV is extracted, and the relative positional relationship with other areas is calculated. At this time, it is determined that a region having a distance to the AGV smaller than a predetermined threshold value and in a vertical (Y-axis) direction from a region recognized as an AGV is a cargo region mounted on the AGV. In addition, get the size of the AGV, and the size of the loaded luggage area. The size is the length of the long side of the bounding box that encloses the luggage area.
 そして、時刻t-1と時刻tにAGVと認識された領域をそれぞれ抽出し、それらの相対位置関係を、ICPアルゴリズムを用いて算出する。なお、算出した相対位置関係とは、自分(AGV)に対する他のAGVの相対位置姿勢の変化量のことである。これを以降で他のAGVの移動量と呼ぶ。 Then, the regions recognized as AGV at time t-1 and time t are extracted respectively, and their relative positional relationship is calculated using an ICP algorithm. The calculated relative positional relationship is the amount of change in the relative position and orientation of another AGV relative to oneself (AGV). This is hereinafter referred to as the movement amount of another AGV.
 制御値算出S830は、ステップS820において意味情報認識部8120が算出した他のAGVの移動量、および他のAGVとそれに搭載された荷物のサイズを基に、制御部8130が自分(AGV)の行動を決定する。 The control value calculation S 830 is the action of the control unit 8130 itself (AGV) based on the movement amount of the other AGV calculated by the semantic information recognition unit 8120 in step S 820 and the size of the other AGV and the package mounted thereon. Decide.
 まず、他のAGVの移動量より、他のAGVが自分(AGV)に近づいているか遠ざかっているのか判定する。他のAGVが遠ざかっている場合には制御値は変更しない。一方近づいている場合には、さらに荷物のサイズを基づいて新たな制御値を算出する。具体的には、あらかじめ不図示の入力手段によりRAM(H13)に格納しておいた自分(AGV)のサイズと、他のAGVとその荷物のサイズとを比較し、自分(AGV)の方が小さければ自分(AGV)が他のAGVを回避するルート計画を行う。一方、自分(AGV)の方が大きければ、自分(AGV)の速度を減速しつつ通信インターフェイスH17を介し、移動体管理システム13に検出した他のAGVに回避動作を行わせる信号を送る。 First, it is determined whether the other AGV is approaching or away from itself (AGV) based on the movement amount of the other AGV. The control value is not changed when the other AGVs move away. On the other hand, when approaching, a new control value is calculated based on the size of the package. Specifically, the size of one's own (AGV) stored in RAM (H13) in advance by input means (not shown) is compared with the size of other AGV's and the size of its package, and one's own (AGV's) If it is smaller, you (AGV) do route planning to avoid other AGVs. On the other hand, if the own (AGV) is larger, the signal of making the mobile management system 13 perform the avoidance operation is sent to the mobile management system 13 through the communication interface H17 while decelerating the own (AGV).
 以上のように、実施形態9においては意味情報としてAGVの周囲の物体の種別を判定し、かつ他のAGVであればさらに移動量と搭載した荷物の大きさを推定した結果をもとに制御値を算出する。このとき、他のAGVやその荷物が自分(AGV)より大きければ自分が回避し、逆に小さければ相手を回避させるような制御を行う。このようにすることで、サイズが小さく、搭載している荷物が小さいAGVがルートを譲るような制御ができるようになり、時間およびエネルギー効率よくAGVを運用することができる。 As described above, in the ninth embodiment, control is performed based on the result of determining the type of an object around AGV as the semantic information, and further estimating the movement amount and the size of the loaded luggage in the case of another AGV. Calculate the value. At this time, if another AGV or its package is larger than itself (AGV), control is performed such that oneself is avoided and, conversely, the other party is avoided. By doing this, it becomes possible to control the AGV having a small size and a small loaded package to give up the route, and it is possible to operate the AGV efficiently with time and energy.
 <変形例9-1>
 本実施形態においては、他の移動体としてAGVを検出していた。しかしながら、AGVに限らず、少なくとも位置や姿勢が変化し、それに応じてAGVの制御を変えることができればどのような物体を検出してもよい。具体的には、移動体としてフォークリフトや移動ロボットを検出してもよい。また、装置の一部の位置や姿勢が変化量を意味情報として認識し、それに応じてAGVの制御を変えてもよい。例えば、自動機やアームロボットのアーム、ベルトコンベアといった機器の可動部の移動量が例えば所定の動作速度より大きければ、よりAGVがそれらから所定の距離を離れて制御するようにしてもよい。
<Modification 9-1>
In the present embodiment, AGV is detected as another mobile body. However, not only the AGV but any object may be detected as long as at least the position and the posture change and the control of the AGV can be changed accordingly. Specifically, a forklift or a mobile robot may be detected as the mobile body. Further, the position or posture of a part of the apparatus may recognize the amount of change as the semantic information, and the control of the AGV may be changed accordingly. For example, if the moving amount of the movable unit of the machine such as an automatic machine or an arm robot arm or a belt conveyor is larger than a predetermined operation speed, for example, the AGV may be controlled to be away from them by a predetermined distance.
 <変形例9-2>
 本実施形態においては、自分と他のAGVとどちらかが回避するような制御値を算出していたが、対象物の動きに応じて制御値を変えるような制御方法であればよい。具体的には、移動量の大小に応じて実施形態8で説明した占有マップの接近拒絶度の値を動的に更新し、これを用いてAGVの制御値を算出してもよい。
<Modification 9-2>
In the present embodiment, the control value is calculated such that one of the AGV and the other AGV avoids, but any control method may be used as long as the control value is changed according to the movement of the object. Specifically, the value of the approach rejection degree of the occupancy map described in the eighth embodiment may be dynamically updated according to the magnitude of the movement amount, and the control value of the AGV may be calculated using this.
 また、他のAGVの移動量が自分(AGV)と同じ方向であれば他のAGVに追従して移動するような制御値を算出してもよい。十字路に差し掛かった時に先に横方向から来たAGVがあれば、通過し終えるまで待つという制御値を算出してもよし、自分が先行して十字路を進行する場合には移動体管理システム13を通じて他のAGVを待機させるような制御値を算出してもよい。さらに、他のAGVが例えば進行方向に対し左右に振動していることを観測した場合や、他のAGVに搭載された荷物が他のAGVに対し振動する動きが観測された場合には、一定距離以上近づかないようなルートを通るような制御値を算出してもよい。 In addition, a control value may be calculated so as to move following the other AGV if the moving amount of the other AGV is in the same direction as that of one's own (AGV). If there is an AGV that has come from the lateral direction first when the crossroad is reached, a control value may be calculated to wait until the passage is finished, or if the user leads the crossroad via the mobile management system 13 A control value that causes another AGV to stand by may be calculated. Furthermore, when it is observed that another AGV vibrates to the left and right with respect to the traveling direction, for example, or when a movement of a load mounted on another AGV is vibrated with respect to the other AGV, constant A control value may be calculated that passes through a route that does not approach the distance or more.
 物体の動きからさらに作業工程を意味情報として認識してもよい。例えば、ロボットが他のAGVに荷物を積む動作をしていることを認識してもよい。このとき、自分(AGV)は別のルートを探索するような制御値を算出してもよい。また、例えば物流倉庫において出荷用パレットに荷物が載せられる動作を認識し、移動体(フォークリフト)が当該パレットに接近するように制御してもよい。このように、対象の動きを意味情報として認識し、それらに合わせて移動体であるAGVやフォークリフトを制御することでより効率よく運用する。 The work process may be further recognized as semantic information from the movement of the object. For example, it may be recognized that the robot is in the process of loading another AGV. At this time, the control value may be calculated such that oneself (AGV) searches for another route. In addition, for example, in a physical distribution warehouse, it is possible to recognize an operation of loading a package on a shipping pallet, and control so that a moving body (forklift truck) approaches the pallet. As described above, the movement of the object is recognized as the semantic information, and the mobile AGV and the forklift are controlled according to them to operate more efficiently.
 [実施形態10]
 実施形態10ではさらに、人の作業や役割を認識した結果を基により安全にAGVを運行する方法について説明する。本実施形態においては、意味情報として人と人が保持する物体種をもとに人の作業種別を推定し、作業種別に応じてAGVの制御を行う。具体例を挙げると、人と人が押すハンドリフトを検出し作業種別として運搬作業を認識してAGVを回避させる制御や、人と人が持つ溶接機を検出し作業種別として溶接作業を認識しAGVのルートを変更するといった制御を実現する。なお、本実施形態においては、人と人が持つ物体に対応してAGVの制御を決める接近拒絶度のパラメータが、事前に人手によって与えられているものとする。パラメータとは具体的には、例えば、人が大きな荷物を持っている場合に0.4、人が台車を押していれば0.6、人が溶接機を持っていれば0.9といった値のことである。本実施形態においては、これらパラメータが保持されたパラメータリストを移動体管理システム13が保持している。必要に応じて情報処理装置に通信I/F(H17)を介して移動体管理システム13から情報処理装置80にダウンロードし、外部メモリ(H14)に保持して参照できるものとする。
Tenth Embodiment
In the tenth embodiment, a method for safely operating an AGV will be described based on the result of recognizing the work and role of a person. In the present embodiment, a work type of a person is estimated based on human beings and object types held by the person as semantic information, and AGV control is performed according to the work type. As a specific example, a person lifts a handlift that is pressed by a person and controls to make the AGV avoid by recognizing a transport work as a work type, detects a welder possessed by a person and a person, and recognizes a welding work as a work type Implement control such as changing the AGV route. In the present embodiment, it is assumed that the proximity rejection degree parameter that determines the control of the AGV in accordance with the human being and the object possessed by the human is given in advance by hand. The parameters are, for example, 0.4 when a person has a large package, 0.6 when a person is pushing a truck, 0.9 when a person has a welding machine, etc. It is. In the present embodiment, the mobile unit management system 13 holds a parameter list in which these parameters are held. As necessary, the data can be downloaded from the mobile management system 13 to the information processing apparatus 80 via the communication I / F (H17) to the information processing apparatus, and can be stored and referenced in the external memory (H14).
 本実施形態における情報処理装置の構成は、実施形態8で説明した情報処理装置80の図14と同一であるので説明を省略する。実施形態8と異なるのは、意味情報認識部8120が推定し制御部8130に入力する意味情報が異なる。 The configuration of the information processing apparatus in the present embodiment is the same as that of FIG. 14 of the information processing apparatus 80 described in the eighth embodiment, and thus the description thereof is omitted. The difference from the eighth embodiment is that the semantic information that the semantic information recognition unit 8120 estimates and inputs to the control unit 8130 is different.
 本実施形態における処理手順の図は、実施形態8で説明した情報処理装置80の処理手順を説明する図15と同一であるため説明を省略する。実施形態8と異なるのは、意味情報推定ステップS820、および制御値算出S830の処理内容である。 The diagram of the processing procedure in the present embodiment is the same as FIG. 15 for describing the processing procedure of the information processing apparatus 80 described in the eighth embodiment, and therefore the description thereof is omitted. What differs from the eighth embodiment is the processing contents of the semantic information estimation step S820 and the control value calculation S830.
 意味情報推定ステップS820では、意味情報認識部8120が、入力画像から人、および人が保持する物体種を認識する。そして、あらかじめ外部メモリH14に保持しておいた人と人が保持する物体に応じたAGVの制御ルールを記録したパラメータリストに基づいてAGVを制御する。 In the semantic information estimation step S820, the semantic information recognition unit 8120 recognizes a person and an object type held by the person from the input image. Then, the AGV is controlled based on the parameter list in which the control rules of the AGV corresponding to the person and the object held by the person stored in advance in the external memory H14 are recorded.
 まず、視覚情報から人の手の部位を検出する。人の手の部位の検出には、人の各部位とそれらの接続関係を認識し、人の骨格を推定する方法を援用する。そして、人の手の位置にあたる画像座標を取得する。 First, the site of a human hand is detected from visual information. For the detection of human hand sites, a method is used that recognizes each human site and their connection and estimates human skeletons. Then, image coordinates corresponding to the position of the human hand are acquired.
 次に、人が保持する物体種を検出する。物体の検出には実施形態6で述べた、画像を物体種ごとに領域分割するよう学習したニューラルネットワークを用いる。分割した領域のうち、人の手の位置の画像座標と所定の距離以内にある領域を人の保持する物体領域として認識し、当該領域に割り当てられた物体種を取得する。なお、ここでいう物体種とは、前述のリストが保持する物体IDと一意に対応付けられるものである。 Next, an object type held by a person is detected. For the detection of an object, the neural network described in the sixth embodiment and trained to divide an image into regions according to object types is used. Among the divided areas, an area within a predetermined distance from the image coordinates of the position of the human hand is recognized as an object area held by a person, and an object type assigned to the area is acquired. Note that the object type mentioned here is uniquely associated with the object ID held by the above list.
 最後に、得られた物体IDと前述の制御パラメータリストを参照し、接近拒絶度のパラメータを取得する。取得したパラメータは意味情報認識部8120が制御部8130に入力する。 Finally, referring to the obtained object ID and the control parameter list described above, the parameter of the approach rejection degree is acquired. The acquired information is input to the control unit 8130 by the semantic information recognition unit 8120.
 制御値算出ステップS830では、ステップS820において意味情報認識部8120が算出した物体の接近拒絶度のパラメータを基に、制御部8130が自分(AGV)の行動を決定する。なお、実施形態8で説明した占有マップの接近拒絶度の値を以下のように更新することで、制御値を算出する。なお、これは接近拒絶度の値が大きい程大きく、距離が離れるほど小さくなるような関数である。 In the control value calculation step S830, the control unit 8130 determines the action of itself (AGV) based on the parameter of the approach rejection degree of the object calculated by the semantic information recognition unit 8120 in step S820. The control value is calculated by updating the value of the approach rejection degree of the occupancy map described in the eighth embodiment as follows. Note that this is a function that increases as the value of the approach rejection degree increases and decreases as the distance increases.
Figure JPOXMLDOC01-appb-M000005
Figure JPOXMLDOC01-appb-M000005
 なお、Scoreがj番目の格子の値である。sが、ステップS820において検出したi番目の物体の接近拒絶度を表すパラメータである。以上のように定めた占有マップを用い、実施形態8で説明したようにAGVの進行ルートを決める。 Here, Score j is the value of the j-th grid. s i is a parameter representing the approach rejection degree of the ith object detected in step S 820. The travel route of the AGV is determined as described in the eighth embodiment using the occupancy map defined as described above.
 さらにAGVの速度の最大値vmaxを進行中の占有マップの接近拒絶度の値を基に次ように制限するように制御値を算出する。 Further, the control value is calculated so as to limit the maximum value v max of the AGV velocity as follows based on the value of the approach rejection of the occupancy map in progress.
Figure JPOXMLDOC01-appb-M000006
Figure JPOXMLDOC01-appb-M000006
 αは占有マップの接近拒絶度の値と速度との調整パラメータであり、βは現在AGVが通行中の占有マップの接近拒絶度の値である。vmaxは、占有マップの接近拒絶度の値が大きくなる(1に近づく)程0に近づくような値として算出される。このようにして制御部8130が算出した制御値をアクチュエータ130に出力する。 α is an adjustment parameter between the value of the approach rejection degree of the occupancy map and the speed, and β is a value of the approach rejection degree of the occupancy map currently passing by the AGV. v max is calculated as a value that approaches 0 as the value of the approach rejection of the occupancy map increases (closes to 1). The control value calculated by the control unit 8130 in this manner is output to the actuator 130.
 実施形態10では、人と人が保持する物体の組み合わせから人の作業の種別を求め、接近拒絶度を表すパラメータを決める。そして接近拒絶度が大きい程、人から遠ざかるように低速に動作するような制御値を算出する。これにより、人の作業に応じて適切な距離をおいてAGVを制御する。このようにして、AGVをより安全に制御することができる。 In the tenth embodiment, the type of work of a person is determined from the combination of a person and an object held by the person, and a parameter representing the approach rejection degree is determined. Then, the control value is calculated so as to move slower as it gets away from the person as the approach rejection degree is larger. This controls the AGV at an appropriate distance according to the work of the person. In this way, AGV can be controlled more safely.
 <変形例>
 実施形態10では、人と人が保持する物体の組み合わせを意味情報として認識していたが、人に付随する状態を認識してより安全にAGVを制御する方法であれば上記方法に限らない。
<Modification>
In the tenth embodiment, although a combination of a person and an object held by the person is recognized as the semantic information, it is not limited to the above method as long as it is a method of more safely controlling the AGV by recognizing a state accompanying the person.
 意味情報として、人の服装を認識してもよい。例えば、工場において、作業着を着ているのが作業者、スーツを着ているのが見学者であることを認識するとする。この認識結果を用いてAGVの動きに慣れている作業者と比較し、特にAGVの動きに慣れていないような見学者の付近を通るときにはよりゆっくりと進行するようにしてより安全にAGVを制御する。 People's clothes may be recognized as semantic information. For example, in a factory, it is assumed that it is a worker wearing a work clothes and a visitor wearing a suit. This recognition result is used to control AGV more safely by comparing with a worker who is used to the movement of AGV and making it progress more slowly when passing in the vicinity of the visitor who is not particularly used to the movement of AGV. Do.
 意味情報として、人の年齢を認識してもよい。例えば、病院において院内配送を行うAGVにおいて、子供や年配の人を認識した時には、ゆっくりと所定の距離をおいて通過することでより、より安全にAGVを運用することができるようになる。 A person's age may be recognized as semantic information. For example, in an AGV that performs in-hospital delivery in a hospital, when a child or an elderly person is recognized, the AGV can be more safely operated by passing at a predetermined distance slowly.
 意味情報として、人の動きを認識してもよい。例えば、ホテルでの荷物運びをするAGVにおいて、千鳥足で歩くように、人が前後左右に繰り返し移動していることを認識した場合には所定の距離をおいて通過する制御値を算出し、より安全にAGVを運用することができる。 Human movement may be recognized as semantic information. For example, in an AGV carrying luggage at a hotel, when it is recognized that a person repeatedly moves back and forth and left and right as when walking with a stagger, a control value to be passed at a predetermined distance is calculated. AGV can be operated safely.
 人の動きから作業を認識してもよい。具体的には、工場において作業者がAGVに荷物を積み込もうとする動作を検出したら、ゆっくりと作業者に近づき荷物の積み込みが終わるまで停止するような制御値を算出してもよい。こうすることで作業者がAGVの停止位置まで移動してから荷物を積み込む必要が無く、効率良く作業を行うことができる。 You may recognize work from human movement. Specifically, when the worker detects an operation to load the AGV in the factory, a control value may be calculated so as to approach the worker slowly and stop until the loading of the package ends. By doing this, it is not necessary to load the load after the operator has moved to the stop position of the AGV, and the work can be performed efficiently.
 人の人数を意味情報として認識してもよい。具体的には、AGVの進行ルート上に所定の数より多数の人を認識した場合には、ルートを変更する。このようにすることで、たとえ人の間を縫って進行した場合に万が一人と接触することを避け、より安全にAGVを運用することができる。 The number of people may be recognized as semantic information. Specifically, the route is changed when more people than a predetermined number are recognized on the progress route of the AGV. By doing this, it is possible to operate the AGV more safely by avoiding contact with one person in case of progressing through people.
 [実施形態11]
 実施形態8から10に共通して適用できるUIについて説明する。撮像部110が取得した視覚情報や、位置姿勢算出部8110が算出した位置姿勢、マップ情報、制御値を提示するUIに加え、さらに実施形態8から10で説明した意味情報や占有マップに割り当てた値といった情報表示する。
[Embodiment 11]
A UI that can be commonly applied to the eighth to tenth embodiments will be described. In addition to the visual information acquired by the imaging unit 110, the position / orientation calculated by the position / orientation calculation unit 8110, the map information, and the UI for presenting control values, the information is further allocated to the semantic information or occupancy map described in the eighth to tenth embodiments. Display information such as value.
 実施形態11における装置の構成は、実施形態8で説明した情報処理装置80の構成を示す図2と同一であるため省略する。なお、表示のための機器の構成に関しては実施形態7で説明した構成と同一であるため省略する。 The configuration of the device in the eleventh embodiment is the same as that of FIG. 2 showing the configuration of the information processing device 80 described in the eighth embodiment, and thus the description thereof is omitted. The configuration of the device for display is the same as the configuration described in the seventh embodiment, and is thus omitted.
 図13に、本実施形態における表示装置が提示する表示情報の一例であるGUI200を示す。G210は撮像部110が取得した視覚情報および意味情報認識部8120が認識した意味情報を提示するためのウィンドウである。G220は実施形態8で述べたAGVのナビゲーションのための接近拒絶度を提示するためのウィンドウである。またG230は2Dの占有マップを提示するためのウィンドウである。また、G240は、AGVを人手で操作するためのGUIや、位置姿勢算出部8110や意味情報認識部8120、制御部8130が算出した値、AGVの運行情報を提示するためのウィンドウである。 FIG. 13 shows a GUI 200 which is an example of display information presented by the display device according to the present embodiment. G210 is a window for presenting the visual information acquired by the imaging unit 110 and the semantic information recognized by the semantic information recognition unit 8120. G220 is a window for presenting the approach rejection for navigation of the AGV described in the eighth embodiment. G230 is a window for presenting a 2D occupancy map. G240 is a GUI for manually operating the AGV, a value calculated by the position and orientation calculation unit 8110, the semantic information recognition unit 8120, and the control unit 8130, and a window for presenting AGV operation information.
 G210は、意味情報認識部8120が検出した意味情報として、複数の物体とそれらの相対距離、および接近拒絶度の値の提示例を示している。G211は、検出した物体のバウンディングボックスである。本実施形態においては、他のAGVとその荷物を検出しそれらを囲むバウンディングボックスを点線で表示している。なお、複数の物体を統合してバウンディングボックスを提示しているが、検出した物体それぞれにバウンディングボックスを描いてもよい。また、バウンディングボックスは検出した物体の位置がわかれば何でもよく点線で描いても実線で描いてもよし、半透明のマスクを重畳して提示してもよい。G212は検出した意味情報を提示するポップアップである。検出した複数の物体種とそれらの距離、および接近拒絶度の値を提示している。このように、認識した意味情報を視覚情報に重畳して提示することで、ユーザが直感的に視覚情報と意味情報を関連付けて把握することができる。 G210 shows an example of presentation of a plurality of objects, their relative distances, and values of the approach rejection degree as the semantic information detected by the semantic information recognition unit 8120. G211 is a bounding box of the detected object. In this embodiment, bounding boxes that detect other AGVs and their packages and surround them are indicated by dotted lines. Although a plurality of objects are integrated to present a bounding box, a bounding box may be drawn for each of the detected objects. In addition, the bounding box may be anything as long as the position of the detected object is known, and may be drawn by a dotted line or a solid line, or a semitransparent mask may be superimposed and presented. G212 is a pop-up that presents the detected semantic information. A plurality of detected object types, their distances, and values of proximity rejection are presented. Thus, by superimposing the recognized semantic information on visual information and presenting it, the user can intuitively associate visual information with the semantic information and grasp.
 G220は、撮像部110が取得した視覚情報に、制御部8130が算出したAGVの接近拒絶度を重畳した例である。G221は、接近拒絶度が高い程濃い色を重畳している。このように接近拒絶度を視覚情報に重畳して提示することで、ユーザは直感的に視覚情報と接近拒絶度の値を関連付けて把握することができる。なお、G221は色や濃度、形状を変えることでより容易にユーザが接近拒絶度を把握できるようにしてもよい。 G220 is an example in which the proximity rejection degree of the AGV calculated by the control unit 8130 is superimposed on the visual information acquired by the imaging unit 110. G221 superimposes darker colors as the degree of approach rejection is higher. By presenting the approach rejection degree superimposed on the visual information in this manner, the user can intuitively associate the visual information with the approach rejection degree and grasp the information. Note that G221 may allow the user to more easily understand the approach rejection degree by changing the color, density, or shape.
 G230は、制御部8130が算出した占有マップと意味情報認識部8120が認識した意味情報を提示例である。G231は占有マップの接近拒絶度の値が大きい程濃く、小さい程薄くなるように占有マップの接近拒絶度の値を可視化している。G232はさらに、意味情報認識部8120が認識した意味情報として、構造物の位置を提示している。本実施形態では、工場の扉が開いていることを認識した結果を提示している例を示した。G233はさらに、意味情報認識部8130が認識した意味情報として、周囲の物体の移動量を提示している。本実施形態では物体の移動方向と速度を提示した。このようにして、占有マップとその値、意味情報の認識結果を提示することで、ユーザは容易にそれらを関連付けてAGVの内部状態を把握することができる。また、このように占有マップを提示することで、ユーザが制御部8130のAGVのルート生成過程を容易に把握することができるようになる。 G230 is an example of presenting the occupancy map calculated by the control unit 8130 and the semantic information recognized by the semantic information recognition unit 8120. G 231 visualizes the value of the approach rejection of the occupancy map so that the value of the approach rejection of the occupancy map becomes larger as the value of the proximity rejection becomes larger and smaller as the value becomes smaller. G232 further presents the position of the structure as the semantic information recognized by the semantic information recognition unit 8120. In the present embodiment, an example is presented in which the result of recognizing that the factory door is open is presented. G233 further presents the movement amounts of surrounding objects as the semantic information recognized by the semantic information recognition unit 8130. In the present embodiment, the moving direction and the speed of the object are presented. In this manner, by presenting the occupancy map, the value thereof, and the recognition result of the semantic information, the user can easily associate them and grasp the internal state of the AGV. Also, by presenting the occupancy map in this manner, the user can easily grasp the AGV route generation process of the control unit 8130.
 G240は、AGVを人手で操作するためのGUIや、位置姿勢算出部8110や意味情報認識部8120、制御部8130が算出した値、AGVの運行情報の提示例を示している。G241は意味情報認識部8120が認識する意味情報や、認識結果を表示するか否かといった設定をするためのGUIであり、例えば項目のオンオフを切り替えるラジオボタンである。G242は、制御部8130が算出する接近拒絶距離や制御値を算出するパラメータを調整するためのGUIであり、例えばスライドバーや数字入力フォームがこれにあたる。 G240 shows a GUI for manually operating the AGV, a value calculated by the position and orientation calculation unit 8110, the semantic information recognition unit 8120, and the control unit 8130, and an example of presentation of operation information of the AGV. G241 is a GUI for setting the semantic information recognized by the semantic information recognition unit 8120 and whether or not to display the recognition result, and is, for example, a radio button for switching on / off of an item. G 242 is a GUI for adjusting the proximity rejection distance calculated by the control unit 8130 and parameters for calculating the control value, and corresponds to, for example, a slide bar or a number input form.
 <変形例>
 本実施形態で説明したGUIは一例であって、意味情報認識部8120が算出した意味情報、制御部8130が算出した占有マップの接近拒絶度の値などを提示し、AGVの内部状態を把握するようにするGUIであればどのような可視化方法を用いてもよい。例えば色を変える、線の太さや実線・破線・二重線を切り替える、拡大縮小する、必要のない情報を隠す、というように表示情報を変更することもできる。このように表示情報の可視化方法を変えることで、ユーザがより直感的に表示情報を理解することができるようにする。
<Modification>
The GUI described in the present embodiment is an example, and the semantic information calculated by the semantic information recognition unit 8120, the value of the proximity rejection of the occupancy map calculated by the control unit 8130, etc. are presented, and the internal state of the AGV is grasped. Any visualization method may be used as long as it is a GUI to be used. For example, the display information can be changed such as changing color, switching line thickness, solid line, broken line, double line, scaling, and hiding unnecessary information. By changing the method of visualizing display information in this manner, the user can more intuitively understand the display information.
 本発明は、上述の実施形態の1以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける1つ以上のプロセッサーがプログラムを読出し実行する処理でも実現可能である。また、1以上の機能を実現する回路(例えば、ASIC)によっても実現可能である。 The present invention supplies a program that implements one or more functions of the above-described embodiments to a system or apparatus via a network or storage medium, and one or more processors in a computer of the system or apparatus read and execute the program. Can also be realized. It can also be implemented by a circuit (eg, an ASIC) that implements one or more functions.
 本発明は上記実施の形態に制限されるものではなく、本発明の精神及び範囲から離脱することなく、様々な変更及び変形が可能である。従って、本発明の範囲を公にするために、以下の請求項を添付する。 The present invention is not limited to the above embodiment, and various changes and modifications can be made without departing from the spirit and scope of the present invention. Accordingly, the following claims are attached to disclose the scope of the present invention.
 本願は、2018年1月12日提出の日本国特許出願特願2018-003817を基礎として優先権を主張するものであり、その記載内容の全てをここに援用する。 The present application claims priority based on Japanese Patent Application No. 2018-003817 filed on Jan. 12, 2018, the entire contents of which are incorporated herein by reference.

Claims (20)

  1.  移動体に搭載された、撮像素子上の各々の受光部が2以上の受光素子によって構成される撮像部が取得した画像情報の入力を受け付ける入力手段と、
     マップ情報を保持する保持手段と、
     前記画像情報と前記マップ情報とに基づいて前記撮像部の位置姿勢を取得する取得手段と、
     前記取得手段が取得した位置姿勢に基づいて前記移動体の移動を制御する制御値を得る制御手段と、
     を備えることを特徴とする情報処理装置。
    An input unit that receives an input of image information acquired by an imaging unit that is mounted on a moving body and in which each light receiving unit on the imaging element is configured by two or more light receiving elements;
    Holding means for holding map information;
    Acquisition means for acquiring the position and orientation of the imaging unit based on the image information and the map information;
    A control unit for obtaining a control value for controlling the movement of the movable body based on the position and orientation acquired by the acquisition unit;
    An information processing apparatus comprising:
  2.  前記画像情報は、前記撮像部が撮像した情報であり、前記画像情報は、前記撮像部が前記受光素子を選択的に構成して生成したデプスマップであること、
     を特徴とする請求項1記載の情報処理装置。
    The image information is information captured by the imaging unit, and the image information is a depth map generated by the imaging unit selectively configuring the light receiving element.
    The information processing apparatus according to claim 1, wherein
  3.  前記画像情報は、前記撮像部が撮像した情報であり、前記画像情報は、前記撮像部が前記受光素子を選択的に構成して生成した空間中の三次元の位置情報を保持する三次元点群であること、
     を特徴とする請求項1記載の情報処理装置。
    The image information is information captured by the imaging unit, and the image information is a three-dimensional point that holds three-dimensional position information in a space generated by the imaging unit selectively configuring the light receiving element. Being a group,
    The information processing apparatus according to claim 1, wherein
  4.  前記画像情報には、前記撮像部が前記受光素子を選択的に構成して生成した画像をさらに含むこと、
     を特徴とする請求項2または3に記載の情報処理装置。
    The image information may further include an image generated by the imaging unit selectively configuring the light receiving element.
    The information processing apparatus according to claim 2 or 3, characterized in that
  5.  前記取得手段では、前記画像情報と、前記撮像部が前記画像情報を取得した第一の時刻前の第二の時刻に取得した画像情報とを用いて前記画像情報を更新し、更新した画像情報と前記マップ情報とに基づいて前記撮像部の位置姿勢を取得すること、
     を特徴とする請求項1乃至4何れか1項に記載の情報処理装置。
    The acquisition unit updates the image information using the image information and the image information acquired at a second time before the first time when the imaging unit acquires the image information, and the updated image information Acquiring the position and orientation of the imaging unit based on the map information and the map information;
    The information processing apparatus according to any one of claims 1 to 4, wherein
  6.  前記制御手段がさらに、パターン光を投影するための投影装置を制御する制御値を算出する、
     ことを特徴とする請求項1乃至5何れか1項に記載の情報処理装置。
    The control means further calculates a control value for controlling a projection device for projecting the pattern light.
    The information processing apparatus according to any one of claims 1 to 5, characterized in that:
  7.  前記入力手段がさらに、空間中の三次元位置を表す三次元情報を取得する三次元計測装置が計測した三次元情報の入力を受け付け、
     前記取得手段がさらに、前記画像情報と、前記三次元情報とを基に前記画像情報を更新し、更新した画像情報と前記マップ情報とに基づいて前記撮像部の位置姿勢を取得すること、
     を特徴とする請求項1乃至6何れか1項に記載の情報処理装置。
    The input unit further receives an input of three-dimensional information measured by a three-dimensional measuring device that acquires three-dimensional information representing a three-dimensional position in space;
    The acquisition unit further updates the image information based on the image information and the three-dimensional information, and acquires the position and orientation of the imaging unit based on the updated image information and the map information.
    The information processing apparatus according to any one of claims 1 to 6, wherein
  8.  前記取得手段はさらに、前記画像情報または前記マップ情報の何れか一方または両方から物体の特徴情報を取得し、
     前記制御手段は、前記物体の特徴情報を基に前記移動体を制御する制御値を算出すること、
     を特徴とする請求項1乃至7何れか1項に記載の情報処理装置。
    The acquisition unit further acquires feature information of an object from either or both of the image information and the map information.
    The control means may calculate a control value for controlling the movable body based on feature information of the object.
    The information processing apparatus according to any one of claims 1 to 7, characterized in that
  9.  前記制御手段がさらに、前記物体の特徴情報を基に所定の物体が前記画像情報の所定の地点に位置するように前記移動体を制御する制御値を算出すること、
     を特徴とする請求項8記載の情報処理装置。
    The control means further calculates a control value for controlling the movable body such that a predetermined object is located at a predetermined point of the image information based on the feature information of the object.
    The information processing apparatus according to claim 8, characterized in that:
  10.  前記取得手段がさらに、前記画像情報を意味的な領域分割により領域分割すること、
     を特徴とする請求項1乃至9何れか1項に記載の情報処理装置。
    The acquisition unit further divides the image information into areas by meaningful area division;
    The information processing apparatus according to any one of claims 1 to 9, wherein
  11.  前記取得手段がさらに、前記領域分割の結果を基に前記マップ情報を生成し更新すること、を特徴とする請求項10記載の情報処理装置。 The information processing apparatus according to claim 10, wherein the acquisition unit further generates and updates the map information based on the result of the area division.
  12.  前記制御手段がさらに、前記領域分割の結果を基に前記移動体を制御する制御値を算出すること、
     を特徴とする請求項10または11に記載の情報処理装置。
    The control means further calculates a control value for controlling the mobile body based on the result of the area division;
    The information processing apparatus according to claim 10 or 11, characterized in that
  13.  前記制御手段がさらに、前記画像情報、前記マップ情報、前記取得した位置姿勢、前記制御値の少なくとも一つを基に前記撮像部のパラメータを調整する調整値を算出すること、
     を特徴とする請求項1乃至12何れか1項に記載の情報処理装置。
    The control unit further calculates an adjustment value for adjusting a parameter of the imaging unit based on at least one of the image information, the map information, the acquired position and orientation, and the control value.
    The information processing apparatus according to any one of claims 1 to 12, characterized in that
  14.  前記調整値とは、前記撮像部のフォーカス値であること、
     を特徴とする請求項13記載の情報処理装置。
    The adjustment value is a focus value of the imaging unit,
    The information processing apparatus according to claim 13, characterized in that
  15.  前記調整値とは、前記撮像部のズーム値であること、
     を特徴とする請求項13記載の情報処理装置。
    The adjustment value is a zoom value of the imaging unit,
    The information processing apparatus according to claim 13, characterized in that
  16.  前記撮像部は、光学装置を交換することが可能であり、
     前記入力手段はさらに、前記交換した光学装置のパラメータを取得すること、
     を特徴とする請求項1乃至15の何れか1項に記載の情報処理装置。
    The imaging unit can replace an optical device,
    The input means further obtains parameters of the replaced optical device,
    The information processing apparatus according to any one of claims 1 to 15, wherein
  17.  前記画像情報、前記マップ情報、前記位置姿勢、前記制御値のうち少なくとも一つを基に表示情報を生成する表示情報生成手段
     をさらに備えることを特徴とする請求項1乃至16の何れか1項に記載の情報処理装置。
    17. A display information generation unit for generating display information based on at least one of the image information, the map information, the position and orientation, and the control value. The information processing apparatus according to claim 1.
  18.  移動体に搭載された撮像素子上の各々の受光部が2以上の受光素子によって構成される撮像部が取得した画像情報の入力を受け付ける入力工程と、
     マップ情報を保持手段に保持する工程と、
     前記画像情報と前記マップ情報とに基づいて前記撮像部の位置姿勢を取得する取得工程と、
     前記算出手段が算出した位置姿勢を基に移動体の移動を制御する制御値を算出する制御工程と、
     を備えることを特徴とする情報処理方法。
    An input step of receiving an input of image information acquired by an imaging unit configured by two or more light receiving elements, each light receiving unit on the imaging element mounted on the moving body;
    Holding map information in the holding means;
    An acquiring step of acquiring the position and orientation of the imaging unit based on the image information and the map information;
    A control step of calculating a control value for controlling the movement of the movable body based on the position and orientation calculated by the calculation means;
    An information processing method comprising:
  19.  撮像素子上の各々の受光部が2以上の受光素子によって構成される撮像手段と、
     前記撮像手段が取得した画像情報の入力を受け付ける入力手段と、
     マップ情報を保持する保持手段と、
     前記画像情報と前記マップ情報とに基づいて前記撮像手段の位置姿勢を取得する取得手段と、
     前記取得手段が取得した位置姿勢に基づいて前記移動体の移動を制御する制御値を得る制御手段と、
     を特徴とする情報処理システム。
    An imaging unit in which each light receiving unit on the imaging device includes two or more light receiving elements;
    An input unit that receives an input of the image information acquired by the imaging unit;
    Holding means for holding map information;
    Acquisition means for acquiring the position and orientation of the imaging means based on the image information and the map information;
    A control unit for obtaining a control value for controlling the movement of the movable body based on the position and orientation acquired by the acquisition unit;
    An information processing system characterized by
  20.  撮像素子上の各々の受光部が2以上の受光素子によって構成される撮像手段と、
     前記撮像手段が取得した画像情報の入力を受け付ける入力手段と、
     マップ情報を保持する保持手段と、
     前記画像情報と前記マップ情報とに基づいて前記撮像手段の位置姿勢を取得する取得手段と、
     前記取得手段が取得した位置姿勢に基づいて前記移動体の移動を制御する制御値を得る制御手段と、
     前記制御値で前記移動体の移動を制御するアクチュエータと、を備えること、
     を特徴とする移動体。
    An imaging unit in which each light receiving unit on the imaging device includes two or more light receiving elements;
    An input unit that receives an input of the image information acquired by the imaging unit;
    Holding means for holding map information;
    Acquisition means for acquiring the position and orientation of the imaging means based on the image information and the map information;
    A control unit for obtaining a control value for controlling the movement of the movable body based on the position and orientation acquired by the acquisition unit;
    An actuator for controlling the movement of the movable body by the control value;
    Mobile body characterized by
PCT/JP2018/047022 2018-01-12 2018-12-20 Information processing device, information processing method, program, and system WO2019138834A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2018003817 2018-01-12
JP2018-003817 2018-01-12

Publications (1)

Publication Number Publication Date
WO2019138834A1 true WO2019138834A1 (en) 2019-07-18

Family

ID=67218687

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2018/047022 WO2019138834A1 (en) 2018-01-12 2018-12-20 Information processing device, information processing method, program, and system

Country Status (2)

Country Link
JP (1) JP7341652B2 (en)
WO (1) WO2019138834A1 (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110826604A (en) * 2019-10-24 2020-02-21 西南交通大学 Material sorting method based on deep learning
CN110926425A (en) * 2019-11-01 2020-03-27 宁波大学 Navigation logistics transportation system of 3D structured light camera and control method thereof
JP2021060849A (en) * 2019-10-08 2021-04-15 国立大学法人静岡大学 Autonomous mobile robot and control program for autonomous mobile robot
US20210217194A1 (en) * 2020-01-13 2021-07-15 Samsung Electronics Co., Ltd. Method and apparatus with object information estimation and virtual object generation
WO2022113771A1 (en) * 2020-11-26 2022-06-02 ソニーグループ株式会社 Autonomous moving body, information processing device, information processing method, and program
US20220282987A1 (en) * 2019-08-08 2022-09-08 Sony Group Corporation Information processing system, information processing device, and information processing method
WO2022254609A1 (en) * 2021-06-02 2022-12-08 国立大学法人東北大学 Information processing device, moving body, information processing method, and program
US20230082486A1 (en) * 2021-09-13 2023-03-16 Southwest Research Institute Obstacle Detection and Avoidance System for Autonomous Aircraft and Other Autonomous Vehicles
EP4261181A1 (en) * 2022-04-11 2023-10-18 STILL GmbH Surroundings monitoring system and industrial truck with surroundings monitoring system

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP7221183B2 (en) * 2019-09-20 2023-02-13 株式会社日立製作所 Machine learning method, forklift control method, and machine learning device
CN114556252A (en) * 2019-10-10 2022-05-27 索尼集团公司 Information processing apparatus, information processing method, and program
SG10201913873QA (en) * 2019-12-30 2021-07-29 Singpilot Pte Ltd Sequential Mapping And Localization (SMAL) For Navigation
US20230028976A1 (en) * 2020-01-16 2023-01-26 Sony Group Corporation Display apparatus, image generation method, and program
JP7429143B2 (en) 2020-03-30 2024-02-07 本田技研工業株式会社 Mobile object control device, mobile object control method, and program
JP6849256B1 (en) * 2020-05-08 2021-03-24 シンメトリー・ディメンションズ・インク 3D model construction system and 3D model construction method
GB2598758B (en) * 2020-09-10 2023-03-29 Toshiba Kk Task performing agent systems and methods
US20240118707A1 (en) * 2020-12-15 2024-04-11 Nec Corporation Information processing apparatus, moving body control system, control method, and non-transitory computer-readable medium
US20240054674A1 (en) * 2020-12-25 2024-02-15 Nec Corporation System, information processing apparatus, method, and computer-readable medium
JP2022175900A (en) * 2021-05-14 2022-11-25 ソニーグループ株式会社 Information processing device, information processing method, and program
JP7447060B2 (en) * 2021-07-29 2024-03-11 キヤノン株式会社 Information processing device, information processing method, autonomous robot device, and computer program
JP2024007754A (en) * 2022-07-06 2024-01-19 キヤノン株式会社 Information processing system, information processing device, terminal device and control method of information processing system
WO2024057800A1 (en) * 2022-09-12 2024-03-21 株式会社島津製作所 Method for controlling mobile object, transport device, and work system

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010122904A (en) * 2008-11-19 2010-06-03 Hitachi Ltd Autonomous mobile robot
JP2011137697A (en) * 2009-12-28 2011-07-14 Canon Inc Illumination apparatus, and measuring system using the illumination system
JP2011237215A (en) * 2010-05-07 2011-11-24 Nikon Corp Depth map output device
JP2016170060A (en) * 2015-03-13 2016-09-23 三菱電機株式会社 Facility information display system, mobile terminal, server and facility information display method
JP2016192028A (en) * 2015-03-31 2016-11-10 株式会社デンソー Automatic travel control device and automatic travel control system
WO2017094317A1 (en) * 2015-12-02 2017-06-08 ソニー株式会社 Control apparatus, control method, and program
JP2017122993A (en) * 2016-01-05 2017-07-13 キヤノン株式会社 Image processor, image processing method and program
JP2017157803A (en) * 2016-03-04 2017-09-07 キヤノン株式会社 Imaging apparatus
JP2017156162A (en) * 2016-02-29 2017-09-07 キヤノン株式会社 Information processing device, information processing method, and program
JP2017215525A (en) * 2016-06-01 2017-12-07 キヤノン株式会社 Imaging device and method for controlling the same, program, and storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6445808B2 (en) 2014-08-26 2018-12-26 三菱重工業株式会社 Image display system
JP6657034B2 (en) 2015-07-29 2020-03-04 ヤマハ発動機株式会社 Abnormal image detection device, image processing system provided with abnormal image detection device, and vehicle equipped with image processing system

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010122904A (en) * 2008-11-19 2010-06-03 Hitachi Ltd Autonomous mobile robot
JP2011137697A (en) * 2009-12-28 2011-07-14 Canon Inc Illumination apparatus, and measuring system using the illumination system
JP2011237215A (en) * 2010-05-07 2011-11-24 Nikon Corp Depth map output device
JP2016170060A (en) * 2015-03-13 2016-09-23 三菱電機株式会社 Facility information display system, mobile terminal, server and facility information display method
JP2016192028A (en) * 2015-03-31 2016-11-10 株式会社デンソー Automatic travel control device and automatic travel control system
WO2017094317A1 (en) * 2015-12-02 2017-06-08 ソニー株式会社 Control apparatus, control method, and program
JP2017122993A (en) * 2016-01-05 2017-07-13 キヤノン株式会社 Image processor, image processing method and program
JP2017156162A (en) * 2016-02-29 2017-09-07 キヤノン株式会社 Information processing device, information processing method, and program
JP2017157803A (en) * 2016-03-04 2017-09-07 キヤノン株式会社 Imaging apparatus
JP2017215525A (en) * 2016-06-01 2017-12-07 キヤノン株式会社 Imaging device and method for controlling the same, program, and storage medium

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220282987A1 (en) * 2019-08-08 2022-09-08 Sony Group Corporation Information processing system, information processing device, and information processing method
JP2021060849A (en) * 2019-10-08 2021-04-15 国立大学法人静岡大学 Autonomous mobile robot and control program for autonomous mobile robot
JP7221839B2 (en) 2019-10-08 2023-02-14 国立大学法人静岡大学 Autonomous Mobile Robot and Control Program for Autonomous Mobile Robot
CN110826604A (en) * 2019-10-24 2020-02-21 西南交通大学 Material sorting method based on deep learning
CN110926425A (en) * 2019-11-01 2020-03-27 宁波大学 Navigation logistics transportation system of 3D structured light camera and control method thereof
US20210217194A1 (en) * 2020-01-13 2021-07-15 Samsung Electronics Co., Ltd. Method and apparatus with object information estimation and virtual object generation
WO2022113771A1 (en) * 2020-11-26 2022-06-02 ソニーグループ株式会社 Autonomous moving body, information processing device, information processing method, and program
WO2022254609A1 (en) * 2021-06-02 2022-12-08 国立大学法人東北大学 Information processing device, moving body, information processing method, and program
US20230082486A1 (en) * 2021-09-13 2023-03-16 Southwest Research Institute Obstacle Detection and Avoidance System for Autonomous Aircraft and Other Autonomous Vehicles
EP4261181A1 (en) * 2022-04-11 2023-10-18 STILL GmbH Surroundings monitoring system and industrial truck with surroundings monitoring system

Also Published As

Publication number Publication date
JP7341652B2 (en) 2023-09-11
JP2019125345A (en) 2019-07-25

Similar Documents

Publication Publication Date Title
WO2019138834A1 (en) Information processing device, information processing method, program, and system
US11704812B2 (en) Methods and system for multi-target tracking
US11592845B2 (en) Image space motion planning of an autonomous vehicle
US11861892B2 (en) Object tracking by an unmanned aerial vehicle using visual sensors
US11573574B2 (en) Information processing apparatus, information processing method, information processing system, and storage medium
US10809081B1 (en) User interface and augmented reality for identifying vehicles and persons
US10837788B1 (en) Techniques for identifying vehicles and persons
US20210133996A1 (en) Techniques for motion-based automatic image capture
US20210138654A1 (en) Robot and method for controlling the same
KR102597216B1 (en) Guidance robot for airport and method thereof
WO2017071143A1 (en) Systems and methods for uav path planning and control
JP7479799B2 (en) Information processing device, information processing method, program, and system
CN110609562B (en) Image information acquisition method and device
CN108280853A (en) Vehicle-mounted vision positioning method, device and computer readable storage medium
US11748998B1 (en) Three-dimensional object estimation using two-dimensional annotations
Spitzer et al. Fast and agile vision-based flight with teleoperation and collision avoidance on a multirotor
US11334094B2 (en) Method for maintaining stability of mobile robot and mobile robot thereof
JP6609588B2 (en) Autonomous mobility system and autonomous mobility control method
Sheikh et al. Stereo vision-based optimal path planning with stochastic maps for mobile robot navigation
US20240085916A1 (en) Systems and methods for robotic detection of escalators and moving walkways
US11846514B1 (en) User interface and augmented reality for representing vehicles and persons
JP2021099384A (en) Information processing apparatus, information processing method, and program
JP2021099383A (en) Information processing apparatus, information processing method, and program
EP4024155B1 (en) Method, system and computer program product of control of unmanned aerial vehicles
Gujarathi et al. Design and Development of Autonomous Delivery Robot

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18900349

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18900349

Country of ref document: EP

Kind code of ref document: A1