WO2021140577A1 - Robot control system - Google Patents

Robot control system Download PDF

Info

Publication number
WO2021140577A1
WO2021140577A1 PCT/JP2020/000203 JP2020000203W WO2021140577A1 WO 2021140577 A1 WO2021140577 A1 WO 2021140577A1 JP 2020000203 W JP2020000203 W JP 2020000203W WO 2021140577 A1 WO2021140577 A1 WO 2021140577A1
Authority
WO
WIPO (PCT)
Prior art keywords
robot
destination
time
layer
work
Prior art date
Application number
PCT/JP2020/000203
Other languages
French (fr)
Japanese (ja)
Inventor
俊行 樽井
Original Assignee
ウェルヴィル株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ウェルヴィル株式会社 filed Critical ウェルヴィル株式会社
Priority to PCT/JP2020/000203 priority Critical patent/WO2021140577A1/en
Publication of WO2021140577A1 publication Critical patent/WO2021140577A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling

Definitions

  • the present invention relates to a robot control system.
  • Patent Document 1 does not improve efficiency when a plurality of robots operate.
  • the present invention has been made in view of such a background, and an object of the present invention is to provide a technique capable of effectively controlling a plurality of robots.
  • the main invention of the present invention for solving the above problems is a system for controlling a plurality of robots, in which a work storage unit for storing a plurality of tasks to be performed by the robot and each of the tasks are assigned to the robot.
  • the allocation processing unit includes a transmission unit that transmits the assigned work to the control device of the robot, and a status acquisition unit that acquires the operation status of the robot. It is characterized in that the allocation destination of the work is changed accordingly.
  • efficient reinforcement learning can be performed for robot control.
  • Unity registered trademark
  • the present invention includes, for example, the following configuration.
  • [Item 1] A system that controls multiple robots A working memory unit that stores a plurality of tasks to be performed by the robot, An allocation processing unit that assigns each of the tasks to the robot, A transmitter that transmits the assigned work to the control device of the robot, and A status acquisition unit that acquires the operating status of the robot, and With The allocation processing unit changes the allocation destination of the work according to the operation status.
  • a robot control system featuring.
  • the allocation processing unit allocates one work to one or a plurality of robots according to a first work amount required for the work and a second work amount that the robot can perform.
  • a robot control system featuring.
  • the allocation processing unit performs the work so that the amount of the work assigned to each of the plurality of robots is smoothed by the cumulative amount of the work assigned to each of the plurality of robots in a predetermined period.
  • Assigning to the robot A robot control system featuring.
  • the status acquisition unit acquires information indicating the operation status from the control device of the robot and a sensor independent of the robot.
  • the present invention can also have the following configurations.
  • [Item 1] A system that controls robots
  • [Item 2] The robot control system according to item 1.
  • the robot comprises one or more sensors.
  • the control unit transmits a control signal related to the operation of the robot to the simulator to the simulator.
  • the simulator simulates the operation of the virtual robot in response to the control signal, simulates the measurement by the virtual sensor, and transmits the measurement information by the virtual sensor to the control unit.
  • the control unit performs the reinforcement learning according to the measurement information.
  • the robot control system according to item 1.
  • the control unit The request reception layer that accepts instructions to the robot and A work pooling layer that gives the instruction as an input value for the reinforcement learning,
  • the AI layer that performs reinforcement learning and A robot control system characterized by being equipped with.
  • FIG. 1 is a diagram showing an overall image of a system configuration according to the robot control system of the present embodiment.
  • the robot control system of this embodiment is configured in five layers.
  • the first layer makes an external connection.
  • the first layer can receive instructions from the user by, for example, natural language processing.
  • the second layer is the management layer. Manage multiple robots together.
  • the second layer can be a scheduler for overall optimization.
  • the third layer is the control layer of the robot and controls the robot.
  • the third layer can perform individual optimization for one robot such as route search.
  • the fourth layer is the execution layer, which is the layer on which the robot operates.
  • the fourth layer can virtually operate the robot by simulation.
  • the fifth layer is an IoT layer.
  • the fifth layer manages measurement data by various sensors and the like required for the autonomous robot.
  • FIG. 2 is a diagram showing a system configuration example of the robot control system of the present embodiment.
  • the second layer robot scheduler includes an MDM server, an AP server, a DB server, and an ESB server.
  • the robot session control of the third layer includes an ESB server, a DB server, an AP server, a Cache server, and a robot control AI process.
  • the fourth layer robot simulator includes synchronous control, a communication adapter, map information, an ML agent, and Unity (registered trademark).
  • the real robot operating environment of the fifth layer includes the API of the real robot and the SDK of the communication adapter.
  • FIG. 3 is a diagram illustrating a functional outline of the second layer in the robot control system of the present embodiment.
  • the second layer is configured as an independent domain.
  • Request data can be input by text from the two-layer ESB.
  • the robot NO is not necessary, and the robot is automatically assigned and determined in two layers.
  • the content of the request is expanded (queued).
  • the content of the request is to go from the waiting area to the starting position, and then return to the waiting area via a plurality of destinations. While implementing this request, you will receive an arrival result report from the 3rd layer every time you go through the starting point and the destination.
  • the second layer gives an instruction to go to the next destination (evacuation area) for each destination. The robot can be released when it finally arrives at the evacuation area.
  • FIG. 4 is a diagram illustrating work queue management.
  • Transport robots are used in warehouses, factories, or on the premises of various buildings. In this embodiment, it is assumed that 100 or more robots automatically act.
  • the premise that the robot takes action is that the transfer instruction from the external business system triggers the action. Assuming a business scene, there is a requirement that luggage exists on a shelf or the like, and the amount of luggage that has been waiting for a while from that shelf is moved to an appropriate place on another premises.
  • Collected luggage is transported from the place of occurrence (start position) to the destination (destination), but it is conceivable that the collected luggage may be at one destination or at multiple locations. Therefore, in this second layer, it is required to satisfy the following requirements as a role to accurately convey the contents of work to the robot and smoothly achieve the purpose.
  • Robot master key is automatically generated by specifying the number of robots parameter. The number of robots should match the number of robots defined in the 4th layer. The robot NO is generated by combining the fixed position of the name and the variable position of the numerical value. -Generation of standby position, start position, and destination master The coordinate axes of each work location are mastered from the map information of the entire building, and are used when calculating the Euclidean distance from here. -Registration of robot type settings (allowable weight, volume) Robots have different sizes and are divided into several types.
  • one type of robot has a size of 1 m in length and 0.8 m in width, but it will be possible to handle several types of robots with different sizes.
  • FIG. 5 is a diagram illustrating resource allocation to the robot. As shown in FIG. 5, the relationship between the robot assigned in the plan and the order is transferred when the actual result is generated and the robot that has already been released occurs. By repeating the allocation change, the difference between the planned allocation and the actual allocation occurs. The transfer journal in that case will be described later.
  • the robots that can be assigned in the same time zone are assigned in the future time. If you think of the requested baggage as one continuous unit, it is not good that it is divided and delivered to the destination. There is no concept of unallocated. Be sure to allocate in the time zone that can be allocated. (3) If the same start position or the same destination exists in the work under the latest allocated order while the allocation is occurring in the multi at the time of robot allocation, it is within a certain time difference. If the transit time is near, it is necessary to exchange the destination order. Calculated based on the relationship between the distance and time of the destination. (4) When the robot allocation is performed, the robot allocation is averaged so that the robot once used is not continuously used due to the charging relationship.
  • the type of robot is selected so that it can be carried by one robot as much as possible according to the amount of luggage to be loaded. If it is not possible to select only a specific robot as shown in (2), the robot can be divided into a plurality of robots and assigned. (For the time being, there is only one type, 1m in length and 0.8m in width, but even if the number of types increases, it will be allocated.)
  • the allocation algorithm selects the robot type with the smallest robot tolerance (weight, volume). If one unit cannot accommodate it, raise the rank and judge.
  • the robot breaks down, the robot is put into a faulty state from the management screen, and an order is generated to move the robot to the faulty position by transferring to another standby robot. After arriving at the faulty position, continue the order in the middle. (1) The failed robot NO is notified from the session control. (2) After transferring the failed robot to a normal robot, the new robot NO notifies the session control of the destination with the failed robot NO. (3) Session control is up to the coordinate axes of the failed robot.
  • the notification that it has arrived at the second layer is returned from the third layer. Since the robot control AI recognizes the destination by name, the request to move to the robot NO is added with a new function. (4) During that time, the robot is stopped for a certain period of time (10 seconds) because there is no next instruction. (5) The second layer instructs the robot control AI of the original destination. From here, get on the normal orbit.
  • Occupancy and release of robot are scheduled by the planned deployment of performance requests.
  • (1) Robot occupancy A robot is not in an occupied state just by being assigned. Occupied active when it is actually taken out of the queue and sent to the 3rd layer.
  • (2) Unit of instruction to the third layer The instruction to the third layer is a unit of a set of one section. That is, instructions are given in section units, such as from the standby position to the start position, from the start position to the destination 1, and from the destination 1 to the destination 2.
  • the next instruction is the timing when the third layer arrives at the destination, notifies the second layer of the next request, and the second layer receives the next request.
  • Allocation in work is in the planning stage, and when a plan request is received, robot allocation (allocation) is performed at that point. Robots allocated at this stage must be calculated and predicted at what point in the future they will be occupied and at what point they will be released. The calculation method is shown below.
  • FIG. 7 is a diagram for explaining the distance from the start point to the arrival point.
  • the distance between the two destinations is C + B, not A. Therefore, the occupancy time is the cumulative total of the travel time according to the number of destinations and the work time at each destination, including the waiting area.
  • N Number of intervals between destinations
  • W Average working time (The value is the master value, and the value is determined by the robot type. The standard is 60 seconds, and the value is determined by the ratio of the size.)
  • M speed / sec (can be set as a parameter. The default can be 2 m / sec.)
  • F Next occupancy interval time (seconds) (The interval time from the release of the same robot to occupancy. It can be set as a parameter, and the default is 1.2 seconds.)
  • h magnification (can be set by parameter, default is 1.2)
  • the occupancy time T can be obtained by the following equation.
  • Resource occupancy start time The occupancy start time is the previous release time of the same robot + F seconds.
  • Resource release time ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ Release time Occupancy start time + Occupancy time T
  • FIG. 8 is a diagram showing an example of a schedule in the robot occupied state and the open state.
  • Equalization of robot utilization rate ⁇ Appropriate allocation of robots based on the development of plans and achievements ⁇ Robot status grasp (occupancy, release, queue) ⁇ Relationship between the amount of work and the number of robots ⁇ Consideration of charging time ⁇ Recognition of robot failure status
  • the premise is that the content of the request from WMS will be planned and developed for picking work, etc., and then the results will improve and transportation work will be born. At that time, I think that the request has been issued even at the planning stage. It will be easier to set up in advance.
  • the loadable volume and weight of the robot differ depending on the robot size, but for the robot assigned this time, the loading method is calculated from the total amount of luggage.
  • weight limit 1.
  • Volume of individual packages length, width, height
  • weight, arrangement of destinations destination
  • Total number 2.
  • Overall volume 4. The total weight etc. is given.
  • the result is set in the order. At the same time, it is also registered in the session control robot object.
  • the order contents of the second layer are saved, but the contents of the robot object are cleared when released. How to load luggage into each robot and the contents of loading can be done by inquiring the robot.
  • the inquiry can be displayed on the Web screen by searching the order contents in the second layer.
  • FIG. 9 is a diagram illustrating the generation of timing for retrieving work from the queue.
  • the timing of removing work from the queue can be controlled at 10-second intervals.
  • the work in the queue is the first work that has been unfolded. Inside the queue, there are two states: when the robot cannot be assigned and when the assigned work is ready.
  • FIG. 10 is a diagram showing a conceptual model of the robot scheduler.
  • FIG. 11 is a diagram showing a conceptual model of the robot scheduler.
  • the following table is a table that explains the account journals when dealing with failures.
  • the first half is attached under the original order number. If you are returning, you do not need to do the latter half. In the latter half, the original order will be used and the order will be Robot002. After that, it becomes the flow of Robot002. The request number is the same. After associating the robot with the order, the state is set to active or in transit. Continue in the current order NO in the work. Note: Alternate robots select robots that are on standby and do not have a queue.
  • the robot scheduler of the second layer is provided with a book for managing the above-mentioned accounts, and can manage the queue, the operation of the robot, and the standby status by the book.
  • the robot that was operating in the transport order broke down, so assign an alternative robot and take over. Copy the contents of the transport order to the transfer order and generate a journal. It is assumed that the destination was progressing halfway, and it broke down on the way and stopped in the middle of the road. At that time, the journals completed up to the previous time in From and To are not necessary for the alternative robot, and even in the middle of the road, the journal starts from that time. Since this From is the starting point, all necessary journals are generated.
  • the starting point is the position of the failed robot. Therefore, the instruction from the standby position to the start position is the position of the failed robot from the standby position.
  • FIG. 12 is a diagram illustrating an operation when the robot fails.
  • the failure pattern 1 the robot has failed while the destination 2 still remains, so the robot is transferred to the alternative robot. Therefore, Robot 2 is moved to the position where Robot 1 has failed.
  • the failure pattern 2 although the failure occurred while returning to the standby area, all the work at the destination has been completed, so the transfer to the alternative robot is not performed.
  • FIG. 13 is a diagram illustrating the operation in time series related to the transfer order. As shown in FIG. 13, when the actual result is generated, the plan is changed and the robot is transferred to the robot that can be assigned earlier. From the usable time of all robots that can be obtained by the getBalance function, it is possible to select a robot that does not have the next plan and determine the fastest usable robot among them.
  • Robot1 For example, for Robot1, the completion of processing of order 1 has been delayed, so it is necessary to transfer the work of order 2 planned next to another robot.
  • Robot 2 finishes order 3 earlier than planned, but since order 4 is assigned to Robot 2, order 2 of Robot 1 cannot be interrupted.
  • Robot3 has no order after the end of order 5. Therefore, the order 2 can be assigned after the completion of the order 5 of the Robot 3.
  • the following table is a table explaining an example of account journal entry between order status and robot status.
  • Accounts such as Robot001 are journalized by +1 only for the time zone existing in the queue at the time of allocation, but once occupied, they disappear from the queue. After that, it proceeds only by the transfer in the state account. It can be understood that the balance of the robot account is +1 or more as the number of queues. It becomes zero when the robot queue runs out. In other words, the balance of the robot can be said to be the number of waiting cases. Others are considered to be managed by the state account.
  • plan request the robot allocates the robot in the state at that time, develops the plan, and enters the queue. Plan expansion is performed in the same way as normal processing, but it is not executed and enters the queue. Leave the plan request as it is and use only the temporary display. Temporary means that when a performance request arrives, it will not be used even if it is displayed.
  • the contents of the order must be able to form an interface to the three layers.
  • the weight and training of luggage, the place of start, and multiple destinations are composed of an array.
  • an order When an order is generated, it will be assigned to the robot and a queue will be generated. For the processing content in that case, an appropriate robot is selected.
  • the size is determined from the robot type based on the weight and volume of the luggage, and a robot suitable for the size must be assigned.
  • the getBalance function is narrowed down by (condition: balance> zero and the robot whose last end time is closest to the present).
  • (2) can be any robot, so one is selected, but the method of selecting the robot with the least amount of time used today in the actual results is to order the waiting robot from the waiting robot. You can get a list with the getBalance function, aggregate by the amount of time, and decide on a small number of robots.
  • (3) one robot whose last end time is closest to the present is selected.
  • (4) is uniquely determined. -Finally, link the robot and the order to the queue.
  • a queue is an order queue, and robot allocation has been completed when the order is generated.
  • -Retrieving an order from the queue is the act of issuing instructions to each robot in three layers. -Give the getBalance function (condition: balance> zero and standby state) to search for robots, and issue instructions to the third layer in order. At that time, do not forget the journal entry of the next work place (destination) even though it is the first time. -The robot account is determined, and the order number arranged at the beginning is taken out from the entry element and the order number is confirmed. Once the order is confirmed, it will be possible to generate an interface from the order details to the third layer. -Finally, activate the robot, put the order in the transport state, and complete the process.
  • plan request will be expanded, but will not be subject to queue management. In other words, since it will not be taken out, it will be left as it is, but when the performance request arrives and a new plan development of the performance request is performed, the display will be on this side and the plan request will not be displayed. It is only used for grasping the amount of work.
  • the queue is in a state where orders are hung for each robot. Whenever an order is placed, it is tied to a specific robot (called allocation). -Robots have already been assigned to all orders in the queue. (If there is a shortage of robots, the queue will only be extended) -If the robot is currently active, do not retrieve the robot's queue. (Not applicable) -No parallel processing-For the waiting state, getBalance () is used to get the order list, and the serial number of the order is in chronological order. Take out one order in order from the order list. At that time, the order number, robot ID, and order elements are expanded on the list. -Obtain the robot ID from the entry under the fetched order, and if the robot is in the standby state, take it out from the queue and execute the instruction processing to the 3rd layer.
  • the retrieval process from the queue is realized by an external batch process, and the process is executed at regular period intervals.
  • FIG. 14 is a diagram illustrating a queue management function.
  • the waiting state of an order is called a queue.
  • the following five orders can be obtained at one time.
  • the order of transmission of the taken-out orders to the third layer is as shown in FIG. In this way, only the head of the queue for each robot in the standby state can be acquired in the list.
  • the queue is not used in the second layer.
  • the next destination is taken out from the order contents and instructed to the third layer.
  • the robot ID is the key to the processing content.
  • the active state of the robot is searched by the getBalance function, and the latest order number of the active state is captured.
  • the previous destination exists in the entry contents.
  • the next destination and end judgment can be judged by order NO.
  • the order number can also be extracted from the request number.
  • the destination next to the previous destination is acquired from the contents of the order, and an instruction for moving in the range of the current destination to the next destination is issued to the third layer. If the previous destination was the waiting area, the processing of the order will be completed with this arrival report.
  • the robot is transferred from occupancy to release, and the processing of the robot is completed.
  • the scenario is a failure response process, and the process content differs depending on the location and timing of the current stop.
  • the failure occurred while leaving the destination of the transportation destination, it is necessary to transfer to an alternative robot and instruct to move to the failure point. Since the retanned robot ID is out of order, the latest order number in the active state is captured by the getBalance function similar to normal processing. In addition, the previous destination exists in the entry contents. Here, the movement instruction is given to the alternative robot by using the position information of the failure point instead of the next destination.
  • the order processing is completed by transferring to the failure without using the alternative robot.
  • the robot ID of the ordered content is set to the robot ID after the transfer, and the content is copied and generated. For work and movement, it is necessary to generate only the journal of the previous destination and determine the next destination.
  • This process is a batch process, which is periodically monitored and the queue is reorganized.
  • Queue reorganization means that the first queue is performed when an order is generated.
  • the processing time (number of destinations and estimated processing time) of the order in that case is a plan, and when the actual result occurs after that, a gradual deviation occurs, and even though it is waiting in the queue of a certain robot. , It is conceivable that other robots are in a standby state and time is wasted. By reorganizing this state on a regular basis and applying appropriate compression, a queue with maximum efficiency can be generated.
  • FIG. 16 is a diagram for explaining the overlap of the passing times of the robots.
  • 12-A overlaps, but the time of arrival is estimated, and if the error between ROBOT001 and ROBOT0012 is within 60 seconds (parameterization), it is determined that they overlap, and either 12 -Change the order of A. If there is only one destination, return the queue to the queue and delay it.
  • the passing time is not an exact time, but a relative relationship between the robots.
  • the overlap of passing times is determined by calculating the relative difference in the straight line distance from the destination to the destination.
  • the distance error becomes a time error, so the difference is used for judgment.
  • the time of the passing point is determined by accumulating the time from position to position. At that time, the working time for the purpose is set to be constant and added individually.
  • i and j are relative positional relationships (for example, wait and 15-A can be expressed by 1 and 2), and K is the cumulative number of times (m)
  • the distance d (ij) is expressed by the following equation. Can be represented by.
  • the distance xy between two points on the X coordinate axis can be calculated by the following equation.
  • ⁇ Change the order of destinations> In the third layer, when the robot has arrived at the destination, but the robot preceding the same destination has already arrived and is in a state of meeting after that, and there is another destination. , The third layer compares the waiting time with the travel time to other destinations, and requests the second layer to change the order of the destinations if the other destinations are not crowded and can be unloaded immediately. To do. In the second layer, the order of the work development results is changed, the current destination is in an unprocessed state, and the next destination is instructed to the third layer. The third layer instructs the fourth layer to move from the current position to the next new destination.
  • FIG. 17 is a diagram illustrating a change in the order of destinations.
  • the third layer indicates the destination name of the destination to the fourth layer.
  • the fourth layer begins to move from its current position to the newly indicated destination.
  • the following table is a table showing API specifications for session control from the second layer to the third layer. 2nd layer ⁇ 3rd layer Business instruction data structure to robot control AI
  • the business instruction data structure from the 2nd layer to the 3rd layer robot control AI can be expressed as JSON format data as follows. ⁇ “robotId”: “robot001”, “requestNo”: “REQNO00001”, “fromDestination”: “waitGate”, “toDestination”: “A”, “destinationOrderNo”: 1, “destinationOperation”: 1, “quantity”: 10, "weight”: 50.0, "goodsIdList”: ["BOOK001”, “ORANGE001”, “BEEF001”], “actionType”: 1, "plannedWorkTime”: "00000000001500000”, "destinationList”: ["A”, “G”] "luggageList”: [ ⁇ “LuggageNO”: Luggage NO, “seqNo”: Order, “A”: ⁇ “x”: coordinates, “y”: coordinates, ”z”: coordinates ⁇ , “b”: ⁇ “x”: coordinates, “y”: coordinates, ”z”
  • the following table is a table explaining the structure of the data returned from the third layer to the second layer.
  • ⁇ Reception data of robot scheduler> The format of the request data from the outside received by the robot scheduler of the second layer is defined. 1. 1. This is instruction data for the robot to transport a specific load from a designated place to a designated place. 2. The specified luggage is assumed to be a box, and the number of boxes, the total weight, and the total volume are specified. 3. Robots are assigned with one designation, but it may be necessary to divide them into multiple robots due to weight restrictions and volume restrictions. 4. The destination is specified for each box. Therefore, there are a plurality of transport destinations, and they are stacked from the bottom in order of distance. At that time, a plurality of robots may be required according to the requirement 3.
  • one robot makes one order. Work is generated for each destination. Work development (1) Movement from the standby position to the start position (loading destination) (2) Loading work (initially human) ... From the 3rd layer, (1) and (2) Completion reports come to the 2nd layer at the same time (1) 3) Moving to the destination: (3) and (4) from the 3rd layer, the completion report comes to the 2nd layer at the same time. If there are multiple destinations, (3) and (4) Repeat 4) (4) Unload at the destination (5) Return to the waiting area when the luggage is empty ... When (4) is completed, request the third layer. When you arrive at the waiting area, there will be a return from the 3rd layer and the robot will be released. Regarding (3) and (4), the order is determined by calculation according to the distance.
  • Request data format 1. Start position name (example: 1-A) and loading amount (number of boxes, weight, volume) 2.
  • the order may be undefined on the data (determined by the system).
  • the instruction number is a unique ID specified by the requesting party (WMS, etc.).
  • FIG. 18 is a diagram showing an overall image of the system configuration according to the robot control system of the present embodiment. In this embodiment, it is assumed that a general-purpose application platform of AI that controls a robot operating inside a building is developed.
  • human 1 asks robot 2 to do the work.
  • Robot 2 receives the package at the designated place. It is a package of packages for multiple destinations. Luggage collected at one time is delivered to multiple destinations (destination) in the most efficient order.
  • destination destinations
  • the human 1 receives only the luggage required at the destination, and when the work at the destination is completed, the robot 2 moves to the next location.
  • the method of instruction is that the human 1 understands the content of the instruction by having a conversation in Japanese and takes an action based on the instruction.
  • the request reception layer 31 that accepts instructions, the control layer (robot control AI3) for understanding and acting on the instruction contents, and the road on the premises to avoid obstacles or if the road is closed,
  • An execution layer (simulator 4) is required, such as stopping and acting in response to an instruction from the control layer 2.
  • the request reception layer 31 may be provided by the robot 2 or the robot control AI 3, but in the present embodiment, the request reception layer 31 is provided by the robot control AI.
  • the control layer 3 does not need to be aware of the shape and restrictions of the individual robots, and only needs to talk with the robot using a general-purpose instruction interface. Therefore, even if the actual robot does not exist, the demonstration experiment can be performed by the virtual environment and the virtual robot.
  • the so-called physical robot entity should be prepared for each purpose of the place and the operation should be confirmed.
  • control layer 3 has various abilities such as understanding the business story, traveling ability to give instructions to the robot, ability to efficiently derive the shortest distance, and selection of a detour route corresponding to the occurrence of an unexpected obstacle. Reinforcement learning is required.
  • this robot control AI3 is not premised on a specific robot.
  • the target function is premised on restrictions as a function related to movement, such as carrying an object when moving from place to place or guiding a person to a specific place, but various usage scenarios are assumed. ..
  • When picking a package on a specific shelf in a warehouse and moving it to another target shelf, the package is automatically transported to the specified location efficiently.
  • tourists want to go somewhere but don't know how to get there, so they will guide you to the destination.
  • At a long-term care facility, carry the finished meal to the target room, or go to the room to pick up the tableware after meals and bring it back.
  • In addition to warehouses, factories and public facilities, etc.
  • Various usage scenes are possible
  • Robot 2 has different shapes and functions depending on the purpose, but it is premised that each function is controlled autonomously, and the action at the end is left to the function of the robot.
  • the robot control AI3 aims to move from place to place based on map information and efficiently carry objects by selecting the optimum route, but it has an autonomous function on the robot 2 side. By fusing, it is possible to solve practical problems.
  • the greatest feature of this function is that the robot control AI3 assumes that a plurality of robots 2 operate at the same time in a specific usage scene. Therefore, it is assumed that the position of the robot 2 and the state of the road play a role like a control center capable of controlling a plurality of states based on a map of the entire premises.
  • the place While performing work such as loading and unloading luggage, the place is occupied when viewed from the robot 2, and the other robots 2 are locked to the place.
  • the robot control AI3 detects this state, it guides the robot to another place so that the robot control AI3 can perform the work in the place where the occupancy is released after performing another work.
  • FIG. 19 is a diagram showing an outline of the robot control AI 3 and the simulator 4.
  • the control layer 3 includes the request reception layer.
  • Robot control AI3 is composed of three layers, each of which has a role. At the 3rd level, the route to the destination is optimized by making full use of artificial intelligence. The result is instructed to the robot simulator 41.
  • the robot simulator 41 executes the traveling operation from the route to the route and the traveling operation from the route to the destination while receiving instructions from the layer 3.
  • the robot simulator 41 periodically (every second) notifies the position information and speed of where the robot 2 is now in three layers. Learning is done to perform the actions of right, left, front, back, and stop, and you can judge for yourself.
  • the sensor is a dedicated function for generating a virtual state, and randomly generates events in chronological order. For example, when it receives an element of action, it returns position information (x, y) and velocity to the simulator in chronological order.
  • FIG. 20 is a diagram illustrating a state in which the robot control AI3 shares position information of a robot other than the robot corresponding to itself.
  • the robot control AI3 needs to share the state of the entire map information at each time point.
  • the robot simulator 41 also requires the same conditions, but in that case, the sensor 42 collects the individual information of the local sensor and shares it with each simulator 4.
  • the sensor 42 collects the individual information of the local sensor and shares it with each simulator 4.
  • we will not be aware of the global sensor so that changes will not occur even if the state from the actual robot is accepted in the future.
  • the robot control AI will share the state coming up from each robot simulator 41 through the global cache.
  • Each robot control AI3 operates in a closed world in an independent session, but if information other than itself can be acquired as other session information, it will be shared, so the global cache is used as a means for that.
  • Shared information makes it possible to always recognize the entire information at regular time intervals (1 second) at the same time as its own information.
  • FIG. 21 is a diagram illustrating an example of the behavior of the robot controlled by the robot control AI3.
  • the main contents are (1) the starting position, (2) a plurality of destinations to which the luggage is moved, and (3) the number and weight of the luggage at each destination.
  • the reward is evaluated based on how much the expected result is obtained, and the evaluation content is weighted to calculate the score.
  • FIG. 22 is a diagram illustrating an overall configuration of the robot control system of the present embodiment.
  • Robot control AI 1. 1. When the package information is received as a negative document, the document content is analyzed and converted into standard request information. 2. Understand the request information and dynamically generate the story. ⁇ Determine the order of transit to the destination ⁇ Number, weight, distance, and time of luggage for each destination 3. Instruct the next action from the current state.
  • FIG. 23 is a diagram illustrating a control hierarchy of the robot control AI3 in the robot control system of the present embodiment.
  • the robot control AI3 layer becomes the core of this process and performs actual control.
  • the work of the robot 2 here is to load the cargo at the starting place, reach the designated place in the shortest time, and unload the cargo. The work of loading and unloading is done by humans this time.
  • the feature of the processing here is that the work pooling layer 31 gives an independent instruction to one destination to the input layer of the neural network, so that the work for a plurality of destinations is serialized and in order. It is to be input.
  • the robot control AI layer is to instruct the autonomous robot to optimally execute the story corresponding to one processing purpose. This part is reinforcement learning with DDQN.
  • FIG. 24 is a diagram illustrating map information of the present embodiment.
  • the map information is assumed to be the numerical value of the range of the place, the composition of the destination, and the information of the route composition. On this map, it is assumed that multiple robots are passing at the same time.
  • the number of passing routes is obtained from the latest routes of points A and B, and the total distance between the routes + the latest distances from both routes are added to obtain the value.
  • FIG. 25 is a diagram illustrating a learning path of the robot control AI3 of the present embodiment.
  • the route and the route are connected by calculation, but when heading for the next route, it is necessary to calculate whether to go straight, up or down, or left or right.
  • the entire destination occupancy status can be recognized within the independent session of each robot control AI3. Therefore, the destination occupancy state of all the robots 2 of oneself and others is obtained every second and shared by all the robots 2.
  • Occupancy means the time from when the robot 2 arrives at the destination until it departs for the next destination or waiting area. The timing of changing the destination sequence number is determined immediately before one destination is completed and the movement to the next destination is started. However, the sequence numbers are not replaced (if the remaining time until release is occupied release time-current time ⁇ 20 seconds).
  • the number of angles is 2 to 2, so they are even. Calculate the total number of routes for all the most appropriate destinations, swapping the destination order.
  • Obstacles are shared so that the robot can recognize where the obstacle is currently occurring while it is running. The maximum number and positions of obstacles that can be recognized by each robot control AI3 at one time is 20. During learning, make it impossible to select a route that includes obstacles at the selection stage, whether the selected route has obstacles. Obstacles disappear 10 seconds after they occur.
  • the other party may be another robot 2. It is possible that they cross each other so that they do not collide with each other, or that the other party is stopped and overtakes. In this case, since the autonomous robot 2 passes by itself avoiding obstacles, the robot control AI3 does not need to know.
  • the obstacle means (2) As mentioned above, the obstacle means (2).
  • the shortest path When selecting the shortest path, if a positional relationship corresponding to the path is detected in the obstacle array, the shortest path is calculated again. As a result, find a route that does not contain obstacles.
  • the robot is included in this obstacle array at the timing when the obstacle array (x, y), which is commonly held in s, disappears.
  • FIG. 26 is a diagram illustrating a hierarchical structure of the robot control AI3.
  • FIG. 27 is a diagram illustrating a robot simulator adapter.
  • the stub is a program for testing the robot simulator 41 before directly interfacing with the actual robot control AI3 application. It reads text data, generates instructions to the autonomous robot 41, and enables simulation.
  • this stub part corresponds to a part of the environment, and it plays the role of interface to the autonomous robot in response to the request from the environment without reading the text data.
  • the robot simulator 41 executes instructions from the upper hierarchy by reinforcing learning the actual knowledge of the robot 2.
  • the main role is to derive the next action based on the information acquired from the sensor 42, and to generate the action based on the state. It operates according to the instruction from the robot control AI3.
  • the instruction content is added to the input layer of the robot simulator.
  • each robot 41 reports a state, and therefore control can be performed by sharing the state, but the simulator 4 must do the same.
  • the learning method of the simulator 4 learns to react to a dynamic and sudden event on the sensor 42 side while periodically generating it.
  • the sensor 42 since the sensor 42 has only its own sensor information, the information given to the simulator 4 is closed within itself. For example, assuming that five robots 2 are moving at the same time, each of the five robots 2 travels in a uniquely closed state.
  • what is developed in the entire map information is that the five robots 2 are instructed from the upper level, and they are running without any dependence on each other.
  • the state of the robot simulator 41 must be able to constantly acquire the state of all running robot simulators 41 other than myself.
  • the sensor 42 local sensor only needs to generate a state peculiar to each robot. It is the global sensor that collects all the local sensor information. The hierarchical relationship is shown below.
  • FIG. 28 is a diagram illustrating an interface between the robot simulator 41 and Unity (registered trademark) 5 that displays an image.
  • the interface condition is an image in which the entire moment of the current image is taken as a unit and this is continuously made into a moving image, so the position information of all robots 41 and the position information of all obstacles are listed. hand over.
  • the delivery interval is 100 ms.
  • the position information of the coordinate axes is displayed as (x, y), which is a point.
  • the robot 41 and obstacles are displayed by volume, and the point on this coordinate axis indicates which position.
  • the position of the robot is the black dot at the beginning. Obstacles are fixed at the front on the left.
  • FIG. 29 is a diagram showing an example of a transfer robot and an obstacle.
  • the robot 41 can pass through obstacles 1 and 2, but the obstacle 3 cannot pass through and must stop. After that, we will head back to the circuit.
  • the sensor 42 has a role of generating its own state and transmitting it to the simulator 4. Originally, it is built in the physical robot 2, but since the actual robot 2 has various functions depending on the purpose, it is decided to prepare a simulator 41 that does not depend on these functions. did.
  • the simulator 41 has the same knowledge as the robot 2 and performs reinforcement learning in advance so that it can act autonomously, but the sensor 42 is dedicated to the role of generating a physical state.
  • Sensor 42 plays a role of calculating the current state that can occur due to its own running while calculating it in time series, so reinforcement learning is not necessary.
  • the simulator 4 decides whether to take the same action in response to the result, but in this case, reinforcement learning is required.
  • the sensor 42 must generate its own state on a regular basis (100 ms).
  • the state is a two-dimensional coordinate axis space of statically defined map information, and repeats a calculation action for changing the situation in which one is placed in a time series by an action received from the simulator 4.
  • the role is to return the result to the simulator 4 at any time.
  • a group sensor that collects the sensors 42 of all robots. It is controlled by two layers, a local sensor and a global sensor. In addition to the state of each robot, the global sensor also plays a role in storytelling the creation and disappearance of obstacles and causing random events.
  • FIG. 30 is a diagram illustrating a global sensor and a local sensor.
  • the global sensor acquires the status of each local sensor when passing the sensor information to the robot simulator 41. All the states of these local sensors are put together as an overall state list, and this collected state list is handed over to the robot simulator 41.
  • the robot simulator 41 grasps all other events occurring at the present time as a state and determines an action.
  • FIG. 31 is a diagram illustrating learning inside the robot simulator 41.
  • Role of sensor 1. 1. Based on the map information, the current position of oneself while traveling is periodically obtained and reported to the simulator 4. (Calculate and hand over your position, mileage, obstacles, and distance to the wall each time)
  • a new state is generated in response to the content of the action a from the agent of the simulator 4.
  • the sensor 42 receives the overall state list and recognizes the position of the robot 41 and the position of an obstacle with others.
  • FIG. 32 is a diagram illustrating communication performed between the robot simulator 41 and the sensor 42.
  • FIG. 33 is a diagram illustrating the sensor 42.
  • the 32 sensors 42 have 64 units as each element at (1) distance and (2) angle.
  • FIG. 34 is a diagram illustrating an example of the arrangement of the sensor 42 and the state arrangement for holding the information from the sensor 42.
  • the distance to a wall, an obstacle, or a white line is acquired by a total of 32 sensors 42, 16 in the front and 16 in the rear.
  • the individual sensors 42 independently return the values as states.
  • the state must be recognized from its independent value.
  • FIG. 35 is a diagram for explaining the distance measured by the sensor 42 included in the robots 2 and 41.
  • the dashed arrow is the straight distance to the wall.
  • the sensor ID is fixed, and the direction changes when the car turns the steering wheel, but straight ahead is always a predetermined sensor ID. This is used by virtual robots 2 and 41 when calculating their own position, regardless of DQN.
  • the distance is calculated by trigonometric function every time.
  • the calculation method is shown separately.
  • FIG. 36 is a diagram illustrating a road model assumed in the present embodiment.
  • the current position is calculated on the two-dimensional coordinate axes of latitude and longitude (x-axis, y-axis). Calculate the positional relationship within the coordinate axes shown on the left.
  • FIG. 37 is a diagram illustrating the relationship between the moving distance and the position information (latitude / longitude).
  • One accelerator is 0.2 m / sec.
  • the speed is determined by the current number of accelerator stages.
  • the elapsed time is calculated by the difference between the previous time and the current time.
  • the elapsed time is determined by multiplying by the speed of the number of accelerator stages to obtain the moving distance.
  • FIG. 38 is a diagram showing a hardware configuration example of a computer used in the robot control system according to the present embodiment.
  • the computer may be a general-purpose computer such as a workstation or a personal computer, or may be logically realized by cloud computing.
  • the illustrated configuration is an example, and may have other configurations.
  • the computer includes at least a processor 20, a memory 21, a storage 22, a transmission / reception unit 23, an input / output unit 24, and the like.
  • the processor 20 is an arithmetic unit that controls the operation of the entire computer, controls the transmission and reception of data between each element, and performs information processing necessary for application execution and authentication processing.
  • the processor 20 is a CPU (Central Processing Unit), and executes each information processing by executing a program or the like stored in the storage 22 and expanded in the memory 21.
  • the memory 21 includes a main memory composed of a volatile storage device such as a DRAM (Dynamic Random Access Memory) and an auxiliary storage composed of a non-volatile storage device such as a flash memory or an HDD (Hard Disc Drive).
  • the memory 21 is used as a work area or the like of the processor 20, and also stores a BIOS (Basic Input / Output System) executed when the computer is started, various setting information, and the like.
  • BIOS Basic Input / Output System
  • the storage 22 stores various programs such as application programs.
  • a database storing data used for each process may be built in the storage 22.
  • the transmission / reception unit 23 connects the computer to the network and the blockchain network.
  • the transmission / reception unit 23 may be provided with a short-range communication interface of Bluetooth (registered trademark) and BLE (Bluetooth Low Energy).
  • the input / output unit 24 is an information input device such as a keyboard and a mouse, and an output device such as a display.
  • the processor 20 included in the computer reads the program stored in the storage 22 into the memory 21. It is realized by executing. Further, the learning result (model) by the robot control AI, the map information, the route information, and the like can be stored in, for example, a storage area provided by the memory 21 or the storage 22.
  • FIG. 99 is a diagram showing a hardware configuration example of a computer used in the robot control system according to the present embodiment.
  • the computer may be a general-purpose computer such as a workstation or a personal computer, or may be logically realized by cloud computing.
  • the illustrated configuration is an example, and may have other configurations.
  • the computer includes at least a processor 20, a memory 21, a storage 22, a transmission / reception unit 23, an input / output unit 24, and the like.
  • the processor 20 is an arithmetic unit that controls the operation of the entire computer, controls the transmission and reception of data between each element, and performs information processing necessary for application execution and authentication processing.
  • the processor 20 is a CPU (Central Processing Unit), and executes each information processing by executing a program or the like stored in the storage 22 and expanded in the memory 21.
  • the memory 21 includes a main memory composed of a volatile storage device such as a DRAM (Dynamic Random Access Memory) and an auxiliary storage composed of a non-volatile storage device such as a flash memory or an HDD (Hard Disc Drive).
  • the memory 21 is used as a work area or the like of the processor 20, and also stores a BIOS (Basic Input / Output System) executed when the computer is started, various setting information, and the like.
  • BIOS Basic Input / Output System
  • the storage 22 stores various programs such as application programs.
  • a database storing data used for each process may be built in the storage 22.
  • the transmission / reception unit 23 connects the computer to the network and the blockchain network.
  • the transmission / reception unit 23 may be provided with a short-range communication interface of Bluetooth (registered trademark) and BLE (Bluetooth Low Energy).
  • the input / output unit 24 is an information input device such as a keyboard and a mouse, and an output device such as a display.
  • the processor 20 provided in the computer is the storage 22. It is realized by reading the program stored in the memory 21 into the memory 21 and executing the program. Further, various storage units in the second layer and learning results (models), map information, route information, etc. by the robot control AI in the third layer can be stored in, for example, a storage area provided by the memory 21 or the storage 22. ..

Abstract

[Problem] To provide robot control such that effective reinforcement learning is possible. [Solution] This system for controlling multiple robots is characterized by comprising: a work memory unit for storing multiple tasks to be performed by robots; an assignment unit for assigning individual tasks to the robots; a transmission unit for transmitting assigned tasks to a robot control device; and a status acquisition unit for acquiring robot operating status, and is also characterized in that the assignment unit changes the task assignment priority in accordance with operating conditions.

Description

ロボット制御システムRobot control system
 本発明は、ロボット制御システムに関する。 The present invention relates to a robot control system.
 ロボット制御に関する強化学習が行われている(特許文献1参照)。 Reinforcement learning related to robot control is being conducted (see Patent Document 1).
特開2019-7891号公報Japanese Unexamined Patent Publication No. 2019-7891
 しかしながら、特許文献1に記載のシステムでは複数のロボットが動作する場合の効率化がなされていない。 However, the system described in Patent Document 1 does not improve efficiency when a plurality of robots operate.
 本発明はこのような背景を鑑みてなされたものであり、複数のロボットを効果的に制御することのできる技術を提供することを目的とする。 The present invention has been made in view of such a background, and an object of the present invention is to provide a technique capable of effectively controlling a plurality of robots.
 上記課題を解決するための本発明の主たる発明は、複数のロボットを制御するシステムであって、前記ロボットが行うべき複数の仕事を記憶する作業記憶部と、前記仕事のそれぞれを前記ロボットに割り当てる割当処理部と、前記割り当てられた前記仕事を前記ロボットの制御装置に送信する送信部と、前記ロボットの動作状況を取得する状況取得部と、を備え、前記割当処理部は、前記動作状況に応じて前記仕事の割当先を変更すること、を特徴とする。 The main invention of the present invention for solving the above problems is a system for controlling a plurality of robots, in which a work storage unit for storing a plurality of tasks to be performed by the robot and each of the tasks are assigned to the robot. The allocation processing unit includes a transmission unit that transmits the assigned work to the control device of the robot, and a status acquisition unit that acquires the operation status of the robot. It is characterized in that the allocation destination of the work is changed accordingly.
 本発明によれば、ロボット制御に関して効率的な強化学習を行うことができる。 According to the present invention, efficient reinforcement learning can be performed for robot control.
本実施形態のロボット制御システムに係るシステム構成の全体像を示す図である。It is a figure which shows the whole image of the system configuration which concerns on the robot control system of this embodiment. 本実施形態のロボット制御システムのシステム構成例を示す図である。It is a figure which shows the system configuration example of the robot control system of this embodiment. 本実施形態のロボット制御システムにおける第2層の機能概要を説明する図である。It is a figure explaining the functional outline of the 2nd layer in the robot control system of this embodiment. 仕事の待ち行列管理について説明する図である。It is a figure explaining the queue management of work. ロボットへの資源割当について説明する図である。It is a figure explaining the resource allocation to a robot. 45個の荷物の運搬のロボットへの割当を説明する図である。It is a figure explaining the allocation to the robot of carrying 45 pieces of luggage. スタート地点から到着地点までの距離を説明する図である。It is a figure explaining the distance from a start point to an arrival point. ロボット占有状態及び開放状態のスケジュール例を示す図である。It is a figure which shows the schedule example of the robot occupied state and open state. 待ち行列から作業を取り出すタイミングの生成について説明する図である。It is a figure explaining the generation of the timing to take work out of a queue. ロボットスケジューラの概念モデルを示す図である。It is a figure which shows the conceptual model of a robot scheduler. ロボットスケジューラの概念モデルを示す図である。It is a figure which shows the conceptual model of a robot scheduler. ロボット故障時の動作を説明する図である。It is a figure explaining the operation at the time of a robot failure. 振替注文に係るの時系列での動作を説明する図である。It is a figure explaining the operation in time series concerning a transfer order. 待ち行列の管理機能を説明する図である。It is a figure explaining the queuing management function. 待機状態のロボットが存在しないかどうかを探し出す手法を示す図である。It is a figure which shows the method of finding out whether or not there is a robot in a standby state. ロボットの通過時刻の重なりを説明する図である。It is a figure explaining the overlap of the passing time of a robot. 目的地の順序変更について説明する図である。It is a figure explaining the order change of a destination. 本実施形態のロボット制御システムに係るシステム構成の全体像を示す図である。It is a figure which shows the whole image of the system configuration which concerns on the robot control system of this embodiment. ロボット制御AI3とシミュレータ4の概要を示す図である。It is a figure which shows the outline of the robot control AI3 and the simulator 4. ロボット制御AI3が自分に対応するロボット以外のロボットの位置情報を共有している状態を説明する図である。It is a figure explaining the state which the robot control AI3 shares the position information of the robot other than the robot corresponding to itself. ロボット制御AI3により制御されたロボットの行動の一例を説明する図である。It is a figure explaining an example of the behavior of the robot controlled by the robot control AI3. 本実施形態のロボット制御システムの全体構成を説明する図である。It is a figure explaining the whole structure of the robot control system of this embodiment. 本実施形態のロボット制御システムにおけるロボット制御AI3の制御階層を説明する図である。It is a figure explaining the control hierarchy of the robot control AI3 in the robot control system of this embodiment. 本実施形態の地図情報を説明する図である。It is a figure explaining the map information of this embodiment. 本実施形態のロボット制御AI3の学習する経路を説明する図である。It is a figure explaining the learning path of the robot control AI3 of this embodiment. ロボット制御AI3の階層構造を説明する図である。It is a figure explaining the hierarchical structure of the robot control AI3. ロボットシミュレータアダプタを説明する図である。It is a figure explaining the robot simulator adapter. ロボットシミュレータ41と、画像表示を行うUnity(登録商標)5とのインタフェースを説明する図である。It is a figure explaining the interface between a robot simulator 41 and Unity (registered trademark) 5 which displays an image. 搬送ロボットと障害物の一例を示す図である。It is a figure which shows an example of a transfer robot and an obstacle. グローバルセンサーとローカルセンサーを説明する図である。It is a figure explaining a global sensor and a local sensor. ロボットシミュレータ41内部の学習について説明する図である。It is a figure explaining the learning inside the robot simulator 41. ロボットシミュレータ41とセンサー42との間で行われる通信について説明する図である。It is a figure explaining the communication performed between a robot simulator 41 and a sensor 42. センサー42について説明する図である。It is a figure explaining the sensor 42. センサー42の配置とセンサー42からの情報を保持する状態配列の一例を説明する図である。It is a figure explaining an example of the arrangement of a sensor 42 and the state arrangement which holds the information from a sensor 42. ロボット2,41が備えるセンサー42が測定する距離を説明する図である。It is a figure explaining the distance measured by the sensor 42 included in the robots 2 and 41. 本実施形態で想定している道路モデルを説明する図である。It is a figure explaining the road model assumed in this embodiment. 移動距離と位置情報(緯度経度)との関係について説明する図である。It is a figure explaining the relationship between the movement distance and position information (latitude and longitude). 本実施形態に係るロボット制御システムに用いられるコンピュータのハードウェア構成例を示す図である。It is a figure which shows the hardware configuration example of the computer used for the robot control system which concerns on this embodiment.
<発明の概要>
 本発明の実施形態の内容を列記して説明する。本発明は、たとえば、以下のような構成を備える。
[項目1]
 複数のロボットを制御するシステムであって、
 前記ロボットが行うべき複数の仕事を記憶する作業記憶部と、
 前記仕事のそれぞれを前記ロボットに割り当てる割当処理部と、
 前記割り当てられた前記仕事を前記ロボットの制御装置に送信する送信部と、
 前記ロボットの動作状況を取得する状況取得部と、
 を備え、
 前記割当処理部は、前記動作状況に応じて前記仕事の割当先を変更すること、
 を特徴とするロボット制御システム。
[項目2]
 項目1に記載のロボット制御システムであって、
 前記割当処理部は、前記仕事に必要な第1の仕事量と、前記ロボットが実行可能な第2の仕事量とに応じて、1つの前記仕事を1つまたは複数の前記ロボットに割り当てること、
 を特徴とするロボット制御システム。
[項目3]
 項目1に記載のロボット制御システムであって、
 前記割当処理部は、前記複数のロボットのそれぞれに割り当てられる前記仕事の量を、前記複数のロボットのそれぞれに割り当てられた前記仕事の所定期間での累積量が平滑化するように、前記仕事を前記ロボットに割り当てること、
 を特徴とするロボット制御システム。
[項目4]
 項目1に記載のロボット制御システムであって、
 前記状況取得部は、前記ロボットの制御装置と、前記ロボットとは独立したセンサーとから前記動作状況を示す情報を取得すること、
 を特徴とするロボット制御システム。
[項目5]
 項目1に記載のロボット制御システムであって、
 少なくとも前記仕事のために前記ロボットが占有される占有時間を、少なくとも各前記ロボット及び全体としての運用状態を勘定科目として借方及び貸方に記憶する帳簿を備えること、
 を特徴とするロボット制御システム。
<Outline of the invention>
The contents of the embodiments of the present invention will be described in a list. The present invention includes, for example, the following configuration.
[Item 1]
A system that controls multiple robots
A working memory unit that stores a plurality of tasks to be performed by the robot,
An allocation processing unit that assigns each of the tasks to the robot,
A transmitter that transmits the assigned work to the control device of the robot, and
A status acquisition unit that acquires the operating status of the robot, and
With
The allocation processing unit changes the allocation destination of the work according to the operation status.
A robot control system featuring.
[Item 2]
The robot control system according to item 1.
The allocation processing unit allocates one work to one or a plurality of robots according to a first work amount required for the work and a second work amount that the robot can perform.
A robot control system featuring.
[Item 3]
The robot control system according to item 1.
The allocation processing unit performs the work so that the amount of the work assigned to each of the plurality of robots is smoothed by the cumulative amount of the work assigned to each of the plurality of robots in a predetermined period. Assigning to the robot,
A robot control system featuring.
[Item 4]
The robot control system according to item 1.
The status acquisition unit acquires information indicating the operation status from the control device of the robot and a sensor independent of the robot.
A robot control system featuring.
[Item 5]
The robot control system according to item 1.
To provide a book that stores at least the occupied time occupied by the robot for the work in the debit and credit as an account item of at least each robot and the operating state as a whole.
A robot control system featuring.
 また、本発明は、以下のような構成を備えることもできる。
[項目1]
 ロボットを制御するシステムであって、
 前記ロボットの制御を行う制御部と、
 前記ロボットの動作をシミュレートするシミュレータと、
 を備え、
 前記制御部が前記シミュレータによりシミュレートされる仮想的な前記ロボットの動作に応じて、前記ロボットの制御に係る強化学習を行うこと、
 を特徴とするロボット制御システム。
[項目2]
 項目1に記載のロボット制御システムであって、
 前記ロボットは1つ以上のセンサを備えるものであり、
 前記制御部は、前記シミュレータに対して前記ロボットの動作に係る制御信号を前記シミュレータに送信し、
 前記シミュレータは、前記制御信号に応じて前記仮想的なロボットの動作をシミュレートするとともに、仮想的な前記センサによる測定をシミュレートし、前記仮想的なセンサによる測定情報を前記制御部に送信し、
 前記制御部は、前記測定情報に応じて前記強化学習を行うこと、
 を特徴とするロボット制御システム。
[項目3]
 項目1に記載のロボット制御システムであって、
 前記制御部は、
  前記ロボットに対する指示を受け付ける依頼受付層と、
  前記指示を前記強化学習の入力値として与える作業プーリング層と、
  前記強化学習を行うAI層と、
 を備えることを特徴とするロボット制御システム。
The present invention can also have the following configurations.
[Item 1]
A system that controls robots
A control unit that controls the robot and
A simulator that simulates the movement of the robot and
With
Reinforcement learning related to the control of the robot according to the virtual operation of the robot simulated by the simulator.
A robot control system featuring.
[Item 2]
The robot control system according to item 1.
The robot comprises one or more sensors.
The control unit transmits a control signal related to the operation of the robot to the simulator to the simulator.
The simulator simulates the operation of the virtual robot in response to the control signal, simulates the measurement by the virtual sensor, and transmits the measurement information by the virtual sensor to the control unit. ,
The control unit performs the reinforcement learning according to the measurement information.
A robot control system featuring.
[Item 3]
The robot control system according to item 1.
The control unit
The request reception layer that accepts instructions to the robot and
A work pooling layer that gives the instruction as an input value for the reinforcement learning,
The AI layer that performs reinforcement learning and
A robot control system characterized by being equipped with.
<目的>
 以下、本発明の一実施形態に係るロボット制御システムについて説明する。
 図1は、本実施形態のロボット制御システムに係るシステム構成の全体像を示す図である。本実施形態のロボット制御システムは、5階層に構成されている。
<Purpose>
Hereinafter, the robot control system according to the embodiment of the present invention will be described.
FIG. 1 is a diagram showing an overall image of a system configuration according to the robot control system of the present embodiment. The robot control system of this embodiment is configured in five layers.
 第1層は外部接続を行う。第1層は例えば、自然言語処理などによりユーザからの指示を受け付けることができる。 The first layer makes an external connection. The first layer can receive instructions from the user by, for example, natural language processing.
 第2層は、管理層である。複数のロボットをまとめて管理する。第2層は、全体最適化のためのスケジューラとすることができる。 The second layer is the management layer. Manage multiple robots together. The second layer can be a scheduler for overall optimization.
 第3層は、ロボットの制御層であり、ロボットの制御を行う。第3層は、経路探索など、1体のロボットについての個別最適化を行うことができる。 The third layer is the control layer of the robot and controls the robot. The third layer can perform individual optimization for one robot such as route search.
 第4層は、実行層であり、ロボットが動作する層である。本実施形態では、第4層はシミュレーションによりロボットを仮想的に動作させることができる。 The fourth layer is the execution layer, which is the layer on which the robot operates. In the present embodiment, the fourth layer can virtually operate the robot by simulation.
 第5層は、IoT層である。第5層は自律ロボットに必要な各種のセンサ等による計測データを管理する。 The fifth layer is an IoT layer. The fifth layer manages measurement data by various sensors and the like required for the autonomous robot.
<システム構成例>
 図2は、本実施形態のロボット制御システムのシステム構成例を示す図である。第2層のロボットスケジューラは、MDMサーバ、APサーバ、DBサーバ、ESBサーバを備える。第3層のロボットセッション制御は、ESBサーバ、DBサーバ、APサーバ、Cacheサーバ、ロボット制御AIプロセスを備える。第4層のロボットシミュレータは、同期制御、通信アダプタ、地図情報、MLエージェント、Unity(登録商標)を備える。第5層の実ロボット動作環境では、実ロボットのAPIと通信アダプタのSDKとが含まれる。
<System configuration example>
FIG. 2 is a diagram showing a system configuration example of the robot control system of the present embodiment. The second layer robot scheduler includes an MDM server, an AP server, a DB server, and an ESB server. The robot session control of the third layer includes an ESB server, a DB server, an AP server, a Cache server, and a robot control AI process. The fourth layer robot simulator includes synchronous control, a communication adapter, map information, an ML agent, and Unity (registered trademark). The real robot operating environment of the fifth layer includes the API of the real robot and the SDK of the communication adapter.
<機能概要>
 図3は、本実施形態のロボット制御システムにおける第2層の機能概要を説明する図である。第2層は独立ドメインとして構成する。依頼データ投入は2層のESBからテキストで行うことができる。依頼内容では、ロボットNOは必要なく、2層で自動的にロボットを割当して決定する。依頼内容は作業展開(待ち行列化)される。図3の例では、依頼内容は、待機エリアから開始位置へ行き、その後、複数の目的地を経て待機エリアまで戻るというものである。この依頼内容を実施する間、開始地、目的地を経由する度に3層から到着結果報告を受ける。第2層は目的地ごとに次の目的地(退避エリア)に行くように指示を出す。最後に退避エリアに到着したらロボットを解放することができる。
<Function overview>
FIG. 3 is a diagram illustrating a functional outline of the second layer in the robot control system of the present embodiment. The second layer is configured as an independent domain. Request data can be input by text from the two-layer ESB. In the content of the request, the robot NO is not necessary, and the robot is automatically assigned and determined in two layers. The content of the request is expanded (queued). In the example of FIG. 3, the content of the request is to go from the waiting area to the starting position, and then return to the waiting area via a plurality of destinations. While implementing this request, you will receive an arrival result report from the 3rd layer every time you go through the starting point and the destination. The second layer gives an instruction to go to the next destination (evacuation area) for each destination. The robot can be released when it finally arrives at the evacuation area.
<待ち行列>
 図4は、仕事の待ち行列管理について説明する図である。
<Queue>
FIG. 4 is a diagram illustrating work queue management.
<スケジューラの主要機能>
 ロボットスケジューラの主要機能について説明する。
<Main functions of scheduler>
The main functions of the robot scheduler will be explained.
 搬送ロボットは倉庫や工場、或は様々な建物の構内で用いられている。本実施形態では、100台以上のロボットが自動的に行動する様を想定する。ロボットが行動を取る前提は、外部の業務システムからの搬送指示が行動のトリガーとなる。業務シーンを想定すると、荷物が棚などに存在し、その棚から間も待った量の荷物を他の構内のしかるべき場所に移動させる要件が発生することである。 Transport robots are used in warehouses, factories, or on the premises of various buildings. In this embodiment, it is assumed that 100 or more robots automatically act. The premise that the robot takes action is that the transfer instruction from the external business system triggers the action. Assuming a business scene, there is a requirement that luggage exists on a shelf or the like, and the amount of luggage that has been waiting for a while from that shelf is moved to an appropriate place on another premises.
 まとまった荷物は、発生場所(開始位置)から移動先(目的地)へ搬送するが、そのまとまった荷物が一か所の目的地の場合と複数個所の場合が考えられる。そこで、本第2層では、ロボットへ仕事の内容を的確に伝えて、円滑に目的を達成するための役割として、下記の要件を満足することが要求される。 Collected luggage is transported from the place of occurrence (start position) to the destination (destination), but it is conceivable that the collected luggage may be at one destination or at multiple locations. Therefore, in this second layer, it is required to satisfy the following requirements as a role to accurately convey the contents of work to the robot and smoothly achieve the purpose.
==1.マスタ関連生成==
1.マスタ関連生成
 ・ロボットNOの生成
  ロボット台数パラメタ指定によって、自動的にロボットマスタのキーを生成する。
  ロボット台数は4層で定義したロボット台数を一致すること。
  ロボットNOは、名称の固定位置と数値の可変位置の合成で生成される。
 ・待機位置、開始位置、目的地マスタの生成
  建物全体の地図情報から、各作業場所の座標軸をマスタ化し、ここからユークリッド距離を計算する際に使用する。  
 ・ロボット型の設定(許容重量、容積)の登録
  ロボットには大きさがあり、幾つかの種類分かれる。現時点では、縦1m、横0.8mサイズを持つロボット1種類としているが、サイズの異なる幾つかの種類のロボットを対応できるようにする。
 ・ロボット勘定の定義
  待ち行列勘定、活動勘定(普通・緊急)、故障勘定がある。
==2.キュー==
2.搬送要件を外部から受付し、一旦キューイングする機能
 キューイングとは、依頼事項を待ち行列に入れて、ロボットを割り当てる準備を行うことである。ロボット引当は残高ゼロに関わらず、すべての注文は一旦、2層で待ち行列に入る。そして、先行する注文が終了するのを待つ。そして、注文の待ち勘定へ振り替える。ロボットが途中で故障した場合は、ロボットを故障勘定に振り替えて、別なロボットを割り当てる。以降、当該ロボットは割当対象から除外される。
== 1. Master related generation ==
1. 1. Master-related generation-Robot NO generation Robot master key is automatically generated by specifying the number of robots parameter.
The number of robots should match the number of robots defined in the 4th layer.
The robot NO is generated by combining the fixed position of the name and the variable position of the numerical value.
-Generation of standby position, start position, and destination master The coordinate axes of each work location are mastered from the map information of the entire building, and are used when calculating the Euclidean distance from here.
-Registration of robot type settings (allowable weight, volume) Robots have different sizes and are divided into several types. At present, one type of robot has a size of 1 m in length and 0.8 m in width, but it will be possible to handle several types of robots with different sizes.
-Definition of robot account There are queue account, activity account (ordinary / emergency), and failure account.
== 2. Queue ==
2. Function to receive transport requirements from the outside and queue once Queuing is to put a request in a queue and prepare to assign a robot. All orders are once queued in two layers, regardless of the robot reserve having zero balance. Then, wait for the preceding order to be completed. Then, it is transferred to the waiting account of the order. If a robot breaks down in the middle, the robot is transferred to the failure account and another robot is assigned. After that, the robot is excluded from the allocation target.
<ロボット資源割当>
 図5は、ロボットへの資源割当について説明する図である。図5に示すように、計画で割り当てられたロボットと注文の関係が、実績が発生するとともに、既に解放されたロボットが発生すると振替が発生する。割当変更を繰り返すことによって、計画割当と実績割当の差異が発生する。その場合の振替仕訳は後述する。
<Robot resource allocation>
FIG. 5 is a diagram illustrating resource allocation to the robot. As shown in FIG. 5, the relationship between the robot assigned in the plan and the order is transferred when the actual result is generated and the robot that has already been released occurs. By repeating the allocation change, the difference between the planned allocation and the actual allocation occurs. The transfer journal in that case will be described later.
==3.待機ロボットへの資源割り当て==
3.依頼事項を受け取るとまず最初に、待機しているロボット資源に割当てる。待ち行列はロボット資源を割り当ててから作成する。
 (1)依頼事項(仕事)を受け取ると、最初にロボット資源の割り当てを行う。その際、依頼事項には優先度はない。緊急用にロボットを待機させておく必要があるが、これは、ロボットが故障した場合に確実に緊急用が取れるように、普段使用しないロボットを緊急用として用意する。
 (2)ロボットに積み込むことが出来る荷物の量(容積、重量)によっては、複数台のロボットに分割しなければならないことがある。1つの注文=1台のロボットとするので、荷物の量によっては複数注文に分割される。複数注文に対応する複数ロボットの内、一部が割当不可の場合は、同じ時間帯で割当出来るロボットを未来時間で割り当てる。依頼内容の荷物は連続した1つの単位と見ると、移動先に分割されて届くのはよろしくない。未割当と言う概念は存在しない。必ず、割当可能な時間帯で割り当てる。
 (3)ロボットの割当時点でマルチに割当が発生している中で、直近の割当済み注文の配下の作業で、同一開始位置や同一目的地が存在した場合は、それが一定の時間差の中で通過時間が近い場合は、移動先順を交換する必要がある。移動先の距離と時間の関係で計算する。
 (4)ロボット割当を行う場合、充電の関係もあり、一度使用したロボットを連続的に使用しないようにするために、ロボット割当を平均化するようにする。その為には、
  1)荷物の依頼された全体量によってロボット種類を選択する。(許容容積、許容重量の範囲)種類が限定されている場合で、許容量を超えた場合は、選択の余地がなくなるので、(5)になる。計算:荷物容積<=許容容積and 荷物重量<= 許容重量を満たすロボットで、最もミニマムなロボットが対象である。以上を満たすロボットが存在しないは(5)になる。
  2)使用時間累積が最も少ない順にロボットを割り当てる。つまり、長時間使用したロボットは解放時間が新しいことになるので、割当優先度は低下する。
  3)ロボットの解放時間が最も古い時刻のロボットを積極的に割り当てる。但し、解放して待機エリアに待機している最中は、充電していると解釈すると、待機している時間が長いほど、割当優先度は高くなる。
 (5)ロボットの大きさによる種類が複数マスタに登録された場合は、積み込む荷物の量に応じて、出来るだけ1台で運ぶことができる量にロボットの種類を選択する。(2)で示すように特定のロボットのみで選択できない場合は、ロボットを複数台に分割して割り当てることが出来る。(当面は縦1m。横0.8mの1種類のみだが、種類が増えても割当かのうとする。)
 割当アルゴリズムでは、ロボット許容値(重量、容積)の最も小さいロボット種類を選択する。1台で収容できない場合にランクを上げて、審査する。1台に適用できるロボットが存在しない場合は、ロボットを分割割当する。
==割当例==
 図6は、45個の荷物の運搬のロボットへの割当を説明する図である。
==4.故障時の対応==
4.ロボットが故障した場合は、管理画面からロボットを故障状態とし、他の待機ロボットに振替て故障位置まで、ロボットを移動させる注文を生成する。故障位置に到着後、途中の注文を続行する。
 (1)セッション制御から故障ロボットNOが通知される。
 (2)故障ロボットを正常ロボットに振り替えてから、セッション制御に対して、新しいロボットNOで、目的地を故障ロボットNOで通知する。
 (3)セッション制御は故障ロボットの座標軸まで、新しいロボットをロボット制御AIに指示すると、2層に到着した通知が3層から返る。ロボット制御AIは目的地を名称で認識しているので、ロボットNOまで移動するという要求は新しい機能で追加となる。
 (4)その間、ロボットは次の指示が無いので、一定時間(10秒)停止している。
 (5)2層は、本来の目的地をロボット制御AIへ指示する。ここから通常軌道に乗る。
== 3. Resource allocation to standby robot ==
3. 3. When it receives a request, it first allocates it to the waiting robot resource. The queue is created after allocating robot resources.
(1) When a request (work) is received, robot resources are first allocated. At that time, there is no priority in the requested items. It is necessary to keep the robot on standby for emergencies, but this is to prepare a robot that is not normally used for emergencies so that the emergency can be reliably taken in the event of a robot failure.
(2) Depending on the amount of luggage (volume, weight) that can be loaded on the robot, it may be necessary to divide it into a plurality of robots. Since one order = one robot, it is divided into multiple orders depending on the amount of luggage. If some of the multiple robots that support multiple orders cannot be assigned, the robots that can be assigned in the same time zone are assigned in the future time. If you think of the requested baggage as one continuous unit, it is not good that it is divided and delivered to the destination. There is no concept of unallocated. Be sure to allocate in the time zone that can be allocated.
(3) If the same start position or the same destination exists in the work under the latest allocated order while the allocation is occurring in the multi at the time of robot allocation, it is within a certain time difference. If the transit time is near, it is necessary to exchange the destination order. Calculated based on the relationship between the distance and time of the destination.
(4) When the robot allocation is performed, the robot allocation is averaged so that the robot once used is not continuously used due to the charging relationship. For that,
1) Select the robot type according to the requested total amount of luggage. (Range of allowable volume and allowable weight) When the type is limited and the allowable amount is exceeded, there is no choice, so the procedure is (5). Calculation: Luggage volume <= Allowable volume and Luggage weight <= Robots that satisfy the allowable weight and are the most minimal robots. If there is no robot that satisfies the above, it becomes (5).
2) Allocate robots in ascending order of accumulated usage time. That is, a robot that has been used for a long time has a new release time, so that the allocation priority is lowered.
3) Actively assign the robot with the oldest release time of the robot. However, if it is interpreted that the battery is being charged while it is released and waiting in the standby area, the longer the standby time, the higher the allocation priority.
(5) When multiple types according to the size of the robot are registered in multiple masters, the type of robot is selected so that it can be carried by one robot as much as possible according to the amount of luggage to be loaded. If it is not possible to select only a specific robot as shown in (2), the robot can be divided into a plurality of robots and assigned. (For the time being, there is only one type, 1m in length and 0.8m in width, but even if the number of types increases, it will be allocated.)
The allocation algorithm selects the robot type with the smallest robot tolerance (weight, volume). If one unit cannot accommodate it, raise the rank and judge. If there is no robot that can be applied to one robot, the robot is divided and assigned.
== Assignment example ==
FIG. 6 is a diagram illustrating allocation of 45 loads to the robot.
== 4. Response in case of failure ==
4. When the robot breaks down, the robot is put into a faulty state from the management screen, and an order is generated to move the robot to the faulty position by transferring to another standby robot. After arriving at the faulty position, continue the order in the middle.
(1) The failed robot NO is notified from the session control.
(2) After transferring the failed robot to a normal robot, the new robot NO notifies the session control of the destination with the failed robot NO.
(3) Session control is up to the coordinate axes of the failed robot. When a new robot is instructed to the robot control AI, the notification that it has arrived at the second layer is returned from the third layer. Since the robot control AI recognizes the destination by name, the request to move to the robot NO is added with a new function.
(4) During that time, the robot is stopped for a certain period of time (10 seconds) because there is no next instruction.
(5) The second layer instructs the robot control AI of the original destination. From here, get on the normal orbit.
<ロボットの占有と解放>
 占有と解放は、実績依頼の計画展開によってスケジュールされる。(計画依頼では行わない)
(1)ロボット占有
 ロボットは割り当てられただけでは占有状態とはならない。
 実際に待ち行列から取り出され3層に送出された段階で占有=活動状態となる。
(2)3層への指示単位
 3層への指示は、1つの区間のセットの単位である。つまり、待機位置から開始位置、開始位置から目的地1、目的地1から目的地2、のように、区間単位で指示を出す。
 次の指示は、3層が目的地に到着し次の要求を2層に通知して、2層が次の要求を受け取ったタイミングである。
(3)活動中のロボットを解放する
 全目的地での作業が完了して、待機エリアに到着した時点でロボット解放が行われる。
 3層から最後の要求(待機位置に戻った状態)を受け取ったタイミングで、この注文に割り当てられていたロボットを解放する。(継続する場合は3層から次の目的地の要求が来る。待機エリアに着くと、次の目的地は来ない。)
<Occupancy and release of robot>
Occupancy and release are scheduled by the planned deployment of performance requests. (Not done in the plan request)
(1) Robot occupancy A robot is not in an occupied state just by being assigned.
Occupied = active when it is actually taken out of the queue and sent to the 3rd layer.
(2) Unit of instruction to the third layer The instruction to the third layer is a unit of a set of one section. That is, instructions are given in section units, such as from the standby position to the start position, from the start position to the destination 1, and from the destination 1 to the destination 2.
The next instruction is the timing when the third layer arrives at the destination, notifies the second layer of the next request, and the second layer receives the next request.
(3) Release the active robot When the work at all destinations is completed and the robot arrives at the waiting area, the robot is released.
When the final request (returned to the standby position) is received from the third layer, the robot assigned to this order is released. (If you want to continue, the next destination will be requested from the 3rd floor. When you reach the waiting area, the next destination will not come.)
 ロボットが解放されると、同一ロボットIDで先行する依頼が待っている場合は、待ち行列管理によって、次の待ち状態の依頼が取り出されるが、その場合は同一解放されたロボットの次の待ち依頼が取り出し対象となる。但し、この処理は、解放と連続的に行うのではなく、時間監視イベントで取り出される。 When the robot is released, if the preceding request with the same robot ID is waiting, the request in the next waiting state is fetched by queue management, but in that case, the next waiting request of the same released robot Is to be taken out. However, this process is not performed continuously with the release, but is fetched by the time monitoring event.
<ロボットの概算占有時間の計算>
 ロボットの割当はロボットを占有しているのではなく、使用時間を割当た状態で、時間資源の時間帯別の未来の占有と言える。占有とは、実際に3層に指示を送出したタイミングであり、計画に対して実績の関係になる。
<Calculation of estimated robot occupancy time>
It can be said that the allocation of robots does not occupy the robot, but the future occupancy of time resources by time zone with the usage time allocated. Occupancy is the timing at which instructions are actually sent to the three layers, and has an actual relationship with the plan.
 作業における割り当ては計画段階で、計画依頼を受けた時点でその時点でロボット割当(引当)を行う。この段階で引当されたロボットは未来のどの時点から占有し、どの時点で解放されるかを計算して予測しなければならない。その計算方法を以下に示す。 Allocation in work is in the planning stage, and when a plan request is received, robot allocation (allocation) is performed at that point. Robots allocated at this stage must be calculated and predicted at what point in the future they will be occupied and at what point they will be released. The calculation method is shown below.
 図7は、スタート地点から到着地点までの距離を説明する図である。
 2つの目的地間の距離はAではなく、C+Bである。従って、待機エリアを含めて、目的地の個数による移動時間と各目的地での作業時間の累計が占有時間である。
 N=目的地の間隔個数
 W=平均作業時間(マスタの値であり、ロボット型によって値が決まる。標準は60秒であり、大きさの割合で値が決定される。)
 M=速度/秒(パラメタとして設定可能である。デフォルトは毎秒2mとすることができる。)
 F=次回占有間隔時間(秒)(同じロボットの解放から占有までの間隔時間である。パラメタとして設定可能であり、デフォルトは1.2秒である。)
 h=倍率(パラメタで設定可能であり、デフォルトは1.2である。)
 とした場合に、占有時間Tは次式で求めることができる。
Figure JPOXMLDOC01-appb-I000001
FIG. 7 is a diagram for explaining the distance from the start point to the arrival point.
The distance between the two destinations is C + B, not A. Therefore, the occupancy time is the cumulative total of the travel time according to the number of destinations and the work time at each destination, including the waiting area.
N = Number of intervals between destinations W = Average working time (The value is the master value, and the value is determined by the robot type. The standard is 60 seconds, and the value is determined by the ratio of the size.)
M = speed / sec (can be set as a parameter. The default can be 2 m / sec.)
F = Next occupancy interval time (seconds) (The interval time from the release of the same robot to occupancy. It can be set as a parameter, and the default is 1.2 seconds.)
h = magnification (can be set by parameter, default is 1.2)
In the case of, the occupancy time T can be obtained by the following equation.
Figure JPOXMLDOC01-appb-I000001
 次に、占有開始時間と解放時間を求める。
 資源占有開始時刻・・・・・・占有開始時間は、同一ロボットの前回の解放時間+F秒が占有開始時間となる。
 資源解放時刻・・・・・・・・・・解放時間=占有開始時刻+占有時間T
Next, the occupancy start time and the release time are obtained.
Resource occupancy start time: The occupancy start time is the previous release time of the same robot + F seconds.
Resource release time ・ ・ ・ ・ ・ ・ ・ ・ Release time = Occupancy start time + Occupancy time T
<スケジュールイメージ>
 図8は、ロボット占有状態及び開放状態のスケジュール例を示す図である。
 ・ロボット稼働率均等化
 ・計画と実績の展開からロボット適正割当(・ロボットと仕事割当状況)
 ・ロボット状態把握(占有、解放、待ち行列)
 ・作業量とロボット台数の関係
 ・充電時間の考慮
 ・ロボット故障状況の認識
<Schedule image>
FIG. 8 is a diagram showing an example of a schedule in the robot occupied state and the open state.
・ Equalization of robot utilization rate ・ Appropriate allocation of robots based on the development of plans and achievements
・ Robot status grasp (occupancy, release, queue)
・ Relationship between the amount of work and the number of robots ・ Consideration of charging time ・ Recognition of robot failure status
 前提は、WMSからの依頼内容が、ピッキング作業などが計画展開され、その後実績が上がり、搬送業務が生まれるなどの流れが考えられる。その際、計画段階でも依頼が発せしていると考える。事前の段取が組みやすくなる。 The premise is that the content of the request from WMS will be planned and developed for picking work, etc., and then the results will improve and transportation work will be born. At that time, I think that the request has been issued even at the planning stage. It will be easier to set up in advance.
<ロボットへの積込み順の計算>
 各ロボットNO毎に積込順、積込位置(空間座標軸)を計算して、配列で結果を返す。
<Calculation of loading order on robot>
The loading order and loading position (spatial coordinate axes) are calculated for each robot NO, and the results are returned as an array.
 ロボットの積み可能な容積、重量はロボットサイズごとに異なるが、今回割り当てられたロボットについて、荷物の全体の量から積込方法を求める。 The loadable volume and weight of the robot differ depending on the robot size, but for the robot assigned this time, the loading method is calculated from the total amount of luggage.
 例えば、入力値としては、
 0.ロボットサイズ(荷物を積込可能容積=縦、横、高さ)、重量制限
 1.個別荷物の容積(縦、横、高さ)、重量、目的地(届け先)の配列(目的地順)
 2.全体個数
 3.全体容積
 4.全体重量
などが与えられる。
For example, as an input value,
0. Robot size (volume that can be loaded = length, width, height), weight limit 1. Volume of individual packages (length, width, height), weight, arrangement of destinations (destination) (in order of destination)
2. Total number 3. Overall volume 4. The total weight etc. is given.
 そして、スケジューラからは、結果データとして、
 1.荷物NO
 2.積込順番
 3.荷物の空間座標軸(8点)
などが出力される。これが荷物個数分の配列として出力される。これによりPlottreeなどにより描画可能となる。
And from the scheduler, as the result data,
1. 1. Luggage NO
2. Loading order 3. Spatial coordinate axes of luggage (8 points)
Etc. are output. This is output as an array for the number of packages. This makes it possible to draw by Plottree or the like.
 結果は、注文に設定する。同時にセッション制御のロボットオブジェクトにも登録する。第2層の注文内容は保存されるが、ロボットオブジェクトの内容は解放されてたらクリアされる。各ロボットへの荷物の積み込み方や積込内容はロボットを照会することで可能になる。照会は第2層で注文内容を検索しWebの画面で表示することが出来る。 The result is set in the order. At the same time, it is also registered in the session control robot object. The order contents of the second layer are saved, but the contents of the robot object are cleared when released. How to load luggage into each robot and the contents of loading can be done by inquiring the robot. The inquiry can be displayed on the Web screen by searching the order contents in the second layer.
 1.依頼番号
 2.荷物を指す適切なID
 3.割り当てられたロボットID
 4.開始位置、目的地・・・
 に箱の積込み画像を表示する。
1. 1. Request number 2. Appropriate ID pointing to the package
3. 3. Assigned robot ID
4. Start position, destination ...
Display the loading image of the box in.
<待ち行列から取り出すタイミング生成>
 図9は、待ち行列から作業を取り出すタイミングの生成について説明する図である。
<Timing generation to retrieve from queue>
FIG. 9 is a diagram illustrating the generation of timing for retrieving work from the queue.
 待ち行列から作業を取り出すタイミングを10秒間隔で制御することができる。待ち行列に入っている仕事は、作業展開された初回の作業である。待ち行列内部には、ロボットが割当できない場合と、割当された仕事の準備が出来ている状態の2つの状態が存在する。 The timing of removing work from the queue can be controlled at 10-second intervals. The work in the queue is the first work that has been unfolded. Inside the queue, there are two states: when the robot cannot be assigned and when the assigned work is ready.
(1)実行可能な作業は、待ち行列から取り出す場合は、同時に1件の作業のみ取り出す。優先順位は有り。
(2)割当されていない待ち行列は、同時に全てを取り出し、割当可能な作業のみ再度待ち行列に戻す。待ち行列監視プロセスを用意し、10秒間隔でESBにトリガーを生成して実行する。(待ち行列取出しトリガー)
(1) When the work that can be executed is taken out from the queue, only one work is taken out at a time. There is a priority.
(2) All unallocated queues are taken out at the same time, and only the work that can be assigned is returned to the queue again. A queue monitoring process is prepared, and a trigger is generated and executed in the ESB at 10-second intervals. (Queue fetch trigger)
<概念モデル>
 図10は、ロボットスケジューラの概念モデルを示す図である。図11は、ロボットスケジューラの概念モデルを示す図である。
<Conceptual model>
FIG. 10 is a diagram showing a conceptual model of the robot scheduler. FIG. 11 is a diagram showing a conceptual model of the robot scheduler.
<シナリオの種類>
 シナリオとしては以下のようなものが存在する。
Figure JPOXMLDOC01-appb-I000002
Figure JPOXMLDOC01-appb-I000003
<Scenario type>
The following scenarios exist.
Figure JPOXMLDOC01-appb-I000002
Figure JPOXMLDOC01-appb-I000003
<注文の状態とロボット状態の勘定仕訳>
 次表は、注文の状態とロボット状態の勘定仕訳を説明する表である。
Figure JPOXMLDOC01-appb-I000004
<Account journals for order status and robot status>
The following table describes the account journals for order status and robot status.
Figure JPOXMLDOC01-appb-I000004
 次表は、故障対応時の勘定仕訳を説明する表である。
Figure JPOXMLDOC01-appb-I000005
The following table is a table that explains the account journals when dealing with failures.
Figure JPOXMLDOC01-appb-I000005
 前半は元の注文NOの下につける。帰還中なら後半は無しでよい。後半は元注文を使用し、Robot002の注文とする。以降はRobot002の流れとなる。
 依頼NOは同じである。ロボットと注文の紐づけをしてから状態を活動中又は搬送中にする。
 作業の中の現在順序NOで継続する。
 注:代替ロボットは待機中で待ち行列を持っていないロボットを選択する。
The first half is attached under the original order number. If you are returning, you do not need to do the latter half. In the latter half, the original order will be used and the order will be Robot002. After that, it becomes the flow of Robot002.
The request number is the same. After associating the robot with the order, the state is set to active or in transit.
Continue in the current order NO in the work.
Note: Alternate robots select robots that are on standby and do not have a queue.
 なお、故障の場合、もし、このタイミグで待機ロボットが存在しない場合は、前半までは自動で処理し、一旦ここで停止する。後半は手動でする。 In case of failure, if there is no standby robot in this timing, it will be processed automatically until the first half and will stop here once. The second half is done manually.
 第2層のロボットスケジューラは、上記のような勘定科目を管理する帳簿を備え、当該帳簿により待ち行列やロボットの動作及び待機の状況を管理することができる。 The robot scheduler of the second layer is provided with a book for managing the above-mentioned accounts, and can manage the queue, the operation of the robot, and the standby status by the book.
 搬送注文で稼働していたロボットが故障したので、代替ロボットを割り当てて、引き継ぐ。搬送注文の内容を振替注文にコピーし、仕訳を生成する。途中まで目的地が進行していて、途中で故障して道路の真ん中で停止してしまったことが想定される。その際、FromとToで前回までの完了した仕訳は、代替ロボットには必要なく、道路の途中でも、その時のFromから開始する。このFromが開始点になるので、必要な全ての仕訳を生成する。 The robot that was operating in the transport order broke down, so assign an alternative robot and take over. Copy the contents of the transport order to the transfer order and generate a journal. It is assumed that the destination was progressing halfway, and it broke down on the way and stopped in the middle of the road. At that time, the journals completed up to the previous time in From and To are not necessary for the alternative robot, and even in the middle of the road, the journal starts from that time. Since this From is the starting point, all necessary journals are generated.
 なお、第3層への指示では、開始地点が故障ロボットの位置になる。従って、待機位置から開始位置までの指示は、待機位置から故障ロボットの位置となる。 In the instruction to the 3rd layer, the starting point is the position of the failed robot. Therefore, the instruction from the standby position to the start position is the position of the failed robot from the standby position.
<ロボット故障時の動作>
 図12は、ロボット故障時の動作を説明する図である。故障パタン1の場合、目的地2がまだ残っている状態でロボットが故障したため、代替ロボットに振り替えることになる。そこで、Robot1が故障した位置にRobot2を移動させる。故障パタン2の場合、待機エリアに帰還中に故障したものの、目的地の作業は全て終了しているため、代替のロボットへの振り替えは行わない。
<Operation when the robot breaks down>
FIG. 12 is a diagram illustrating an operation when the robot fails. In the case of the failure pattern 1, the robot has failed while the destination 2 still remains, so the robot is transferred to the alternative robot. Therefore, Robot 2 is moved to the position where Robot 1 has failed. In the case of the failure pattern 2, although the failure occurred while returning to the standby area, all the work at the destination has been completed, so the transfer to the alternative robot is not performed.
<注文の状態とロボット状態の勘定仕訳>
 次表は、振替注文についての勘定仕訳を説明する表である。
Figure JPOXMLDOC01-appb-I000006
<Account journals for order status and robot status>
The following table describes the account journals for transfer orders.
Figure JPOXMLDOC01-appb-I000006
 図13は、振替注文に係るの時系列での動作を説明する図である。図13に示すように、計画は、実績が発生すると、計画を変更して、より早く割当可能なロボットへ振替を行う。getBalance関数で取得できる全ロボットの使用可能時間から、次の計画が存在しないロボットを選択し、その中で最も早く使用可能なロボットを決めることができる。 FIG. 13 is a diagram illustrating the operation in time series related to the transfer order. As shown in FIG. 13, when the actual result is generated, the plan is changed and the robot is transferred to the robot that can be assigned earlier. From the usable time of all robots that can be obtained by the getBalance function, it is possible to select a robot that does not have the next plan and determine the fastest usable robot among them.
 例えば、Robot1について、注文1の処理の完了が延びたため、次に計画されていた注文2の作業を他のロボットに振り替える必要が生じている。ここでRobot2は、注文3を計画よりも早く終了しているが、注文4がRobot2に割り当てられているため、Robot1の注文2を割り込ますことができない。Robot3は、注文5の終了後に注文が入っていない。そこで、Robot3の注文5の終了後に注文2を割り当てることができる。 For example, for Robot1, the completion of processing of order 1 has been delayed, so it is necessary to transfer the work of order 2 planned next to another robot. Here, Robot 2 finishes order 3 earlier than planned, but since order 4 is assigned to Robot 2, order 2 of Robot 1 cannot be interrupted. Robot3 has no order after the end of order 5. Therefore, the order 2 can be assigned after the completion of the order 5 of the Robot 3.
 次表は注文の状態とロボットの状態との勘定仕訳の例を説明する表である。
Figure JPOXMLDOC01-appb-I000007

 Robot001などの勘定は、割当時に待ち行列に存在する時間帯だけ+1で仕訳されるが、一旦、占有されると待ち行列から消滅する。その後は状態勘定での振替のみで進行する。ロボット勘定の残高が+1以上になるのは、待ち行列の数と理解できる。ロボットの待ち行列が無くなるとゼロになる。つまり、ロボットの残高は待ち件数と言える。その他は、状態勘定で管理すると考える。
The following table is a table explaining an example of account journal entry between order status and robot status.
Figure JPOXMLDOC01-appb-I000007

Accounts such as Robot001 are journalized by +1 only for the time zone existing in the queue at the time of allocation, but once occupied, they disappear from the queue. After that, it proceeds only by the transfer in the state account. It can be understood that the balance of the robot account is +1 or more as the number of queues. It becomes zero when the robot queue runs out. In other words, the balance of the robot can be said to be the number of waiting cases. Others are considered to be managed by the state account.
<注文の待ち行列管理機能>
 注文の待ち行列管理について説明する。
<Order queue management function>
Queue management of orders will be described.
1.外部からの依頼が発生し、注文展開して、注文内容が確定した。
 計画段階の依頼の受付(存在する場合としない場合がある)・・・計画依頼の場合
1. 1. An external request was made, the order was expanded, and the order details were confirmed.
Accepting requests at the planning stage (may or may not exist) ... In the case of planning requests
 受け付けた段階で、通常の作業展開まで行うが、計画として生成する。但し、計画依頼の場合は、ロボットはその時の状態で割当して計画展開し、待ち行列に入る。通常の処理と同様に計画展開は行うが、実行しないで、待ち行列に入る。計画依頼はそのまま放置し、一時的な表示のみ使用。一時的とは実績依頼が到着すると表示でも使用しない。 At the stage of acceptance, normal work development will be performed, but it will be generated as a plan. However, in the case of a plan request, the robot allocates the robot in the state at that time, develops the plan, and enters the queue. Plan expansion is performed in the same way as normal processing, but it is not executed and enters the queue. Leave the plan request as it is and use only the temporary display. Temporary means that when a performance request arrives, it will not be used even if it is displayed.
・注文の中身は、3層へのインタフェースを形成できる内容でなければならない。つまり、荷物の重量や養成、開始の場所、複数の目的地が配列で構成される。
・注文が生成されると、ロボットへの割当と待ち行列を生成する。その場合の処理内容は、適切なロボットを選択する。
 (1)荷物の重量や容積からロボット型からサイズが決定され、そのサイズに見合うロボットを割当なければならない。
 (2)getBalance関数に(条件:残高=ゼロand待機状態)を与えてロボットを絞り込みを行う。
 (3)もし、(2)が存在しなければ、getBalance関数に(条件:残高=ゼロand活動状態)を与えて絞り込みを行う。
 (4)もし、(3)が存在しなければ、getBalance関数に(条件:残高>ゼロand最後の終了時間が最も現在に近いロボット)で絞り込みを行う。
-The contents of the order must be able to form an interface to the three layers. In other words, the weight and training of luggage, the place of start, and multiple destinations are composed of an array.
-When an order is generated, it will be assigned to the robot and a queue will be generated. For the processing content in that case, an appropriate robot is selected.
(1) The size is determined from the robot type based on the weight and volume of the luggage, and a robot suitable for the size must be assigned.
(2) Give the getBalance function (condition: balance = zero and standby state) to narrow down the robots.
(3) If (2) does not exist, the getBalance function is given (condition: balance = zero and active state) to narrow down.
(4) If (3) does not exist, the getBalance function is narrowed down by (condition: balance> zero and the robot whose last end time is closest to the present).
 結果、(1)は(1)から(4)までのAND条件となる。(2)はどのロボットでも構わないので1つ選択されるが、実績で本日1日の中で最も使用された時間量が少ないロボットを選択手法は、待機中ロボットに対して、待ち状態の注文getBalance関数でリストを取得し時間量で集計し、少ないロボットに決めることができる。(3)は、最後の終了時間が最も現在に近いロボットを1つ選択することになる。(4)は一意に決定される。
・最後に、待ち行列にロボットと注文を紐づける。
As a result, (1) becomes the AND condition from (1) to (4). (2) can be any robot, so one is selected, but the method of selecting the robot with the least amount of time used today in the actual results is to order the waiting robot from the waiting robot. You can get a list with the getBalance function, aggregate by the amount of time, and decide on a small number of robots. In (3), one robot whose last end time is closest to the present is selected. (4) is uniquely determined.
-Finally, link the robot and the order to the queue.
2.待ち行列の生成と取り出し方法・・・待ち行列とは注文の待ち行列であり、注文生成時にロボット割当は完了済みである。 2. How to create and retrieve a queue ... A queue is an order queue, and robot allocation has been completed when the order is generated.
・待ち行列からの注文の取り出しとは、3層にロボット毎に指示を発行する行為である。
・getBalance関数に(条件:残高>ゼロand待機状態)を与えてロボットを検索し、順に第3層に指示を発行する。その際、初回にも関わらず、次作業場所(目的地)の仕訳を忘れないようにする。
・ロボット勘定が決定し、Entry要素から先頭に並んでいる注文NOを取り出し注文NOを確定する。注文が確定すると、注文内容から3層へのインタフェース生成が行えるようになる。
・最後にロボットを活動状態にし、注文を搬送状態にして処理を完了する。
-Retrieving an order from the queue is the act of issuing instructions to each robot in three layers.
-Give the getBalance function (condition: balance> zero and standby state) to search for robots, and issue instructions to the third layer in order. At that time, do not forget the journal entry of the next work place (destination) even though it is the first time.
-The robot account is determined, and the order number arranged at the beginning is taken out from the entry element and the order number is confirmed. Once the order is confirmed, it will be possible to generate an interface from the order details to the third layer.
-Finally, activate the robot, put the order in the transport state, and complete the process.
(1)計画依頼は作業展開は行うが待ち行列管理の対象外とする。つまり、取り出すことは無いのでそのままとするが、実績依頼が到着し新たに実績依頼の計画展開を行うと、表示はこちら側となり、計画依頼の表示は行わない。業務量の把握などで使用するだけである。 (1) The plan request will be expanded, but will not be subject to queue management. In other words, since it will not be taken out, it will be left as it is, but when the performance request arrives and a new plan development of the performance request is performed, the display will be on this side and the plan request will not be displayed. It is only used for grasping the amount of work.
(2)実績の依頼が到着したら、ロボットの割当を実行して実際の待ち行列に登録する。待ち行列の取り出しには、2つのパラメタを用意する。
 ・一度に連続して処理する件数
 ・一件づつ取り出す際の間隔時間(秒単位)
(2) When the achievement request arrives, the robot is assigned and registered in the actual queue. Two parameters are prepared for queuing.
-Number of cases to be processed continuously at one time-Interval time (in seconds) when taking out one case at a time
(3)取出し条件
 ・待ち行列はロボット毎に注文をぶら下げた状態。注文が発生すると必ずどれか特定のロボットに紐づく(割当と言う)。
 ・待ち行列内は全て注文に対してロボットは割当済みである。(ロボット不足の場合は待ち行列が延びるだけ)
 ・ロボットが現在活動中状態なら、当該ロボットの待ち行列は取り出さない。(対象外)
 ・並列処理は行わない
 ・待ち状態に対して、getBalance()で注文リストを取得し、注文が持っているシリアル番号が時系列となるので、
注文リストから順番に一件の注文を取り出す。その時、注文NO、ロボットID、順番の要素がリスト上に展開される。
 ・取り出した注文の配下にあるエントリからロボットIDを取得し、そのロボットが待機状態なら、待ち行列から取出し、3層への
指示処理を実行する。
(3) Extraction conditions ・ The queue is in a state where orders are hung for each robot. Whenever an order is placed, it is tied to a specific robot (called allocation).
-Robots have already been assigned to all orders in the queue. (If there is a shortage of robots, the queue will only be extended)
-If the robot is currently active, do not retrieve the robot's queue. (Not applicable)
-No parallel processing-For the waiting state, getBalance () is used to get the order list, and the serial number of the order is in chronological order.
Take out one order in order from the order list. At that time, the order number, robot ID, and order elements are expanded on the list.
-Obtain the robot ID from the entry under the fetched order, and if the robot is in the standby state, take it out from the queue and execute the instruction processing to the 3rd layer.
(4)待ち行列からの取り出し処理は、外部バッチ処理にて実現し、一定期間間隔で処理を実行する。 (4) The retrieval process from the queue is realized by an external batch process, and the process is executed at regular period intervals.
 図14は、待ち行列の管理機能を説明する図である。
 注文の待ち状態を待ち行列と言う。現在1回で取得できる待ち行列は、下記の5つの注文である。また、取出した注文の3層への送信順番は図14に示す通りである。このように、待機状態のロボット毎の待ち行列の先頭だけが、リストに取得できる。
FIG. 14 is a diagram illustrating a queue management function.
The waiting state of an order is called a queue. Currently, the following five orders can be obtained at one time. Further, the order of transmission of the taken-out orders to the third layer is as shown in FIG. In this way, only the head of the queue for each robot in the standby state can be acquired in the list.
3.第2層、第3層のインタフェース処理 3. 3. Interface processing of the second and third layers
 ◆第3層からの次の目的地の要求(リターンタイプ=正常)を受け取る。 ◆ Receive the request for the next destination (return type = normal) from the 3rd layer.
 待機エリアから開始地点や、開始地点から目的地、目的地から次の目的地のように目的地に連鎖する場合、第2層では待ち行列は使用しない。注文内容から次の目的地を取り出して、第3層に指示する。
 処理内容は、ロボットIDがキーになる。第3層からリターンされたロボットIDに基づき、そのロボットの活動状態に対して、getBalance関数により検索して、活動状態の直近の注文NOを捕まえる。また、そのエントリ内容には前回の目的地が存在する。次の目的地や終了判定は順序NOで判断できる。また、注文NOも依頼NOの中から抽出できる。注文内容から前回目的地の次の目的地を取得し、第3層に現在目的地ー次回目的地との範囲で移動するための指示を発行する。
 前回目的地が待機エリアの場合は、この到着報告を持って注文の処理は完了する。ロボットを占有から解放に振り替えて、当該ロボットの処理を終える。
When chaining from the waiting area to the starting point, from the starting point to the destination, and from the destination to the next destination, the queue is not used in the second layer. The next destination is taken out from the order contents and instructed to the third layer.
The robot ID is the key to the processing content. Based on the robot ID returned from the third layer, the active state of the robot is searched by the getBalance function, and the latest order number of the active state is captured. In addition, the previous destination exists in the entry contents. The next destination and end judgment can be judged by order NO. The order number can also be extracted from the request number. The destination next to the previous destination is acquired from the contents of the order, and an instruction for moving in the range of the current destination to the next destination is issued to the third layer.
If the previous destination was the waiting area, the processing of the order will be completed with this arrival report. The robot is transferred from occupancy to release, and the processing of the robot is completed.
 ◆第3層からの次の目的地の要求(リターンタイプ=異常) ◆ Request for the next destination from the 3rd layer (return type = abnormal)
 現在作業中のロボットIDが故障したことを第3層から通知された場合の処理である。シナリオは故障対応の処理で、現在停止した場所とタイミングによって処理内容が異なる。 This is the process when the third layer notifies that the robot ID currently being worked on has failed. The scenario is a failure response process, and the process content differs depending on the location and timing of the current stop.
 まだ搬送先の目的地を残して故障した場合は、代替ロボットへの振り替えと、故障地点までの移動指示が必要になる。リタンされたロボットIDが故障なので、正常処理と同様のgetBalance関数により活動状態の直近の注文NOを捕まえる。また、そのエントリ内容には前回の目的地が存在する。ここでは、次の目的地ではなく、故障地点の位置情報を使用して代替ロボットに移動指示を出す。 If the failure occurred while leaving the destination of the transportation destination, it is necessary to transfer to an alternative robot and instruct to move to the failure point. Since the retanned robot ID is out of order, the latest order number in the active state is captured by the getBalance function similar to normal processing. In addition, the previous destination exists in the entry contents. Here, the movement instruction is given to the alternative robot by using the position information of the failure point instead of the next destination.
 これに対し、目的地残が待機エリアの場合は、代替ロボットを使用しないで、故障に振り替えてこの注文処理を完了させる。代替ロボットへ振り替える場合は、注文内容のロボットIDを振替後のロボットIDにして内容をコピーして生成する。作業と移動についても前回の目的地の仕訳のみ生成して、次の目的地を決める必要がある。 On the other hand, if the destination remaining is the standby area, the order processing is completed by transferring to the failure without using the alternative robot. When transferring to an alternative robot, the robot ID of the ordered content is set to the robot ID after the transfer, and the content is copied and generated. For work and movement, it is necessary to generate only the journal of the previous destination and determine the next destination.
4.この処理は、バッチ処理で、定期的に監視して待ち行列の再編を行う。 4. This process is a batch process, which is periodically monitored and the queue is reorganized.
 待ち行列の再編とは、最初の待ち行列は、注文生成時に行う。その場合の注文の処理所要時間(目的地の数と予測処理時間)は計画であり、後続で実績が発生すると、徐々にずれが発生し、また、あるロボットの待ち行列で待っているのに、他のロボットが待機状態になっていて、時間の無駄が発生することが考えられる。この状態を定期的に再編して、適切なコンプレッションを行うことで、最大効率の待ち行列を生成できる。 Queue reorganization means that the first queue is performed when an order is generated. The processing time (number of destinations and estimated processing time) of the order in that case is a plan, and when the actual result occurs after that, a gradual deviation occurs, and even though it is waiting in the queue of a certain robot. , It is conceivable that other robots are in a standby state and time is wasted. By reorganizing this state on a regular basis and applying appropriate compression, a queue with maximum efficiency can be generated.
 ◆再編内容
 ・待機状態のロボットが存在しないかどうかを探し出す。
 ・待機状態のロボットが存在しない場合は、各ロボットの待ち行列数の最終時間が均等になるように注文の移動を行う
◆ Details of reorganization ・ Find out if there is a waiting robot.
-If there are no robots in the standby state, move the order so that the final time of the number of queues for each robot is equal.
 この手法を図15に示す。図15の例では、注文の平均時間は、(20+30+10)÷3=20
(1)各ロボットの現在の待ち行列が平均時間以下ならそのままとする。
(2)Robot002は、平均以上なので、他のロボットで平均以下に収まるロボットが存在するかを探す。結果、Robot001と003である。
(3)Robot002が平均以下になるように、どれか1つを取りる。注文5を取りRobot003に移動する。Robot003の結果が平均時間を超えるなら再編を取りやめる。
This method is shown in FIG. In the example of FIG. 15, the average order time is (20 + 30 + 10) ÷ 3 = 20.
(1) If the current queue of each robot is less than the average time, leave it as it is.
(2) Since Robot002 is above the average, search for other robots that are below the average. As a result, Robot001 and 003.
(3) Take one of them so that Robot002 is below the average. Take order 5 and move to Robot003. If the result of Robot003 exceeds the average time, the reorganization will be cancelled.
<通過時刻の重なり>
 図16は、ロボットの通過時刻の重なりを説明する図である。図16の例では、12-Aが重なっているが、ここに到着する時刻の概算を行い、ROBOT001とROBOT0012の誤差が60秒以内(パラメタ化)なら重なっていると判定し、どちらかの12-Aの順番を入れ替える。もし、目的地が1つだけなら、待ち行列のキューを戻して遅らせる。
<Overlapping transit times>
FIG. 16 is a diagram for explaining the overlap of the passing times of the robots. In the example of FIG. 16, 12-A overlaps, but the time of arrival is estimated, and if the error between ROBOT001 and ROBOT0012 is within 60 seconds (parameterization), it is determined that they overlap, and either 12 -Change the order of A. If there is only one destination, return the queue to the queue and delay it.
 通過時刻は正確な時刻ではなく、ロボット同士の相対的な関係を示すものである。目的地から目的地までの直線距離の相対的違いを計算することで通過時刻の重なりを判定する。上記の例では、waitから次の位置までの座標軸同士のユークリッド距離を計算し、2m当たりを1秒と重みづけをすると、距離の誤差が時間の誤差になるので、その差分で判断する。位置から位置の時間の積み上げで通過点の時刻が決まる。その際、目的での作業時間を一定として個別に加算する。
 i,jを相対位置関係(例えば、waitと15-Aは、1と2で表現することができる。)、Kを累積時の回数(m)とした場合、距離d(ij)は次式で表すことができる。
Figure JPOXMLDOC01-appb-I000008

X座標軸で2点間の距離xyは、次式で求めることができる。
Figure JPOXMLDOC01-appb-I000009
The passing time is not an exact time, but a relative relationship between the robots. The overlap of passing times is determined by calculating the relative difference in the straight line distance from the destination to the destination. In the above example, if the Euclidean distance between the coordinate axes from wait to the next position is calculated and weighted as 1 second per 2 m, the distance error becomes a time error, so the difference is used for judgment. The time of the passing point is determined by accumulating the time from position to position. At that time, the working time for the purpose is set to be constant and added individually.
When i and j are relative positional relationships (for example, wait and 15-A can be expressed by 1 and 2), and K is the cumulative number of times (m), the distance d (ij) is expressed by the following equation. Can be represented by.
Figure JPOXMLDOC01-appb-I000008

The distance xy between two points on the X coordinate axis can be calculated by the following equation.
Figure JPOXMLDOC01-appb-I000009
<目的地の順序変更>
 第3層では、ロボットが目的地に到着したが、同一目的地に先行するロボットが既に到着していて、その後続で待ち合わせする状態になった場合で且つ、他の目的地が存在する場合は、第3層は待ち時間と他の目的地までの走行時間を比較して、その他の目的地が混み合っていないで直ぐに荷降し可能ならば、目的地の順序変更を第2層に要求する。第2層では作業展開結果の順番を変更して、現在の目的地は未処理状態で次の目的地を第3層に指示する。第3層は現在位置から次の新しい目的地に向けて移動するように第4層に指示する。
<Change the order of destinations>
In the third layer, when the robot has arrived at the destination, but the robot preceding the same destination has already arrived and is in a state of meeting after that, and there is another destination. , The third layer compares the waiting time with the travel time to other destinations, and requests the second layer to change the order of the destinations if the other destinations are not crowded and can be unloaded immediately. To do. In the second layer, the order of the work development results is changed, the current destination is in an unprocessed state, and the next destination is instructed to the third layer. The third layer instructs the fourth layer to move from the current position to the next new destination.
 図17は、目的地の順序変更について説明する図である。現在の第3層のロボットの動きでは、ロボット内部に持っている現時点の目的地を入れ替える
と、第3層は第4層に行き先の目的地名を指示することになる。第4層は現在の位置から新しく指示された目的地に向かって移動を始める。
FIG. 17 is a diagram illustrating a change in the order of destinations. In the current movement of the robot in the third layer, if the current destination held inside the robot is replaced, the third layer indicates the destination name of the destination to the fourth layer. The fourth layer begins to move from its current position to the newly indicated destination.
 第3層からの順序変更依頼を受けて、第2層では次の目的地が他のロボットによって到着中状態(荷降し中状態)かどうかの判断は可能である。従って、第3層から順序変更が来たとしても、拒否回答を出すことが出来る。第2層ではそのような状態でなければ次の目的地と交換する。この例では、9-Aから12-Aに変更する。 In response to the order change request from the 3rd layer, it is possible to determine whether the next destination is in the arriving state (unloading state) by another robot in the 2nd layer. Therefore, even if the order is changed from the third layer, a refusal answer can be issued. In the second layer, if it is not in such a state, it will be exchanged for the next destination. In this example, it is changed from 9-A to 12-A.
 次表は第2層から第3層へのセッション制御を行うAPI仕様を表す表である。

2層→3層ロボット制御AIへの業務指示データ構造
Figure JPOXMLDOC01-appb-I000010
The following table is a table showing API specifications for session control from the second layer to the third layer.

2nd layer → 3rd layer Business instruction data structure to robot control AI
Figure JPOXMLDOC01-appb-I000010
 2層から3層ロボット制御AIへの業務指示データ構造をJSON形式のデータとして表すと次のようなものになる。
{
 "robotId":“robot001", "requestNo":"REQNO00001",
 "fromDestination":"waitGate", "toDestination":"A",
 "destinationOrderNo":1, "destinationOperation":1,
 "quantity":10, "weight":50.0,
 "goodsIdList":[ "BOOK001", "ORANGE001", "BEEF001" ],
 "actionType":1,
 "plannedWorkTime":"00000000001500000",
 "destinationList":[ "A", "G" ]
 "luggageList”:[
  {
    “luggageNO”:荷物NO, “seqNo”:順番,
    “a”:{“x”:座標,“y”:座標,”z”:座標}, “b”:{“x”:座標,“y”:座標,”z”:座標},
    “c”:{“x”:座標,“y”:座標,”z”:座標}, “d”:{“x”:座標,“y”:座標,”z”:座標},
    “e”:{“x”:座標,“y”:座標,”z”:座標}, “f”:{“x”:座標,“y”:座標,”z”:座標},
    “g”:{“x”:座標,“y”:座標,”z”:座標}, “h”:{“x”:座標,“y”:座標,”z”:座標}
  },…
 ]
}
The business instruction data structure from the 2nd layer to the 3rd layer robot control AI can be expressed as JSON format data as follows.
{
"robotId": "robot001", "requestNo": "REQNO00001",
"fromDestination": "waitGate", "toDestination": "A",
"destinationOrderNo": 1, "destinationOperation": 1,
"quantity": 10, "weight": 50.0,
"goodsIdList": ["BOOK001", "ORANGE001", "BEEF001"],
"actionType": 1,
"plannedWorkTime": "00000000001500000",
"destinationList": ["A", "G"]
"luggageList": [
{
“LuggageNO”: Luggage NO, “seqNo”: Order,
“A”: {“x”: coordinates, “y”: coordinates, ”z”: coordinates}, “b”: {“x”: coordinates, “y”: coordinates, ”z”: coordinates},
“C”: {“x”: coordinates, “y”: coordinates, ”z”: coordinates}, “d”: {“x”: coordinates, “y”: coordinates, ”z”: coordinates},
“E”: {“x”: coordinates, “y”: coordinates, ”z”: coordinates}, “f”: {“x”: coordinates, “y”: coordinates, ”z”: coordinates},
“G”: {“x”: coordinates, “y”: coordinates, ”z”: coordinates}, “h”: {“x”: coordinates, “y”: coordinates, ”z”: coordinates}
},…
]
}
 次表は第3層から第2層に戻すデータの構造を説明する表である。
Figure JPOXMLDOC01-appb-I000011
The following table is a table explaining the structure of the data returned from the third layer to the second layer.
Figure JPOXMLDOC01-appb-I000011
 第3層から第2層に戻されるデータ構造をJSON形式で表すと次のようになる。
{
  "robotId":“robot001",
  "requestNo":"REQNO00001",
  “returnType":1,
  “destinationOrderNo":1
  “robotPosition:xxxxx-yyyyy”
}
The data structure returned from the third layer to the second layer is expressed in JSON format as follows.
{
"robotId": “robot001”,
"requestNo": "REQNO00001",
“ReturnType”: 1,
“DestinationOrderNo”: 1
“RobotPosition: xxxxx-yyyyy”
}
<ロボットスケジューラの受付データ>
 第2層のロボットスケジューラが受け付ける、外部からの依頼データの形式について定める。
 1.ロボットが特定の荷物を指定された場所から指定された場所に搬送するための指示データである。
 2.指定された荷物は箱を想定し、箱の個数、全体の重量、全体の容積を指定する。
 3.1回の指定でロボットを割り当てるが、重量制限と容積制限で複数台に分割しなければならないことがある。
 4.箱ごとに搬送先が規定されている。従って、複数の搬送先が存在し、距離的に遠い順に下から積み上げる。
 その際、3の要件で、ロボットが複数台必要になることがある。
<Reception data of robot scheduler>
The format of the request data from the outside received by the robot scheduler of the second layer is defined.
1. 1. This is instruction data for the robot to transport a specific load from a designated place to a designated place.
2. The specified luggage is assumed to be a box, and the number of boxes, the total weight, and the total volume are specified.
3. Robots are assigned with one designation, but it may be necessary to divide them into multiple robots due to weight restrictions and volume restrictions.
4. The destination is specified for each box. Therefore, there are a plurality of transport destinations, and they are stacked from the bottom in order of distance.
At that time, a plurality of robots may be required according to the requirement 3.
 注文展開では、1台のロボットが1注文となる。搬送先毎に作業が生成される。
 作業展開は、
  (1)待機位置から開始位置(積込先)までの移動
  (2)積込作業(当初は人間)・・・3層からは(1)と(2)同時に完了報告が2層に来る
  (3)搬送先への移動・・・・・・・・・3層からは(3)と(4)同時に完了報告が2層に来る
     なお、搬送先が複数の場合は、(3)と(4)を繰り返す
  (4)搬送先での荷降し
  (5)荷物が空になり待機場所へ帰還・・・(4)まで完了したら第3層に要求する。
     待機場所に到着すると3層からリターンがありロボット解放となる。
     (3)(4)については、距離に応じて計算で順序性が決まる。
In order development, one robot makes one order. Work is generated for each destination.
Work development
(1) Movement from the standby position to the start position (loading destination) (2) Loading work (initially human) ... From the 3rd layer, (1) and (2) Completion reports come to the 2nd layer at the same time (1) 3) Moving to the destination: (3) and (4) from the 3rd layer, the completion report comes to the 2nd layer at the same time. If there are multiple destinations, (3) and (4) Repeat 4) (4) Unload at the destination (5) Return to the waiting area when the luggage is empty ... When (4) is completed, request the third layer.
When you arrive at the waiting area, there will be a return from the 3rd layer and the robot will be released.
Regarding (3) and (4), the order is determined by calculation according to the distance.
 依頼データ形式
 1.開始位置名称(例:1-A)と積込量(箱数、重量、容積)
 2.初回搬送先(例:10-A)と荷降し量(箱数、重量、容積)
 [次回以降搬送先(例:12-A)と荷降し量(箱数、重量、容積)]・・・]
 注:搬送先が複数の場合の順番はデータ上は不定で良い(システムで決める)
Request data format 1. Start position name (example: 1-A) and loading amount (number of boxes, weight, volume)
2. Initial transport destination (example: 10-A) and unloading amount (number of boxes, weight, volume)
[From the next time onward, the destination (example: 12-A) and the amount of unloading (number of boxes, weight, volume)] ...]
Note: When there are multiple destinations, the order may be undefined on the data (determined by the system).
例:
 指示番号:XXXXXX、計画フラグ:x、(場所1:1-A、箱数:数値、重量:数値、容積:数値)、
         (場所1:10-A、箱数:数値、重量:数値、容積:数値)、
         (場所1:12-A、箱数:数値、重量:数値、容積:数値)
 注:指示番号は依頼する側(WMSなど)が規定したユニークなIDである。
Example:
Instruction number: XXXXXX, plan flag: x, (location 1: 1-A, number of boxes: number, weight: number, volume: number),
(Place 1: 10-A, number of boxes: numerical value, weight: numerical value, volume: numerical value),
(Place 1: 12-A, number of boxes: numerical value, weight: numerical value, volume: numerical value)
Note: The instruction number is a unique ID specified by the requesting party (WMS, etc.).
 以下、主要な記録項目についての表である。
「依頼」
Figure JPOXMLDOC01-appb-I000012
Below is a table of the main record items.
"request"
Figure JPOXMLDOC01-appb-I000012
「依頼、注文」
Figure JPOXMLDOC01-appb-I000013
"Request, order"
Figure JPOXMLDOC01-appb-I000013
「作業」
Figure JPOXMLDOC01-appb-I000014
"work"
Figure JPOXMLDOC01-appb-I000014
「移動」
Figure JPOXMLDOC01-appb-I000015
"Move"
Figure JPOXMLDOC01-appb-I000015
<第3層>
 以下、第3層について説明する。図18は、本実施形態のロボット制御システムに係るシステム構成の全体像を示す図である。本実施形態では、建物の内部で活動するロボットを制御するAIの汎用的なアプリケーション基盤を開発することを想定する。
<Third layer>
Hereinafter, the third layer will be described. FIG. 18 is a diagram showing an overall image of the system configuration according to the robot control system of the present embodiment. In this embodiment, it is assumed that a general-purpose application platform of AI that controls a robot operating inside a building is developed.
 最初に、人間1がロボット2に仕事を依頼する。ロボット2は、指定された場所で荷物を受け取る。それは、複数の届け先に向けたまとまった荷物である。1度にまとまった荷物は複数の届け先(目的地)に向けて最も効率良い順番で届けて行く。それぞれの場所では、「作業」として、その目的地で必要な荷物のみを人間1が受け取り、その目的地での作業が完了すると、ロボット2は次の場所に移動する。このように、人間1がロボット2に指示を与えると、ロボット2は事前に学習した知識に基づいて、効率よく物を運ぶ能力を身に着けている。また、指示の方法は、人間1が日本語で会話することによって、その指示内容を理解し、その指示に基づく行動を取る。 First, human 1 asks robot 2 to do the work. Robot 2 receives the package at the designated place. It is a package of packages for multiple destinations. Luggage collected at one time is delivered to multiple destinations (destination) in the most efficient order. At each location, as "work", the human 1 receives only the luggage required at the destination, and when the work at the destination is completed, the robot 2 moves to the next location. In this way, when the human 1 gives an instruction to the robot 2, the robot 2 has the ability to efficiently carry an object based on the knowledge learned in advance. In addition, the method of instruction is that the human 1 understands the content of the instruction by having a conversation in Japanese and takes an action based on the instruction.
 これらを実現するためには、指示を受け付ける依頼受付層31と、指示内容を理解し行動するための制御層(ロボット制御AI3)と、実際に構内の道を障害物を避けたり、通行止めなら、停止して、制御層2からの指示を受けて行動するなどの、実行層(シミュレータ4)とが必要になる。なお、依頼受付層31は、ロボット2が備えるようにしてもよいし、ロボット制御AI3が備えるようにしてもよいが、本実施形態ではロボット制御AIが備えるものとする。各層における知識は、人工知能により学習させることで、独立した挙動を実現することが出来る。そうなると、制御層3は個別のロボットの形状や制約を意識せず、汎用的な指示インタフェースのみでロボットと会話すれば良いことになる。従って、実際のロボットが存在しなくても仮想的な環境と仮想的なロボットによって実証実験が可能になる。いわゆる、物理的なロボットとしての実体は、その場所の目的ごとに用意し、動作確認を行えば良いことになる。 In order to realize these, the request reception layer 31 that accepts instructions, the control layer (robot control AI3) for understanding and acting on the instruction contents, and the road on the premises to avoid obstacles or if the road is closed, An execution layer (simulator 4) is required, such as stopping and acting in response to an instruction from the control layer 2. The request reception layer 31 may be provided by the robot 2 or the robot control AI 3, but in the present embodiment, the request reception layer 31 is provided by the robot control AI. By learning the knowledge in each layer by artificial intelligence, it is possible to realize independent behavior. In that case, the control layer 3 does not need to be aware of the shape and restrictions of the individual robots, and only needs to talk with the robot using a general-purpose instruction interface. Therefore, even if the actual robot does not exist, the demonstration experiment can be performed by the virtual environment and the virtual robot. The so-called physical robot entity should be prepared for each purpose of the place and the operation should be confirmed.
 今回は、ロボットと場所である構内の構造は、仮想的シミュレーション環境(シミュレータ4)で実現する。 This time, the structure of the robot and the premises, which is the place, will be realized in a virtual simulation environment (simulator 4).
 但し、その仮想的シミュレーションは、実ロボットと同じように、構内の道路を進行するための強化学習を実際に行う。また、制御層3においては、業務ストーリの理解とロボットに指示を与える走行能力、最短距離を効率的に導き出す能力、予期しない障害物の発生に対応した迂回経路の選択など、様々な能力を持つための強化学習が必要になる。 However, the virtual simulation actually performs reinforcement learning for traveling on the road in the premises, just like a real robot. In addition, the control layer 3 has various abilities such as understanding the business story, traveling ability to give instructions to the robot, ability to efficiently derive the shortest distance, and selection of a detour route corresponding to the occurrence of an unexpected obstacle. Reinforcement learning is required.
 本基盤フレームワークでは、AIによる制御層3の能力と、AIによる実行層4の能力が、相互に会話しながら目的を遂行することを目指す。 In this basic framework, the ability of the control layer 3 by AI and the ability of the execution layer 4 by AI aim to achieve the purpose while talking with each other.
<ロボット制御AIの利用シーン>
 前述のように、このロボット制御AI3は、特定のロボットを前提にしたものではない。目標とする機能は、場所から場所に移動する際に物を運ぶか、人を特定の場所に案内するなど、移動に関わる機能としての制限は前提となるが、様々な利用シーンが想定される。例えば、
 ・倉庫のある特定の棚にある荷物を、ピッキングして、別の目的の棚に移動する場合に、自動的にまとまった荷物を所定の場所に効率良く運ぶ、
 ・空港などで、旅行客が、どこかに行きたいが行き方が分からないので、その目的の場所に案内してくれる、
 ・介護施設で、出来上がった食事を目的の部屋まで運んで行くことや、食後の食器類を部屋に取りに行って運んで戻ってくるなど、
 ・また、倉庫だけではなく、工場や公共の施設など、
様々な利用シーンが考えられる
<Usage scene of robot control AI>
As described above, this robot control AI3 is not premised on a specific robot. The target function is premised on restrictions as a function related to movement, such as carrying an object when moving from place to place or guiding a person to a specific place, but various usage scenarios are assumed. .. For example
・ When picking a package on a specific shelf in a warehouse and moving it to another target shelf, the package is automatically transported to the specified location efficiently.
・ At airports, tourists want to go somewhere but don't know how to get there, so they will guide you to the destination.
・ At a long-term care facility, carry the finished meal to the target room, or go to the room to pick up the tableware after meals and bring it back.
・ In addition to warehouses, factories and public facilities, etc.
Various usage scenes are possible
 ロボット2は目的によって形状や機能が異なるが、それぞれの機能は自律的に制御されることを前提としていて、末端の行動はロボットの機能に委ねられる。 Robot 2 has different shapes and functions depending on the purpose, but it is premised that each function is controlled autonomously, and the action at the end is left to the function of the robot.
 ロボット制御AI3は、地図情報をもとに場所から場所への移動を、最適なルートを選択することで、効率よく物を運ぶことが出来ることを目的としているが、ロボット2側の自律機能と融合することで、現実的な問題を解決することができる。 The robot control AI3 aims to move from place to place based on map information and efficiently carry objects by selecting the optimum route, but it has an autonomous function on the robot 2 side. By fusing, it is possible to solve practical problems.
 本機能の最大の特徴は、ロボット制御AI3が、特定の利用シーンにおいて、何台ものロボット2が複数台同時に稼働することを想定している。従って、ロボット2の位置や道路の状態などを、構内全体の地図に基づき、複数の状態に対応した制御を行うことが出来る管制センタのような役割を果たすことを想定する。 The greatest feature of this function is that the robot control AI3 assumes that a plurality of robots 2 operate at the same time in a specific usage scene. Therefore, it is assumed that the position of the robot 2 and the state of the road play a role like a control center capable of controlling a plurality of states based on a map of the entire premises.
 荷物の出し入れなどの作業を行っている最中は、ロボット2から見ると、その場所が占有状態となり、他のロボット2はその場所に対してロックされることになる。ロボット制御AI3はこの状態検知すると、他の場所に誘導し、別の作業を行った後に、占有が解けた場所の作業を行えるようにする。 While performing work such as loading and unloading luggage, the place is occupied when viewed from the robot 2, and the other robots 2 are locked to the place. When the robot control AI3 detects this state, it guides the robot to another place so that the robot control AI3 can perform the work in the place where the occupancy is released after performing another work.
 これらの業務的ストーリも含めて人工知能に学習させる必要がある。その結果、利用シーンの応用性が広がり、効率性を追求する場面や、全体時間に制約がある現場などで利用価値が認められるものと考える。 It is necessary to let artificial intelligence learn including these business stories. As a result, the applicability of the usage scene will be expanded, and the utility value will be recognized in the scene where efficiency is pursued and in the field where the total time is limited.
<ロボット制御AIとロボットシミュレータ>
=全体像の説明=
 図19は、ロボット制御AI3とシミュレータ4の概要を示す図である。なお、図19の例では、制御層3に依頼受付層まで含まれるようにしている。
<Robot control AI and robot simulator>
= Explanation of the whole picture =
FIG. 19 is a diagram showing an outline of the robot control AI 3 and the simulator 4. In the example of FIG. 19, the control layer 3 includes the request reception layer.
 ロボット制御AI3は、3層で構成され、それぞれ役割を持つ。3階層において、人工知能を駆使して目的地への経路の最適化を行う。その結果をロボットシミュレータ41に指示する。 Robot control AI3 is composed of three layers, each of which has a role. At the 3rd level, the route to the destination is optimized by making full use of artificial intelligence. The result is instructed to the robot simulator 41.
 ロボットシミュレータ41は、経路から経路までの走行動作と経路から目的地までの走行動作を階層3から指示を受けながら実行する。 The robot simulator 41 executes the traveling operation from the route to the route and the traveling operation from the route to the destination while receiving instructions from the layer 3.
 ロボットシミュレータ41は、ロボット2が今何処にいるかの位置情報や速度などを3階層に定期的(1秒ごと)に通知する。行動である右、左、前、後ろ、止まるの行為を実行するための学習が行われていて自分で判断することが出来る。 The robot simulator 41 periodically (every second) notifies the position information and speed of where the robot 2 is now in three layers. Learning is done to perform the actions of right, left, front, back, and stop, and you can judge for yourself.
 センサーは、仮想的状態を生成するための専用機能で、時系列でランダムに事象を発生させることを行う。例えば、行動の要素を受け取ると、時系列に位置情報(x、y)と速度をシミュレータに返す。 The sensor is a dedicated function for generating a virtual state, and randomly generates events in chronological order. For example, when it receives an element of action, it returns position information (x, y) and velocity to the simulator in chronological order.
<自分以外の位置情報などの共有>
 図20は、ロボット制御AI3が自分に対応するロボット以外のロボットの位置情報を共有している状態を説明する図である。
<Sharing location information other than yourself>
FIG. 20 is a diagram illustrating a state in which the robot control AI3 shares position information of a robot other than the robot corresponding to itself.
 ロボット制御AI3は、それぞれの時点の地図情報全体の状態を共有する必要がある。ロボットシミュレータ41も同様の条件が必要になるが、その場合は、センサー42がローカルセンサーの個別情報を取りまとめて各シミュレータ4で共有する形態を取る。しかし、ここでは、将来、実際のロボットからの状態を受け付けることになっても変更が発生しないように、グローバルセンサーを意識しないこととする。 The robot control AI3 needs to share the state of the entire map information at each time point. The robot simulator 41 also requires the same conditions, but in that case, the sensor 42 collects the individual information of the local sensor and shares it with each simulator 4. However, here, we will not be aware of the global sensor so that changes will not occur even if the state from the actual robot is accepted in the future.
 その代替として、各ロボットシミュレータ41から上がってくる状態を、ロボット制御AIがグローバルキャッシュを通して共有することとする。各ロボット制御AI3が独立したセッションにおいて閉じた世界で稼働するが、自分以外の情報を他のセッション情報として取得できれば共有したことになるので、グローバルキャッシュは、その手段として使用する。 As an alternative, the robot control AI will share the state coming up from each robot simulator 41 through the global cache. Each robot control AI3 operates in a closed world in an independent session, but if information other than itself can be acquired as other session information, it will be shared, so the global cache is used as a means for that.
 結果、管制センターのように各ロボット2の位置情報を知っているので、自分を阻害する要素を常に把握出来、走行経路もこれらの情報に基づいて制御する。 As a result, since the position information of each robot 2 is known like the control center, the factors that hinder oneself can always be grasped, and the traveling route is also controlled based on this information.
 共有情報は、自身の情報と同時に一定時間間隔(1秒)で全体情報を常に認識できるようにする。 Shared information makes it possible to always recognize the entire information at regular time intervals (1 second) at the same time as its own information.
<ロボット制御AIの行動のイメージ>
 図21は、ロボット制御AI3により制御されたロボットの行動の一例を説明する図である。
<Image of robot control AI behavior>
FIG. 21 is a diagram illustrating an example of the behavior of the robot controlled by the robot control AI3.
 2階層からの依頼を受ける。その主な内容は、(1)開始位置、(2)荷物を移動する先である複数の目的地、(3)各目的地の荷物の個数、重量である。 Receive requests from the second level. The main contents are (1) the starting position, (2) a plurality of destinations to which the luggage is moved, and (3) the number and weight of the luggage at each destination.
 待機しているロボットの割当と目的地への最適な順番の算出など事前の計画をスケジュールとして作成する。 Create a schedule in advance, such as assigning waiting robots and calculating the optimum order for destinations.
 但し、実際に移動を始めると最初に計画した通り進行出来るとは限らない。突発的に障害物や目的地の占有など予期できないことが発生する。従って、学習内容はこれらの刻々と変化する事象に対応できるようにすることである。 However, when you actually start moving, you may not always be able to proceed as originally planned. Unexpected things such as obstacles and occupancy of destinations occur suddenly. Therefore, the learning content is to be able to respond to these ever-changing events.
=ロボット制御AIは何を学習するか?=
 決まった行動をロボットに指示するための学習は、外部からの指示とロボットから送られてくる状態から、次の行動を作り出す。そのための内容を学習する。(状態と入力から行動→結果(前回の状態)から報酬を計算する)
= What does robot control AI learn? = =
Learning to instruct a robot to take a fixed action creates the next action from external instructions and the state sent from the robot. Learn the content for that. (Action from state and input → Calculate reward from result (previous state))
 報酬は、期待した結果がどの程度得られたかで評価し、評価内容に重みづけをして点数を計算する。 The reward is evaluated based on how much the expected result is obtained, and the evaluation content is weighted to calculate the score.
 最適行動価値関数(ベンルマン方程式)により、次の行動は最大価値を得ようとして学習が行われる。 According to the optimal action value function (Benlemann equation), the next action is learned in an attempt to obtain the maximum value.
<全体システム構成イメージ>
 図22は、本実施形態のロボット制御システムの全体構成を説明する図である。
<Overall system configuration image>
FIG. 22 is a diagram illustrating an overall configuration of the robot control system of the present embodiment.
=ロボット制御AI=
1.荷物の情報を否定形な文書で受け取ると、文書内容を解析し、定型依頼情報に変換する。
2.依頼情報を理解し、動的にストーリを生成する。
  ・目的地への経由順番を決める
  ・目的地ごとの荷物の個数、重さ、距離、時間
3.現在の状態から次の行動を指示する。
= Robot control AI =
1. 1. When the package information is received as a negative document, the document content is analyzed and converted into standard request information.
2. Understand the request information and dynamically generate the story.
・ Determine the order of transit to the destination ・ Number, weight, distance, and time of luggage for each destination 3. Instruct the next action from the current state.
<制御階層>
 図23は、本実施形態のロボット制御システムにおけるロボット制御AI3の制御階層を説明する図である。
<Control hierarchy>
FIG. 23 is a diagram illustrating a control hierarchy of the robot control AI3 in the robot control system of the present embodiment.
 各階層は役割を持ち、独立した処理を行う。ロボット制御AI3層が本処理のコアとなり、実際の制御を行う。ここでのロボット2の仕事は、開始場所で荷持を積、指示された場所まで最短時間で到達し、荷物を降ろす。積んだり、降ろしたりする作業は、今回は人間が行う。 Each layer has a role and performs independent processing. The robot control AI3 layer becomes the core of this process and performs actual control. The work of the robot 2 here is to load the cargo at the starting place, reach the designated place in the shortest time, and unload the cargo. The work of loading and unloading is done by humans this time.
 ここでの処理の特徴は、作業プーリング層31で、1つの目的地に対して独立した指示を、ニューラルネットの入力層へ与えることで、複数の目的地に向けた作業がシリアライズされて順番に入力されていくことである。 The feature of the processing here is that the work pooling layer 31 gives an independent instruction to one destination to the input layer of the neural network, so that the work for a plurality of destinations is serialized and in order. It is to be input.
 ロボット制御AI層は、1回の処理目的に対応してストーリを最適に遂行するように、自律型ロボットに指示することである。この部分はDDQNで強化学習する。 The robot control AI layer is to instruct the autonomous robot to optimally execute the story corresponding to one processing purpose. This part is reinforcement learning with DDQN.
<地図情報とは>
 図24は、本実施形態の地図情報を説明する図である。
<What is map information?>
FIG. 24 is a diagram illustrating map information of the present embodiment.
 本実施形態において、地図情報は、場所の範囲と、目的地の構成、経路構成の情報が数値化されたものとする。この地図上では、同時に複数のロボットが通行していることを想定している。 In the present embodiment, the map information is assumed to be the numerical value of the range of the place, the composition of the destination, and the information of the route composition. On this map, it is assumed that multiple robots are passing at the same time.
<ロボット制御AI3は何を学習するか>
=経路の最適化学習についての概要=
<What does robot control AI3 learn?>
= Overview of route optimization learning =
 AからLまでの目的地の中で、起点で受取、複数個所の目的地へ届けて引き渡すことを、ロボット2に指示するとしたら、その複数の目的地をどのように経由すると、最も効率よく届けることが出来るかを学習する。 Of the destinations A to L, if the robot 2 is instructed to receive at the starting point, deliver it to multiple destinations, and deliver it, how to go through the multiple destinations will deliver it most efficiently. Learn what you can do.
=エージェント判定要素=
 入力値=依頼内容にもとづいて、所定の場所で荷物を受け取り、複数の目的地に正確に、制限時間内に届けることを学習する。
= Agent judgment element =
Based on the input value = request content, learn to receive the package at a predetermined location and deliver it to multiple destinations accurately and within the time limit.
=学習する要素=
 ・開始位置に向けて、待機エリアから移動する。その際、最適経路を選択することを覚える。(詳細にはついては後述する。)
= Elements to learn =
・ Move from the waiting area toward the start position. At that time, remember to select the optimum route. (Details will be described later.)
 ・荷物を積んだら、最初の目的地に向けて移動する。その際、最適経路を選択することを覚える。(詳細にはついては後述する。) ・ After loading the luggage, move to the first destination. At that time, remember to select the optimum route. (Details will be described later.)
 ・通過できない経路では、迂回することを覚える。
  現時点で存在するx.y位置に、他のロボットが複数居て、目的地まではその経路は行きたくない。
  候補2、候補3のように、障害物や他のロボット、経路状態ご通過不可状態の場合は、複数の経路候補から可能な、最適な経路を選択する。その都度、角数や、距離、を計算しながら、候補noを下げて行く。
  時点毎に前提=地図上の道路(経路)のどこかに自分以外のロボットが点在している状態を認識できる。状態で取得できる。
・ Remember to detour on routes that cannot be passed.
Existing at present x. There are multiple other robots in the y position, and I don't want to go that route to the destination.
When an obstacle, another robot, or a path state cannot be passed, as in Candidate 2 and Candidate 3, the optimum route that is possible from a plurality of route candidates is selected. Each time, the candidate no is lowered while calculating the number of angles and the distance.
Assumption for each time point = You can recognize the state where robots other than yourself are scattered somewhere on the road (route) on the map. Can be obtained in the state.
 ・初回目的地から次の目的地へ移動できることを覚える。 ・ Remember that you can move from the first destination to the next destination.
 ・目的地に近づいた時点で、他のロボットに占有されていたら別の最適な目的地を選択し迂回する。(なお、本実施形態では荷物の投入と取り出しは人間が行うことを想定している。) ・ When approaching the destination, if it is occupied by another robot, select another optimal destination and detour. (In this embodiment, it is assumed that a human is responsible for loading and unloading the luggage.)
=道路の中央基準線の位置情報として決める。=
 ロボット2からの走行距離は実際の距離として取得されるが、基準距離とは一定範囲でぶれることを想定する。
= Determined as the position information of the central reference line of the road. = =
The mileage from the robot 2 is acquired as an actual distance, but it is assumed that the distance deviates from the reference distance within a certain range.
(4)他のロボットも含め、通路を遮断する障害物を検出した場合は、
 ・一旦、3秒間停止した後、その後障害物が消滅した場合は、そのまま続行する。
 ・3秒経過しても、まだ障害物が存在している場合は、走行不能と判断し、一旦、迂回経路を経由して目的地までの走行時間を計算する。これをp時間とする。走行不能と判断した場合、10秒待てば障害物が取り除かれると決つけるのでこれをq時間とする。
 ・p時間>q時間ならば、迂回しないでここで待つ。逆なら、迂回する。(10秒とした理由は、ロボットは、障害物が消えるタイミングが分からないため。10秒後に再度、障害物があるかどうかを判定する。)
 ・但し、隣り合う経路間の走行時間の標準値は、1m/分なので、経路間の距離で換算して、待つか、迂回するかを判断する。距離計算は、経路から経路までの直線距離がx,y座標で分かるので、地図情報から計算する。
(4) If an obstacle that blocks the passage is detected, including other robots,
・ If the obstacle disappears after stopping for 3 seconds, continue as it is.
・ If there are still obstacles after 3 seconds, it is judged that the vehicle cannot travel, and the travel time to the destination via the detour route is calculated once. Let this be p time. If it is determined that the vehicle cannot run, it is determined that the obstacle will be removed after waiting 10 seconds, so this is set as q hours.
・ If p time> q time, wait here without detouring. If it is the other way around, detour. (The reason for setting it to 10 seconds is that the robot does not know when the obstacle disappears. After 10 seconds, it determines whether there is an obstacle again.)
-However, since the standard value of the traveling time between adjacent routes is 1 m / min, it is determined whether to wait or detour by converting the distance between the routes. The distance is calculated from the map information because the linear distance from the route to the route can be known from the x and y coordinates.
=距離計算=
 点A(x、y)から点B(x、y)までの距離を求める。直線距離は、横方向ならば、yを固定化し、||A(x)-B(x)||=mとする。
= Distance calculation =
Find the distance from point A (x, y) to point B (x, y). If the linear distance is in the horizontal direction, y is fixed and || A (x) -B (x) || = m.
 縦方向ならば、xを固定化して、||A(y)-B(y)||=mとする。 In the vertical direction, x is fixed so that || A (y) -B (y) || = m.
 実際は、点Aと点Bの直近の経路から通過経路数を求め、経路間の距離合計+双方の経路からの直近の距離を両方加算して値となる。 Actually, the number of passing routes is obtained from the latest routes of points A and B, and the total distance between the routes + the latest distances from both routes are added to obtain the value.
 ||経路1(y)-A(y)||+経路数×経路間距離(均等)+||経路2(y)-B(y)||=AとBの実質距離(m)となる。 || Route 1 (y) -A (y) || + Number of routes x Distance between routes (equal) + || Route 2 (y) -B (y) || = Real distance (m) between A and B Become.
(5)開始位置から目的地のセットで経路セットの選択は、アルゴリズムでは、4つを与えて選択させる。この段階では、経路から経路への関係だけが決まった状態である。目的地までは幾つかの通過経路を経由しならないので、その最短経路は自分で学習しなければならない。 (5) In the algorithm, four routes are given to select a route set from the start position to the destination set. At this stage, only the route-to-route relationship is determined. You have to learn the shortest route by yourself because you have to go through several routes to the destination.
(6)報酬評価は、2段階で行う。
 ・端(開始位置)から端(目的地)までの経路数は、まとまった最短経路マトリクスで知っている。
  差分=最短経路マトリクスー実行結果
  差分<0なら、報酬=差分×10・・・・・・差分がマイナスになるので10を掛けるとマイナスになる。
  差分>=0なら、報酬=差分×10・・・・差分がプラスになるので10を掛けるとプラスになる。
 ・注:ここでの差分計算は、最短経路数のみで、角数は考慮していない。
(6) Reward evaluation is performed in two stages.
-The number of routes from the end (start position) to the end (destination) is known from the shortest route matrix.
Difference = Shortest path matrix Execution result If the difference <0, the reward = difference x 10 ... The difference is negative, so multiplying by 10 gives a negative value.
If the difference> = 0, the reward = difference x 10 ... The difference will be positive, so multiplying by 10 will result in a positive result.
-Note: The difference calculation here is only for the shortest path number, and does not consider the number of angles.
=経路において、直進か上下か左右のどちらに進むかを判断する方法=
 図25は、本実施形態のロボット制御AI3の学習する経路を説明する図である。
= How to determine whether to go straight, up / down, or left / right on the route =
FIG. 25 is a diagram illustrating a learning path of the robot control AI3 of the present embodiment.
 経路と経路を繋ぐのは計算で求めるが、次の経路に向かう場合、直進か、上下か左右のどちらにの方向に向かうかは計算しなければならない。 The route and the route are connected by calculation, but when heading for the next route, it is necessary to calculate whether to go straight, up or down, or left or right.
=ルール2(目的地配列の順序最適化)=
 途中で目的地配列の順序番号を変更する。
 目的地の座標軸までの距離が、2mの範囲に入った段階で空き状態なら、自分が占有出来たこととする。
= Rule 2 (optimization of destination array order) =
Change the sequence number of the destination array on the way.
If the distance to the coordinate axis of the destination is vacant when it is within the range of 2 m, it is assumed that you have occupied it.
  1.空き状態(自分が目的地からの距離が2m範囲以内に入った)
  2.占有状態(自分が目的地の3m以内で占有状態を認識して停止中)
  3.占有解除中(他者が目的地から2m以上離れた場合、作業完了状態)
  4.占有開始中(他者が目的地までの距離が2m範囲に入った)
1. 1. Empty (I was within a 2m distance from my destination)
2. Occupied state (I recognize the occupied state within 3 m of the destination and stop)
3. 3. Unoccupied (work completed when another person is more than 2m away from the destination)
4. Occupancy is starting (the distance to the destination is within the range of 2m by others)
 以上を目的地の占有状態として、各ロボット制御AI3の独立セッション内に、全体の目的地占有状況を認識できるようにする。その為に、自分と他者の全ロボット2の目的地占有状態を1秒ごとに入手し、全ロボット2で共有する。他者のロボット2が占有状態から解放されるまでの予想時間の計算は、ロボット2が保持している荷物の数に比例する。例えば、目的地F=5個、目的地H=7個、目的地J=3個で、現在目的地Hの近くに到着した状態では、 With the above as the occupancy state of the destination, the entire destination occupancy status can be recognized within the independent session of each robot control AI3. Therefore, the destination occupancy state of all the robots 2 of oneself and others is obtained every second and shared by all the robots 2. The calculation of the estimated time until the robot 2 of another person is released from the occupied state is proportional to the number of luggage held by the robot 2. For example, if the destination F = 5, the destination H = 7, the destination J = 3, and the current arrival is near the destination H,
  基準占有時間の設定(デフォルト=>5秒/荷物個数=1個)・・・パラメタ化として、
  占有開始時間+(基準占有時間×7)=占有解放予定時間とする。
Setting of standard occupancy time (default => 5 seconds / number of luggage = 1) ・ ・ ・ As parameterization
Occupancy start time + (standard occupancy time x 7) = scheduled occupancy release time.
(1)最初の開始位置を除く、目的地配列では、最初に、目的地が他のロボット2で占有されている場合は出来るだけ順番を最後に持ってくる。=>順序番号を大きくすることと同義。 (1) In the destination array excluding the first start position, if the destination is occupied by another robot 2, the order is brought to the end as much as possible. => Synonymous with increasing the sequence number.
 占有とは、ロボット2が目的地に到着した状態から、次の目的地或は、待機エリアに出発するまでの時間を意味する。目的地順序番号の変更タイミングは、1つの目的地が完了して、次の目的地へ向けて移動開始する直前で、判定する。但し、(解放までの残時間が、占有解放時刻ー現在時間<20秒なら)順序番号を入れ替えない。 Occupancy means the time from when the robot 2 arrives at the destination until it departs for the next destination or waiting area. The timing of changing the destination sequence number is determined immediately before one destination is completed and the movement to the next destination is started. However, the sequence numbers are not replaced (if the remaining time until release is occupied release time-current time <20 seconds).
 そうでなければ、完了した目的地を除く、残の目的地の順序番号を占有状況から入れ替える。→(2)の経路判定へそれでも、目的地に到着した時に、他のロボットで目的地が占有されていた場合は、ここでも、目的地の順序入れ替えを行う。 Otherwise, replace the sequence numbers of the remaining destinations, excluding the completed destinations, from the occupancy status. → Go to route determination in (2) If the destination is still occupied by another robot when it arrives at the destination, the order of the destinations is changed here as well.
(2)経路の最適化は、目的地(開始位置)から目的地の1つの区間で計算する。区間が変われば、その段階で最適経路を再度算出する。→DDQN(最適行動価値関数) (2) Route optimization is calculated in one section from the destination (start position) to the destination. If the section changes, the optimum route is calculated again at that stage. → DDQN (optimal action value function)
 ・何を持って目的地配列が適切と判断するか?→上記例で判断すると、(A->D->J)と(A->J->E)のどちらが適切かでは、(A->D->J) :(A->J->E )= 7対5 で後者が最適となる。 ・ What do you have to decide that the destination arrangement is appropriate? → Judging from the above example, depending on which is more appropriate, (A-> D-> J) or (A-> J-> E), (A-> D-> J): (A-> J-> E) ) = 7 to 5, the latter is optimal.
 更に角数では、2対2なので互角となる。目的地順序を入れ替えながら、最も適切な全目的地の合計経路数を計算する。 Furthermore, the number of angles is 2 to 2, so they are even. Calculate the total number of routes for all the most appropriate destinations, swapping the destination order.
=ルール3=
 障害物の共有を行い、ロボットが走行中に、今現在、どこの位置に障害物が発生しているかを認識できるようにする。各ロボット制御AI3が1度に認識できる障害物の数と位置は最大20個までとする。学習の際、選択した経路が障害物があるか、選択段階で、障害物を含む経路は選択出来ないようにする。障害物は発生から10秒で消滅する。
= Rule 3 =
Obstacles are shared so that the robot can recognize where the obstacle is currently occurring while it is running. The maximum number and positions of obstacles that can be recognized by each robot control AI3 at one time is 20. During learning, make it impossible to select a route that includes obstacles at the selection stage, whether the selected route has obstacles. Obstacles disappear 10 seconds after they occur.
1.障害物が存在していることを認識する
 障害物は大きく2つに分類できる。
1. 1. Recognizing the existence of obstacles Obstacles can be broadly classified into two types.
 (1)例え通過点に存在しても、通過が可能な障害物である。この場合は、相手が他のロボット2だったりする。お互い、ぶつからないように交差してすれ違う場合や、相手が停止していて、追い越す場合などが考えられる。この場合は、自律型ロボット2が自ら障害物をよけて通過するので、ロボット制御AI3は知る必要が無い。 (1) It is an obstacle that can pass even if it exists at a passing point. In this case, the other party may be another robot 2. It is possible that they cross each other so that they do not collide with each other, or that the other party is stopped and overtakes. In this case, since the autonomous robot 2 passes by itself avoiding obstacles, the robot control AI3 does not need to know.
 (2)道路が完全に通行できない状態で、自律型ロボット2は一旦、引き返して、別経路を選択するしか方法が無い場合である。この場合は、他の自律型ロボット2がセンサー42で認識して、通行不可と判断した瞬間に障害物が生まれることになる。この情報は、全ロボットセッションで共有して、同時に全体で認識でるようになる。つまり、この時の通行不可情報に座標軸(x、y)がついて、状態として記録される。 (2) When the road is completely impassable, the autonomous robot 2 has no choice but to turn back and select another route. In this case, an obstacle is created at the moment when the other autonomous robot 2 recognizes the sensor 42 and determines that the robot cannot pass. This information will be shared by all robot sessions and will be recognized by the whole at the same time. That is, the impassable information at this time is attached with coordinate axes (x, y) and recorded as a state.
 以上のように、障害物とは、(2)の意味である。 As mentioned above, the obstacle means (2).
 最短経路の選択を行う際、障害物配列の中に当該経路間に相当する位置関係を検出した場合は、最短経路算出をやり直す。結果、障害物が含まれていないルートを探し出す。 When selecting the shortest path, if a positional relationship corresponding to the path is detected in the obstacle array, the shortest path is calculated again. As a result, find a route that does not contain obstacles.
 従って、学習過程において、意図的に障害物配列にランダムに位置情報を入れて、最短経路選択を行わせる必要がある。 Therefore, in the learning process, it is necessary to intentionally randomly insert position information into the obstacle array to select the shortest path.
2.以前存在していた障害物が、消滅したことを認識する 2. Recognize that previously existing obstacles have disappeared
 状態:sに共通的に保持されている、障害物配列(x,y)が消滅するタイミングは、ロボットがこの障害物配列に含まれる。 State: The robot is included in this obstacle array at the timing when the obstacle array (x, y), which is commonly held in s, disappears.
 位置関係(x、y)を通過した場合は、通過位置情報と障害物位置情報が一定の誤差(2m範囲)内で一致したら、障害物を除去する。 When passing through the positional relationship (x, y), if the passing position information and the obstacle position information match within a certain error (2 m range), the obstacle is removed.
 逆に、障害物配列に存在すると、意図的にその場所を通過しなくなり、いつまでも、障害物が存在していることになる。今回は、決め事で、障害物が発生してから10秒経過すると消滅することとする。(下線はパラメタ) On the contrary, if it exists in the obstacle array, it will not pass through the place intentionally, and the obstacle will exist forever. This time, by convention, it will disappear 10 seconds after the obstacle occurs. (Underlined parameters)
<ロボット制御AI層のシステム構造>
 図26は、ロボット制御AI3の階層構造を説明する図である。
<System structure of robot control AI layer>
FIG. 26 is a diagram illustrating a hierarchical structure of the robot control AI3.
<ロボットシミュレータ・アダプタの作成>
 図27は、ロボットシミュレータアダプタを説明する図である。
<Creation of robot simulator adapter>
FIG. 27 is a diagram illustrating a robot simulator adapter.
 スタブとは、実際のロボット制御AI3のアプリケーションと直接インタフェースを取る前に、ロボットシミュレータ41をテストするためのプログラムである。テキストデータを読み込み、自律型ロボット41への指示を生成し、シミュレーションを行えるようにする。 The stub is a program for testing the robot simulator 41 before directly interfacing with the actual robot control AI3 application. It reads text data, generates instructions to the autonomous robot 41, and enables simulation.
 このスタブ部分は、実構成では、環境の一部に相当し、テキストデータを読み込まないで、環境からの要求を受けて自律型ロボットへインタフェースする役割になる。 In the actual configuration, this stub part corresponds to a part of the environment, and it plays the role of interface to the autonomous robot in response to the request from the environment without reading the text data.
なお、以下に注意する。
・インタフェースの形式は、本物に厳格に合わせる。
・指示データ形式は自由ではあるが、発行順序NOやインタフェース形式(Json)などを決める。
Note the following.
-The interface format should be strictly matched to the real thing.
-Although the instruction data format is free, the issue order NO and interface format (Json) are determined.
<ロボットシミュレータの役割>
 ロボットシミュレータ41は、実際のロボット2の知識を強化学習することにより、上位階層からの指示を遂行する。主な役割は、センサー42から取得した情報に基づいて、次の行動を導き出すことがメインで、状態に基づいて行動を生成する役割を果たす。ロボット制御AI3からの指示で作動する。指示内容はロボットシミュレータの入力層に加える。
<Role of robot simulator>
The robot simulator 41 executes instructions from the upper hierarchy by reinforcing learning the actual knowledge of the robot 2. The main role is to derive the next action based on the information acquired from the sensor 42, and to generate the action based on the state. It operates according to the instruction from the robot control AI3. The instruction content is added to the input layer of the robot simulator.
 1.ロボット制御AI3からの指示データ
 2.現時点の全ロボット41、障害物の位置情報リスト
 3.センサー42からの状態
 以上の3つの状態に基づいてDDQNによる強化学習を行う。
1. 1. Instruction data from robot control AI3 2. List of all robots 41 and obstacle positions at the moment 3. State from sensor 42 Reinforcement learning by DDQN is performed based on the above three states.
 この3つの中で、2番目の情報の実現性を考える。本来、実際の物理的なロボット2は、自身が持っているセンサーに基づいて外界が動的に変化する状態に反応しながら動いていくので、学習はその時の自分のセンサー情報だけをよりどころに行う。 Of these three, consider the feasibility of the second information. Originally, the actual physical robot 2 moves while reacting to a state in which the outside world changes dynamically based on its own sensor, so learning is based only on its own sensor information at that time. Do.
 しかし、シミュレータ41には、同時に複数のロボットの動きの事象を捉える必要がある。上位階層であるロボット制御AI3側では、各ロボット41が状態を報告してくるので、その状態を共有することで制御できるが、同じことをシミュレータ4が行わなければならない。 However, it is necessary for the simulator 41 to capture the movement events of a plurality of robots at the same time. On the robot control AI3 side, which is a higher layer, each robot 41 reports a state, and therefore control can be performed by sharing the state, but the simulator 4 must do the same.
 実際、シミュレータ4の学習方法は、センサー42側で動的かつ突発的な事象を定期的に発生させながら、その事象に反応することを学習する。しかし、センサー42は自分のセンサー情報のみなので、シミュレータ4に与える情報は自分の中に閉じている。例えば、5台のロボット2が同時に動いていたと想定すると、5台のロボット2はそれぞれ固有の中に閉じた状態で走行する。しかし、地図情報全体に展開されているのは、5台のロボット2が上位からの指示で、それぞれの依存性は全くなく勝手に走行している。 In fact, the learning method of the simulator 4 learns to react to a dynamic and sudden event on the sensor 42 side while periodically generating it. However, since the sensor 42 has only its own sensor information, the information given to the simulator 4 is closed within itself. For example, assuming that five robots 2 are moving at the same time, each of the five robots 2 travels in a uniquely closed state. However, what is developed in the entire map information is that the five robots 2 are instructed from the upper level, and they are running without any dependence on each other.
 夫々が勝手に走行していることは、シミュレータ4から見ると、ランダムで無作為に走行しているように見える。5台のセンサー42からの状態を共有することで、自分以外のロボットまでの距離や、定期的に発生と消滅を繰り返す障害物までの距離を取得できる。また、衝突の判定が行えるので報酬判定で使用する。衝突はゲームセットとなる。 From the simulator 4, it seems that each of them is running randomly. By sharing the state from the five sensors 42, it is possible to acquire the distance to a robot other than oneself and the distance to an obstacle that repeatedly occurs and disappears on a regular basis. In addition, since it is possible to judge a collision, it is used in a reward judgment. The collision becomes a game set.
 以上から、まとめると、ロボットシミュレータ41の状態は自分も含めて、自分以外の全ての走行中のロボットシミュレータ41の状態を常時取得できるようにしなければならない。但し、センサー42(ローカルセンサー)は、各ロボット固有の状態を生成するだけで良い。全部のローカルセンサー情報を取りまとめるのはグローバルセンサーである。以下にその階層関係を示す。 From the above, in summary, the state of the robot simulator 41, including myself, must be able to constantly acquire the state of all running robot simulators 41 other than myself. However, the sensor 42 (local sensor) only needs to generate a state peculiar to each robot. It is the global sensor that collects all the local sensor information. The hierarchical relationship is shown below.
<Unity(登録商標)とのインタフェース>
 図28は、ロボットシミュレータ41と、画像表示を行うUnity(登録商標)5とのインタフェースを説明する図である。
<Interface with Unity (registered trademark)>
FIG. 28 is a diagram illustrating an interface between the robot simulator 41 and Unity (registered trademark) 5 that displays an image.
 ロボットシミュレータ41とリアルタイム画像表示を行うUnity5とのインタフェースを定義する。インタフェースの条件は、現時点の画像の全体の瞬間を1枚の単位として、これを連続させて動画にしていくイメージとなるので、全部のロボット41の位置情報と全障害物の位置情報をリストで引き渡す。引渡間隔は100msとする。 Define the interface between the robot simulator 41 and Unity 5 that displays real-time images. The interface condition is an image in which the entire moment of the current image is taken as a unit and this is continuously made into a moving image, so the position information of all robots 41 and the position information of all obstacles are listed. hand over. The delivery interval is 100 ms.
=位置情報の前提条件=
 座標軸の位置情報が(x,y)で表示されるが、これは点である。ロボット41や障害物は容積で表示されるが、この座標軸である点はどの位置を意味するかである。位置はロボットは向かって先頭部黒い点の部分とする。障害物は、左の先頭部固定とする。
= Prerequisites for location information =
The position information of the coordinate axes is displayed as (x, y), which is a point. The robot 41 and obstacles are displayed by volume, and the point on this coordinate axis indicates which position. The position of the robot is the black dot at the beginning. Obstacles are fixed at the front on the left.
 図29は、搬送ロボットと障害物の一例を示す図である。 FIG. 29 is a diagram showing an example of a transfer robot and an obstacle.
 道路幅は2mなので、障害物1と2はロボット41は通過できるが、障害物3は通過不可能となり、停止しなければならない。その後、バックしてう回路に向かうことになる。 Since the road width is 2m, the robot 41 can pass through obstacles 1 and 2, but the obstacle 3 cannot pass through and must stop. After that, we will head back to the circuit.
 表1
Figure JPOXMLDOC01-appb-I000016
Table 1
Figure JPOXMLDOC01-appb-I000016
<センサーの役割>
 センサー42は、自身の状態を生成し、シミュレータ4側に発信する役割を持つ。本来ならば、物理的なロボット2に内蔵されているものであるが、実ロボット2は目的ごとに用途が異なり、様々な機能を有するため、これらの機能に依存しないシミュレータ41を用意することにした。
<Role of sensor>
The sensor 42 has a role of generating its own state and transmitting it to the simulator 4. Originally, it is built in the physical robot 2, but since the actual robot 2 has various functions depending on the purpose, it is decided to prepare a simulator 41 that does not depend on these functions. did.
 シミュレータ41は、ロボット2と同様の知識を持ち、自律的に行動が出来る状態に事前に強化学習を行うが、センサー42は物理的な状態を生成する役割に徹する。 The simulator 41 has the same knowledge as the robot 2 and performs reinforcement learning in advance so that it can act autonomously, but the sensor 42 is dedicated to the role of generating a physical state.
 センサー42は自身が走行するために発生し得る現在の状態を時系列に計算しながら求める役割を果たすので、強化学習は必要ない。その結果を受けて同のように行動を起こすかをシミュレータ4が行うがこの場合は強化学習が必要になる。 Sensor 42 plays a role of calculating the current state that can occur due to its own running while calculating it in time series, so reinforcement learning is not necessary. The simulator 4 decides whether to take the same action in response to the result, but in this case, reinforcement learning is required.
 センサー42は、定期的に(100ms)自身の状態を生成しなければならない。その状態とは、静的に定義された地図情報の2次元座標軸空間で、自分が置かれた状況をシミュレータ4から受け取った行動によって、時系列に変化させるための計算行為を繰り返す。その結果を随時シミュレータ4に返すことが役割である。 The sensor 42 must generate its own state on a regular basis (100 ms). The state is a two-dimensional coordinate axis space of statically defined map information, and repeats a calculation action for changing the situation in which one is placed in a time series by an action received from the simulator 4. The role is to return the result to the simulator 4 at any time.
 しかし、これは自分自身の世界だけで、地図情報全体で発生している他のシミュレータ4の事象が含まれていない。つまり、他のロボット41が自分の近くまで迫っていることや、突発的に発生した障害物を認識することはできない。そこで、ロボット制御AI3は複数のロボットシミュレータ41から受けた状態の中で、共有しなければならない状態を知っているので、これらの状態を定期的に(例:100msとか)自分自身の状態に取り込む必要がある。物理的なロボットが物理的な場所で行動する場合はこのようなことは一切必要ないが、シミュレーション空間では必要になる。この部分がシミュレーションの鍵になる。 However, this is only in my own world, and does not include the events of other simulators 4 that occur in the entire map information. That is, it is not possible to recognize that another robot 41 is approaching itself or an obstacle that suddenly occurs. Therefore, since the robot control AI3 knows the states that must be shared among the states received from the plurality of robot simulators 41, these states are periodically (eg, 100 ms) incorporated into its own state. There is a need. This is not necessary if the physical robot behaves in a physical location, but it is necessary in the simulation space. This part is the key to the simulation.
 方法として、全ロボットのセンサー42を取りまとめるグループセンサーを用意する。ローカルセンサーとグローバルセンサーの2階層によって制御する。グローバルセンサーは、各ロボットの状態以外に、障害物の生成、消滅をストーリ化し、ランダムに事象を起こす役割も果たす。 As a method, prepare a group sensor that collects the sensors 42 of all robots. It is controlled by two layers, a local sensor and a global sensor. In addition to the state of each robot, the global sensor also plays a role in storytelling the creation and disappearance of obstacles and causing random events.
<グローバルセンサーとローカルセンサー>
 図30は、グローバルセンサーとローカルセンサーを説明する図である。
<Global sensor and local sensor>
FIG. 30 is a diagram illustrating a global sensor and a local sensor.
 グローバルセンサーは、ロボットシミュレータ41にセンサー情報を引き渡す際に、それぞれのローカルセンサーの状態を取得する。これらのローカルセンサーの全ての状態を全体状態リストとして取りまとめて、このまとまった状態リストをロボットシミュレータ41に引き渡す。 The global sensor acquires the status of each local sensor when passing the sensor information to the robot simulator 41. All the states of these local sensors are put together as an overall state list, and this collected state list is handed over to the robot simulator 41.
 ロボットシミュレータ41は学習する場合、現時点で発生している他の全ての事象を状態として把握し、行動を決定する。 When learning, the robot simulator 41 grasps all other events occurring at the present time as a state and determines an action.
<自律型ロボット・シミュレータ>
 図31は、ロボットシミュレータ41内部の学習について説明する図である。
<Autonomous robot simulator>
FIG. 31 is a diagram illustrating learning inside the robot simulator 41.
=センサーの役割=
 1.地図情報の基づいて、走行中の自身の現在の位置を定期的に求めてシミュレータ4に報告する。(自身の位置、走行距離、障害物、壁までの距離を毎回計算して引き渡す)
= Role of sensor =
1. 1. Based on the map information, the current position of oneself while traveling is periodically obtained and reported to the simulator 4. (Calculate and hand over your position, mileage, obstacles, and distance to the wall each time)
 2.前後の壁や障害物までの距離を定期的に検査して報告する。 2. Regularly inspect and report distances to front and rear walls and obstacles.
 3.シミュレータ4のエージェントからの行動aの内容に反応して、新しい状態を生成する。 3. A new state is generated in response to the content of the action a from the agent of the simulator 4.
 4.センサー42とシミュレータ4間の内部は、定期的(100ms)で状態を変化させるが、シミュレータ4がロボット制御AI3に返すのは1秒毎とする。 4. The inside between the sensor 42 and the simulator 4 changes its state periodically (100 ms), but the simulator 4 returns to the robot control AI3 every second.
 5.自身の位置、走行距離、障害物、壁までの距離を毎回計算して引き渡す。 5. Calculate and hand over your position, mileage, obstacles, and distance to the wall each time.
 6.センサー42は全体状態リストを受け取り、他とロボット41の位置や障害物の位置を認識する。 6. The sensor 42 receives the overall state list and recognizes the position of the robot 41 and the position of an obstacle with others.
<ロボットシミュレータとセンサーの通信>
 図32は、ロボットシミュレータ41とセンサー42との間で行われる通信について説明する図である。
<Communication between robot simulator and sensor>
FIG. 32 is a diagram illustrating communication performed between the robot simulator 41 and the sensor 42.
<センサーの認識(ロボット側との同期)>
 図33は、センサー42について説明する図である。
<Sensor recognition (synchronization with robot side)>
FIG. 33 is a diagram illustrating the sensor 42.
 ・32本のセンサー42は、各要素として、(1)距離、(2)角度で、64ユニットとなる。64ユニット(ニューロン)は、更に、(1)壁、(2)障害物、(3)白線(1番近い)、(4)白線(2番近い)の4つを保持するため、64×4=256ニューロンとなる。 The 32 sensors 42 have 64 units as each element at (1) distance and (2) angle. The 64 units (neurons) also hold four, (1) walls, (2) obstacles, (3) white lines (closest to 1), and (4) white lines (closest to 2), so 64x4. = 256 neurons.
=センサー要素の認識=
 図34は、センサー42の配置とセンサー42からの情報を保持する状態配列の一例を説明する図である。
= Recognition of sensor elements =
FIG. 34 is a diagram illustrating an example of the arrangement of the sensor 42 and the state arrangement for holding the information from the sensor 42.
 前方16本と後方16本の合計32本のセンサー42によって、壁や障害物や白線までの距離を取得する。個別のセンサー42は、独立して値を状態として返す。その独立した値から状態を認識しなければならない。 The distance to a wall, an obstacle, or a white line is acquired by a total of 32 sensors 42, 16 in the front and 16 in the rear. The individual sensors 42 independently return the values as states. The state must be recognized from its independent value.
<センサーが生成する状態>
 表2
Figure JPOXMLDOC01-appb-I000017
<State generated by the sensor>
Table 2
Figure JPOXMLDOC01-appb-I000017
<ロボット・センサー>
 図35は、ロボット2,41が備えるセンサー42が測定する距離を説明する図である。
<Robot sensor>
FIG. 35 is a diagram for explaining the distance measured by the sensor 42 included in the robots 2 and 41.
 破線矢印は、壁までの直進距離である。センサーIDは固定で、自動車がハンドルを切ると向きが変わるが、直進は常に所定のセンサーIDである。これは、DQNとは関係なく、仮想ロボット2,41が、自分の位置を計算する際に使用する。 The dashed arrow is the straight distance to the wall. The sensor ID is fixed, and the direction changes when the car turns the steering wheel, but straight ahead is always a predetermined sensor ID. This is used by virtual robots 2 and 41 when calculating their own position, regardless of DQN.
<ロボット・センサー状態累積>
=アクションの変化を取る=
1.現在段数(1,2、3、4、5)のどの位置化を記録
 ・1段当たりの速度、0.55m/1000ms
 ・アクセル1回の累積速度として計算(速度=現在段数×0.2m/1000ms)
 ・ブレーキ1回はマイナスとして計算(速度=((現在段数-1)×0.2m/1000ms)
<Cumulative robot / sensor status>
= Take a change in action =
1. 1. Record which position of the current number of stages (1, 2, 3, 4, 5) ・ Speed per stage, 0.55m / 1000ms
・ Calculated as the cumulative speed of one accelerator (speed = current number of stages x 0.2m / 1000ms)
・ One brake is calculated as minus (speed = ((current number of stages-1) x 0.2m / 1000ms)
 2.現在の右ハンドルの段数(0、+1、+2、+3・・・+15)のどれかの位置を記録・・・中心から右に16段
  ・1段当たり、11,25度で、MAX180度
  ・右ハンドルは1回ごとに累積される(進行角度=(現在段数+1)×11.25度)
2. Record any position of the current number of steps (0, +1, +2, +3 ... +15) of the right-hand drive ... 16 steps to the right from the center ・ 11,25 degrees per step, MAX 180 degrees ・ Right Handles are accumulated each time (advance angle = (current number of stages + 1) x 11.25 degrees)
 3.現在の左ハンドルの段数(0、ー1、ー2、ー3・・・ー15)のどれかの位置を記録・・・中心から左に16段
  ・1段当たり、11,25度で、MAX180度
  ・左ハンドルは1回ごとに累積される(進行角度=(現在段数ー1)×11.25度)
3. 3. Record any position of the current number of left-hand drive steps (0, -1, -2, -3 ... -15) ... 16 steps to the left from the center- At 11,25 degrees per step, MAX 180 degrees ・ The left-hand drive is accumulated each time (advance angle = (current number of stages -1) x 11.25 degrees)
 4.バックは、速度=0(静止状態)の場合のみ有効で、現在のハンドルの段数のまま、後ろに0.4m移動する 4. The back is effective only when the speed = 0 (stationary state), and moves 0.4 m behind while keeping the current number of steps of the steering wheel.
=センサーの状態を記録=
 1.角度は、センサーIDで固有に保持する
 2.距離は、壁、障害物、一番近い線(右か左)、二番目に近い線(右か左)までの距離の4種類を持つ
= Record the state of the sensor =
1. 1. The angle is uniquely held by the sensor ID. There are four types of distances: walls, obstacles, the closest line (right or left), and the second closest line (right or left).
  距離計算は、毎回、三角関数で計算する。
  計算方法は別途示す。
The distance is calculated by trigonometric function every time.
The calculation method is shown separately.
=自動車の現在位置(GPS)情報の計算=
 図36は、本実施形態で想定している道路モデルを説明する図である。
= Calculation of the current position (GPS) information of the car =
FIG. 36 is a diagram illustrating a road model assumed in the present embodiment.
 現在の位置は、緯度経度(x軸、y軸)の二次元座標軸で計算する。
 左記の座標軸の中の位置関係を計算する。
The current position is calculated on the two-dimensional coordinate axes of latitude and longitude (x-axis, y-axis).
Calculate the positional relationship within the coordinate axes shown on the left.
<位置情報の計算>
=自動車がある方向に進んだ距離時点の緯度経度を計算する=
 図37は、移動距離と位置情報(緯度経度)との関係について説明する図である。
<Calculation of location information>
= Calculate the latitude and longitude at the time when the car travels in a certain direction =
FIG. 37 is a diagram illustrating the relationship between the moving distance and the position information (latitude / longitude).
(1)1回のアクセルは、0.2m/秒すすむ。現在のアクセルの段数で速度が決まる。
(2)経過時間は、前回時刻と今回時刻の差で計算
(3)経過時間が決まれは、アクセル段数の速度で掛け算して移動距離が求まる。
(1) One accelerator is 0.2 m / sec. The speed is determined by the current number of accelerator stages.
(2) The elapsed time is calculated by the difference between the previous time and the current time. (3) The elapsed time is determined by multiplying by the speed of the number of accelerator stages to obtain the moving distance.
例:
 現在段数:2
 現在速度:0.2×2=0.4m/1000ms
 経過時間:前回:11時35分20秒234ms
 今回:11時35分20秒350ms
 差:116ms
 角度:1回のハンドル:11.25度
 走行距離:116ms÷1000ms×0.4m=0.0464m
 C=0.0464m
Example:
Current number of stages: 2
Current speed: 0.2 x 2 = 0.4m / 1000ms
Elapsed time: Last time: 11:35:20 234ms
This time: 11:35:20:350ms
Difference: 116ms
Angle: One handle: 11.25 degrees Mileage: 116ms ÷ 1000ms x 0.4m = 0.0464m
C = 0.0464m
(1)緯度経度のx軸の計算
 Cosθ(11.25度)=A÷C
 A=0.0464m× Cosθ(11.25度)
 A=0.0464×0.98078528=0.04550844m
(2)緯度経度のy軸の計算
 Sinθ(11.25度)=B÷C
 B= Sinθ(11.25度)×C
 B=0.195090×0.0464m=0.00905218m
(3)自動車の位置
 位置(X,Y)=( 0.04550844m、0.00905218m)+(P,Q)
      =( 0.04550844m+P、0.00905218m+Q)
       すなわち、秒速0.4mで約0.1秒間に動いた距離である。
(1) Calculation of x-axis of latitude and longitude Cosθ (11.25 degrees) = A ÷ C
A = 0.0464m x Cosθ (11.25 degrees)
A = 0.0464 x 0.98078528 = 0.04550844m
(2) Calculation of y-axis of latitude and longitude Sinθ (11.25 degrees) = B ÷ C
B = Sinθ (11.25 degrees) x C
B = 0.195090 x 0.0464m = 0.00905218m
(3) Position of the car Position (X, Y) = (0.04550844m, 0.00905218m) + (P, Q)
= (0.04550844m + P, 0.00905218m + Q)
That is, it is the distance moved in about 0.1 seconds at a speed of 0.4 m / s.
<ハードウェア>
 図38は、本実施形態に係るロボット制御システムに用いられるコンピュータのハードウェア構成例を示す図である。コンピュータは、例えばワークステーションやパーソナルコンピュータのような汎用コンピュータとしてもよいし、あるいはクラウド・コンピューティングによって論理的に実現されてもよい。なお、図示された構成は一例であり、これ以外の構成を有していてもよい。
<Hardware>
FIG. 38 is a diagram showing a hardware configuration example of a computer used in the robot control system according to the present embodiment. The computer may be a general-purpose computer such as a workstation or a personal computer, or may be logically realized by cloud computing. The illustrated configuration is an example, and may have other configurations.
 コンピュータは、少なくとも、プロセッサ20、メモリ21、ストレージ22、送受信部23、入出力部24等を備える。プロセッサ20は、コンピュータ全体の動作を制御し、各要素間におけるデータの送受信の制御、ならびにアプリケーションの実行および認証処理に必要な情報処理等を行う演算装置である。たとえばプロセッサ20はCPU(Central Processing Unit)であり、ストレージ22に格納されメモリ21に展開されたプログラム等を実行して各情報処理を実施する。メモリ21は、DRAM(Dynamic Random Access Memory)等の揮発性記憶装置で構成される主記憶と、フラッシュメモリやHDD(Hard Disc Drive)等の不揮発性記憶装置で構成される補助記憶とを含む。メモリ21は、プロセッサ20のワークエリア等として使用され、また、コンピュータの起動時に実行されるBIOS(Basic Input / Output System)、及び各種設定情報等を格納する。ストレージ22は、アプリケーション・プログラム等の各種プログラムを格納する。各処理に用いられるデータを格納したデータベースがストレージ22に構築されていてもよい。送受信部23は、コンピュータをネットワークおよびブロックチェーンネットワークに接続する。なお、送受信部23は、Bluetooth(登録商標)及びBLE(Bluetooth Low Energy)の近距離通信インタフェースを備えていてもよい。入出力部24は、キーボード・マウス類等の情報入力機器、及びディスプレイ等の出力機器である。 The computer includes at least a processor 20, a memory 21, a storage 22, a transmission / reception unit 23, an input / output unit 24, and the like. The processor 20 is an arithmetic unit that controls the operation of the entire computer, controls the transmission and reception of data between each element, and performs information processing necessary for application execution and authentication processing. For example, the processor 20 is a CPU (Central Processing Unit), and executes each information processing by executing a program or the like stored in the storage 22 and expanded in the memory 21. The memory 21 includes a main memory composed of a volatile storage device such as a DRAM (Dynamic Random Access Memory) and an auxiliary storage composed of a non-volatile storage device such as a flash memory or an HDD (Hard Disc Drive). The memory 21 is used as a work area or the like of the processor 20, and also stores a BIOS (Basic Input / Output System) executed when the computer is started, various setting information, and the like. The storage 22 stores various programs such as application programs. A database storing data used for each process may be built in the storage 22. The transmission / reception unit 23 connects the computer to the network and the blockchain network. The transmission / reception unit 23 may be provided with a short-range communication interface of Bluetooth (registered trademark) and BLE (Bluetooth Low Energy). The input / output unit 24 is an information input device such as a keyboard and a mouse, and an output device such as a display.
 本実施形態に係るロボット制御システムの依頼受付層31、制御層(ロボット制御AI3)、実行層(シミュレータ4)はいずれも、コンピュータが備えるプロセッサ20がストレージ22に記憶されたプログラムをメモリ21に読み出して実行することにより実現される。また、ロボット制御AIによる学習結果(モデル)や地図情報、経路情報などは、たとえば、メモリ21やストレージ22が提供する記憶領域に格納することができる。



In each of the request reception layer 31, the control layer (robot control AI3), and the execution layer (simulator 4) of the robot control system according to the present embodiment, the processor 20 included in the computer reads the program stored in the storage 22 into the memory 21. It is realized by executing. Further, the learning result (model) by the robot control AI, the map information, the route information, and the like can be stored in, for example, a storage area provided by the memory 21 or the storage 22.



<ハードウェア>
 図99は、本実施形態に係るロボット制御システムに用いられるコンピュータのハードウェア構成例を示す図である。コンピュータは、例えばワークステーションやパーソナルコンピュータのような汎用コンピュータとしてもよいし、あるいはクラウド・コンピューティングによって論理的に実現されてもよい。なお、図示された構成は一例であり、これ以外の構成を有していてもよい。
<Hardware>
FIG. 99 is a diagram showing a hardware configuration example of a computer used in the robot control system according to the present embodiment. The computer may be a general-purpose computer such as a workstation or a personal computer, or may be logically realized by cloud computing. The illustrated configuration is an example, and may have other configurations.
 コンピュータは、少なくとも、プロセッサ20、メモリ21、ストレージ22、送受信部23、入出力部24等を備える。プロセッサ20は、コンピュータ全体の動作を制御し、各要素間におけるデータの送受信の制御、ならびにアプリケーションの実行および認証処理に必要な情報処理等を行う演算装置である。たとえばプロセッサ20はCPU(Central Processing Unit)であり、ストレージ22に格納されメモリ21に展開されたプログラム等を実行して各情報処理を実施する。メモリ21は、DRAM(Dynamic Random Access Memory)等の揮発性記憶装置で構成される主記憶と、フラッシュメモリやHDD(Hard Disc Drive)等の不揮発性記憶装置で構成される補助記憶とを含む。メモリ21は、プロセッサ20のワークエリア等として使用され、また、コンピュータの起動時に実行されるBIOS(Basic Input / Output System)、及び各種設定情報等を格納する。ストレージ22は、アプリケーション・プログラム等の各種プログラムを格納する。各処理に用いられるデータを格納したデータベースがストレージ22に構築されていてもよい。送受信部23は、コンピュータをネットワークおよびブロックチェーンネットワークに接続する。なお、送受信部23は、Bluetooth(登録商標)及びBLE(Bluetooth Low Energy)の近距離通信インタフェースを備えていてもよい。入出力部24は、キーボード・マウス類等の情報入力機器、及びディスプレイ等の出力機器である。 The computer includes at least a processor 20, a memory 21, a storage 22, a transmission / reception unit 23, an input / output unit 24, and the like. The processor 20 is an arithmetic unit that controls the operation of the entire computer, controls the transmission and reception of data between each element, and performs information processing necessary for application execution and authentication processing. For example, the processor 20 is a CPU (Central Processing Unit), and executes each information processing by executing a program or the like stored in the storage 22 and expanded in the memory 21. The memory 21 includes a main memory composed of a volatile storage device such as a DRAM (Dynamic Random Access Memory) and an auxiliary storage composed of a non-volatile storage device such as a flash memory or an HDD (Hard Disc Drive). The memory 21 is used as a work area or the like of the processor 20, and also stores a BIOS (Basic Input / Output System) executed when the computer is started, various setting information, and the like. The storage 22 stores various programs such as application programs. A database storing data used for each process may be built in the storage 22. The transmission / reception unit 23 connects the computer to the network and the blockchain network. The transmission / reception unit 23 may be provided with a short-range communication interface of Bluetooth (registered trademark) and BLE (Bluetooth Low Energy). The input / output unit 24 is an information input device such as a keyboard and a mouse, and an output device such as a display.
 本実施形態に係る第2層のスケジューラ、第3層のロボット制御システムの依頼受付層31、制御層(ロボット制御AI3)、実行層(シミュレータ4)はいずれも、コンピュータが備えるプロセッサ20がストレージ22に記憶されたプログラムをメモリ21に読み出して実行することにより実現される。また、第2層における各種記憶部及び第3層におけるロボット制御AIによる学習結果(モデル)や地図情報、経路情報などは、たとえば、メモリ21やストレージ22が提供する記憶領域に格納することができる。 In each of the second layer scheduler, the third layer robot control system request reception layer 31, the control layer (robot control AI3), and the execution layer (simulator 4) according to the present embodiment, the processor 20 provided in the computer is the storage 22. It is realized by reading the program stored in the memory 21 into the memory 21 and executing the program. Further, various storage units in the second layer and learning results (models), map information, route information, etc. by the robot control AI in the third layer can be stored in, for example, a storage area provided by the memory 21 or the storage 22. ..
 以上、本実施形態について説明したが、上記実施形態は本発明の理解を容易にするためのものであり、本発明を限定して解釈するためのものではない。本発明は、その趣旨を逸脱することなく、変更、改良され得ると共に、本発明にはその等価物も含まれる。 Although the present embodiment has been described above, the above embodiment is for facilitating the understanding of the present invention, and is not for limiting and interpreting the present invention. The present invention can be modified and improved without departing from the spirit thereof, and the present invention also includes an equivalent thereof.
  2  ロボット
  3  ロボット制御AI
  31 依頼受付層
  32 作業プーリング層
  4  シミュレータ
  41 ロボットシミュレータ
  42 センサー
  43 アダプタ
2 Robot 3 Robot Control AI
31 Request reception layer 32 Work pooling layer 4 Simulator 41 Robot simulator 42 Sensor 43 Adapter

Claims (5)

  1.  複数のロボットを制御するシステムであって、
     前記ロボットが行うべき複数の仕事を記憶する作業記憶部と、
     前記仕事のそれぞれを前記ロボットに割り当てる割当処理部と、
     前記割り当てられた前記仕事を前記ロボットの制御装置に送信する送信部と、
     前記ロボットの動作状況を取得する状況取得部と、
     を備え、
     前記割当処理部は、前記動作状況に応じて前記仕事の割当先を変更すること、
     を特徴とするロボット制御システム。
    A system that controls multiple robots
    A working memory unit that stores a plurality of tasks to be performed by the robot,
    An allocation processing unit that assigns each of the tasks to the robot,
    A transmitter that transmits the assigned work to the control device of the robot, and
    A status acquisition unit that acquires the operating status of the robot, and
    With
    The allocation processing unit changes the allocation destination of the work according to the operation status.
    A robot control system featuring.
  2.  請求項1に記載のロボット制御システムであって、
     前記割当処理部は、前記仕事に必要な第1の仕事量と、前記ロボットが実行可能な第2の仕事量とに応じて、1つの前記仕事を1つまたは複数の前記ロボットに割り当てること、
     を特徴とするロボット制御システム。
    The robot control system according to claim 1.
    The allocation processing unit allocates one work to one or a plurality of robots according to a first work amount required for the work and a second work amount that the robot can perform.
    A robot control system featuring.
  3.  請求項1に記載のロボット制御システムであって、
     前記割当処理部は、前記複数のロボットのそれぞれに割り当てられる前記仕事の量を、前記複数のロボットのそれぞれに割り当てられた前記仕事の所定期間での累積量が平滑化するように、前記仕事を前記ロボットに割り当てること、
     を特徴とするロボット制御システム。
    The robot control system according to claim 1.
    The allocation processing unit performs the work so that the amount of the work assigned to each of the plurality of robots is smoothed by the cumulative amount of the work assigned to each of the plurality of robots in a predetermined period. Assigning to the robot,
    A robot control system featuring.
  4.  請求項1に記載のロボット制御システムであって、
     前記状況取得部は、前記ロボットの制御装置と、前記ロボットとは独立したセンサーとから前記動作状況を示す情報を取得すること、
     を特徴とするロボット制御システム。
    The robot control system according to claim 1.
    The status acquisition unit acquires information indicating the operation status from the control device of the robot and a sensor independent of the robot.
    A robot control system featuring.
  5.  請求項1に記載のロボット制御システムであって、
     少なくとも前記仕事のために前記ロボットが占有される占有時間を、少なくとも各前記ロボット及び全体としての運用状態を勘定科目として借方及び貸方に記憶する帳簿を備えること、
     を特徴とするロボット制御システム。
    The robot control system according to claim 1.
    To provide a book that stores at least the occupied time occupied by the robot for the work in the debit and credit as an account item of at least each robot and the operating state as a whole.
    A robot control system featuring.
PCT/JP2020/000203 2020-01-07 2020-01-07 Robot control system WO2021140577A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/JP2020/000203 WO2021140577A1 (en) 2020-01-07 2020-01-07 Robot control system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2020/000203 WO2021140577A1 (en) 2020-01-07 2020-01-07 Robot control system

Publications (1)

Publication Number Publication Date
WO2021140577A1 true WO2021140577A1 (en) 2021-07-15

Family

ID=76788484

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2020/000203 WO2021140577A1 (en) 2020-01-07 2020-01-07 Robot control system

Country Status (1)

Country Link
WO (1) WO2021140577A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114290329A (en) * 2021-12-13 2022-04-08 库卡机器人制造(上海)有限公司 Calibration control method and system for robot, storage medium and robot assembly
CN114418461A (en) * 2022-03-28 2022-04-29 浙江凯乐士科技集团股份有限公司 Task allocation method and device for shuttle vehicle and electronic equipment
CN117035587A (en) * 2023-10-09 2023-11-10 山东省智能机器人应用技术研究院 Multiple-robot cooperative work management system based on cargo information
WO2023233977A1 (en) * 2022-05-31 2023-12-07 オムロン株式会社 Information processing device, information processing method, and information processing program

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005324278A (en) * 2004-05-13 2005-11-24 Honda Motor Co Ltd Robot control device
JP2009028831A (en) * 2007-07-26 2009-02-12 Panasonic Electric Works Co Ltd Work robot system
JP2009136932A (en) * 2007-12-04 2009-06-25 Honda Motor Co Ltd Robot and task execution system
JP2018047536A (en) * 2016-09-23 2018-03-29 カシオ計算機株式会社 Robot, fault diagnosis system, fault diagnosis method, and program

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005324278A (en) * 2004-05-13 2005-11-24 Honda Motor Co Ltd Robot control device
JP2009028831A (en) * 2007-07-26 2009-02-12 Panasonic Electric Works Co Ltd Work robot system
JP2009136932A (en) * 2007-12-04 2009-06-25 Honda Motor Co Ltd Robot and task execution system
JP2018047536A (en) * 2016-09-23 2018-03-29 カシオ計算機株式会社 Robot, fault diagnosis system, fault diagnosis method, and program

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114290329A (en) * 2021-12-13 2022-04-08 库卡机器人制造(上海)有限公司 Calibration control method and system for robot, storage medium and robot assembly
CN114290329B (en) * 2021-12-13 2023-09-05 库卡机器人制造(上海)有限公司 Calibration control method and system for robot, storage medium and robot assembly
CN114418461A (en) * 2022-03-28 2022-04-29 浙江凯乐士科技集团股份有限公司 Task allocation method and device for shuttle vehicle and electronic equipment
WO2023233977A1 (en) * 2022-05-31 2023-12-07 オムロン株式会社 Information processing device, information processing method, and information processing program
CN117035587A (en) * 2023-10-09 2023-11-10 山东省智能机器人应用技术研究院 Multiple-robot cooperative work management system based on cargo information
CN117035587B (en) * 2023-10-09 2024-01-16 山东省智能机器人应用技术研究院 Multiple-robot cooperative work management system based on cargo information

Similar Documents

Publication Publication Date Title
WO2021140577A1 (en) Robot control system
JP7282850B2 (en) Methods, systems and apparatus for controlling movement of transport devices
Lee et al. Smart robotic mobile fulfillment system with dynamic conflict-free strategies considering cyber-physical integration
US20210078175A1 (en) Method, server and storage medium for robot routing
Vis Survey of research in the design and control of automated guided vehicle systems
US20190152057A1 (en) Robotic load handler coordination system, cell grid system and method of coordinating a robotic load handler
Srivastava et al. Development of an intelligent agent-based AGV controller for a flexible manufacturing system
JP2022533784A (en) Warehousing task processing method and apparatus, warehousing system and storage medium
US20200247611A1 (en) Object handling coordination system and method of relocating a transporting vessel
CN106228302A (en) A kind of method and apparatus for carrying out task scheduling in target area
Gambardella et al. Agent-based planning and simulation of combined rail/road transport
Hartmann Scheduling reefer mechanics at container terminals
López et al. A simulation and control framework for AGV based transport systems
Basile et al. An auction-based approach to control automated warehouses using smart vehicles
Xu et al. Dynamic spare point application based coordination strategy for multi-AGV systems in a WIP warehouse environment
Branisso et al. A multi-agent system using fuzzy logic to increase AGV fleet performance in warehouses
Mahdavi et al. Optimal trajectory and schedule planning for autonomous guided vehicles in flexible manufacturing system
AU2022331927A1 (en) A hybrid method for controlling a railway system and an apparatus therefor
JP2023024414A (en) Method, device, and facility for arranging delivery of articles and recording medium
Singgih et al. Architectural design of terminal operating system for a container terminal based on a new concept
WO2020003988A1 (en) Information processing device, moving device, information processing system, method, and program
Iida et al. Negotiation Algorithm for Multi-agent Pickup and Delivery Tasks
Sin Development of task distribution algorithm for multi-robot coordination system
Evers Real-time hiring of vehicles for container transport
Yu et al. The Research on the Container Truck Scheduling Based on Fuzzy Control and Ant Colony Algorithm

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20912222

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 12/10/2022)

NENP Non-entry into the national phase

Ref country code: JP

122 Ep: pct application non-entry in european phase

Ref document number: 20912222

Country of ref document: EP

Kind code of ref document: A1