CN116501042A

CN116501042A - Target tracking method and device based on robot vision

Info

Publication number: CN116501042A
Application number: CN202310381740.3A
Authority: CN
Inventors: 陈国荣
Original assignee: Shanghai Micro Motor Research Institute 21st Research Institute Of China Electronics Technology Corp
Current assignee: Shanghai Micro Motor Research Institute 21st Research Institute Of China Electronics Technology Corp
Priority date: 2023-04-11
Filing date: 2023-04-11
Publication date: 2023-07-28

Abstract

The application provides a target tracking method and device based on robot vision, wherein the method comprises the following steps: determining a tracking target according to a first frame image acquired by a camera on the robot; processing the current frame image acquired by the camera by adopting a single-target tracking algorithm, and determining the position information of a predicted target corresponding to the tracking target in the current frame image; calculating real-time moving direction information of the robot moving towards the tracking target according to the position information corresponding to the predicted target and the current moving direction information of the robot; and finally, controlling the robot to move towards the predicted target according to the real-time moving direction information, so as to realize active tracking of the tracked target. According to the method, under the condition that other auxiliary hardware is not available and the three-dimensional coordinates of the tracking target are not obtained, active tracking of the tracking target can be achieved only based on the first frame image and the current frame image obtained by the robot camera.

Description

Target tracking method and device based on robot vision

Technical Field

The invention belongs to the technical field of robot control, and particularly relates to a target tracking method and device based on robot vision.

Background

At present, the intelligent quadruped robot has human perception capability and thinking capability, can independently and adaptively execute tasks given by people, realizes intimate interaction of human-machine-environment, and has wide application in military and civil applications. Therefore, achieving active target tracking on a robotic platform is an important task in developing intelligent robots.

Existing four-legged robots are designed primarily around ultrasound, radar, GPS, etc., which require some additional expense and modification. In the case of a four-legged robot, however, not all robots are equipped with ultrasound modules, radar modules or GPS networking modules. The active target tracking based on the ultrasonic module needs to track the target carrying emitter, so that the mode limits the use scene of the active tracking of the robot, particularly the tracking of a specific target which is not seen; active tracking algorithms based on radar and GPS, however, require knowledge of the three-dimensional coordinates of the tracked target, which is generally not known to any target.

Therefore, a method for actively tracking a robot, which does not need other auxiliary hardware or knowledge of the three-dimensional coordinates of a tracked object, is to be studied.

Disclosure of Invention

In view of this, the present invention provides a target tracking method and apparatus based on robot vision, and the main purpose of the present invention is to realize active tracking of a tracked target based on robot vision only without other auxiliary hardware and without acquiring three-dimensional coordinates of the tracked target.

According to a first aspect of the present invention, there is provided a target tracking method based on robot vision, comprising:

determining a tracking target according to a first frame image acquired by a camera on the robot;

processing a current frame image acquired by the camera by adopting a single-target tracking algorithm, and determining a predicted target corresponding to the tracking target and position information corresponding to the predicted target in the current frame image;

calculating real-time moving direction information of the robot moving towards the tracking target according to the position information corresponding to the predicted target and the current moving direction information of the robot;

and controlling the robot to move towards the prediction target according to the real-time movement direction information.

According to a second aspect of the present invention, there is provided a robot vision-based object tracking apparatus comprising:

The first determining module is used for determining a tracking target according to a first frame image acquired by a camera on the robot;

the second determining module is used for processing the tracking target by adopting a single target tracking algorithm and determining a predicted target in the current frame image acquired by the camera and position information corresponding to the predicted target;

the real-time moving direction information calculation module is used for calculating real-time moving direction information of the robot moving towards the tracking target according to the position information corresponding to the predicted target and the current moving direction information of the robot;

and the control module is used for controlling the robot to move towards the prediction target according to the real-time moving direction information.

According to a third aspect of the present invention there is provided a computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of the method described above.

According to a fourth aspect of the present invention there is provided a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the computer program implementing the steps of the method described above when executed by the processor.

By means of the technical scheme, the technical scheme provided by the embodiment of the invention has at least the following advantages:

compared with the prior art, the target tracking method and device based on robot vision provided by the invention have the advantages that the tracking target is determined according to the first frame image acquired by the camera on the robot; processing the current frame image acquired by the camera by adopting a single-target tracking algorithm, and determining the position information of a predicted target corresponding to the tracking target in the current frame image; calculating real-time moving direction information of the robot moving towards the tracking target according to the position information corresponding to the predicted target and the current moving direction information of the robot; and finally, controlling the robot to move towards the predicted target according to the real-time moving direction information, so as to realize active tracking of the tracked target. According to the method, under the condition that other auxiliary hardware is not available and the three-dimensional coordinates of the tracking target are not obtained, active tracking of the tracking target can be achieved only based on the first frame image and the current frame image obtained by the robot camera.

Drawings

Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to designate like parts throughout the figures. In the drawings:

Fig. 1 shows an application scenario schematic diagram of a target tracking method based on robot vision provided by an embodiment of the present invention;

fig. 2 is a schematic flow chart of a target tracking method based on robot vision according to an embodiment of the present invention;

fig. 3 is a schematic flow chart of another object tracking method based on robot vision according to an embodiment of the present invention;

fig. 4 shows a block diagram of a target tracking device based on robot vision according to an embodiment of the present invention;

fig. 5 is a block diagram of an electronic device for implementing a method of an embodiment of the invention.

Detailed Description

Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the invention. Rather, they are merely examples of apparatus and methods consistent with aspects of the invention as detailed in the accompanying claims.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in this specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any or all possible combinations of one or more of the associated listed items.

It should be understood that although the terms first, second, third, etc. may be used herein to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the invention. The word "if" as used herein may be interpreted as "at … …" or "at … …" or "responsive to a determination", depending on the context.

Embodiments of the present disclosure will be described in detail below with reference to the accompanying drawings.

Fig. 1 illustrates a schematic diagram of an exemplary system 100 in which various methods and apparatus described herein may be implemented, in accordance with an embodiment of the present disclosure. Referring to fig. 1, the system 100 includes one or more client devices 101, 102, 103, 104, 105, and 106, a server 120, and one or more communication networks 110 coupling the one or more client devices to the server 120. Client devices 101, 102, 103, 104, 105, and 106 may be configured to execute one or more applications.

In embodiments of the present disclosure, the server 120 may run one or more services or software applications that enable execution of the robot vision-based target tracking method.

In some embodiments, server 120 may also provide other services or software applications that may include non-virtual environments and virtual environments. In some embodiments, these services may be provided as web-based services or cloud services, for example, provided to users of client devices 101, 102, 103, 104, 105, and/or 106 under a software as a service (SaaS) model.

In the configuration shown in fig. 1, server 120 may include one or more components that implement the functions performed by server 120. These components may include software components, hardware components, or a combination thereof that are executable by one or more processors. A user operating client devices 101, 102, 103, 104, 105, and/or 106 may in turn utilize one or more client applications to interact with server 120 to utilize the services provided by these components. It should be appreciated that a variety of different system configurations are possible, which may differ from system 100. Accordingly, FIG. 1 is one example of a system for implementing the various methods described herein and is not intended to be limiting.

The user may receive the first classification result using client devices 101, 102, 103, 104, 105, and/or 106. The client device may provide an interface that enables a user of the client device to interact with the client device. The client device may also output information to the user via the interface. Although fig. 1 depicts only six client devices, those skilled in the art will appreciate that the present disclosure may support any number of client devices.

Client devices 101, 102, 103, 104, 105, and/or 106 may include various types of computer devices, such as portable handheld devices, general purpose computers (such as personal computers and laptop computers), workstation computers, wearable devices, smart screen devices, self-service terminal devices, service robots, gaming systems, thin clients, various messaging devices, sensors or other sensing devices, and the like. These computer devices may run various types and versions of software applications and operating systems, such as MICROSOFT Windows, APPLE iOS, a UNIX-like operating system, linux, or a Linux-like operating system (e.g., GOOGLE Chrome OS); or include various mobile operating systems such as MICROSOFT WindowsMobile OS, iOS, windows Phone, android. Portable handheld devices may include cellular telephones, smart phones, tablet computers, personal Digital Assistants (PDAs), and the like. Wearable devices may include head mounted displays (such as smart glasses) and other devices. The gaming system may include various handheld gaming devices, internet-enabled gaming devices, and the like. The client device is capable of executing a variety of different applications, such as various Internet-related applications, communication applications (e.g., email applications), short Message Service (SMS) applications, and may use a variety of communication protocols.

Network 110 may be any type of network known to those skilled in the art that may support data communications using any of a number of available protocols, including but not limited to TCP/IP, SNA, IPX, etc. For example only, the one or more networks 110 may be a Local Area Network (LAN), an ethernet-based network, a token ring, a Wide Area Network (WAN), the internet, a virtual network, a Virtual Private Network (VPN), an intranet, an extranet, a Public Switched Telephone Network (PSTN), an infrared network, a wireless network (e.g., bluetooth, WIFI), and/or any combination of these and/or other networks.

The server 120 may include one or more general purpose computers, special purpose server computers (e.g., PC (personal computer) servers, UNIX servers, mid-end servers), blade servers, mainframe computers, server clusters, or any other suitable arrangement and/or combination. The server 120 may include one or more virtual machines running a virtual operating system, or other computing architecture that involves virtualization (e.g., one or more flexible pools of logical storage devices that may be virtualized to maintain virtual storage devices of the server). In various embodiments, server 120 may run one or more services or software applications that provide the functionality described below.

The computing units in server 120 may run one or more operating systems including any of the operating systems described above as well as any commercially available server operating systems. Server 120 may also run any of a variety of additional server applications and/or middle tier applications, including HTTP servers, FTP servers, CGI servers, JAVA servers, database servers, etc.

In some implementations, server 120 may include one or more applications to analyze and consolidate data feeds and/or event updates received from users of client devices 101, 102, 103, 104, 105, and 106. Server 120 may also include one or more applications to display data feeds and/or real-time events via one or more display devices of client devices 101, 102, 103, 104, 105, and 106.

In some implementations, the server 120 may be a server of a distributed system or a server that incorporates a blockchain. The server 120 may also be a cloud server, or an intelligent cloud computing server or intelligent cloud host with artificial intelligence technology. The cloud server is a host product in a cloud computing service system, so as to solve the defects of large management difficulty and weak service expansibility in the traditional physical host and virtual private server (VPS, virtual Private Server) service.

The system 100 may also include one or more databases 130. In some embodiments, these databases may be used to store data and other information. For example, one or more of databases 130 may be used to store information such as audio files and object files. The data store 130 may reside in a variety of locations. For example, the data store used by the server 120 may be local to the server 120, or may be remote from the server 120 and may communicate with the server 120 via a network-based or dedicated connection. The data store 130 may be of different types. In some embodiments, the data store used by server 120 may be a database, such as a relational database. One or more of these databases may store, update, and retrieve the databases and data from the databases in response to the commands.

In some embodiments, one or more of databases 130 may also be used by applications to store application data. The databases used by the application may be different types of databases, such as key value stores, object stores, or conventional stores supported by the file system.

The system 100 of fig. 1 may be configured and operated in various ways to enable application of the various methods and apparatus described in accordance with the present disclosure.

Referring to fig. 2, a robot vision-based target tracking method according to some embodiments of the present disclosure, a method 200 includes the steps of:

201. and determining a tracking target according to the first frame image acquired by the camera on the robot.

Here, the robot is preferably a four-legged robot; the tracking target may be selected by the end user based on the image content in the first frame image.

In some embodiments, determining the tracking target from the first frame image acquired by the camera on the robot may include: processing the first frame image to obtain a target frame; and extracting an image corresponding to the target frame according to the position information of the target frame in the first frame image, and taking a target contained in the image as a tracking target.

The tracking target is framed from the first frame image by using a cv2.Select roi function of the open source packet opencv, so as to obtain the target frame and the position information of the target frame in the first frame image, where the position information of the target frame includes coordinates corresponding to each vertex of the target frame in the first frame image. Preferably, the position information of the target frame is coordinates of the target frame on x and y axes of the corresponding left upper corner vertex in the first frame image, and width and height values of the target frame, namely [ x ] ₀ ,y ₀ ,w ₀ ,h ₀ ]Wherein x is ₀ 、y ₀ Representing the coordinates on the x and y axes, w, of the top left corner vertex of the target frame, respectively ₀ And h ₀ And respectively representing the width value and the height value of the target frame, extracting an image corresponding to the target frame in the first frame image, and taking the target in the extracted image as a tracking target at the moment.

202. And processing the current frame image acquired by the camera by adopting a single-target tracking algorithm, and determining the position information of a predicted target corresponding to the tracked target in the current frame image.

The single-target tracking algorithm is preferably a siamfc++ algorithm, which is capable of taking an image of a target frame in a first frame image as a template image, extracting template features from the template image, and interacting with a current frame image to output position information corresponding to a predicted target, for example, after inputting the current frame image acquired by a camera into the siamfc++ algorithm, outputting the position information corresponding to the predicted target in the current frame image as p _k ＝[x _k ,y _k ,w _k ,h _k ]。

203. And calculating real-time moving direction information of the robot moving towards the tracking target according to the position information corresponding to the predicted target and the current moving direction information of the robot.

Wherein the current movement direction information of the robot includes a direction in which the robot moves along its longitudinal axis and a direction in which the robot moves along its transverse axis.

Specifically, according to the position information of the predicted target corresponding to the current frame image, whether the predicted target is positioned at the left half part or the right half part in the current frame image can be determined, and then the real-time moving direction information of the robot moving towards the tracking target is calculated based on the current moving direction information of the robot. For example: the predicted target is in the left half of the current frame image, the robot is controlled to rotate to the left by controlling the robot in the direction along its longitudinal axis and in the direction along its lateral axis.

In some embodiments, according to the position information corresponding to the predicted target and the current motion direction information of the robot, real-time motion direction information of the robot moving toward the tracking target is calculated, referring to fig. 3, step 203 includes the steps of:

2031. and determining the target direction of the body rotation of the robot according to the position information of the predicted target in the current frame image and the current movement direction information of the robot.

Here, according to p acquired in step 202 _k ＝[x _k ,y _k ,w _k ,h _k ]The position information of the predicted target in the current frame image can be obtained, and the relative position between the predicted target and the robot can be determined, for example: the left half part of the predicted target in the current image can determine that the predicted target is positioned at the left side of the robot, and further can determine that the target direction of the rotation of the robot body is left rotation.

Further, the target direction includes left and right directions, and determining the target direction of the robot body rotation according to the position information of the predicted target in the current frame image and the current movement direction information of the robot may include: acquiring a width value of a current frame image, and determining the position of a symmetry axis of the current frame image; and determining that the robot body rotates leftwards or rightwards according to the position of the symmetrical axis of the current frame image, the position information of the predicted target in the current frame image and the current movement direction information of the robot.

Specifically, determining the position of a predicted target in a current frame image through the position information of the predicted target in the current frame image, determining whether the predicted target is positioned at the left side or the right side of the symmetrical axis according to the relation between the position of the symmetrical axis of the current image and the position of the predicted target, and if the predicted target is positioned at the left side of the symmetrical axis, determining that the robot body rotates leftwards; if the predicted target is positioned right of the symmetry axis, the robot body is determined to rotate rightwards.

2032. And determining and obtaining a first prediction offset of the robot along the direction of the transverse axis of the robot body according to the position information of the prediction target in the current frame image.

Specifically, the current frame image is noted as a kth frame image, where the position information of the prediction target in the kth frame image is p _k ＝[x _k ,y _k ,w _k ,h _k ]The length value and the width value of the kth frame of image are W, H, u is the longitudinal axis direction of the robot body, and v is the transverse axis direction of the robot body.

Here, the first predicted offset of the robot motion in the v-axis direction is calculated by the following formula:

c _k ＝x _k +w _k /2

wherein Deltav is the first predicted offset, c _x To predict the abscissa of the center position of the target,normalizing the abscissa of the central position of the predicted target to a value between 0 and 1, wherein W is the width value of the k-th frame image, and x _k Is the coordinate on the x-axis, w, of the top left corner vertex of the predicted object in the kth frame image _k A is a preset fixed parameter for the width value of the prediction target in the kth frame image, wherein a can be set to any positive number of 1-10, and here, a is preferably 5.

2033. And obtaining corresponding first size information of the predicted target in the current frame image according to the position information of the predicted target in the current frame image.

Here, the first size information includes, but is not limited to, area information of a tracking frame corresponding to the predicted target in the current image. Since the position information of the prediction target in the kth frame image is p _k ＝[x _k ,y _k ,w _k ,h _k ]The first size information is w _k ×h _k 。

2034. And obtaining second size information corresponding to the tracking target in the first frame image according to the position information of the tracking target in the first frame image.

Here, the second size information includes, but is not limited to, area information of a target frame corresponding to the tracking target. Since the position information of the tracking target in the first frame image is [ x ] ₀ ,y ₀ ,w ₀ ,h ₀ ]The second size information is w ₀ ×h ₀ 。

2035. And determining a second predicted offset of the robot along the longitudinal axis direction of the robot body according to the first size information and the second size information.

Here, the second predicted offset amount Δu of the robot movement in the u-axis direction is calculated by the following formula:

wherein Δu is the second predicted offset; b, r is a fixed parameter, wherein b is any positive number between 1 and 5, r is any positive number between 0.5 and 2, and here, b is preferably 1, and r is preferably 2;to measure the degree of size change between the corresponding size of the predicted object in the kth frame of image and the corresponding size of the tracked object in the first frame of image.

2036. And determining real-time moving direction information of the robot moving towards the tracking target according to the target direction of the robot body rotation, the first predicted offset and the second predicted offset.

In some embodiments, determining real-time movement direction information of the robot moving toward the tracking target according to the target direction of the body rotation of the robot, the first predicted offset amount, and the second predicted offset amount may include: calculating an adjustment rotation angle of the robot according to the first predicted offset and the second predicted offset of the rotation of the robot body; and determining real-time moving direction information of the robot moving towards the tracking target according to the first predicted offset and the second predicted offset of the robot body rotation and the adjustment rotation angle.

Here, the directions in the current movement direction information include a direction in which the robot moves along the longitudinal axis of its body and a direction in which the robot moves along the transverse axis of its body. The adjustment rotation angle of the robot can be obtained through calculation by obtaining the prediction offset corresponding to the movement direction of the robot along the longitudinal axis of the robot body and the movement direction of the transverse axis of the robot body.

For example: the first predicted offset in the direction of the robot along the transverse axis of the robot body is Deltav ', the second predicted offset in the direction of the robot along the longitudinal axis of the robot body is Deltau', the rotation angle is adjusted to Deltatheta ', and the Deltatheta' is calculated according to the following formula:

Δu′＝Δu-u ₀

Δv′＝Δv-v ₀

Wherein u is ₀ Is a fixed parameter relative to the longitudinal axis of the fuselage, v ₀ Is a fixed parameter relative to the transverse axis direction of the fuselage, wherein v ₀ Can be set according to the tracking effect, where u is set ₀ ＝0,v ₀ =0; deltau ', deltav' respectively refer to the actual offset of the machine body after correction in the longitudinal axis direction and the transverse axis direction of the machine body;and Δθ' are both angle estimates for the adjustment rotation angle.

204. And controlling the robot to move towards the prediction target according to the real-time moving direction information.

Here, the real-time movement direction information is transmitted to a motion control module of the robot through ROS (Robot Operating System ) communication by controlling the robot to move back and forth, move left and right, turn clockwise, turn counterclockwise, etc.

In some embodiments, controlling movement of the robot to the predicted target based on the real-time movement direction information includes: obtaining maximum speed information and preset fixed learning rate of the robot in each direction in the current movement direction information; calculating the predicted moving speed of the robot in each direction according to the maximum speed information and the preset fixed learning rate which correspond to each direction in the current moving direction information; and controlling the robot to move towards the prediction target according to the real-time moving direction information and the predicted moving speed of the robot in each direction.

The predicted movement speed of the robot body in each direction when the kth frame image is obtained is calculated by the following calculation formula:

wherein U, V, theta respectively represent the longitudinal axis direction of the machine body, the transverse axis direction of the machine body and the target direction of the machine body rotation; u (U) _max ,V _max ,θ _max Maximum speeds corresponding to the three directions U, V and theta are set as preferable U _max ,V _max ,θ _max The method comprises the following steps: u (U) _max ＝0.7,V _max ＝0.7,θ _max ＝0.5；lr _u ,lr _v ,lr _θ The learning rates are respectively fixed corresponding to the directions U, V and theta, and lr is set in the application _u ,lr _v ,lr _θ The method comprises the following steps: lr (lr) _u ＝0.7,lr _v ＝0.01,lr _θ =0.45; deltau ', deltav' respectively refer to the actual offset of the machine body after correction in the longitudinal axis direction and the transverse axis direction of the machine body, and the intermediate quantityAnd Δθ' are both angle estimates for the adjustment rotation angle.

It should be noted that, after the current frame image is obtained, the predicted moving speed of the robot body in each direction is adjusted in real time according to the position information corresponding to the predicted target in the current frame image, and after the next frame image is obtained, the next frame image is the new current frame image, and then the predicted moving speed of the robot body in each direction is adjusted in real time according to the position information corresponding to the predicted target in the new current frame image, so as to realize controlling the robot to move towards the predicted target.

According to the target tracking method based on robot vision, a tracking target is determined according to a first frame image acquired by a camera on a robot; processing the current frame image acquired by the camera by adopting a single-target tracking algorithm, and determining the position information of a predicted target corresponding to the tracking target in the current frame image; calculating real-time moving direction information of the robot moving towards the tracking target according to the position information corresponding to the predicted target and the current moving direction information of the robot; and finally, controlling the robot to move towards the predicted target according to the real-time moving direction information, so as to realize active tracking of the tracked target. According to the method, under the condition that other auxiliary hardware is not available and the three-dimensional coordinates of the tracking target are not obtained, active tracking of the tracking target can be achieved only based on the first frame image and the current frame image obtained by the robot camera.

Further, as an implementation of the method shown in fig. 2, an embodiment of the present invention provides a target tracking device based on robot vision, as shown in fig. 4, where the device includes:

The second determining module is used for processing the current frame image acquired by the camera by adopting a single-target tracking algorithm and determining the position information of a predicted target corresponding to the tracking target in the current frame image;

Further, the first determining module includes:

the target frame acquisition unit is used for processing the first frame image to obtain a target frame;

and the tracking target extraction unit is used for extracting an image corresponding to the target frame according to the position information of the target frame in the first frame image, and taking a target contained in the image as a tracking target.

Further, the real-time moving direction information calculating module includes:

a target direction determining unit for determining a target direction of the robot body rotation according to the position information of the predicted target in the current frame image and the current movement direction information of the robot;

the first prediction offset determining unit is used for determining and obtaining a first prediction offset of the robot moving along the transverse axis direction of the robot body according to the position information of the prediction target in the current frame image;

The first size information acquisition unit is used for acquiring first size information corresponding to the predicted target in the current frame image according to the position information of the predicted target in the current frame image;

the second size information acquisition unit is used for acquiring second size information corresponding to the tracking target in the first frame image according to the position information of the tracking target in the first frame image;

the second predicted offset determining unit is used for determining and obtaining a second predicted offset of the robot moving along the longitudinal axis direction of the robot body according to the first size information and the second size information;

and the real-time moving direction information determining unit is used for determining the real-time moving direction information of the robot moving towards the tracking target according to the target direction of the body rotation of the robot, the first predicted offset and the second predicted offset.

Further, the target direction includes left and right directions, and the real-time moving direction information calculating module includes:

the symmetrical axis position determining unit is used for acquiring the width value of the current frame image and determining the position of the symmetrical axis of the current frame image;

and the steering determining unit is used for determining the left or right rotation of the robot body according to the position of the symmetrical axis of the current frame image, the position information of the predicted target in the current frame image and the current movement direction information of the robot.

Further, the real-time moving direction information determining unit includes:

the adjustment rotation angle calculation subunit is used for calculating the adjustment rotation angle of the robot according to the first prediction offset and the second prediction offset of the rotation of the body of the robot;

and the real-time moving direction information determining subunit is used for determining the real-time moving direction information of the robot moving towards the tracking target according to the first predicted offset and the second predicted offset of the body rotation of the robot and the adjustment rotation angle.

Further, the control module includes:

the acquisition unit is used for acquiring maximum speed information and preset fixed learning rate of the robot in each direction in the current movement direction information;

a predicted moving speed calculation unit for calculating a predicted moving speed of the robot in each direction according to maximum speed information and a preset fixed learning rate corresponding to each direction in the current moving direction information;

and the control unit is used for controlling the robot to move towards the prediction target according to the real-time moving direction information and the predicted moving speed of the robot in each direction.

According to the target tracking device based on robot vision, a tracking target is determined according to a first frame image acquired by a camera on a robot; according to the target characteristics corresponding to the tracking targets, determining the predicted targets in the current frame image acquired by the camera and the position information corresponding to the predicted targets; calculating real-time moving direction information of the robot moving towards the tracking target according to the position information corresponding to the predicted target and the current moving direction information of the robot; and finally, controlling the robot to move towards the predicted target according to the real-time moving direction information, so as to realize active tracking of the tracked target. According to the method, under the condition that other auxiliary hardware is not available and the three-dimensional coordinates of the tracking target are not obtained, active tracking of the tracking target can be achieved only based on the first frame image and the current frame image obtained by the robot camera.

It should be noted that: in the object tracking device based on robot vision provided in the above embodiment, when object tracking is implemented, only the division of the above functional modules is used for illustration, in practical application, the above functional allocation may be implemented by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules, so as to implement all or part of the functions described above. In addition, the object tracking device based on the robot vision provided in the above embodiment and the object tracking method embodiment based on the robot vision belong to the same concept, and detailed implementation processes of the object tracking device based on the robot vision are detailed in the method embodiment, and are not described in detail herein.

According to another aspect of the present disclosure, there is also provided a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the method according to the present disclosure.

According to another aspect of the present disclosure there is also provided a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, which when executed by the processor, implements the steps of the method according to the present disclosure.

Referring to fig. 5, a block diagram of an electronic device 500 that may be a server or a client of the present disclosure, which is an example of a hardware device that may be applied to aspects of the present disclosure, will now be described. Electronic devices are intended to represent various forms of digital electronic computer devices, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other suitable computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 5, the electronic device 500 includes a computing unit 501 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) 502 or a computer program loaded from a storage unit 508 into a Random Access Memory (RAM) 503. In the RAM 503, various programs and data required for the operation of the electronic device 500 may also be stored. The computing unit 501, ROM 502, and RAM 503 are connected to each other by a bus 504. An input/output (I/O) interface 505 is also connected to bus 504.

A number of components in electronic device 500 are connected to I/O interface 505, including: an input unit 506, an output unit 507, a storage unit 508, and a communication unit 509. The input unit 506 may be any type of device capable of inputting information to the electronic device 500, the input unit 506 may receive input numeric or character information and generate key signal inputs related to user settings and/or function control of the electronic device, and may include, but is not limited to, a mouse, a keyboard, a touch screen, a trackpad, a trackball, a joystick, a microphone, and/or a remote control. The output unit 507 may be any type of device capable of presenting information and may include, but is not limited to, a display, speakers, object/audio output terminals, vibrators, and/or printers. Storage unit 508 may include, but is not limited to, magnetic disks, optical disks. The communication unit 509 allows the electronic device 500 to exchange information/data with other devices over a computer network such as the internet and/or various telecommunications networks, and may include, but is not limited to, modems, network cards, infrared communication devices, wireless communication transceivers and/or chipsets, such as bluetooth (TM) devices, 802.11 devices, wiFi devices, wiMax devices, cellular communication devices, and/or the like.

The computing unit 501 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 501 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 501 performs the various methods and processes described above, such as method 200. For example, in some embodiments, the method 200 may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as the storage unit 508. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 500 via the ROM 502 and/or the communication unit 509. When the computer program is loaded into RAM 503 and executed by computing unit 501, one or more steps of method 200 described above may be performed. Alternatively, in other embodiments, the computing unit 501 may be configured to perform the method 200 by any other suitable means (e.g., by means of firmware).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), complex Programmable Logic Devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.

The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server incorporating a blockchain.

It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps recited in the present disclosure may be performed in parallel, sequentially or in a different order, provided that the desired results of the disclosed aspects are achieved, and are not limited herein.

Although embodiments or examples of the present disclosure have been described with reference to the accompanying drawings, it is to be understood that the foregoing methods, systems, and apparatus are merely exemplary embodiments or examples, and that the scope of the present invention is not limited by these embodiments or examples but only by the claims following the grant and their equivalents. Various elements of the embodiments or examples may be omitted or replaced with equivalent elements thereof. Furthermore, the steps may be performed in a different order than described in the present disclosure. Further, various elements of the embodiments or examples may be combined in various ways. It is important that as technology evolves, many of the elements described herein may be replaced by equivalent elements that appear after the disclosure.

Claims

1. A target tracking method based on robot vision, comprising:

processing a current frame image acquired by the camera by adopting a single-target tracking algorithm, and determining the position information of a predicted target corresponding to the tracking target in the current frame image;

2. The method of claim 1, wherein determining the tracking target from the first frame image acquired by the camera on the robot comprises:

processing the first frame image to obtain a target frame;

and extracting an image corresponding to the target frame according to the position information of the target frame in the first frame image, and taking a target contained in the image as the tracking target.

3. The method according to claim 1, wherein calculating real-time movement direction information of the robot moving toward the tracking target based on the position information corresponding to the predicted target and the current movement direction information of the robot, comprises:

Determining a target direction of the rotation of the robot body according to the position information of the predicted target in the current frame image and the current movement direction information of the robot;

determining a first prediction offset of the robot moving along the transverse axis direction of the robot body according to the position information of the prediction target in the current frame image;

obtaining first size information corresponding to the prediction target in the current frame image according to the position information of the prediction target in the current frame image;

obtaining second size information corresponding to the tracking target in the first frame image according to the position information of the tracking target in the first frame image;

determining a second predicted offset of the robot moving along the longitudinal axis direction of the robot body according to the first size information and the second size information;

and determining real-time moving direction information of the robot moving towards the tracking target according to the target direction of the robot body rotation, the first predicted offset and the second predicted offset.

4. A method according to claim 3, wherein the target direction includes left and right directions, and the determining the target direction of the robot's body rotation based on the position information of the predicted target in the current frame image and the current movement direction information of the robot includes:

Acquiring a width value of the current frame image, and determining the position of the symmetry axis of the current frame image;

and determining that the robot body rotates leftwards or rightwards according to the position of the symmetrical axis of the current frame image, the position information of the prediction target in the current frame image and the current movement direction information of the robot.

5. The method of claim 3, wherein the determining real-time movement direction information of the robot moving toward the tracking target based on the target direction of the robot's body rotation, the first predicted offset, and the second predicted offset, comprises:

calculating an adjustment rotation angle of the robot according to a first prediction offset and a second prediction offset of the rotation of the body of the robot;

and determining real-time moving direction information of the robot moving towards the tracking target according to the first predicted offset and the second predicted offset of the robot body rotation and the adjustment rotation angle.

6. A method according to claim 3, wherein said controlling the movement of the robot to the predicted target based on the real-time movement direction information comprises:

Obtaining maximum speed information and preset fixed learning rate of the robot corresponding to each direction in the current movement direction information respectively;

calculating the predicted moving speed of the robot in each direction according to the maximum speed information and the preset fixed learning rate which correspond to each direction in the current moving direction information;

and controlling the robot to move towards the prediction target according to the real-time moving direction information and the predicted moving speed of the robot in each direction.

7. A robot vision-based target tracking device, comprising:

8. The apparatus of claim 1, wherein the first determining module comprises:

the target frame acquisition unit is used for processing the first frame image according to a single target tracking algorithm to obtain a target frame;

and the tracking target extraction unit is used for extracting an image corresponding to the target frame according to the position information of the target frame in the first frame image, and taking a target contained in the image as the tracking target.

9. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 7.

10. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the computer program when executed by the processor implements the steps of the method according to any one of claims 1 to 7.