WO2021008207A1 - Target tracking method and apparatus, intelligent mobile device and storage medium - Google Patents
Target tracking method and apparatus, intelligent mobile device and storage medium Download PDFInfo
- Publication number
- WO2021008207A1 WO2021008207A1 PCT/CN2020/089620 CN2020089620W WO2021008207A1 WO 2021008207 A1 WO2021008207 A1 WO 2021008207A1 CN 2020089620 W CN2020089620 W CN 2020089620W WO 2021008207 A1 WO2021008207 A1 WO 2021008207A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- offset
- target object
- value
- target
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 110
- 238000003860 storage Methods 0.000 title claims abstract description 31
- 238000001514 detection method Methods 0.000 claims description 75
- 238000012545 processing Methods 0.000 claims description 74
- 238000007781 pre-processing Methods 0.000 claims description 58
- 238000004422 calculation algorithm Methods 0.000 claims description 26
- 230000033001 locomotion Effects 0.000 claims description 21
- 230000002787 reinforcement Effects 0.000 claims description 21
- 238000010606 normalization Methods 0.000 claims description 18
- 238000004590 computer program Methods 0.000 claims description 14
- 230000004044 response Effects 0.000 claims description 4
- 230000008569 process Effects 0.000 description 31
- 238000010586 diagram Methods 0.000 description 24
- 230000006870 function Effects 0.000 description 18
- 230000009471 action Effects 0.000 description 16
- 238000007726 management method Methods 0.000 description 16
- 238000005516 engineering process Methods 0.000 description 13
- 238000004891 communication Methods 0.000 description 11
- 238000012549 training Methods 0.000 description 11
- 230000005540 biological transmission Effects 0.000 description 8
- 238000013528 artificial neural network Methods 0.000 description 7
- 238000000605 extraction Methods 0.000 description 6
- 238000003062 neural network model Methods 0.000 description 5
- 230000003287 optical effect Effects 0.000 description 4
- 230000005236 sound signal Effects 0.000 description 4
- 238000013135 deep learning Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 230000003068 static effect Effects 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 240000007651 Rubus glaucus Species 0.000 description 2
- 235000011034 Rubus glaucus Nutrition 0.000 description 2
- 235000009122 Rubus idaeus Nutrition 0.000 description 2
- 230000001133 acceleration Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000009432 framing Methods 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 239000004973 liquid crystal related substance Substances 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 230000002093 peripheral effect Effects 0.000 description 2
- 238000011176 pooling Methods 0.000 description 2
- 230000001902 propagating effect Effects 0.000 description 2
- 239000004065 semiconductor Substances 0.000 description 2
- 238000012795 verification Methods 0.000 description 2
- KLDZYURQCUYZBL-UHFFFAOYSA-N 2-[3-[(2-hydroxyphenyl)methylideneamino]propyliminomethyl]phenol Chemical compound OC1=CC=CC=C1C=NCCCN=CC1=CC=CC=C1O KLDZYURQCUYZBL-UHFFFAOYSA-N 0.000 description 1
- RYGMFSIKBFXOCR-UHFFFAOYSA-N Copper Chemical compound [Cu] RYGMFSIKBFXOCR-UHFFFAOYSA-N 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 229910052802 copper Inorganic materials 0.000 description 1
- 239000010949 copper Substances 0.000 description 1
- 125000004122 cyclic group Chemical group 0.000 description 1
- 201000001098 delayed sleep phase syndrome Diseases 0.000 description 1
- 208000033921 delayed sleep phase type circadian rhythm sleep disease Diseases 0.000 description 1
- 238000005538 encapsulation Methods 0.000 description 1
- 238000005265 energy consumption Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 229910044991 metal oxide Inorganic materials 0.000 description 1
- 150000004706 metal oxides Chemical class 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 230000001052 transient effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/24—Aligning, centring, orientation detection or correction of the image
- G06V10/245—Aligning, centring, orientation detection or correction of the image by locating a pattern; Special marks for positioning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/32—Normalisation of the pattern dimensions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/46—Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
- G06V20/47—Detecting features for summarising video content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/172—Classification, e.g. identification
Definitions
- the embodiments of the present application relate to the field of computer vision technology, and relate to but not limited to a target tracking method and device, smart mobile equipment, and storage media.
- smart mobile devices such as remote control cars and mobile robots are used in various fields.
- remote control cars can be used as teaching tools to achieve target tracking.
- the embodiment of the present application proposes a target tracking method and device, smart mobile equipment and storage medium.
- An embodiment of the present application provides a target tracking method, including: acquiring a captured image; determining the position of a target object in the image; and determining the distance between the position of the target object and the center position of the image
- a control instruction for controlling the rotation of a smart mobile device wherein the control instruction is used to make the target object be located at the center of the image, and the control instruction includes an offset in an offset sequence constituting the distance Value corresponding to the rotation instruction, the offset sequence includes at least one offset value.
- the method before determining the position of the target object in the image, the method further includes performing a preprocessing operation on the image, and the preprocessing operation includes: adjusting the image to a preset A grayscale image of a specification, and performing normalization processing on the grayscale image; wherein the determining the position of the target object in the image includes: performing target detection processing on the image obtained after the preprocessing operation , Obtaining the position of the target object in the image after the preprocessing operation; and determining the position of the target object in the image based on the position of the target object in the image after the preprocessing operation.
- the performing normalization processing on the grayscale image includes: determining the average value and standard deviation of the pixel value of each pixel in the grayscale image; obtaining each pixel The difference between the pixel value of and the average value; the ratio between the difference and the standard deviation corresponding to each pixel is determined as the normalized pixel value of each pixel .
- the determining the location of the target object in the image includes: extracting image features of the image; performing classification processing on the image features to obtain the location of the target object in the image Area; the center position of the location area is determined as the location of the target object.
- the target object includes: a human face; correspondingly, the determining the position of the target object in the image includes: determining the position of the human face in the image.
- the determining the control instruction for controlling the rotation of the smart mobile device based on the distance between the position of the target object and the center position of the image includes: based on the target in the image Determine the target offset based on the distance between the position of the object and the center position of the image; generate multiple sets of offset sequences based on the target offset, and the sum of the offset values in each set of offset sequences Is the target offset; using a reinforcement learning algorithm, select an offset sequence that meets the requirements from the multiple sets of offset sequences, and determine the control instruction corresponding to the offset sequence that meets the requirements.
- the use of a reinforcement learning algorithm to select an offset sequence that meets the requirements from the multiple sets of offset sequences includes: determining for each offset value in the multiple sets of offset sequences The maximum value corresponding to the offset value in the value table, the value table includes the value corresponding to the offset value under different rotation commands; the reward value corresponding to the offset value is obtained, and the corresponding value is based on the offset value
- the reward value and the maximum value of the offset value are determined to determine the final value of the offset value, and the reward value is the position of the target object when the rotation instruction corresponding to the maximum value of the offset value is not executed
- the distance between the center positions of the image; the offset sequence with the largest sum of the final value of the offset values in the multiple sets of offset sequences is determined as the offset sequence that meets the requirements.
- the determining the control instruction corresponding to the offset sequence that meets the requirements includes: determining the rotation instruction corresponding to the maximum value of each offset value in the offset sequence that meets the requirements. The control instructions.
- the method further includes: driving the smart mobile device to perform rotation based on the control instruction.
- the method further includes: determining a control instruction for controlling the movement of the smart mobile device based on the location area of the target object, wherein the response to the location area of the target object corresponds to If the area is greater than the first threshold, generate a control instruction for controlling the back of the smart mobile device; in response to the area corresponding to the location area of the target object is less than the second threshold, generate a control instruction for controlling the smart mobile device to move forward , The first threshold is greater than the second threshold.
- An embodiment of the application provides a target tracking device, which includes: an image acquisition module configured to acquire an image; a target detection module configured to determine the position of a target object in the image; and a control module configured to be based on The distance between the position of the target object and the center position of the image determines a control instruction for controlling the rotation of the smart mobile device, wherein the control instruction is used to make the position of the target object be located at the center position of the image , And the control instruction includes a rotation instruction corresponding to an offset value in an offset sequence constituting the distance, and the offset sequence includes at least one offset value.
- the device further includes a preprocessing module configured to perform a preprocessing operation on the image, and the preprocessing operation includes: adjusting the image to a grayscale image of a preset specification, And performing normalization processing on the grayscale image; the target detection module is further configured to perform target detection processing on the image obtained after the preprocessing operation to obtain the target object in the image after the preprocessing operation Based on the position of the target object in the image after the preprocessing operation, determine the position of the target object in the image.
- a preprocessing module configured to perform a preprocessing operation on the image, and the preprocessing operation includes: adjusting the image to a grayscale image of a preset specification, And performing normalization processing on the grayscale image
- the target detection module is further configured to perform target detection processing on the image obtained after the preprocessing operation to obtain the target object in the image after the preprocessing operation Based on the position of the target object in the image after the preprocessing operation, determine the position of the target object in the image.
- the step of performing the normalization process on the grayscale image by the preprocessing module includes: determining the average value and standard of the pixel value of each pixel in the grayscale image Difference; obtain the difference between the pixel value of each pixel and the average value; determine the ratio between the difference and the standard deviation corresponding to each pixel as the pixel The pixel value after point normalization.
- the target detection module is further configured to extract image features of the image; perform classification processing on the image features to obtain the location area of the target object in the image; The center position of the area is determined as the position of the target object.
- the target object includes a human face; correspondingly, the target detection module is further configured to determine the position of the human face in the image.
- control module is further configured to determine the target offset based on the distance between the position of the target object in the image and the center position of the image; based on the target offset Generate multiple sets of offset sequences, and the sum of the offset values in each set of offset sequences is the target offset; the reinforcement learning algorithm is used to select from the multiple sets of offset sequences that meet the requirements Offset sequence, and obtain the control instruction corresponding to the offset sequence that meets the requirements.
- control module is further configured to determine the maximum value corresponding to the offset value in the value table for each offset value in the multiple sets of offset sequences, and the value table includes The value corresponding to the offset value under different rotation commands; the reward value corresponding to the offset value is obtained, and the final value of the offset value is determined based on the reward value and the maximum value corresponding to the offset value Value, the reward value is the distance between the position of the target object and the center of the image when the rotation instruction corresponding to the maximum value of the offset value is not executed; the offset value of each offset value in the multiple sets of offset sequences The offset sequence with the largest sum of the final value is determined as the offset sequence that meets the requirements.
- control module is further configured to determine the control instruction based on the rotation instruction corresponding to the maximum value of each offset value in the offset sequence that meets the requirements.
- the target detection module is further configured to determine a control instruction for controlling the movement of the smart mobile device based on the location area of the target object, wherein the location area corresponding to the target object If the area is greater than the first threshold, generate a control instruction to control the back of the smart mobile device; if the area corresponding to the location area of the target object is less than the second threshold, generate a control to control the smart mobile device to move forward Instruction that the first threshold is greater than the second threshold.
- the embodiment of the present application provides a smart mobile device, which includes the target tracking device, and the target detection module in the target tracking device is integrated in the management device of the smart mobile device, and the management device executes the The target detection processing of the image collected by the image acquisition module obtains the position of the target object; the control module is connected with the management device and is used to generate the control instruction according to the position of the target object obtained by the management device, and The control instruction controls the rotation of the smart mobile device.
- the management device is also integrated with the preprocessing module of the target tracking device for performing preprocessing operations on the images, and performing target detection on the images after the preprocessing operations Processing to obtain the position of the target object in the image.
- the smart mobile device includes an educational robot.
- the embodiment of the present application provides a smart mobile device, which includes: a processor; a memory for storing executable instructions of the processor; wherein the processor is configured to call the instructions stored in the memory to execute any one The target tracking method described in item.
- An embodiment of the present application provides a computer-readable storage medium on which computer program instructions are stored, and when the computer program instructions are executed by a processor, the target tracking method described in any one of the first aspect is implemented.
- the embodiment of the present application provides a computer program, including computer-readable code.
- the processor in the smart mobile device executes the Target tracking method.
- the target tracking method and device, smart mobile device, and storage medium provided by the embodiments of the application can obtain the position of the target object in the collected image, and obtain the position of the smart mobile device according to the distance between the position of the target object and the image center.
- the control instruction is used to control the rotation of the smart mobile device, and the obtained control instruction includes at least one rotation instruction corresponding to an offset value, wherein the distance between the offset sequence formed by each offset value and the target object and the image center It is determined that the obtained control instruction can make the rotated target object be in the center of the collected image, so that the target object is within the tracking range of the smart mobile device.
- the target tracking method and device, smart mobile device, and storage medium provided in the embodiments of the present application can perform target tracking according to the position of the target object in real time, which is more convenient and accurate.
- FIG. 1 is a schematic flowchart of a target tracking method provided by an embodiment of this application
- FIG. 2 is a schematic diagram of a process of performing preprocessing on an image provided by an embodiment of the application
- step S20 in a target tracking method provided by an embodiment of this application
- step S30 is a schematic flowchart of step S30 in a target tracking method according to an embodiment of the application;
- step S303 is a schematic flowchart of step S303 in a target tracking method provided by an embodiment of this application;
- FIG. 6 is a schematic diagram of another process of a target tracking method provided by an embodiment of the application.
- FIG. 7 is an application example diagram of a target tracking method provided by an embodiment of the application.
- FIG. 8 is a schematic flowchart of a preprocessing process provided by an embodiment of the application.
- FIG. 9 is a schematic diagram of the training process of the target detection network provided by an embodiment of the application.
- FIG. 10 is a schematic diagram of the application process of the target detection network provided by an embodiment of this application.
- FIG. 11 is a schematic flowchart of a path planning algorithm based on reinforcement learning provided by an embodiment of the application.
- FIG. 12 is a schematic structural diagram of a target tracking device provided by an embodiment of this application.
- FIG. 13 is a schematic structural diagram of a smart mobile device provided by an embodiment of this application.
- the embodiment of the application provides a target tracking method, which can be applied to any smart mobile device with image processing function.
- the target tracking method can be applied to devices such as mobile robots, remote-controlled vehicles, and aircraft.
- the target tracking method may be implemented by a processor invoking computer-readable instructions stored in a memory.
- FIG. 1 is a schematic flowchart of a target tracking method provided by an embodiment of the application. As shown in FIG. 1, the target tracking method includes:
- Step S10 Obtain the collected image
- the smart mobile device to which the target tracking method of the embodiments of the present application is applied may include an image acquisition device, such as a camera or a camera.
- images can be directly collected by an image collection device, or video data can be collected by the image collection device, and the video data can be subjected to frame division or frame selection processing to obtain corresponding images.
- Step S20 Determine the position of the target object in the image
- the target detection process of the captured image can be performed, that is, whether the target object exists in the captured image, and when the target object exists, it is determined where the target object is s position.
- the target detection processing can be realized through a neural network.
- the target object detected by the embodiment of the present application may be any type of object, for example, the target object may be a human face, or the target object may be another object to be tracked, which is not specifically limited in the embodiment of the present application.
- the target object may be an object with a specific known identity, that is, the embodiments of the present application can perform tracking of corresponding types of objects (such as all face images), or perform tracking of a specific identity.
- the tracking of an object (such as a known specific face image) can be set according to requirements, which is not specifically limited in the embodiment of the present application.
- the neural network that implements target detection processing may be a convolutional neural network. After training, the neural network can accurately detect the position of the target object in the image.
- the form of the neural network is not limited. .
- the process of performing target detection processing on the image perform feature extraction on the image to obtain image features, and then perform classification processing on the image features to obtain the location area of the target object in the image, based on the location area. Determine the location of the target object.
- the classification result obtained by the classification process may include the identification of whether there is a target object in the image, such as the first identification or the second identification, where the first identification indicates that the pixel corresponding to the current position in the image is the target object, and the second identification indicates that the current
- the pixel point corresponding to the position in the image is not the target object, and the position of the target object in the image can be determined by the area formed by the first identifier. For example, the center position of the area can be determined as the position of the target object.
- the position of the target object in the image can be directly obtained, for example, the position of the target object can be expressed in the form of coordinates.
- the center position of the position area of the target object in the image may be used as the position of the target object.
- the output position is empty.
- Step S30 Determine a control instruction for controlling the rotation of the smart mobile device based on the distance between the position of the target object and the center position of the image, wherein the control instruction is used for making the position of the target object located The center position of the image, and the control instruction includes a rotation instruction corresponding to an offset value in an offset sequence constituting the distance, and the offset sequence includes at least one offset value.
- the smart mobile device when the position of the target object in the image is obtained, the smart mobile device can be controlled to move according to the position, so that the target object can be located at the center of the collected image, thereby realizing the target object Tracking.
- the embodiment of the present application can obtain a control instruction for controlling the rotation of the smart mobile device according to the distance between the position of the target object in the image and the center position of the image, so that the position of the target object can be located at the center of the currently collected image .
- the control instruction may include rotation instructions respectively corresponding to at least one offset value, wherein the distance between the position of the target object and the center position of the image can be determined according to the offset sequence corresponding to the at least one offset value.
- the distance in the embodiment of the present application can be a directed distance (such as a direction vector), and the offset value can also be a direction vector.
- the direction vector corresponding to the distance can be obtained by adding the direction vector corresponding to each offset value, that is, by The rotation instructions corresponding to each offset value can realize the offset of each offset value, and finally make the target object located in the center of the currently collected image.
- the target object may always be located at the center of the captured image from the moment when the next image of the current image is captured.
- the embodiment of the application can quickly adjust the rotation of the smart mobile device to the position of the target object in the previous image, so that the target object is in the center of the collected image, even when the target object is moving, It is also possible to track and shoot the target object so that the target object is in the frame of the collected image.
- the embodiment of the present application may use a reinforcement learning algorithm to execute the planning of the rotation path of the smart mobile device, and obtain a control instruction for positioning the target object in the center of the image.
- the control instruction may be determined based on the reinforcement learning algorithm
- the reinforcement learning algorithm may be a value learning algorithm (Q-learning algorithm).
- the movement path of the smart mobile device is optimized and determined, and the control instructions corresponding to the optimal movement path are obtained in the comprehensive evaluation of the movement time, the convenience of the movement path, and the energy consumption of the smart mobile device.
- the embodiment of the present application can conveniently and accurately realize real-time tracking of the target object, and control the rotation of the smart mobile device according to the position of the target object, so that the target object is located in the center of the collected image.
- the control instruction of the smart mobile device can be obtained according to the distance between the position of the target object in the image and the center position of the image.
- the control instruction is used to control the rotation of the smart mobile device, and the obtained control instruction includes at least one offset value corresponding The distance between the offset sequence formed by each offset value and the target object and the center of the image is determined.
- the obtained control command can enable the rotated target object to be in the center of the captured image, thereby making the target The object is within the tracking range of the smart mobile device.
- the embodiment of the present application can perform target tracking according to the position of the target object in real time, and has the characteristics of being more convenient, accurate, and improving the performance of the smart mobile device.
- the embodiment of the present application may perform target detection processing on the image when the image is collected.
- the specifications, types, and other parameters of the collected images may be different, it is possible to perform preprocessing operations on the images before performing target detection processing on the images to obtain a normalized image.
- the method further includes performing a preprocessing operation on the image.
- FIG. 2 is a schematic diagram of the process of performing preprocessing on the image provided by an embodiment of the application, as shown in FIG. 2 ,
- the preprocessing operation includes:
- Step S11 Adjust the image to a grayscale image of a preset specification.
- the captured image may be a color image or another form of image, and the captured image may be converted into an image of a preset specification, and then the image of the preset specification may be converted into a grayscale image.
- the preset specification may be 640*480, but it is not a specific limitation of the embodiment of the present application. Converting color images or other forms of images into grayscale images can be based on the processing of pixel values. For example, the pixel value of each pixel can be divided by the maximum pixel value, and the corresponding grayscale value can be obtained based on the result. It is only illustrative, and the embodiment of the present application does not specifically limit the process.
- the embodiment of the present application converts the image into a grayscale image and directly converts the picture into a grayscale picture. Then send it to the network model for testing, which can reduce resource consumption and increase processing speed.
- Step S12 Perform normalization processing on the grayscale image.
- normalization processing can be performed on the grayscale image.
- the pixel values of the image can be normalized to the same scale range.
- the normalization processing may include: determining the average value and standard deviation of the pixel value of each pixel in the grayscale image; determining the difference between the pixel value of the pixel and the average value; The ratio between the difference and the standard deviation corresponding to each pixel is determined as the normalized pixel value of the pixel.
- the images collected in the embodiment of the present application may be multiple or one.
- the obtained grayscale image is also one.
- the average value and standard deviation corresponding to the pixel value of each pixel can be obtained.
- the ratio between the difference between each pixel and the average value and the standard deviation can be updated to the pixel value of the pixel.
- the average value and standard deviation of the pixel values of the multiple grayscale images can be determined by the pixel value of each pixel in the multiple grayscale images. That is, the average value and standard deviation of the embodiment of the present application may be for one image or for multiple images.
- the difference between the pixel value of each pixel of each image and the average value can be obtained, and then the difference between the difference and the average value can be obtained. Use this ratio to update the pixel value of the pixel.
- the pixel value of each pixel in the grayscale image can be unified to the same scale, and the normalization processing of the collected image can be realized.
- the pre-processing may also be performed in other manners. For example, it is possible to only perform conversion of an image into a preset specification, and perform normalization processing on an image of the preset specification. That is, the embodiment of the present application may also perform normalization processing of color images.
- the average value and standard deviation of the feature value of each channel of each pixel in the color image can be obtained, for example, the average value of the feature value (R value) of the red (Red, R) channel of each pixel of the image can be obtained Sum standard deviation, the mean and standard deviation of the characteristic value (G value) of the green (Green, G) channel, and the mean and standard deviation of the characteristic value (B value) of the blue (Blue, B) channel. Then, according to the ratio of the difference between the eigenvalue of the corresponding color channel and the average value and the standard deviation, the new eigenvalue of the corresponding color channel is obtained. In this way, the updated feature value of the color channel corresponding to each pixel of each image is obtained, and then a normalized image is obtained.
- the embodiments of the present application can be applied to different types of images and images of different scales during implementation, thereby improving the applicability of the embodiments of the present application.
- the position of the target object in the image is obtained, that is, the position of the target object in the original collected image can be obtained according to the position of the target object after preprocessing.
- the following only takes the execution of target detection processing on the collected image as an example for description, the process of performing target detection on the preprocessed image is the same as that, and the description is not repeated here.
- FIG. 3 is a schematic flowchart of step S20 in a target tracking method according to an embodiment of the application. As shown in FIG. 3, the determining the position of the target object in the image includes:
- Step S201 Extract image features of the image
- the image features of the image can be extracted first, for example, the image features can be obtained by convolution processing.
- the target detection processing can be realized by a neural network, where the neural network can include a feature extraction module and a classification Module, the feature extraction module may include at least one convolutional layer, and may also include a pooling layer.
- the feature extraction module can extract the features of the image.
- the feature extraction process may also be performed in the structure of the residual network to obtain image features, which is not specifically limited in the embodiment of the present application.
- Step S202 Perform classification processing on the image features to obtain the location area of the target object in the image.
- classification processing can be performed on image features.
- the classification module performing the classification processing can include a fully connected layer, and the detection result of the target object in the image, that is, the location area of the target object, is obtained through the fully connected layer.
- the location area of the target object in the embodiments of the present application can be expressed in the form of coordinates, such as the location coordinates of the two vertex corners of the detection frame corresponding to the location area of the detected target object, or the location coordinates of a vertex, And the height or width of the detection frame.
- the result of the classification process in the embodiment of the present application may include whether there is an object of the target type in the image, that is, the target object, and the location area of the target object.
- the first identifier and the second identifier can be used to identify whether there is an object of the target type, and to indicate the location area where the target object is located in the form of coordinates.
- the first identifier can be 1, indicating that there is a target object, on the contrary, the second identifier can be 0, indicating that there is no target object, (x1, x2, y1, y2) are the horizontal lines corresponding to the two vertices of the detection frame. The ordinate value.
- Step S203 Determine the center position of the position area as the position of the target object.
- the center position of the detected position area of the target object may be determined as the position of the target object.
- the average value of the coordinate values of the four vertices of the location area where the target object is located can be taken to obtain the coordinates of the center position, and then the coordinates of the center position are determined as the position of the target object.
- the target object can be a face
- the target detection process can be a face detection process, that is, the location area where the face is located in the image can be detected, and the person can be obtained according to the center of the location area where the detected face is located. The position of the face. Then perform target tracking for the face.
- the embodiments of the present application can obtain the position of the target object with high accuracy, and improve the accuracy of target tracking.
- the above-mentioned preprocessing and target detection process can be performed by the management device of the smart mobile device.
- the management device may be a Raspberry Pi chip.
- Raspberry Pi chip has high scalability and high processing speed.
- the obtained information about the location of the target object, etc. may be transmitted to the control terminal of the smart mobile device to obtain the control instruction.
- the transmission of the detection result of the execution target object in the embodiment of the present application may be encapsulated and transmitted according to a preset data format.
- the detection result indicates the position of the target object in the image.
- the data corresponding to the detection result of the transmission can be 80 bytes, and it can include mode flags, detection result information, cyclic redundancy (CRC) check, retransmission threshold, control field, and optional Field.
- the mode flag bit can indicate the current working mode of the Raspberry Pi chip
- the detection result information can be the position of the target object
- the CRC check bit is used for security verification
- the retransmission threshold is used to indicate the maximum number of retransmissions of data
- the control field Used to indicate the desired working mode of the smart mobile device.
- the optional field is the information that can be added.
- FIG. 4 is a schematic flowchart of step S30 in a target tracking method provided by an embodiment of the application. As shown in FIG. 4, step S30 can be implemented through the following steps:
- Step S301 Determine a target offset based on the distance between the position of the target object in the image and the center position of the image;
- the position of the target object when tracking the target object in the embodiments of the present application, the position of the target object can be maintained at the center of the image, and the tracking of the target object can be achieved in this way. Therefore, in the embodiment of the present application, when the position of the target object is obtained, the distance between the position of the target object and the center position of the image can be detected, and the distance is used as the target offset. The Euclidean distance between the coordinates of the position of the target object and the coordinates of the center position of the image can be used as the target offset.
- the distance can also be expressed in the form of a vector, for example, it can be expressed as a directed vector between the center position of the image and the position of the target object, that is, the obtained target offset may include the distance between the position of the target object and the center position of the image , Can also include the direction of the center of the image relative to the position of the target object.
- Step S302 Generate multiple sets of offset sequences based on the target offset, the offset sequences include at least one offset value, and the sum of the offset values in each set of offset sequences is the target offset Shift
- the embodiment of the present application may generate multiple sets of offset sequences according to the obtained target offset, the offset sequence includes at least one offset value, and the sum of the at least one offset value Is the target offset. For example, if the position of the target object is (100, 0) and the position of the image center is (50, 0), the target offset is 50 on the x-axis.
- multiple offset sequences can be generated.
- the offset value of the first offset sequence is 10, 20, and 20, and the offset value of the second offset sequence can be 10, 25 And 15, where the direction of each offset value can be the positive direction of the x-axis.
- multiple sets of offset sequences corresponding to the target offset can be obtained.
- the number of offset values in the generated multiple sets of offset sequences may be set, for example, it may be 3, but it is not a specific limitation in the embodiment of the present application.
- the method of generating multiple sets of offset sequences may be a method of randomly generating. In practice, there may be multiple combinations of offset values in the offset sequence that can achieve the target offset. The embodiment of the present application may randomly select a preset number of combinations from the multiple combinations, that is, the preset number. The offset sequence.
- Step S303 Using a reinforcement learning algorithm, select an offset sequence that meets the requirements from the multiple sets of offset sequences, and obtain a control instruction corresponding to the offset sequence that meets the requirements.
- a reinforcement learning algorithm when the generated offset sequence is obtained, a reinforcement learning algorithm may be used to select an offset sequence that meets the requirements.
- the reinforcement learning algorithm can be used to obtain the total value corresponding to the offset sequence, and the offset sequence with the highest total value is determined as the offset sequence that meets the requirements.
- Fig. 5 is a schematic flow chart of step S303 in a target tracking method provided by an embodiment of the application.
- step S303 “the use of the reinforcement learning algorithm, selects from the multiple sets of offset sequences that meet the requirements And obtain the control instruction corresponding to the offset sequence that meets the requirements” may include:
- Step S3031 For each offset value in the multiple sets of offset sequences, determine the maximum value corresponding to the offset value in the value table, and the value table includes the value corresponding to the offset value under different rotation commands;
- the reinforcement learning algorithm may be a value learning algorithm (Q-learning algorithm), and the corresponding value table (Q-table) may indicate the value corresponding to different offset values under different rotation instructions ( quality).
- Rotation instructions refer to instructions that control the rotation of smart mobile devices, which can include parameters such as motor rotation angle, motor speed, and motor rotation time.
- the value table in the embodiment of the present application may be a value table obtained through intensive chemical learning in advance, wherein the parameters of the value table can be accurately distinguished and reflected in the case of different offset values, and the values corresponding to different rotation commands.
- Table 1 shows at least a part of the parameters of the rotation command
- Table 2 shows a schematic table of the value table.
- the horizontal parameters a1, a2, and a3 are different rotation commands, and the vertical parameters s1, s2, and s3 are different.
- the parameter in the table indicates the value of the corresponding offset value and the corresponding rotation command.
- the value can represent the value of the corresponding rotation command under the corresponding offset value. Generally, the larger the value, the higher the value, indicating that the value of the target tracking through the command is higher.
- each offset sequence may include multiple offset values, and the embodiment of the present application may determine the maximum value corresponding to each offset value in each sequence based on the value table. For example, for the offset value s1, the maximum value is 3, for the offset value s2, the maximum value is 2, and for the offset value s3, the maximum value is 4.
- the maximum value is 3
- the maximum value is 2
- the offset value s3 is 4.
- the foregoing is only an exemplary description, and the obtained value may be different for different value tables, which is not specifically limited in the embodiment of the present application.
- Step S3032 Obtain the reward value corresponding to the offset value, and determine the final value of the offset value based on the reward value corresponding to the offset value and the maximum value, wherein the reward value is The distance between the position of the target object and the center position of the image when the rotation instruction corresponding to the offset value is not executed;
- the reward value of each offset value in the offset sequence can be obtained, where the reward value is related to the position of the target object when the corresponding offset value is not executed.
- the position of the target object is the initially detected position of the target object in the image.
- the position of the target object may be assumed after the rotation instruction corresponding to the maximum value of the offset value before the offset value is executed. For example, assuming that the position of the target object in the detected image is (100, 0), the obtained offset sequence that satisfies the condition may be 20, 15, and 15.
- the reward value of the first offset value can be determined by the position (100, 0) of the target object.
- the position of the target object can be determined to be (120, 0).
- the reward value of the second offset value can be determined based on the position, and when the third offset value is executed, It is determined that the position of the target object is (135, 0), and the reward value of the third offset value can be determined based on this position.
- the expression for obtaining the reward value can be as shown in formula (1-1):
- R(s,a) is the reward value of the maximum value of the rotation instruction a corresponding to the offset value s, that is, the reward value corresponding to the offset value s, and s(x) and s(y) are respectively the unexecuted offset
- the value corresponds to the abscissa and ordinate of the position of the target object when the rotation instruction a of the maximum value corresponds
- b and c represent the abscissa and ordinate of the center position of the image, respectively.
- the final value of the offset value can be determined according to the reward value corresponding to the offset value and the maximum value corresponding to the offset value.
- the weighted sum of the reward value and the maximum value can be used to determine the final value.
- the expression for determining the final value of the offset value in the embodiment of the present application may be as shown in formula (1-2):
- Q'(s,a) is the final value corresponding to the offset value s
- R(s,a) is the reward value of the maximum value rotation instruction a corresponding to the offset value s
- Step S3033 Determine the offset sequence with the largest sum of the final value as the offset sequence that meets the requirements.
- the final value of each offset value in the offset sequence may be summed to obtain the total value corresponding to the offset sequence. Then select the offset sequence with the largest total value as the offset sequence that meets the requirements.
- the offset sequence with the largest total value can be obtained, and the maximum total value indicates that the rotation instruction corresponding to the rotation path corresponding to the offset sequence is the optimal choice.
- control instruction can be combined and generated according to the rotation instruction corresponding to the maximum value corresponding to each offset value in the value table in the offset sequence.
- the control instruction can then be transmitted to the smart mobile device, so that the smart mobile device performs a rotation operation according to the control instruction.
- the smart mobile device can be controlled to move according to the generated control instruction.
- the control command may include parameters such as the rotation angle and direction of the motor, or may also include control commands such as the motor speed, the motor rotation time, whether to stop or not.
- the embodiment of the present application may control the movement of the mobile device by means of differential steering.
- the smart mobile device may be a smart mobile vehicle, which may include two left and right drive wheels.
- the embodiment of the present application may control the left and right drive wheels based on control instructions.
- Rotation speed realizes steering and movement. When the driving wheels rotate at different speeds, the body will rotate even if there is no steering wheel or the steering wheel does not move.
- the difference in the rotational speed of the two driving wheels can be realized by operating two separate clutches or braking devices installed on the left and right half shafts.
- the intelligent mobile device can realize different rotation trajectories according to the different rotation speed and rotation angle of the left and right driving wheels. Under different rotation trajectories, the pictures collected by the car are different, and then through continuous optimization, the position of the intelligent mobile car is adjusted to ensure that the target object is in the center of the image to achieve the tracking of the target object.
- FIG. 6 is a schematic diagram of another process of a target tracking method provided by an embodiment of the application. As shown in FIG. 6, the target tracking method further includes:
- Step S41 Determine a control instruction for controlling the movement of the smart mobile device based on the location area of the target object, wherein it can be determined whether the area of the location area of the target object is within the range between the first threshold and the second threshold Inside.
- the location area of the target object in the collected image can be obtained, and the embodiment of the present application can control the moving direction of the smart mobile device according to the area of the location area.
- the area of the location area can be determined according to the obtained location area of the target object, and the area can be compared with the first threshold and the second threshold.
- the first threshold and the second threshold may be preset reference thresholds, the first threshold is greater than the second threshold, and the embodiment of the present application does not limit specific values.
- Step S42 in the case that the area corresponding to the location area of the target object is greater than the first threshold, generate a control instruction for controlling the backing of the smart mobile device;
- a control instruction for controlling the backing of the smart mobile device can be generated until the area of the detected location area of the target object is less than the first threshold and greater than the second threshold.
- Step S43 In a case where the area corresponding to the location area of the target object is smaller than a second threshold, generate a control instruction for controlling the advancement of the smart mobile device, where the first threshold is greater than the second threshold.
- the area of the detected location area of the target object when the area of the detected location area of the target object is smaller than the second threshold, it means that the distance between the target object and the smart mobile device is far, and the smart mobile device can be moved forward at this time.
- a control instruction for controlling the advancement of the smart mobile device can be generated until the area of the detected location area of the target object is less than the first threshold and greater than the second threshold.
- the smart mobile device can perform a forward or backward operation according to the received forward or backward control instruction.
- the movement of the smart mobile device can be controlled according to the size of the target object, and the area corresponding to the location area of the detected target object (such as a human face) can be kept between the second threshold and the first threshold to realize the smart mobile device Control of the moving direction.
- the application body of the target tracking method in the embodiment of the present application may be a smart mobile device, or may also be a device installed in the smart mobile device, and the device is used to control the movement of the smart mobile device.
- the intelligent mobile device to which the target tracking method of the embodiment of the present application is applied is an educational robot
- the management device of the educational robot is a Raspberry Pi
- the target object is a human face as an example for description, to clearly embody the embodiments of the present application.
- FIG. 7 is an application example diagram of a target tracking method provided by an embodiment of the application, in which the camera 701 is connected to the raspberry pi 702 to transmit the image or video collected by the camera 701 to the raspberry pi 702, wherein the camera 701
- the Raspberry Pi 702 can be connected to the Raspberry Pi 702 through a Universal Serial Bus (USB) port for data transmission, but this connection method is not limited to this embodiment of the application. The following process can then be performed.
- USB Universal Serial Bus
- the application field of the embodiment of the present application may be an intelligent robot in an educational background, and the intelligent robot may realize the functions of face detection and tracking.
- the Raspberry Pi 702 can perform image processing
- the Raspberry Pi 702 in the embodiment of the present application can perform image preprocessing and target detection processing
- the Raspberry Pi can be integrated with a target detection network. Since the types of images collected by the camera 701 are not the same, the Raspberry Pi 702 needs to perform necessary preprocessing on the image data before transmitting the images to the target detection network model.
- Fig. 8 is a schematic flow chart of the preprocessing process provided by an embodiment of the application, as shown in Fig. 8, including:
- Step S51 Receive the collected video data.
- Step S52 Framing the video data into picture data.
- Step S53 unify the picture size.
- Step S54 Convert the picture into a grayscale image.
- Step S55 Normalize the picture.
- Image framing refers to decomposing the collected video data into one frame of images, and then unifying the image size to a size range of 640*480. Since color images consume a lot of resources during processing, but have little impact on the detection effect, the embodiment of the present application ignores color features, directly converts the image to a grayscale image, and sends it to the target detection network for detection. Finally, for the convenience of image processing, the image is normalized, which is to subtract the average value of each dimension from the original data of each dimension of the image data, replace the original data with the result, and then divide the data of each dimension With the standard deviation of each dimension data, the image data can be normalized to the same scale.
- the camera 701 collects the picture.
- Output Face detection coordinate position.
- the target detection network in the Raspberry Pi 702 can perform face recognition and detection in the image, that is, the embodiment of the application can use the deep learning technology to realize the face detection technology, where the deep learning technology realizes the face detection
- the technology is divided into two stages: model training and model application.
- FIG. 9 is a schematic diagram of the training process of the target detection network provided in an embodiment of the application. As shown in FIG. 9, the training process includes:
- Step S61 Collect a face data set picture.
- the face data set pictures include face pictures of various ages and various regions, and the face pictures are manually labeled to obtain the coordinate positions of the faces. Construct a face data set and divide the data set into three parts: training set, test set and verification set.
- Step S62 construct a neural network model.
- step S62 can be implemented through the following steps:
- step S621 feature extraction is achieved by superimposing the convolutional layer and the pooling layer.
- Step S622 Use a classifier to classify the extracted features.
- classification can be achieved through a fully connected layer (classifier).
- Step S63 training the neural network model.
- Model training is achieved through a series of gradient optimization algorithms. After a large number of iterative training, a trained model can be obtained for model testing.
- step S64 a trained neural network model is obtained.
- the training process of the model is the training process of the target detection network (neural network model).
- FIG. 10 is a schematic diagram of the application process of the target detection network provided by an embodiment of the application. As shown in FIG. 10, the application process includes:
- Step S71 Collect a face picture.
- step S72 the preprocessed picture is sent to the trained model.
- Step S73 obtain the coordinate position of the face.
- the pre-processed picture is sent to the trained model, and the coordinate position of the face in the picture can be output after forward calculation.
- the face coordinate position detection can be completed by the Raspberry Pi 702, and then the face coordinate position can be encapsulated into a data packet according to a defined communication protocol specification. After the data encapsulation is completed, it is sent to the processor or controller in the smart mobile device 703 through the serial port, where the smart mobile device 703 can be an educational robot EV3, and then the smart mobile device 703 can complete subsequent faces according to the received face position track.
- EV3 performs path planning according to the coordinates of the face position.
- the educational robot EV3 receives and analyzes the data packet sent from the Raspberry Pi 702 side to obtain the coordinate position of the face, and then complete the path planning.
- reinforcement learning algorithms can be used to realize path planning.
- Reinforcement learning mainly includes state, reward and action factors.
- the state is the coordinate position of the face detected each time
- the reward can be defined as the Euclidean distance between the center of the face and the center of the picture
- the action is the motor motion instruction executed each time.
- the motor motion can be controlled as shown in Table 1.
- path planning can be performed.
- the Q function is defined as follows. Input includes state and action, and returns the reward value for performing an action in a specific state.
- FIG. 11 is a schematic flowchart of a path planning algorithm based on reinforcement learning provided by an embodiment of the application, as shown in FIG. 11, including:
- Step S81 initialize the Q value table.
- Step S82 selecting a specific motor from the action set to execute the command.
- Step S83 execute a specific motor execution instruction.
- Step S84 Calculate the Q value table of this state.
- Step S85 update the Q value table.
- the action set of the educational robot EV3 is shown in Table 1.
- the state set uses the face coordinates to determine the tracking effect, that is, the distance between the face position and the center of the picture is used as the reward function, and the Q value table is updated by measuring the reward function of different actions.
- the Q value table is updated by measuring the reward function of different actions.
- the smart mobile device 703 implements face tracking according to the motion instructions (same as the control instructions in the above embodiments).
- Smart mobile devices such as educational robots use a differential steering mechanism, and the trolley realizes steering by controlling the speed of the left and right driving wheels 704 and 705.
- the body When the driving wheels rotate at different speeds, the body will rotate even if there is no steering wheel or the steering wheel does not move.
- the difference in the speed of the driving wheels can be realized by operating two separate clutches or braking devices mounted on the left and right axles.
- the smart mobile device 703 can realize different rotation trajectories according to different rotation speeds and rotation angles of the left and right wheels. Under different rotation trajectories, the pictures collected by the car are different, and then continuously optimize the action, adjust the position of the car, and finally ensure that the face position is in the center of the picture to realize the face tracking function.
- the smart mobile device in the embodiment of the present application may also be provided with a sensor 706, such as a distance sensor, a touch sensor, etc., for sensing related information of the surrounding environment of the smart mobile device 703, and can be based on the sensed surroundings The related information of the environment controls the working mode and movement parameters of the smart mobile device 703.
- a sensor 706 such as a distance sensor, a touch sensor, etc.
- the target tracking method can obtain the position of the target object in the collected image, and obtain the control instruction of the smart mobile device according to the distance between the position of the target object and the image center.
- the control instruction is used to adjust the rotation angle of the smart mobile device.
- the obtained control instruction includes at least one rotation instruction corresponding to an offset value, wherein the distance between the offset sequence formed by each offset value and the target object and the image center is determined,
- the obtained control instruction can enable the rotated target object to be in the center of the collected image, so that the target object is within the tracking range of the smart mobile device.
- the embodiment of the present application can perform target tracking according to the position of the target object in real time, and has the characteristics of being more convenient, accurate, and improving the performance of the smart mobile device.
- the embodiments of the present application can use deep learning technology to complete face detection (using neural networks to achieve target detection), which has significantly improved accuracy and speed compared to traditional target detection methods.
- a reinforcement learning algorithm may also be used to perform path planning through Q-learning technology, and the optimal rotation path may be selected.
- the embodiments of the present application can also be adapted to the requirements of different scenarios and have good scalability.
- the writing order of the steps does not mean a strict execution order but constitutes any limitation on the implementation process.
- the specific execution order of each step should be based on its function and possibility.
- the inner logic is determined.
- the embodiments of the present application also provide target tracking devices, smart mobile devices, computer-readable storage media, and programs, all of which can be used to implement any target tracking method provided in the embodiments of the present application.
- target tracking devices smart mobile devices, computer-readable storage media, and programs, all of which can be used to implement any target tracking method provided in the embodiments of the present application.
- FIG. 12 is a schematic structural diagram of a target tracking device provided by an embodiment of the application. As shown in FIG. 12, the target tracking device includes:
- the image acquisition module 10 is configured to acquire images
- the target detection module 20 is configured to determine the position of the target object in the image
- the control module 30 is configured to determine a control instruction for controlling the rotation of the smart mobile device based on the distance between the position of the target object and the center position of the image, wherein the control instruction is used to make the target object The position is located at the center of the image, and the control instruction includes a control instruction corresponding to an offset value in an offset sequence constituting the distance, and the offset sequence includes at least one offset value.
- the device further includes a preprocessing module configured to perform a preprocessing operation on the image, and the preprocessing operation includes: adjusting the image to a grayscale image of a preset specification , And perform normalization processing on the grayscale image;
- the target detection module is further configured to perform target detection processing on the image obtained after the preprocessing operation to obtain the position of the target object in the image after the preprocessing operation;
- the step of performing the normalization processing on the grayscale image by the preprocessing module includes:
- the ratio between the difference and the standard deviation corresponding to each pixel is determined as the normalized pixel value of the pixel.
- the target detection module is further configured to extract image features of the image
- the center position of the position area is determined as the position of the target object.
- the target object includes a human face
- the target detection module is further configured to determine the position of the human face in the image.
- control module is further configured to determine the target offset based on the distance between the position of the target object in the image and the center position of the image;
- control module is further configured to determine the maximum value corresponding to the offset value in the value table for each offset value in the multiple sets of offset sequences, and the value table includes The offset value corresponds to the value under different rotation commands;
- the reward value corresponding to the offset value is obtained, and the final value of the offset value is determined based on the reward value and the maximum value corresponding to the offset value, and the reward value is In the case of a rotation instruction corresponding to the maximum value of the offset value, the distance between the position of the target object and the image center;
- the offset sequence that has the largest sum of the final value of the offset values in the multiple sets of offset sequences is determined as the offset sequence that meets the requirements.
- control module is further configured to determine the control instruction based on the rotation instruction corresponding to the maximum value of each offset value in the offset sequence that meets the requirements.
- the target detection module is further configured to determine a control instruction for controlling the movement of the smart mobile device based on the location area of the target object, wherein:
- a control instruction for controlling the advancement of the smart mobile device is generated, and the first threshold is greater than the second threshold.
- an embodiment of the present application also provides a smart mobile device that includes the target tracking device described in the above embodiment, and the target detection network in the target tracking device is integrated in the management device of the smart mobile device, Execute the target detection processing of the image collected by the image collection module by the management device to obtain the position of the target object;
- the control module is connected to the management device, and is used to generate the control instruction according to the position of the target object obtained by the management device, and control the rotation of the smart mobile device according to the control instruction.
- the management device is a Raspberry Pi.
- the smart mobile device includes an educational robot.
- the management device is also integrated with the preprocessing module of the target tracking device to be configured to perform preprocessing operations on the images and perform target detection on the images after the preprocessing operations Processing to obtain the position of the target object in the image.
- the functions or modules included in the apparatus provided in the embodiments of the present application can be configured to execute the methods described in the above method embodiments, and for specific implementation, refer to the description of the above method embodiments.
- the embodiment of the present application also proposes a computer-readable storage medium on which computer program instructions are stored, and the computer program instructions implement the foregoing method when executed by a processor.
- the computer-readable storage medium may be a non-volatile computer-readable storage medium.
- An embodiment of the present application also proposes an intelligent mobile device, including: a processor; a memory for storing executable instructions of the processor; wherein the processor is configured as the above method.
- FIG. 13 is a schematic structural diagram of a smart mobile device provided by an embodiment of this application.
- the smart mobile device 800 may be any device capable of performing image processing or a mobile device capable of performing target tracking.
- the device 800 may include one or more of the following components: a processing component 802, a memory 804, a power supply component 806, a multimedia component 808, an audio component 810, an input/output (Input Output, I/O) interface 812, a sensor Component 814, and communication component 816.
- the processing component 802 generally controls the overall operations of the device 800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations.
- the processing component 802 may include one or more processors 820 to execute instructions to complete all or part of the steps of the foregoing method.
- the processing component 802 may include one or more modules to facilitate the interaction between the processing component 802 and other components.
- the processing component 802 may include a multimedia module to facilitate the interaction between the multimedia component 808 and the processing component 802.
- the memory 804 is configured to store various types of data to support the operation of the device 800. Examples of these data include instructions for any application or method operating on the device 800, contact data, phone book data, messages, pictures, videos, etc.
- the memory 804 can be implemented by any type of volatile or non-volatile storage devices or their combination, such as static random access memory (Static Random-Access Memory, SRAM), electrically erasable programmable read-only memory (Electrically Erasable Programmable Read Only Memory, EEPROM, Erasable Programmable Read-Only Memory (Electrical Programmable Read Only Memory, EPROM), Programmable Read-Only Memory (PROM), Read-Only Memory (Read-Only Memory) , ROM), magnetic memory, flash memory, magnetic disk or optical disk.
- SRAM static random access memory
- SRAM static random access memory
- EEPROM Electrically erasable programmable read-only memory
- EEPROM Electrically Erasable Programmable Read-Only Memory
- EPROM Electrical Programmable Read Only Memory
- the power supply component 806 provides power for various components of the device 800.
- the power supply component 806 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the device 800.
- the multimedia component 808 includes a screen that provides an output interface between the device 800 and the user.
- the screen may include a liquid crystal display (Liquid Crystal Display, LCD) and a touch panel (Touch Pad, TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from the user.
- the touch panel includes one or more touch sensors to sense touch, sliding, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure related to the touch or slide operation.
- the multimedia component 808 includes a front camera and/or a rear camera. When the device 800 is in an operation mode, such as a shooting mode or a video mode, the front camera and/or the rear camera can receive external multimedia data. Each front camera and rear camera can be a fixed optical lens system or have focal length and optical zoom capabilities.
- the audio component 810 is configured to output and/or input audio signals.
- the audio component 810 includes a microphone (MIC).
- the microphone When the device 800 is in an operating mode, such as a call mode, a recording mode, and a voice recognition mode, the microphone is configured to receive external audio signals.
- the received audio signal may be further stored in the memory 804 or transmitted via the communication component 816.
- the audio component 810 further includes a speaker for outputting audio signals.
- the I/O interface 812 provides an interface between the processing component 802 and a peripheral interface module.
- the peripheral interface module may be a keyboard, a click wheel, a button, and the like. These buttons may include but are not limited to: home button, volume button, start button, and lock button.
- the sensor component 814 includes one or more sensors for providing the device 800 with various aspects of status assessment.
- the sensor component 814 can detect the on/off status of the device 800 and the relative positioning of components, such as the display and keypad of the device 800, and the sensor component 814 can also detect the position change of the device 800 or a component of the device 800 , The presence or absence of contact between the user and the device 800, the orientation or acceleration/deceleration of the device 800, and the temperature change of the device 800.
- the sensor component 814 may include a proximity sensor configured to detect the presence of nearby objects when there is no physical contact.
- the sensor component 814 may also include a light sensor, such as a Complementary Metal Oxide Semiconductor (CMOS) or Charge Coupled Device (CCD) image sensor, for use in imaging applications.
- CMOS Complementary Metal Oxide Semiconductor
- CCD Charge Coupled Device
- the sensor component 814 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor or a temperature sensor.
- the communication component 816 is configured to facilitate wired or wireless communication between the device 800 and other devices.
- the device 800 can access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof.
- the communication component 816 receives a broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel.
- the communication component 816 further includes a Near Field Communication (NFC) module to facilitate short-range communication.
- NFC Near Field Communication
- the NFC module can be based on radio frequency identification (RFID) technology, infrared data association (Infrared Data Association, IrDA) technology, ultra wideband (UWB) technology, Bluetooth (BT) technology and other technologies. Technology to achieve.
- RFID radio frequency identification
- IrDA infrared data association
- UWB ultra wideband
- Bluetooth Bluetooth
- the device 800 may be implemented by one or more application specific integrated circuits (Application Specific Integrated Circuit, ASIC), digital signal processor (Digital Signal Processor, DSP), and digital signal processing device (Digital Signal Process, DSPD). ), programmable logic device (Programmable Logic Device, PLD), Field Programmable Gate Array (Field Programmable Gate Array, FPGA), controller, microcontroller, microprocessor, or other electronic components to implement the above methods.
- ASIC Application Specific Integrated Circuit
- DSP Digital Signal Processor
- DSPD digital signal processing device
- PLD programmable logic device
- FPGA Field Programmable Gate Array
- controller microcontroller, microprocessor, or other electronic components to implement the above methods.
- non-volatile computer-readable storage medium such as the memory 804 including computer program instructions, which can be executed by the processor 820 of the device 800 to implement the foregoing methods.
- the embodiments of the application may be systems, methods and/or computer program products.
- the computer program product may include a computer-readable storage medium loaded with computer-readable program instructions for enabling a processor to implement various aspects of the embodiments of the present application.
- the computer-readable storage medium may be a tangible device that can hold and store instructions used by the instruction execution device.
- the computer-readable storage medium may be, for example, but not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing.
- Computer-readable storage media include: portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM) Or flash memory), static random access memory (SRAM), portable compact disk read-only memory (CD-ROM), digital versatile disk (Digital Video Disc, DVD), memory stick, floppy disk, mechanical encoding device, such as storage on it Commanded punch cards or protruding structures in the grooves, and any suitable combination of the above.
- RAM random access memory
- ROM read-only memory
- EPROM erasable programmable read-only memory
- flash memory static random access memory
- SRAM static random access memory
- CD-ROM compact disk read-only memory
- DVD digital versatile disk
- memory stick floppy disk
- mechanical encoding device such as storage on it Commanded punch cards or protruding structures in the grooves, and any suitable combination of the above.
- the computer-readable storage medium used here is not interpreted as a transient signal itself, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (for example, light pulses through fiber optic cables), or through wires Transmission of electrical signals.
- the computer-readable program instructions described herein can be downloaded from a computer-readable storage medium to various computing/processing devices, or downloaded to an external computer or external storage device via a network, such as the Internet, a local area network, a wide area network, and/or a wireless network.
- the network may include copper transmission cables, optical fiber transmission, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers.
- the network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network, and forwards the computer-readable program instructions for storage in the computer-readable storage medium in each computing/processing device .
- the computer program instructions used to perform the operations of the embodiments of the present application may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine-related instructions, microcode, firmware instructions, state setting data, or one or more programming Source code or object code written in any combination of languages, the programming language includes object-oriented programming languages such as Smalltalk, C++, etc., and conventional procedural programming languages such as "C" language or similar programming languages.
- Computer-readable program instructions can be executed entirely on the user's computer, partly on the user's computer, executed as a stand-alone software package, partly on the user's computer and partly executed on a remote computer, or entirely on the remote computer or server carried out.
- the remote computer can be connected to the user's computer through any kind of network-including Local Area Network (LAN) or Wide Area Network (WAN)-or it can be connected to an external computer (such as Use an Internet service provider to connect via the Internet).
- the electronic circuit is customized by using the state information of the computer-readable program instructions, such as programmable logic circuit, Field Programmable Gate Array (FPGA) or Programmable Logic Array (Programmable Logic). Array, PLA), the electronic circuit can execute computer-readable program instructions to implement various aspects of the embodiments of the present application.
- These computer-readable program instructions can be provided to the processor of a general-purpose computer, a special-purpose computer, or other programmable data processing device, thereby producing a machine such that when these instructions are executed by the processor of the computer or other programmable data processing device , A device that implements the functions/actions specified in one or more blocks in the flowchart and/or block diagram is produced. It is also possible to store these computer-readable program instructions in a computer-readable storage medium. These instructions make computers, programmable data processing apparatuses, and/or other devices work in a specific manner, so that the computer-readable medium storing instructions includes An article of manufacture, which includes instructions for implementing various aspects of the functions/actions specified in one or more blocks in the flowchart and/or block diagram.
- each block in the flowchart or block diagram may represent a module, program segment, or part of an instruction, and the module, program segment, or part of an instruction contains one or more functions for implementing the specified logical function.
- Executable instructions may also occur in a different order from the order marked in the drawings. For example, two consecutive blocks can actually be executed in parallel, or they can sometimes be executed in the reverse order, depending on the functions involved.
- each block in the block diagram and/or flowchart, and the combination of the blocks in the block diagram and/or flowchart can be implemented by a dedicated hardware-based system that performs the specified functions or actions Or it can be realized by a combination of dedicated hardware and computer instructions.
- the embodiment of the application discloses a target tracking method and device, a smart mobile device, and a storage medium.
- the method includes: acquiring a captured image; determining the location of a target object in the image; and based on the location of the target object and The distance between the center positions of the image obtains a control instruction used to control a smart mobile device, wherein the control instruction is used to make the position of the target object be located at the center position of the image, and the control instruction includes The rotation instruction corresponding to the offset value in the offset sequence constituting the distance, and the offset sequence includes at least one offset value.
- the embodiments of the present application can realize real-time tracking of target objects.
Abstract
Description
动作action | 值value |
电机转速Motor speed | 0-10000-1000 |
电机转动角度Motor rotation angle | 0-3600-360 |
电机转动时间Motor rotation time | ~~ |
电机停止动作Motor stop action | 保持、中断Hold, interrupt |
To | a1a1 | a2a2 | a3a3 |
s1s1 | 11 | 22 | 33 |
s2s2 | 11 | 11 | 22 |
s3s3 | 44 | 22 | 11 |
Claims (25)
- 一种目标跟踪方法,包括:A target tracking method includes:获取采集的图像;Obtain the collected images;确定所述图像中的目标对象的位置;Determining the position of the target object in the image;基于所述目标对象的位置和所述图像的中心位置之间的距离,确定用于控制智能移动设备转动的控制指令,其中,所述控制指令用于使得所述目标对象的位置位于所述图像的中心位置,且所述控制指令包括构成所述距离的偏移序列中的偏移值对应的转动指令,所述偏移序列包括至少一个偏移值。Based on the distance between the position of the target object and the center position of the image, a control instruction for controlling the rotation of the smart mobile device is determined, wherein the control instruction is used to make the position of the target object in the image The control instruction includes a rotation instruction corresponding to an offset value in an offset sequence constituting the distance, and the offset sequence includes at least one offset value.
- 根据权利要求1所述的方法,在确定所述图像中的目标对象的位置之前,所述方法还包括对所述图像执行预处理操作,所述预处理操作包括:将所述图像调整成预设规格的灰度图像,以及对所述灰度图像执行归一化处理;The method according to claim 1, before determining the position of the target object in the image, the method further comprises performing a preprocessing operation on the image, and the preprocessing operation comprises: adjusting the image to a preprocessing operation. Set a grayscale image of the specification, and perform normalization processing on the grayscale image;其中,所述确定所述图像中的目标对象的位置,包括:Wherein, the determining the position of the target object in the image includes:对所述预处理操作后得到的图像执行目标检测处理,获得所述预处理操作后的图像中所述目标对象的位置;Performing target detection processing on the image obtained after the preprocessing operation to obtain the position of the target object in the image after the preprocessing operation;基于所述预处理操作后的图像中所述目标对象的位置,确定所述图像中所述目标对象的位置。Determine the position of the target object in the image based on the position of the target object in the image after the preprocessing operation.
- 根据权利要求2所述的方法,所述对所述灰度图像执行归一化处理,包括:The method according to claim 2, wherein the performing normalization processing on the grayscale image comprises:确定所述灰度图像中各像素点的像素值的平均值和标准差;Determining the average value and standard deviation of the pixel value of each pixel in the grayscale image;获得所述各像素点的像素值与所述平均值之间的差值;Obtaining the difference between the pixel value of each pixel and the average value;将所述各像素点对应的所述差值和所述标准差之间的比值,确定为所述各像素点归一化后的像素值。The ratio between the difference and the standard deviation corresponding to each pixel is determined as the normalized pixel value of each pixel.
- 根据权利要求1至3中任意一项所述的方法,所述确定所述图像中的目标对象的位置,包括:The method according to any one of claims 1 to 3, said determining the position of the target object in the image comprises:提取所述图像的图像特征;Extract image features of the image;对所述图像特征执行分类处理,得到所述图像中的目标对象的位置区域;Perform classification processing on the image features to obtain the location area of the target object in the image;将所述位置区域的中心位置确定为所述目标对象的位置。The center position of the position area is determined as the position of the target object.
- 根据权利要求1至4中任意一项所述的方法,所述目标对象包括人脸;The method according to any one of claims 1 to 4, the target object includes a human face;对应地,所述确定所述图像中的目标对象的位置,包括:确定所述图像中人脸的位置。Correspondingly, the determining the position of the target object in the image includes: determining the position of the human face in the image.
- 根据权利要求1至5中任意一项所述的方法,所述基于所述目标对象的位置和所述图像的中心位置之间的距离,确定用于控制智能移动设备转动的控制指令,包括:The method according to any one of claims 1 to 5, wherein the determining a control instruction for controlling the rotation of a smart mobile device based on the distance between the position of the target object and the center position of the image includes:基于所述图像中的目标对象的位置与所述图像的中心位置之间的距离,确定目标偏移量;Determine the target offset based on the distance between the position of the target object in the image and the center position of the image;基于所述目标偏移量生成多组偏移序列,并且每组偏移序列中的偏移值的加和值为所述目标偏移量;Generating multiple sets of offset sequences based on the target offset, and the sum of the offset values in each set of offset sequences is the target offset;利用强化学习算法,从所述多组偏移序列中选择出满足要求的偏移序列,并确定所述满足要求的偏移序列对应的控制指令。Using a reinforcement learning algorithm, select an offset sequence that meets the requirements from the multiple sets of offset sequences, and determine the control instruction corresponding to the offset sequence that meets the requirements.
- 根据权利要求6所述的方法,所述利用强化学习算法,从所述多组偏移序列中选择出满足要求的偏移序列,包括:The method according to claim 6, said using a reinforcement learning algorithm to select an offset sequence that meets the requirements from the multiple sets of offset sequences, comprising:针对所述多组偏移序列中各偏移值,确定价值表中与所述偏移值对应的最大价值,所述价值表包括偏移值在不同转动指令下对应的价值;For each offset value in the multiple sets of offset sequences, determine the maximum value corresponding to the offset value in the value table, the value table including the value corresponding to the offset value under different rotation instructions;获得所述偏移值对应的奖赏值,并基于所述偏移值对应的所述奖赏值和所述最大价 值,确定所述偏移值的最终价值,所述奖赏值为在未执行所述偏移值的最大价值对应的转动指令的情况下,目标对象的位置与所述图像的中心位置之间的距离;The reward value corresponding to the offset value is obtained, and the final value of the offset value is determined based on the reward value and the maximum value corresponding to the offset value, and the reward value is In the case of a rotation instruction corresponding to the maximum value of the offset value, the distance between the position of the target object and the center position of the image;将所述多组偏移序列中各偏移值的所述最终价值之和最大的偏移序列,确定为满足要求的偏移序列。The offset sequence that has the largest sum of the final value of the offset values in the multiple sets of offset sequences is determined as the offset sequence that meets the requirements.
- 根据权利要求6或7所述的方法,所述确定所述满足要求的偏移序列对应的控制指令,包括:The method according to claim 6 or 7, wherein the determining the control instruction corresponding to the offset sequence that meets the requirement includes:基于所述满足要求的偏移序列中各偏移值的最大价值对应的转动指令,确定所述控制指令。The control instruction is determined based on the rotation instruction corresponding to the maximum value of each offset value in the offset sequence that meets the requirements.
- 根据权利要求1至8中任意一项所述的方法,所述方法还包括:The method according to any one of claims 1 to 8, the method further comprising:基于所述控制指令驱动所述智能移动设备执行转动。Drive the smart mobile device to perform rotation based on the control instruction.
- 根据权利要求4所述的方法,所述方法还包括:The method according to claim 4, further comprising:基于所述目标对象的位置区域,确定用于控制所述智能移动设备移动的控制指令,其中,Based on the location area of the target object, a control instruction for controlling the movement of the smart mobile device is determined, wherein:响应于所述目标对象的位置区域对应的面积大于第一阈值,生成用于控制所述智能移动设备后退的控制指令;In response to the area corresponding to the location area of the target object being greater than the first threshold, generating a control instruction for controlling the backing of the smart mobile device;响应于所述目标对象的位置区域对应的面积小于第二阈值,生成用于控制所述智能移动设备前进的控制指令,所述第一阈值大于第二阈值。In response to the area corresponding to the location area of the target object being smaller than a second threshold, a control instruction for controlling the advancement of the smart mobile device is generated, where the first threshold is greater than the second threshold.
- 一种目标跟踪装置,包括:A target tracking device includes:图像采集模块,其配置为采集图像;Image acquisition module, which is configured to acquire images;目标检测模块,其配置为确定所述图像中的目标对象的位置;A target detection module configured to determine the position of the target object in the image;控制模块,其配置为基于所述目标对象的位置和所述图像的中心位置之间的距离,确定用于控制智能移动设备转动的控制指令,其中,所述控制指令用于使得所述目标对象的位置位于所述图像的中心位置,且所述控制指令包括构成所述距离的偏移序列中的偏移值对应的转动指令,所述偏移序列包括至少一个偏移值。The control module is configured to determine a control instruction for controlling the rotation of the smart mobile device based on the distance between the position of the target object and the center position of the image, wherein the control instruction is used to make the target object The position of is located at the center position of the image, and the control instruction includes a rotation instruction corresponding to an offset value in an offset sequence constituting the distance, and the offset sequence includes at least one offset value.
- 根据权利要求11所述的装置,所述装置还包括预处理模块,其配置为对所述图像执行预处理操作,所述预处理操作包括:将所述图像调整成预设规格的灰度图像,以及对所述灰度图像执行归一化处理;The device according to claim 11, the device further comprising a preprocessing module configured to perform a preprocessing operation on the image, the preprocessing operation comprising: adjusting the image to a grayscale image of a preset specification , And perform normalization processing on the grayscale image;所述目标检测模块还配置为对所述预处理操作后得到的图像执行目标检测处理,获得所述预处理操作后的图像中所述目标对象的位置;The target detection module is further configured to perform target detection processing on the image obtained after the preprocessing operation to obtain the position of the target object in the image after the preprocessing operation;基于所述预处理操作后的图像中所述目标对象的位置,确定所述图像中所述目标对象的位置。Determine the position of the target object in the image based on the position of the target object in the image after the preprocessing operation.
- 根据权利要求12所述的装置,所述预处理模块执行所述对所述灰度图像执行归一化处理的步骤包括:The device according to claim 12, wherein the step of performing the normalization processing on the grayscale image by the preprocessing module comprises:确定所述灰度图像中各像素点的像素值的平均值和标准差;Determining the average value and standard deviation of the pixel value of each pixel in the grayscale image;获得所述各像素点的像素值与所述平均值之间的差值;Obtaining the difference between the pixel value of each pixel and the average value;将所述各像素点对应的所述差值和所述标准差之间的比值,确定为所述各像素点归一化后的像素值。The ratio between the difference and the standard deviation corresponding to each pixel is determined as the normalized pixel value of each pixel.
- 根据权利要求11至13中任意一项所述的装置,所述目标检测模块还配置为提取所述图像的图像特征;The device according to any one of claims 11 to 13, wherein the target detection module is further configured to extract image features of the image;对所述图像特征执行分类处理,得到所述图像中的目标对象的位置区域;Perform classification processing on the image features to obtain the location area of the target object in the image;将所述位置区域的中心位置确定为所述目标对象的位置。The center position of the position area is determined as the position of the target object.
- 根据权利要求11至14中任意一项所述的装置,所述目标对象包括人脸;The device according to any one of claims 11 to 14, wherein the target object includes a human face;对应地,所述目标检测模块还配置为确定所述图像中人脸的位置。Correspondingly, the target detection module is further configured to determine the position of the human face in the image.
- 根据权利要求11至15中任意一项所述的装置,所述控制模块还配置为基于所述图像中的目标对象的位置与所述图像的中心位置之间的距离,确定目标偏移量;The device according to any one of claims 11 to 15, wherein the control module is further configured to determine the target offset based on the distance between the position of the target object in the image and the center position of the image;基于所述目标偏移量生成多组偏移序列,并且每组偏移序列中的偏移值的加和值为所述目标偏移量;Generating multiple sets of offset sequences based on the target offset, and the sum of the offset values in each set of offset sequences is the target offset;利用强化学习算法,从所述多组偏移序列中选择出满足要求的偏移序列,并得到所述满足要求的偏移序列对应的控制指令。Using a reinforcement learning algorithm, select an offset sequence that meets the requirements from the multiple sets of offset sequences, and obtain a control instruction corresponding to the offset sequence that meets the requirements.
- 根据权利要求16所述的装置,所述控制模块还配置为针对所述多组偏移序列中各偏移值,确定价值表中与所述偏移值对应的最大价值,所述价值表包括偏移值在不同转动指令下对应的价值;The device according to claim 16, wherein the control module is further configured to determine the maximum value corresponding to the offset value in the value table for each offset value in the multiple sets of offset sequences, the value table comprising The corresponding value of the offset value under different rotation commands;获得所述偏移值对应的奖赏值,并基于所述偏移值对应的所述奖赏值和所述最大价值,确定所述偏移值的最终价值,所述奖赏值为在未执行所述偏移值的最大价值对应的转动指令的情况下,目标对象的位置与所述图像的中心位置之间的距离;The reward value corresponding to the offset value is obtained, and the final value of the offset value is determined based on the reward value and the maximum value corresponding to the offset value, and the reward value is In the case of a rotation instruction corresponding to the maximum value of the offset value, the distance between the position of the target object and the center position of the image;将所述多组偏移序列中各偏移值的所述最终价值之和最大的偏移序列,确定为满足要求的偏移序列。The offset sequence that has the largest sum of the final value of the offset values in the multiple sets of offset sequences is determined as the offset sequence that meets the requirements.
- 根据权利要求16或17所述的装置,所述控制模块还配置为基于所述满足要求的偏移序列中各偏移值的最大价值对应的转动指令,确定所述控制指令。According to the device of claim 16 or 17, the control module is further configured to determine the control instruction based on the rotation instruction corresponding to the maximum value of each offset value in the offset sequence that meets the requirements.
- 根据权利要求14所述的装置,所述目标检测模块还配置为基于所述目标对象的位置区域,确定用于控制所述智能移动设备移动的控制指令,其中,The apparatus according to claim 14, wherein the target detection module is further configured to determine a control instruction for controlling the movement of the smart mobile device based on the location area of the target object, wherein:在所述目标对象的位置区域对应的面积大于第一阈值的情况下,生成用于控制所述智能移动设备后退的控制指令;In the case that the area corresponding to the location area of the target object is greater than the first threshold, generating a control instruction for controlling the smart mobile device to back up;在所述目标对象的位置区域对应的面积小于第二阈值的情况下,生成用于控制所述智能移动设备前进的控制指令,所述第一阈值大于第二阈值。In a case where the area corresponding to the location area of the target object is smaller than a second threshold, a control instruction for controlling the advancement of the smart mobile device is generated, and the first threshold is greater than the second threshold.
- 一种智能移动设备,包括:如权利要求11至19中任意一项所述的目标跟踪装置,An intelligent mobile device, comprising: the target tracking device according to any one of claims 11 to 19,所述目标跟踪装置内的所述目标检测模块集成在智能移动设备的管理装置中,通过所述管理装置执行所述图像采集模块采集的图像的目标检测处理,得到所述目标对象的位置;The target detection module in the target tracking device is integrated in a management device of a smart mobile device, and the management device executes target detection processing of images collected by the image collection module to obtain the position of the target object;所述控制模块与所述管理装置连接,并配置为根据所述管理装置得到的目标对象的位置生成所述控制指令,并根据所述控制指令控制所述智能移动设备转动。The control module is connected to the management device and is configured to generate the control instruction according to the position of the target object obtained by the management device, and control the rotation of the smart mobile device according to the control instruction.
- 根据权利要求20所述的设备,所述管理装置还集成有所述目标跟踪装置的预处理模块以配置为对所述图像执行预处理操作,并对所述预处理操作后的图像执行目标检测处理,得到所述图像中目标对象的位置。The device according to claim 20, wherein the management device is further integrated with a preprocessing module of the target tracking device to be configured to perform a preprocessing operation on the image, and perform target detection on the image after the preprocessing operation Processing to obtain the position of the target object in the image.
- 根据权利要求20或21所述的设备,所述智能移动设备包括教育机器人。The device according to claim 20 or 21, the smart mobile device comprises an educational robot.
- 一种智能移动设备,包括:A smart mobile device, including:处理器;processor;配置为存储处理器可执行指令的存储器;A memory configured to store executable instructions of the processor;其中,所述处理器被配置为调用所述存储器存储的指令,以执行权利要求1至10中任意一项所述的方法。Wherein, the processor is configured to call instructions stored in the memory to execute the method according to any one of claims 1 to 10.
- 一种计算机可读存储介质,其上存储有计算机程序指令,所述计算机程序指令被处理器执行时实现权利要求1至10中任意一项所述的方法。A computer-readable storage medium having computer program instructions stored thereon, and when the computer program instructions are executed by a processor, the method according to any one of claims 1 to 10 is realized.
- 一种计算机程序,包括计算机可读代码,当所述计算机可读代码在智能移动设备中运行时,所述智能移动设备中的处理器执行用于实现权利要求1至10中的任一项所述的方法。A computer program, comprising computer readable code, when the computer readable code runs in a smart mobile device, the processor in the smart mobile device executes for realizing any one of claims 1 to 10 The method described.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020217014152A KR20210072808A (en) | 2019-07-17 | 2020-05-11 | Target tracking method and device, smart mobile device and storage medium |
JP2021525569A JP2022507145A (en) | 2019-07-17 | 2020-05-11 | Target tracking methods and equipment, intelligent mobile equipment and storage media |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910646696.8 | 2019-07-17 | ||
CN201910646696.8A CN110348418B (en) | 2019-07-17 | 2019-07-17 | Target tracking method and device, intelligent mobile device and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021008207A1 true WO2021008207A1 (en) | 2021-01-21 |
Family
ID=68175655
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/089620 WO2021008207A1 (en) | 2019-07-17 | 2020-05-11 | Target tracking method and apparatus, intelligent mobile device and storage medium |
Country Status (5)
Country | Link |
---|---|
JP (1) | JP2022507145A (en) |
KR (1) | KR20210072808A (en) |
CN (1) | CN110348418B (en) |
TW (2) | TWI755762B (en) |
WO (1) | WO2021008207A1 (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113139655A (en) * | 2021-03-31 | 2021-07-20 | 北京大学 | Target tracking training method and tracking method based on reinforcement learning |
CN113625658A (en) * | 2021-08-17 | 2021-11-09 | 杭州飞钛航空智能装备有限公司 | Offset information processing method and device, electronic equipment and hole making mechanism |
CN115250329A (en) * | 2021-04-28 | 2022-10-28 | 深圳市三诺数字科技有限公司 | Camera control method and device, computer equipment and storage medium |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110348418B (en) * | 2019-07-17 | 2022-03-11 | 上海商汤智能科技有限公司 | Target tracking method and device, intelligent mobile device and storage medium |
CN112207821B (en) * | 2020-09-21 | 2021-10-01 | 大连遨游智能科技有限公司 | Target searching method of visual robot and robot |
CN113409220A (en) * | 2021-06-28 | 2021-09-17 | 展讯通信(天津)有限公司 | Face image processing method, device, medium and equipment |
CN115037877A (en) * | 2022-06-08 | 2022-09-09 | 湖南大学重庆研究院 | Automatic following method and device and safety monitoring method and device |
CN117238039B (en) * | 2023-11-16 | 2024-03-19 | 暗物智能科技(广州)有限公司 | Multitasking human behavior analysis method and system based on top view angle |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101888479A (en) * | 2009-05-14 | 2010-11-17 | 汉王科技股份有限公司 | Method and device for detecting and tracking target image |
CN104751486A (en) * | 2015-03-20 | 2015-07-01 | 安徽大学 | Moving object relay tracing algorithm of multiple PTZ (pan/tilt/zoom) cameras |
CN105740644A (en) * | 2016-03-24 | 2016-07-06 | 苏州大学 | Cleaning robot optimal target path planning method based on model learning |
CN107992099A (en) * | 2017-12-13 | 2018-05-04 | 福州大学 | A kind of target sport video tracking and system based on improvement frame difference method |
CN109040574A (en) * | 2017-06-08 | 2018-12-18 | 北京君正集成电路股份有限公司 | A kind of method and device of rotation head-shaking machine tracking target |
CN110348418A (en) * | 2019-07-17 | 2019-10-18 | 上海商汤智能科技有限公司 | Method for tracking target and device, Intelligent mobile equipment and storage medium |
Family Cites Families (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1178467C (en) * | 1998-04-16 | 2004-12-01 | 三星电子株式会社 | Method and apparatus for automatically tracing moving object |
US7430315B2 (en) * | 2004-02-13 | 2008-09-30 | Honda Motor Co. | Face recognition system |
JP3992026B2 (en) * | 2004-07-09 | 2007-10-17 | 船井電機株式会社 | Self-propelled robot |
JP2010176504A (en) * | 2009-01-30 | 2010-08-12 | Canon Inc | Image processor, image processing method, and program |
JP2012191265A (en) * | 2011-03-08 | 2012-10-04 | Nikon Corp | Image processing apparatus and program |
CN102411368B (en) * | 2011-07-22 | 2013-10-09 | 北京大学 | Active vision human face tracking method and tracking system of robot |
CN102307297A (en) * | 2011-09-14 | 2012-01-04 | 镇江江大科茂信息系统有限责任公司 | Intelligent monitoring system for multi-azimuth tracking and detecting on video object |
KR102131477B1 (en) * | 2013-05-02 | 2020-07-07 | 퀄컴 인코포레이티드 | Methods for facilitating computer vision application initialization |
JP6680498B2 (en) * | 2015-09-28 | 2020-04-15 | 株式会社日立システムズ | Autonomous flying vehicle, target tracking method |
WO2017120336A2 (en) * | 2016-01-05 | 2017-07-13 | Mobileye Vision Technologies Ltd. | Trained navigational system with imposed constraints |
CN113589833A (en) * | 2016-02-26 | 2021-11-02 | 深圳市大疆创新科技有限公司 | Method for visual target tracking |
CN108292141B (en) * | 2016-03-01 | 2022-07-01 | 深圳市大疆创新科技有限公司 | Method and system for target tracking |
CN107798723B (en) * | 2016-08-30 | 2021-11-19 | 北京神州泰岳软件股份有限公司 | Target tracking control method and device |
US10140719B2 (en) * | 2016-12-22 | 2018-11-27 | TCL Research America Inc. | System and method for enhancing target tracking via detector and tracker fusion for unmanned aerial vehicles |
JP6856914B2 (en) * | 2017-07-18 | 2021-04-14 | ハンジョウ タロ ポジショニング テクノロジー カンパニー リミテッドHangzhou Taro Positioning Technology Co.,Ltd. | Intelligent object tracking |
CN108549413A (en) * | 2018-04-27 | 2018-09-18 | 全球能源互联网研究院有限公司 | A kind of holder method of controlling rotation, device and unmanned vehicle |
CN108806146A (en) * | 2018-06-06 | 2018-11-13 | 合肥嘉仕诚能源科技有限公司 | A kind of safety monitoring dynamic object track lock method and system |
CN109992000B (en) * | 2019-04-04 | 2020-07-03 | 北京航空航天大学 | Multi-unmanned aerial vehicle path collaborative planning method and device based on hierarchical reinforcement learning |
-
2019
- 2019-07-17 CN CN201910646696.8A patent/CN110348418B/en active Active
-
2020
- 2020-05-11 WO PCT/CN2020/089620 patent/WO2021008207A1/en active Application Filing
- 2020-05-11 KR KR1020217014152A patent/KR20210072808A/en not_active Application Discontinuation
- 2020-05-11 JP JP2021525569A patent/JP2022507145A/en not_active Ceased
- 2020-06-19 TW TW109120760A patent/TWI755762B/en active
- 2020-06-19 TW TW110149350A patent/TW202215364A/en unknown
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101888479A (en) * | 2009-05-14 | 2010-11-17 | 汉王科技股份有限公司 | Method and device for detecting and tracking target image |
CN104751486A (en) * | 2015-03-20 | 2015-07-01 | 安徽大学 | Moving object relay tracing algorithm of multiple PTZ (pan/tilt/zoom) cameras |
CN105740644A (en) * | 2016-03-24 | 2016-07-06 | 苏州大学 | Cleaning robot optimal target path planning method based on model learning |
CN109040574A (en) * | 2017-06-08 | 2018-12-18 | 北京君正集成电路股份有限公司 | A kind of method and device of rotation head-shaking machine tracking target |
CN107992099A (en) * | 2017-12-13 | 2018-05-04 | 福州大学 | A kind of target sport video tracking and system based on improvement frame difference method |
CN110348418A (en) * | 2019-07-17 | 2019-10-18 | 上海商汤智能科技有限公司 | Method for tracking target and device, Intelligent mobile equipment and storage medium |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113139655A (en) * | 2021-03-31 | 2021-07-20 | 北京大学 | Target tracking training method and tracking method based on reinforcement learning |
CN113139655B (en) * | 2021-03-31 | 2022-08-19 | 北京大学 | Target tracking training method and tracking method based on reinforcement learning |
CN115250329A (en) * | 2021-04-28 | 2022-10-28 | 深圳市三诺数字科技有限公司 | Camera control method and device, computer equipment and storage medium |
CN115250329B (en) * | 2021-04-28 | 2024-04-19 | 深圳市三诺数字科技有限公司 | Camera control method and device, computer equipment and storage medium |
CN113625658A (en) * | 2021-08-17 | 2021-11-09 | 杭州飞钛航空智能装备有限公司 | Offset information processing method and device, electronic equipment and hole making mechanism |
Also Published As
Publication number | Publication date |
---|---|
CN110348418A (en) | 2019-10-18 |
TW202215364A (en) | 2022-04-16 |
KR20210072808A (en) | 2021-06-17 |
TW202105326A (en) | 2021-02-01 |
TWI755762B (en) | 2022-02-21 |
CN110348418B (en) | 2022-03-11 |
JP2022507145A (en) | 2022-01-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021008207A1 (en) | Target tracking method and apparatus, intelligent mobile device and storage medium | |
US20200387698A1 (en) | Hand key point recognition model training method, hand key point recognition method and device | |
US11468581B2 (en) | Distance measurement method, intelligent control method, electronic device, and storage medium | |
WO2020187153A1 (en) | Target detection method, model training method, device, apparatus and storage medium | |
TWI728621B (en) | Image processing method and device, electronic equipment, computer readable storage medium and computer program | |
WO2021164469A1 (en) | Target object detection method and apparatus, device, and storage medium | |
TWI724736B (en) | Image processing method and device, electronic equipment, storage medium and computer program | |
TWI766286B (en) | Image processing method and image processing device, electronic device and computer-readable storage medium | |
CN111985265B (en) | Image processing method and device | |
WO2021017836A1 (en) | Method for controlling display of large-screen device, and mobile terminal and first system | |
WO2020216054A1 (en) | Sight line tracking model training method, and sight line tracking method and device | |
WO2022127919A1 (en) | Surface defect detection method, apparatus, system, storage medium, and program product | |
CN104156915A (en) | Skin color adjusting method and device | |
CN110443366B (en) | Neural network optimization method and device, and target detection method and device | |
WO2021035812A1 (en) | Image processing method and apparatus, electronic device and storage medium | |
US20190271940A1 (en) | Electronic device, external device capable of being combined with the electronic device, and a display method thereof | |
KR101623642B1 (en) | Control method of robot cleaner and terminal device and robot cleaner control system including the same | |
CN104063865A (en) | Classification model creation method, image segmentation method and related device | |
WO2020220973A1 (en) | Photographing method and mobile terminal | |
CN108156374A (en) | A kind of image processing method, terminal and readable storage medium storing program for executing | |
CN111435422B (en) | Action recognition method, control method and device, electronic equipment and storage medium | |
TWI770531B (en) | Face recognition method, electronic device and storage medium thereof | |
US20230245344A1 (en) | Electronic device and controlling method of electronic device | |
WO2023137923A1 (en) | Person re-identification method and apparatus based on posture guidance, and device and storage medium | |
CN105608469A (en) | Image resolution determination method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20841368 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2021525569 Country of ref document: JP Kind code of ref document: A Ref document number: 20217014152 Country of ref document: KR Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20841368 Country of ref document: EP Kind code of ref document: A1 |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20841368 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 09.09.2022) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20841368 Country of ref document: EP Kind code of ref document: A1 |