WO2021008207A1 - Target tracking method and apparatus, intelligent mobile device and storage medium - Google Patents

Target tracking method and apparatus, intelligent mobile device and storage medium Download PDF

Info

Publication number
WO2021008207A1
WO2021008207A1 PCT/CN2020/089620 CN2020089620W WO2021008207A1 WO 2021008207 A1 WO2021008207 A1 WO 2021008207A1 CN 2020089620 W CN2020089620 W CN 2020089620W WO 2021008207 A1 WO2021008207 A1 WO 2021008207A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
offset
target object
value
target
Prior art date
Application number
PCT/CN2020/089620
Other languages
French (fr)
Chinese (zh)
Inventor
张军伟
Original Assignee
上海商汤智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海商汤智能科技有限公司 filed Critical 上海商汤智能科技有限公司
Priority to KR1020217014152A priority Critical patent/KR20210072808A/en
Priority to JP2021525569A priority patent/JP2022507145A/en
Publication of WO2021008207A1 publication Critical patent/WO2021008207A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/24Aligning, centring, orientation detection or correction of the image
    • G06V10/245Aligning, centring, orientation detection or correction of the image by locating a pattern; Special marks for positioning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/32Normalisation of the pattern dimensions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • G06V20/47Detecting features for summarising video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification

Definitions

  • the embodiments of the present application relate to the field of computer vision technology, and relate to but not limited to a target tracking method and device, smart mobile equipment, and storage media.
  • smart mobile devices such as remote control cars and mobile robots are used in various fields.
  • remote control cars can be used as teaching tools to achieve target tracking.
  • the embodiment of the present application proposes a target tracking method and device, smart mobile equipment and storage medium.
  • An embodiment of the present application provides a target tracking method, including: acquiring a captured image; determining the position of a target object in the image; and determining the distance between the position of the target object and the center position of the image
  • a control instruction for controlling the rotation of a smart mobile device wherein the control instruction is used to make the target object be located at the center of the image, and the control instruction includes an offset in an offset sequence constituting the distance Value corresponding to the rotation instruction, the offset sequence includes at least one offset value.
  • the method before determining the position of the target object in the image, the method further includes performing a preprocessing operation on the image, and the preprocessing operation includes: adjusting the image to a preset A grayscale image of a specification, and performing normalization processing on the grayscale image; wherein the determining the position of the target object in the image includes: performing target detection processing on the image obtained after the preprocessing operation , Obtaining the position of the target object in the image after the preprocessing operation; and determining the position of the target object in the image based on the position of the target object in the image after the preprocessing operation.
  • the performing normalization processing on the grayscale image includes: determining the average value and standard deviation of the pixel value of each pixel in the grayscale image; obtaining each pixel The difference between the pixel value of and the average value; the ratio between the difference and the standard deviation corresponding to each pixel is determined as the normalized pixel value of each pixel .
  • the determining the location of the target object in the image includes: extracting image features of the image; performing classification processing on the image features to obtain the location of the target object in the image Area; the center position of the location area is determined as the location of the target object.
  • the target object includes: a human face; correspondingly, the determining the position of the target object in the image includes: determining the position of the human face in the image.
  • the determining the control instruction for controlling the rotation of the smart mobile device based on the distance between the position of the target object and the center position of the image includes: based on the target in the image Determine the target offset based on the distance between the position of the object and the center position of the image; generate multiple sets of offset sequences based on the target offset, and the sum of the offset values in each set of offset sequences Is the target offset; using a reinforcement learning algorithm, select an offset sequence that meets the requirements from the multiple sets of offset sequences, and determine the control instruction corresponding to the offset sequence that meets the requirements.
  • the use of a reinforcement learning algorithm to select an offset sequence that meets the requirements from the multiple sets of offset sequences includes: determining for each offset value in the multiple sets of offset sequences The maximum value corresponding to the offset value in the value table, the value table includes the value corresponding to the offset value under different rotation commands; the reward value corresponding to the offset value is obtained, and the corresponding value is based on the offset value
  • the reward value and the maximum value of the offset value are determined to determine the final value of the offset value, and the reward value is the position of the target object when the rotation instruction corresponding to the maximum value of the offset value is not executed
  • the distance between the center positions of the image; the offset sequence with the largest sum of the final value of the offset values in the multiple sets of offset sequences is determined as the offset sequence that meets the requirements.
  • the determining the control instruction corresponding to the offset sequence that meets the requirements includes: determining the rotation instruction corresponding to the maximum value of each offset value in the offset sequence that meets the requirements. The control instructions.
  • the method further includes: driving the smart mobile device to perform rotation based on the control instruction.
  • the method further includes: determining a control instruction for controlling the movement of the smart mobile device based on the location area of the target object, wherein the response to the location area of the target object corresponds to If the area is greater than the first threshold, generate a control instruction for controlling the back of the smart mobile device; in response to the area corresponding to the location area of the target object is less than the second threshold, generate a control instruction for controlling the smart mobile device to move forward , The first threshold is greater than the second threshold.
  • An embodiment of the application provides a target tracking device, which includes: an image acquisition module configured to acquire an image; a target detection module configured to determine the position of a target object in the image; and a control module configured to be based on The distance between the position of the target object and the center position of the image determines a control instruction for controlling the rotation of the smart mobile device, wherein the control instruction is used to make the position of the target object be located at the center position of the image , And the control instruction includes a rotation instruction corresponding to an offset value in an offset sequence constituting the distance, and the offset sequence includes at least one offset value.
  • the device further includes a preprocessing module configured to perform a preprocessing operation on the image, and the preprocessing operation includes: adjusting the image to a grayscale image of a preset specification, And performing normalization processing on the grayscale image; the target detection module is further configured to perform target detection processing on the image obtained after the preprocessing operation to obtain the target object in the image after the preprocessing operation Based on the position of the target object in the image after the preprocessing operation, determine the position of the target object in the image.
  • a preprocessing module configured to perform a preprocessing operation on the image, and the preprocessing operation includes: adjusting the image to a grayscale image of a preset specification, And performing normalization processing on the grayscale image
  • the target detection module is further configured to perform target detection processing on the image obtained after the preprocessing operation to obtain the target object in the image after the preprocessing operation Based on the position of the target object in the image after the preprocessing operation, determine the position of the target object in the image.
  • the step of performing the normalization process on the grayscale image by the preprocessing module includes: determining the average value and standard of the pixel value of each pixel in the grayscale image Difference; obtain the difference between the pixel value of each pixel and the average value; determine the ratio between the difference and the standard deviation corresponding to each pixel as the pixel The pixel value after point normalization.
  • the target detection module is further configured to extract image features of the image; perform classification processing on the image features to obtain the location area of the target object in the image; The center position of the area is determined as the position of the target object.
  • the target object includes a human face; correspondingly, the target detection module is further configured to determine the position of the human face in the image.
  • control module is further configured to determine the target offset based on the distance between the position of the target object in the image and the center position of the image; based on the target offset Generate multiple sets of offset sequences, and the sum of the offset values in each set of offset sequences is the target offset; the reinforcement learning algorithm is used to select from the multiple sets of offset sequences that meet the requirements Offset sequence, and obtain the control instruction corresponding to the offset sequence that meets the requirements.
  • control module is further configured to determine the maximum value corresponding to the offset value in the value table for each offset value in the multiple sets of offset sequences, and the value table includes The value corresponding to the offset value under different rotation commands; the reward value corresponding to the offset value is obtained, and the final value of the offset value is determined based on the reward value and the maximum value corresponding to the offset value Value, the reward value is the distance between the position of the target object and the center of the image when the rotation instruction corresponding to the maximum value of the offset value is not executed; the offset value of each offset value in the multiple sets of offset sequences The offset sequence with the largest sum of the final value is determined as the offset sequence that meets the requirements.
  • control module is further configured to determine the control instruction based on the rotation instruction corresponding to the maximum value of each offset value in the offset sequence that meets the requirements.
  • the target detection module is further configured to determine a control instruction for controlling the movement of the smart mobile device based on the location area of the target object, wherein the location area corresponding to the target object If the area is greater than the first threshold, generate a control instruction to control the back of the smart mobile device; if the area corresponding to the location area of the target object is less than the second threshold, generate a control to control the smart mobile device to move forward Instruction that the first threshold is greater than the second threshold.
  • the embodiment of the present application provides a smart mobile device, which includes the target tracking device, and the target detection module in the target tracking device is integrated in the management device of the smart mobile device, and the management device executes the The target detection processing of the image collected by the image acquisition module obtains the position of the target object; the control module is connected with the management device and is used to generate the control instruction according to the position of the target object obtained by the management device, and The control instruction controls the rotation of the smart mobile device.
  • the management device is also integrated with the preprocessing module of the target tracking device for performing preprocessing operations on the images, and performing target detection on the images after the preprocessing operations Processing to obtain the position of the target object in the image.
  • the smart mobile device includes an educational robot.
  • the embodiment of the present application provides a smart mobile device, which includes: a processor; a memory for storing executable instructions of the processor; wherein the processor is configured to call the instructions stored in the memory to execute any one The target tracking method described in item.
  • An embodiment of the present application provides a computer-readable storage medium on which computer program instructions are stored, and when the computer program instructions are executed by a processor, the target tracking method described in any one of the first aspect is implemented.
  • the embodiment of the present application provides a computer program, including computer-readable code.
  • the processor in the smart mobile device executes the Target tracking method.
  • the target tracking method and device, smart mobile device, and storage medium provided by the embodiments of the application can obtain the position of the target object in the collected image, and obtain the position of the smart mobile device according to the distance between the position of the target object and the image center.
  • the control instruction is used to control the rotation of the smart mobile device, and the obtained control instruction includes at least one rotation instruction corresponding to an offset value, wherein the distance between the offset sequence formed by each offset value and the target object and the image center It is determined that the obtained control instruction can make the rotated target object be in the center of the collected image, so that the target object is within the tracking range of the smart mobile device.
  • the target tracking method and device, smart mobile device, and storage medium provided in the embodiments of the present application can perform target tracking according to the position of the target object in real time, which is more convenient and accurate.
  • FIG. 1 is a schematic flowchart of a target tracking method provided by an embodiment of this application
  • FIG. 2 is a schematic diagram of a process of performing preprocessing on an image provided by an embodiment of the application
  • step S20 in a target tracking method provided by an embodiment of this application
  • step S30 is a schematic flowchart of step S30 in a target tracking method according to an embodiment of the application;
  • step S303 is a schematic flowchart of step S303 in a target tracking method provided by an embodiment of this application;
  • FIG. 6 is a schematic diagram of another process of a target tracking method provided by an embodiment of the application.
  • FIG. 7 is an application example diagram of a target tracking method provided by an embodiment of the application.
  • FIG. 8 is a schematic flowchart of a preprocessing process provided by an embodiment of the application.
  • FIG. 9 is a schematic diagram of the training process of the target detection network provided by an embodiment of the application.
  • FIG. 10 is a schematic diagram of the application process of the target detection network provided by an embodiment of this application.
  • FIG. 11 is a schematic flowchart of a path planning algorithm based on reinforcement learning provided by an embodiment of the application.
  • FIG. 12 is a schematic structural diagram of a target tracking device provided by an embodiment of this application.
  • FIG. 13 is a schematic structural diagram of a smart mobile device provided by an embodiment of this application.
  • the embodiment of the application provides a target tracking method, which can be applied to any smart mobile device with image processing function.
  • the target tracking method can be applied to devices such as mobile robots, remote-controlled vehicles, and aircraft.
  • the target tracking method may be implemented by a processor invoking computer-readable instructions stored in a memory.
  • FIG. 1 is a schematic flowchart of a target tracking method provided by an embodiment of the application. As shown in FIG. 1, the target tracking method includes:
  • Step S10 Obtain the collected image
  • the smart mobile device to which the target tracking method of the embodiments of the present application is applied may include an image acquisition device, such as a camera or a camera.
  • images can be directly collected by an image collection device, or video data can be collected by the image collection device, and the video data can be subjected to frame division or frame selection processing to obtain corresponding images.
  • Step S20 Determine the position of the target object in the image
  • the target detection process of the captured image can be performed, that is, whether the target object exists in the captured image, and when the target object exists, it is determined where the target object is s position.
  • the target detection processing can be realized through a neural network.
  • the target object detected by the embodiment of the present application may be any type of object, for example, the target object may be a human face, or the target object may be another object to be tracked, which is not specifically limited in the embodiment of the present application.
  • the target object may be an object with a specific known identity, that is, the embodiments of the present application can perform tracking of corresponding types of objects (such as all face images), or perform tracking of a specific identity.
  • the tracking of an object (such as a known specific face image) can be set according to requirements, which is not specifically limited in the embodiment of the present application.
  • the neural network that implements target detection processing may be a convolutional neural network. After training, the neural network can accurately detect the position of the target object in the image.
  • the form of the neural network is not limited. .
  • the process of performing target detection processing on the image perform feature extraction on the image to obtain image features, and then perform classification processing on the image features to obtain the location area of the target object in the image, based on the location area. Determine the location of the target object.
  • the classification result obtained by the classification process may include the identification of whether there is a target object in the image, such as the first identification or the second identification, where the first identification indicates that the pixel corresponding to the current position in the image is the target object, and the second identification indicates that the current
  • the pixel point corresponding to the position in the image is not the target object, and the position of the target object in the image can be determined by the area formed by the first identifier. For example, the center position of the area can be determined as the position of the target object.
  • the position of the target object in the image can be directly obtained, for example, the position of the target object can be expressed in the form of coordinates.
  • the center position of the position area of the target object in the image may be used as the position of the target object.
  • the output position is empty.
  • Step S30 Determine a control instruction for controlling the rotation of the smart mobile device based on the distance between the position of the target object and the center position of the image, wherein the control instruction is used for making the position of the target object located The center position of the image, and the control instruction includes a rotation instruction corresponding to an offset value in an offset sequence constituting the distance, and the offset sequence includes at least one offset value.
  • the smart mobile device when the position of the target object in the image is obtained, the smart mobile device can be controlled to move according to the position, so that the target object can be located at the center of the collected image, thereby realizing the target object Tracking.
  • the embodiment of the present application can obtain a control instruction for controlling the rotation of the smart mobile device according to the distance between the position of the target object in the image and the center position of the image, so that the position of the target object can be located at the center of the currently collected image .
  • the control instruction may include rotation instructions respectively corresponding to at least one offset value, wherein the distance between the position of the target object and the center position of the image can be determined according to the offset sequence corresponding to the at least one offset value.
  • the distance in the embodiment of the present application can be a directed distance (such as a direction vector), and the offset value can also be a direction vector.
  • the direction vector corresponding to the distance can be obtained by adding the direction vector corresponding to each offset value, that is, by The rotation instructions corresponding to each offset value can realize the offset of each offset value, and finally make the target object located in the center of the currently collected image.
  • the target object may always be located at the center of the captured image from the moment when the next image of the current image is captured.
  • the embodiment of the application can quickly adjust the rotation of the smart mobile device to the position of the target object in the previous image, so that the target object is in the center of the collected image, even when the target object is moving, It is also possible to track and shoot the target object so that the target object is in the frame of the collected image.
  • the embodiment of the present application may use a reinforcement learning algorithm to execute the planning of the rotation path of the smart mobile device, and obtain a control instruction for positioning the target object in the center of the image.
  • the control instruction may be determined based on the reinforcement learning algorithm
  • the reinforcement learning algorithm may be a value learning algorithm (Q-learning algorithm).
  • the movement path of the smart mobile device is optimized and determined, and the control instructions corresponding to the optimal movement path are obtained in the comprehensive evaluation of the movement time, the convenience of the movement path, and the energy consumption of the smart mobile device.
  • the embodiment of the present application can conveniently and accurately realize real-time tracking of the target object, and control the rotation of the smart mobile device according to the position of the target object, so that the target object is located in the center of the collected image.
  • the control instruction of the smart mobile device can be obtained according to the distance between the position of the target object in the image and the center position of the image.
  • the control instruction is used to control the rotation of the smart mobile device, and the obtained control instruction includes at least one offset value corresponding The distance between the offset sequence formed by each offset value and the target object and the center of the image is determined.
  • the obtained control command can enable the rotated target object to be in the center of the captured image, thereby making the target The object is within the tracking range of the smart mobile device.
  • the embodiment of the present application can perform target tracking according to the position of the target object in real time, and has the characteristics of being more convenient, accurate, and improving the performance of the smart mobile device.
  • the embodiment of the present application may perform target detection processing on the image when the image is collected.
  • the specifications, types, and other parameters of the collected images may be different, it is possible to perform preprocessing operations on the images before performing target detection processing on the images to obtain a normalized image.
  • the method further includes performing a preprocessing operation on the image.
  • FIG. 2 is a schematic diagram of the process of performing preprocessing on the image provided by an embodiment of the application, as shown in FIG. 2 ,
  • the preprocessing operation includes:
  • Step S11 Adjust the image to a grayscale image of a preset specification.
  • the captured image may be a color image or another form of image, and the captured image may be converted into an image of a preset specification, and then the image of the preset specification may be converted into a grayscale image.
  • the preset specification may be 640*480, but it is not a specific limitation of the embodiment of the present application. Converting color images or other forms of images into grayscale images can be based on the processing of pixel values. For example, the pixel value of each pixel can be divided by the maximum pixel value, and the corresponding grayscale value can be obtained based on the result. It is only illustrative, and the embodiment of the present application does not specifically limit the process.
  • the embodiment of the present application converts the image into a grayscale image and directly converts the picture into a grayscale picture. Then send it to the network model for testing, which can reduce resource consumption and increase processing speed.
  • Step S12 Perform normalization processing on the grayscale image.
  • normalization processing can be performed on the grayscale image.
  • the pixel values of the image can be normalized to the same scale range.
  • the normalization processing may include: determining the average value and standard deviation of the pixel value of each pixel in the grayscale image; determining the difference between the pixel value of the pixel and the average value; The ratio between the difference and the standard deviation corresponding to each pixel is determined as the normalized pixel value of the pixel.
  • the images collected in the embodiment of the present application may be multiple or one.
  • the obtained grayscale image is also one.
  • the average value and standard deviation corresponding to the pixel value of each pixel can be obtained.
  • the ratio between the difference between each pixel and the average value and the standard deviation can be updated to the pixel value of the pixel.
  • the average value and standard deviation of the pixel values of the multiple grayscale images can be determined by the pixel value of each pixel in the multiple grayscale images. That is, the average value and standard deviation of the embodiment of the present application may be for one image or for multiple images.
  • the difference between the pixel value of each pixel of each image and the average value can be obtained, and then the difference between the difference and the average value can be obtained. Use this ratio to update the pixel value of the pixel.
  • the pixel value of each pixel in the grayscale image can be unified to the same scale, and the normalization processing of the collected image can be realized.
  • the pre-processing may also be performed in other manners. For example, it is possible to only perform conversion of an image into a preset specification, and perform normalization processing on an image of the preset specification. That is, the embodiment of the present application may also perform normalization processing of color images.
  • the average value and standard deviation of the feature value of each channel of each pixel in the color image can be obtained, for example, the average value of the feature value (R value) of the red (Red, R) channel of each pixel of the image can be obtained Sum standard deviation, the mean and standard deviation of the characteristic value (G value) of the green (Green, G) channel, and the mean and standard deviation of the characteristic value (B value) of the blue (Blue, B) channel. Then, according to the ratio of the difference between the eigenvalue of the corresponding color channel and the average value and the standard deviation, the new eigenvalue of the corresponding color channel is obtained. In this way, the updated feature value of the color channel corresponding to each pixel of each image is obtained, and then a normalized image is obtained.
  • the embodiments of the present application can be applied to different types of images and images of different scales during implementation, thereby improving the applicability of the embodiments of the present application.
  • the position of the target object in the image is obtained, that is, the position of the target object in the original collected image can be obtained according to the position of the target object after preprocessing.
  • the following only takes the execution of target detection processing on the collected image as an example for description, the process of performing target detection on the preprocessed image is the same as that, and the description is not repeated here.
  • FIG. 3 is a schematic flowchart of step S20 in a target tracking method according to an embodiment of the application. As shown in FIG. 3, the determining the position of the target object in the image includes:
  • Step S201 Extract image features of the image
  • the image features of the image can be extracted first, for example, the image features can be obtained by convolution processing.
  • the target detection processing can be realized by a neural network, where the neural network can include a feature extraction module and a classification Module, the feature extraction module may include at least one convolutional layer, and may also include a pooling layer.
  • the feature extraction module can extract the features of the image.
  • the feature extraction process may also be performed in the structure of the residual network to obtain image features, which is not specifically limited in the embodiment of the present application.
  • Step S202 Perform classification processing on the image features to obtain the location area of the target object in the image.
  • classification processing can be performed on image features.
  • the classification module performing the classification processing can include a fully connected layer, and the detection result of the target object in the image, that is, the location area of the target object, is obtained through the fully connected layer.
  • the location area of the target object in the embodiments of the present application can be expressed in the form of coordinates, such as the location coordinates of the two vertex corners of the detection frame corresponding to the location area of the detected target object, or the location coordinates of a vertex, And the height or width of the detection frame.
  • the result of the classification process in the embodiment of the present application may include whether there is an object of the target type in the image, that is, the target object, and the location area of the target object.
  • the first identifier and the second identifier can be used to identify whether there is an object of the target type, and to indicate the location area where the target object is located in the form of coordinates.
  • the first identifier can be 1, indicating that there is a target object, on the contrary, the second identifier can be 0, indicating that there is no target object, (x1, x2, y1, y2) are the horizontal lines corresponding to the two vertices of the detection frame. The ordinate value.
  • Step S203 Determine the center position of the position area as the position of the target object.
  • the center position of the detected position area of the target object may be determined as the position of the target object.
  • the average value of the coordinate values of the four vertices of the location area where the target object is located can be taken to obtain the coordinates of the center position, and then the coordinates of the center position are determined as the position of the target object.
  • the target object can be a face
  • the target detection process can be a face detection process, that is, the location area where the face is located in the image can be detected, and the person can be obtained according to the center of the location area where the detected face is located. The position of the face. Then perform target tracking for the face.
  • the embodiments of the present application can obtain the position of the target object with high accuracy, and improve the accuracy of target tracking.
  • the above-mentioned preprocessing and target detection process can be performed by the management device of the smart mobile device.
  • the management device may be a Raspberry Pi chip.
  • Raspberry Pi chip has high scalability and high processing speed.
  • the obtained information about the location of the target object, etc. may be transmitted to the control terminal of the smart mobile device to obtain the control instruction.
  • the transmission of the detection result of the execution target object in the embodiment of the present application may be encapsulated and transmitted according to a preset data format.
  • the detection result indicates the position of the target object in the image.
  • the data corresponding to the detection result of the transmission can be 80 bytes, and it can include mode flags, detection result information, cyclic redundancy (CRC) check, retransmission threshold, control field, and optional Field.
  • the mode flag bit can indicate the current working mode of the Raspberry Pi chip
  • the detection result information can be the position of the target object
  • the CRC check bit is used for security verification
  • the retransmission threshold is used to indicate the maximum number of retransmissions of data
  • the control field Used to indicate the desired working mode of the smart mobile device.
  • the optional field is the information that can be added.
  • FIG. 4 is a schematic flowchart of step S30 in a target tracking method provided by an embodiment of the application. As shown in FIG. 4, step S30 can be implemented through the following steps:
  • Step S301 Determine a target offset based on the distance between the position of the target object in the image and the center position of the image;
  • the position of the target object when tracking the target object in the embodiments of the present application, the position of the target object can be maintained at the center of the image, and the tracking of the target object can be achieved in this way. Therefore, in the embodiment of the present application, when the position of the target object is obtained, the distance between the position of the target object and the center position of the image can be detected, and the distance is used as the target offset. The Euclidean distance between the coordinates of the position of the target object and the coordinates of the center position of the image can be used as the target offset.
  • the distance can also be expressed in the form of a vector, for example, it can be expressed as a directed vector between the center position of the image and the position of the target object, that is, the obtained target offset may include the distance between the position of the target object and the center position of the image , Can also include the direction of the center of the image relative to the position of the target object.
  • Step S302 Generate multiple sets of offset sequences based on the target offset, the offset sequences include at least one offset value, and the sum of the offset values in each set of offset sequences is the target offset Shift
  • the embodiment of the present application may generate multiple sets of offset sequences according to the obtained target offset, the offset sequence includes at least one offset value, and the sum of the at least one offset value Is the target offset. For example, if the position of the target object is (100, 0) and the position of the image center is (50, 0), the target offset is 50 on the x-axis.
  • multiple offset sequences can be generated.
  • the offset value of the first offset sequence is 10, 20, and 20, and the offset value of the second offset sequence can be 10, 25 And 15, where the direction of each offset value can be the positive direction of the x-axis.
  • multiple sets of offset sequences corresponding to the target offset can be obtained.
  • the number of offset values in the generated multiple sets of offset sequences may be set, for example, it may be 3, but it is not a specific limitation in the embodiment of the present application.
  • the method of generating multiple sets of offset sequences may be a method of randomly generating. In practice, there may be multiple combinations of offset values in the offset sequence that can achieve the target offset. The embodiment of the present application may randomly select a preset number of combinations from the multiple combinations, that is, the preset number. The offset sequence.
  • Step S303 Using a reinforcement learning algorithm, select an offset sequence that meets the requirements from the multiple sets of offset sequences, and obtain a control instruction corresponding to the offset sequence that meets the requirements.
  • a reinforcement learning algorithm when the generated offset sequence is obtained, a reinforcement learning algorithm may be used to select an offset sequence that meets the requirements.
  • the reinforcement learning algorithm can be used to obtain the total value corresponding to the offset sequence, and the offset sequence with the highest total value is determined as the offset sequence that meets the requirements.
  • Fig. 5 is a schematic flow chart of step S303 in a target tracking method provided by an embodiment of the application.
  • step S303 “the use of the reinforcement learning algorithm, selects from the multiple sets of offset sequences that meet the requirements And obtain the control instruction corresponding to the offset sequence that meets the requirements” may include:
  • Step S3031 For each offset value in the multiple sets of offset sequences, determine the maximum value corresponding to the offset value in the value table, and the value table includes the value corresponding to the offset value under different rotation commands;
  • the reinforcement learning algorithm may be a value learning algorithm (Q-learning algorithm), and the corresponding value table (Q-table) may indicate the value corresponding to different offset values under different rotation instructions ( quality).
  • Rotation instructions refer to instructions that control the rotation of smart mobile devices, which can include parameters such as motor rotation angle, motor speed, and motor rotation time.
  • the value table in the embodiment of the present application may be a value table obtained through intensive chemical learning in advance, wherein the parameters of the value table can be accurately distinguished and reflected in the case of different offset values, and the values corresponding to different rotation commands.
  • Table 1 shows at least a part of the parameters of the rotation command
  • Table 2 shows a schematic table of the value table.
  • the horizontal parameters a1, a2, and a3 are different rotation commands, and the vertical parameters s1, s2, and s3 are different.
  • the parameter in the table indicates the value of the corresponding offset value and the corresponding rotation command.
  • the value can represent the value of the corresponding rotation command under the corresponding offset value. Generally, the larger the value, the higher the value, indicating that the value of the target tracking through the command is higher.
  • each offset sequence may include multiple offset values, and the embodiment of the present application may determine the maximum value corresponding to each offset value in each sequence based on the value table. For example, for the offset value s1, the maximum value is 3, for the offset value s2, the maximum value is 2, and for the offset value s3, the maximum value is 4.
  • the maximum value is 3
  • the maximum value is 2
  • the offset value s3 is 4.
  • the foregoing is only an exemplary description, and the obtained value may be different for different value tables, which is not specifically limited in the embodiment of the present application.
  • Step S3032 Obtain the reward value corresponding to the offset value, and determine the final value of the offset value based on the reward value corresponding to the offset value and the maximum value, wherein the reward value is The distance between the position of the target object and the center position of the image when the rotation instruction corresponding to the offset value is not executed;
  • the reward value of each offset value in the offset sequence can be obtained, where the reward value is related to the position of the target object when the corresponding offset value is not executed.
  • the position of the target object is the initially detected position of the target object in the image.
  • the position of the target object may be assumed after the rotation instruction corresponding to the maximum value of the offset value before the offset value is executed. For example, assuming that the position of the target object in the detected image is (100, 0), the obtained offset sequence that satisfies the condition may be 20, 15, and 15.
  • the reward value of the first offset value can be determined by the position (100, 0) of the target object.
  • the position of the target object can be determined to be (120, 0).
  • the reward value of the second offset value can be determined based on the position, and when the third offset value is executed, It is determined that the position of the target object is (135, 0), and the reward value of the third offset value can be determined based on this position.
  • the expression for obtaining the reward value can be as shown in formula (1-1):
  • R(s,a) is the reward value of the maximum value of the rotation instruction a corresponding to the offset value s, that is, the reward value corresponding to the offset value s, and s(x) and s(y) are respectively the unexecuted offset
  • the value corresponds to the abscissa and ordinate of the position of the target object when the rotation instruction a of the maximum value corresponds
  • b and c represent the abscissa and ordinate of the center position of the image, respectively.
  • the final value of the offset value can be determined according to the reward value corresponding to the offset value and the maximum value corresponding to the offset value.
  • the weighted sum of the reward value and the maximum value can be used to determine the final value.
  • the expression for determining the final value of the offset value in the embodiment of the present application may be as shown in formula (1-2):
  • Q'(s,a) is the final value corresponding to the offset value s
  • R(s,a) is the reward value of the maximum value rotation instruction a corresponding to the offset value s
  • Step S3033 Determine the offset sequence with the largest sum of the final value as the offset sequence that meets the requirements.
  • the final value of each offset value in the offset sequence may be summed to obtain the total value corresponding to the offset sequence. Then select the offset sequence with the largest total value as the offset sequence that meets the requirements.
  • the offset sequence with the largest total value can be obtained, and the maximum total value indicates that the rotation instruction corresponding to the rotation path corresponding to the offset sequence is the optimal choice.
  • control instruction can be combined and generated according to the rotation instruction corresponding to the maximum value corresponding to each offset value in the value table in the offset sequence.
  • the control instruction can then be transmitted to the smart mobile device, so that the smart mobile device performs a rotation operation according to the control instruction.
  • the smart mobile device can be controlled to move according to the generated control instruction.
  • the control command may include parameters such as the rotation angle and direction of the motor, or may also include control commands such as the motor speed, the motor rotation time, whether to stop or not.
  • the embodiment of the present application may control the movement of the mobile device by means of differential steering.
  • the smart mobile device may be a smart mobile vehicle, which may include two left and right drive wheels.
  • the embodiment of the present application may control the left and right drive wheels based on control instructions.
  • Rotation speed realizes steering and movement. When the driving wheels rotate at different speeds, the body will rotate even if there is no steering wheel or the steering wheel does not move.
  • the difference in the rotational speed of the two driving wheels can be realized by operating two separate clutches or braking devices installed on the left and right half shafts.
  • the intelligent mobile device can realize different rotation trajectories according to the different rotation speed and rotation angle of the left and right driving wheels. Under different rotation trajectories, the pictures collected by the car are different, and then through continuous optimization, the position of the intelligent mobile car is adjusted to ensure that the target object is in the center of the image to achieve the tracking of the target object.
  • FIG. 6 is a schematic diagram of another process of a target tracking method provided by an embodiment of the application. As shown in FIG. 6, the target tracking method further includes:
  • Step S41 Determine a control instruction for controlling the movement of the smart mobile device based on the location area of the target object, wherein it can be determined whether the area of the location area of the target object is within the range between the first threshold and the second threshold Inside.
  • the location area of the target object in the collected image can be obtained, and the embodiment of the present application can control the moving direction of the smart mobile device according to the area of the location area.
  • the area of the location area can be determined according to the obtained location area of the target object, and the area can be compared with the first threshold and the second threshold.
  • the first threshold and the second threshold may be preset reference thresholds, the first threshold is greater than the second threshold, and the embodiment of the present application does not limit specific values.
  • Step S42 in the case that the area corresponding to the location area of the target object is greater than the first threshold, generate a control instruction for controlling the backing of the smart mobile device;
  • a control instruction for controlling the backing of the smart mobile device can be generated until the area of the detected location area of the target object is less than the first threshold and greater than the second threshold.
  • Step S43 In a case where the area corresponding to the location area of the target object is smaller than a second threshold, generate a control instruction for controlling the advancement of the smart mobile device, where the first threshold is greater than the second threshold.
  • the area of the detected location area of the target object when the area of the detected location area of the target object is smaller than the second threshold, it means that the distance between the target object and the smart mobile device is far, and the smart mobile device can be moved forward at this time.
  • a control instruction for controlling the advancement of the smart mobile device can be generated until the area of the detected location area of the target object is less than the first threshold and greater than the second threshold.
  • the smart mobile device can perform a forward or backward operation according to the received forward or backward control instruction.
  • the movement of the smart mobile device can be controlled according to the size of the target object, and the area corresponding to the location area of the detected target object (such as a human face) can be kept between the second threshold and the first threshold to realize the smart mobile device Control of the moving direction.
  • the application body of the target tracking method in the embodiment of the present application may be a smart mobile device, or may also be a device installed in the smart mobile device, and the device is used to control the movement of the smart mobile device.
  • the intelligent mobile device to which the target tracking method of the embodiment of the present application is applied is an educational robot
  • the management device of the educational robot is a Raspberry Pi
  • the target object is a human face as an example for description, to clearly embody the embodiments of the present application.
  • FIG. 7 is an application example diagram of a target tracking method provided by an embodiment of the application, in which the camera 701 is connected to the raspberry pi 702 to transmit the image or video collected by the camera 701 to the raspberry pi 702, wherein the camera 701
  • the Raspberry Pi 702 can be connected to the Raspberry Pi 702 through a Universal Serial Bus (USB) port for data transmission, but this connection method is not limited to this embodiment of the application. The following process can then be performed.
  • USB Universal Serial Bus
  • the application field of the embodiment of the present application may be an intelligent robot in an educational background, and the intelligent robot may realize the functions of face detection and tracking.
  • the Raspberry Pi 702 can perform image processing
  • the Raspberry Pi 702 in the embodiment of the present application can perform image preprocessing and target detection processing
  • the Raspberry Pi can be integrated with a target detection network. Since the types of images collected by the camera 701 are not the same, the Raspberry Pi 702 needs to perform necessary preprocessing on the image data before transmitting the images to the target detection network model.
  • Fig. 8 is a schematic flow chart of the preprocessing process provided by an embodiment of the application, as shown in Fig. 8, including:
  • Step S51 Receive the collected video data.
  • Step S52 Framing the video data into picture data.
  • Step S53 unify the picture size.
  • Step S54 Convert the picture into a grayscale image.
  • Step S55 Normalize the picture.
  • Image framing refers to decomposing the collected video data into one frame of images, and then unifying the image size to a size range of 640*480. Since color images consume a lot of resources during processing, but have little impact on the detection effect, the embodiment of the present application ignores color features, directly converts the image to a grayscale image, and sends it to the target detection network for detection. Finally, for the convenience of image processing, the image is normalized, which is to subtract the average value of each dimension from the original data of each dimension of the image data, replace the original data with the result, and then divide the data of each dimension With the standard deviation of each dimension data, the image data can be normalized to the same scale.
  • the camera 701 collects the picture.
  • Output Face detection coordinate position.
  • the target detection network in the Raspberry Pi 702 can perform face recognition and detection in the image, that is, the embodiment of the application can use the deep learning technology to realize the face detection technology, where the deep learning technology realizes the face detection
  • the technology is divided into two stages: model training and model application.
  • FIG. 9 is a schematic diagram of the training process of the target detection network provided in an embodiment of the application. As shown in FIG. 9, the training process includes:
  • Step S61 Collect a face data set picture.
  • the face data set pictures include face pictures of various ages and various regions, and the face pictures are manually labeled to obtain the coordinate positions of the faces. Construct a face data set and divide the data set into three parts: training set, test set and verification set.
  • Step S62 construct a neural network model.
  • step S62 can be implemented through the following steps:
  • step S621 feature extraction is achieved by superimposing the convolutional layer and the pooling layer.
  • Step S622 Use a classifier to classify the extracted features.
  • classification can be achieved through a fully connected layer (classifier).
  • Step S63 training the neural network model.
  • Model training is achieved through a series of gradient optimization algorithms. After a large number of iterative training, a trained model can be obtained for model testing.
  • step S64 a trained neural network model is obtained.
  • the training process of the model is the training process of the target detection network (neural network model).
  • FIG. 10 is a schematic diagram of the application process of the target detection network provided by an embodiment of the application. As shown in FIG. 10, the application process includes:
  • Step S71 Collect a face picture.
  • step S72 the preprocessed picture is sent to the trained model.
  • Step S73 obtain the coordinate position of the face.
  • the pre-processed picture is sent to the trained model, and the coordinate position of the face in the picture can be output after forward calculation.
  • the face coordinate position detection can be completed by the Raspberry Pi 702, and then the face coordinate position can be encapsulated into a data packet according to a defined communication protocol specification. After the data encapsulation is completed, it is sent to the processor or controller in the smart mobile device 703 through the serial port, where the smart mobile device 703 can be an educational robot EV3, and then the smart mobile device 703 can complete subsequent faces according to the received face position track.
  • EV3 performs path planning according to the coordinates of the face position.
  • the educational robot EV3 receives and analyzes the data packet sent from the Raspberry Pi 702 side to obtain the coordinate position of the face, and then complete the path planning.
  • reinforcement learning algorithms can be used to realize path planning.
  • Reinforcement learning mainly includes state, reward and action factors.
  • the state is the coordinate position of the face detected each time
  • the reward can be defined as the Euclidean distance between the center of the face and the center of the picture
  • the action is the motor motion instruction executed each time.
  • the motor motion can be controlled as shown in Table 1.
  • path planning can be performed.
  • the Q function is defined as follows. Input includes state and action, and returns the reward value for performing an action in a specific state.
  • FIG. 11 is a schematic flowchart of a path planning algorithm based on reinforcement learning provided by an embodiment of the application, as shown in FIG. 11, including:
  • Step S81 initialize the Q value table.
  • Step S82 selecting a specific motor from the action set to execute the command.
  • Step S83 execute a specific motor execution instruction.
  • Step S84 Calculate the Q value table of this state.
  • Step S85 update the Q value table.
  • the action set of the educational robot EV3 is shown in Table 1.
  • the state set uses the face coordinates to determine the tracking effect, that is, the distance between the face position and the center of the picture is used as the reward function, and the Q value table is updated by measuring the reward function of different actions.
  • the Q value table is updated by measuring the reward function of different actions.
  • the smart mobile device 703 implements face tracking according to the motion instructions (same as the control instructions in the above embodiments).
  • Smart mobile devices such as educational robots use a differential steering mechanism, and the trolley realizes steering by controlling the speed of the left and right driving wheels 704 and 705.
  • the body When the driving wheels rotate at different speeds, the body will rotate even if there is no steering wheel or the steering wheel does not move.
  • the difference in the speed of the driving wheels can be realized by operating two separate clutches or braking devices mounted on the left and right axles.
  • the smart mobile device 703 can realize different rotation trajectories according to different rotation speeds and rotation angles of the left and right wheels. Under different rotation trajectories, the pictures collected by the car are different, and then continuously optimize the action, adjust the position of the car, and finally ensure that the face position is in the center of the picture to realize the face tracking function.
  • the smart mobile device in the embodiment of the present application may also be provided with a sensor 706, such as a distance sensor, a touch sensor, etc., for sensing related information of the surrounding environment of the smart mobile device 703, and can be based on the sensed surroundings The related information of the environment controls the working mode and movement parameters of the smart mobile device 703.
  • a sensor 706 such as a distance sensor, a touch sensor, etc.
  • the target tracking method can obtain the position of the target object in the collected image, and obtain the control instruction of the smart mobile device according to the distance between the position of the target object and the image center.
  • the control instruction is used to adjust the rotation angle of the smart mobile device.
  • the obtained control instruction includes at least one rotation instruction corresponding to an offset value, wherein the distance between the offset sequence formed by each offset value and the target object and the image center is determined,
  • the obtained control instruction can enable the rotated target object to be in the center of the collected image, so that the target object is within the tracking range of the smart mobile device.
  • the embodiment of the present application can perform target tracking according to the position of the target object in real time, and has the characteristics of being more convenient, accurate, and improving the performance of the smart mobile device.
  • the embodiments of the present application can use deep learning technology to complete face detection (using neural networks to achieve target detection), which has significantly improved accuracy and speed compared to traditional target detection methods.
  • a reinforcement learning algorithm may also be used to perform path planning through Q-learning technology, and the optimal rotation path may be selected.
  • the embodiments of the present application can also be adapted to the requirements of different scenarios and have good scalability.
  • the writing order of the steps does not mean a strict execution order but constitutes any limitation on the implementation process.
  • the specific execution order of each step should be based on its function and possibility.
  • the inner logic is determined.
  • the embodiments of the present application also provide target tracking devices, smart mobile devices, computer-readable storage media, and programs, all of which can be used to implement any target tracking method provided in the embodiments of the present application.
  • target tracking devices smart mobile devices, computer-readable storage media, and programs, all of which can be used to implement any target tracking method provided in the embodiments of the present application.
  • FIG. 12 is a schematic structural diagram of a target tracking device provided by an embodiment of the application. As shown in FIG. 12, the target tracking device includes:
  • the image acquisition module 10 is configured to acquire images
  • the target detection module 20 is configured to determine the position of the target object in the image
  • the control module 30 is configured to determine a control instruction for controlling the rotation of the smart mobile device based on the distance between the position of the target object and the center position of the image, wherein the control instruction is used to make the target object The position is located at the center of the image, and the control instruction includes a control instruction corresponding to an offset value in an offset sequence constituting the distance, and the offset sequence includes at least one offset value.
  • the device further includes a preprocessing module configured to perform a preprocessing operation on the image, and the preprocessing operation includes: adjusting the image to a grayscale image of a preset specification , And perform normalization processing on the grayscale image;
  • the target detection module is further configured to perform target detection processing on the image obtained after the preprocessing operation to obtain the position of the target object in the image after the preprocessing operation;
  • the step of performing the normalization processing on the grayscale image by the preprocessing module includes:
  • the ratio between the difference and the standard deviation corresponding to each pixel is determined as the normalized pixel value of the pixel.
  • the target detection module is further configured to extract image features of the image
  • the center position of the position area is determined as the position of the target object.
  • the target object includes a human face
  • the target detection module is further configured to determine the position of the human face in the image.
  • control module is further configured to determine the target offset based on the distance between the position of the target object in the image and the center position of the image;
  • control module is further configured to determine the maximum value corresponding to the offset value in the value table for each offset value in the multiple sets of offset sequences, and the value table includes The offset value corresponds to the value under different rotation commands;
  • the reward value corresponding to the offset value is obtained, and the final value of the offset value is determined based on the reward value and the maximum value corresponding to the offset value, and the reward value is In the case of a rotation instruction corresponding to the maximum value of the offset value, the distance between the position of the target object and the image center;
  • the offset sequence that has the largest sum of the final value of the offset values in the multiple sets of offset sequences is determined as the offset sequence that meets the requirements.
  • control module is further configured to determine the control instruction based on the rotation instruction corresponding to the maximum value of each offset value in the offset sequence that meets the requirements.
  • the target detection module is further configured to determine a control instruction for controlling the movement of the smart mobile device based on the location area of the target object, wherein:
  • a control instruction for controlling the advancement of the smart mobile device is generated, and the first threshold is greater than the second threshold.
  • an embodiment of the present application also provides a smart mobile device that includes the target tracking device described in the above embodiment, and the target detection network in the target tracking device is integrated in the management device of the smart mobile device, Execute the target detection processing of the image collected by the image collection module by the management device to obtain the position of the target object;
  • the control module is connected to the management device, and is used to generate the control instruction according to the position of the target object obtained by the management device, and control the rotation of the smart mobile device according to the control instruction.
  • the management device is a Raspberry Pi.
  • the smart mobile device includes an educational robot.
  • the management device is also integrated with the preprocessing module of the target tracking device to be configured to perform preprocessing operations on the images and perform target detection on the images after the preprocessing operations Processing to obtain the position of the target object in the image.
  • the functions or modules included in the apparatus provided in the embodiments of the present application can be configured to execute the methods described in the above method embodiments, and for specific implementation, refer to the description of the above method embodiments.
  • the embodiment of the present application also proposes a computer-readable storage medium on which computer program instructions are stored, and the computer program instructions implement the foregoing method when executed by a processor.
  • the computer-readable storage medium may be a non-volatile computer-readable storage medium.
  • An embodiment of the present application also proposes an intelligent mobile device, including: a processor; a memory for storing executable instructions of the processor; wherein the processor is configured as the above method.
  • FIG. 13 is a schematic structural diagram of a smart mobile device provided by an embodiment of this application.
  • the smart mobile device 800 may be any device capable of performing image processing or a mobile device capable of performing target tracking.
  • the device 800 may include one or more of the following components: a processing component 802, a memory 804, a power supply component 806, a multimedia component 808, an audio component 810, an input/output (Input Output, I/O) interface 812, a sensor Component 814, and communication component 816.
  • the processing component 802 generally controls the overall operations of the device 800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations.
  • the processing component 802 may include one or more processors 820 to execute instructions to complete all or part of the steps of the foregoing method.
  • the processing component 802 may include one or more modules to facilitate the interaction between the processing component 802 and other components.
  • the processing component 802 may include a multimedia module to facilitate the interaction between the multimedia component 808 and the processing component 802.
  • the memory 804 is configured to store various types of data to support the operation of the device 800. Examples of these data include instructions for any application or method operating on the device 800, contact data, phone book data, messages, pictures, videos, etc.
  • the memory 804 can be implemented by any type of volatile or non-volatile storage devices or their combination, such as static random access memory (Static Random-Access Memory, SRAM), electrically erasable programmable read-only memory (Electrically Erasable Programmable Read Only Memory, EEPROM, Erasable Programmable Read-Only Memory (Electrical Programmable Read Only Memory, EPROM), Programmable Read-Only Memory (PROM), Read-Only Memory (Read-Only Memory) , ROM), magnetic memory, flash memory, magnetic disk or optical disk.
  • SRAM static random access memory
  • SRAM static random access memory
  • EEPROM Electrically erasable programmable read-only memory
  • EEPROM Electrically Erasable Programmable Read-Only Memory
  • EPROM Electrical Programmable Read Only Memory
  • the power supply component 806 provides power for various components of the device 800.
  • the power supply component 806 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the device 800.
  • the multimedia component 808 includes a screen that provides an output interface between the device 800 and the user.
  • the screen may include a liquid crystal display (Liquid Crystal Display, LCD) and a touch panel (Touch Pad, TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from the user.
  • the touch panel includes one or more touch sensors to sense touch, sliding, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure related to the touch or slide operation.
  • the multimedia component 808 includes a front camera and/or a rear camera. When the device 800 is in an operation mode, such as a shooting mode or a video mode, the front camera and/or the rear camera can receive external multimedia data. Each front camera and rear camera can be a fixed optical lens system or have focal length and optical zoom capabilities.
  • the audio component 810 is configured to output and/or input audio signals.
  • the audio component 810 includes a microphone (MIC).
  • the microphone When the device 800 is in an operating mode, such as a call mode, a recording mode, and a voice recognition mode, the microphone is configured to receive external audio signals.
  • the received audio signal may be further stored in the memory 804 or transmitted via the communication component 816.
  • the audio component 810 further includes a speaker for outputting audio signals.
  • the I/O interface 812 provides an interface between the processing component 802 and a peripheral interface module.
  • the peripheral interface module may be a keyboard, a click wheel, a button, and the like. These buttons may include but are not limited to: home button, volume button, start button, and lock button.
  • the sensor component 814 includes one or more sensors for providing the device 800 with various aspects of status assessment.
  • the sensor component 814 can detect the on/off status of the device 800 and the relative positioning of components, such as the display and keypad of the device 800, and the sensor component 814 can also detect the position change of the device 800 or a component of the device 800 , The presence or absence of contact between the user and the device 800, the orientation or acceleration/deceleration of the device 800, and the temperature change of the device 800.
  • the sensor component 814 may include a proximity sensor configured to detect the presence of nearby objects when there is no physical contact.
  • the sensor component 814 may also include a light sensor, such as a Complementary Metal Oxide Semiconductor (CMOS) or Charge Coupled Device (CCD) image sensor, for use in imaging applications.
  • CMOS Complementary Metal Oxide Semiconductor
  • CCD Charge Coupled Device
  • the sensor component 814 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor or a temperature sensor.
  • the communication component 816 is configured to facilitate wired or wireless communication between the device 800 and other devices.
  • the device 800 can access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof.
  • the communication component 816 receives a broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel.
  • the communication component 816 further includes a Near Field Communication (NFC) module to facilitate short-range communication.
  • NFC Near Field Communication
  • the NFC module can be based on radio frequency identification (RFID) technology, infrared data association (Infrared Data Association, IrDA) technology, ultra wideband (UWB) technology, Bluetooth (BT) technology and other technologies. Technology to achieve.
  • RFID radio frequency identification
  • IrDA infrared data association
  • UWB ultra wideband
  • Bluetooth Bluetooth
  • the device 800 may be implemented by one or more application specific integrated circuits (Application Specific Integrated Circuit, ASIC), digital signal processor (Digital Signal Processor, DSP), and digital signal processing device (Digital Signal Process, DSPD). ), programmable logic device (Programmable Logic Device, PLD), Field Programmable Gate Array (Field Programmable Gate Array, FPGA), controller, microcontroller, microprocessor, or other electronic components to implement the above methods.
  • ASIC Application Specific Integrated Circuit
  • DSP Digital Signal Processor
  • DSPD digital signal processing device
  • PLD programmable logic device
  • FPGA Field Programmable Gate Array
  • controller microcontroller, microprocessor, or other electronic components to implement the above methods.
  • non-volatile computer-readable storage medium such as the memory 804 including computer program instructions, which can be executed by the processor 820 of the device 800 to implement the foregoing methods.
  • the embodiments of the application may be systems, methods and/or computer program products.
  • the computer program product may include a computer-readable storage medium loaded with computer-readable program instructions for enabling a processor to implement various aspects of the embodiments of the present application.
  • the computer-readable storage medium may be a tangible device that can hold and store instructions used by the instruction execution device.
  • the computer-readable storage medium may be, for example, but not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing.
  • Computer-readable storage media include: portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM) Or flash memory), static random access memory (SRAM), portable compact disk read-only memory (CD-ROM), digital versatile disk (Digital Video Disc, DVD), memory stick, floppy disk, mechanical encoding device, such as storage on it Commanded punch cards or protruding structures in the grooves, and any suitable combination of the above.
  • RAM random access memory
  • ROM read-only memory
  • EPROM erasable programmable read-only memory
  • flash memory static random access memory
  • SRAM static random access memory
  • CD-ROM compact disk read-only memory
  • DVD digital versatile disk
  • memory stick floppy disk
  • mechanical encoding device such as storage on it Commanded punch cards or protruding structures in the grooves, and any suitable combination of the above.
  • the computer-readable storage medium used here is not interpreted as a transient signal itself, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (for example, light pulses through fiber optic cables), or through wires Transmission of electrical signals.
  • the computer-readable program instructions described herein can be downloaded from a computer-readable storage medium to various computing/processing devices, or downloaded to an external computer or external storage device via a network, such as the Internet, a local area network, a wide area network, and/or a wireless network.
  • the network may include copper transmission cables, optical fiber transmission, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers.
  • the network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network, and forwards the computer-readable program instructions for storage in the computer-readable storage medium in each computing/processing device .
  • the computer program instructions used to perform the operations of the embodiments of the present application may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine-related instructions, microcode, firmware instructions, state setting data, or one or more programming Source code or object code written in any combination of languages, the programming language includes object-oriented programming languages such as Smalltalk, C++, etc., and conventional procedural programming languages such as "C" language or similar programming languages.
  • Computer-readable program instructions can be executed entirely on the user's computer, partly on the user's computer, executed as a stand-alone software package, partly on the user's computer and partly executed on a remote computer, or entirely on the remote computer or server carried out.
  • the remote computer can be connected to the user's computer through any kind of network-including Local Area Network (LAN) or Wide Area Network (WAN)-or it can be connected to an external computer (such as Use an Internet service provider to connect via the Internet).
  • the electronic circuit is customized by using the state information of the computer-readable program instructions, such as programmable logic circuit, Field Programmable Gate Array (FPGA) or Programmable Logic Array (Programmable Logic). Array, PLA), the electronic circuit can execute computer-readable program instructions to implement various aspects of the embodiments of the present application.
  • These computer-readable program instructions can be provided to the processor of a general-purpose computer, a special-purpose computer, or other programmable data processing device, thereby producing a machine such that when these instructions are executed by the processor of the computer or other programmable data processing device , A device that implements the functions/actions specified in one or more blocks in the flowchart and/or block diagram is produced. It is also possible to store these computer-readable program instructions in a computer-readable storage medium. These instructions make computers, programmable data processing apparatuses, and/or other devices work in a specific manner, so that the computer-readable medium storing instructions includes An article of manufacture, which includes instructions for implementing various aspects of the functions/actions specified in one or more blocks in the flowchart and/or block diagram.
  • each block in the flowchart or block diagram may represent a module, program segment, or part of an instruction, and the module, program segment, or part of an instruction contains one or more functions for implementing the specified logical function.
  • Executable instructions may also occur in a different order from the order marked in the drawings. For example, two consecutive blocks can actually be executed in parallel, or they can sometimes be executed in the reverse order, depending on the functions involved.
  • each block in the block diagram and/or flowchart, and the combination of the blocks in the block diagram and/or flowchart can be implemented by a dedicated hardware-based system that performs the specified functions or actions Or it can be realized by a combination of dedicated hardware and computer instructions.
  • the embodiment of the application discloses a target tracking method and device, a smart mobile device, and a storage medium.
  • the method includes: acquiring a captured image; determining the location of a target object in the image; and based on the location of the target object and The distance between the center positions of the image obtains a control instruction used to control a smart mobile device, wherein the control instruction is used to make the position of the target object be located at the center position of the image, and the control instruction includes The rotation instruction corresponding to the offset value in the offset sequence constituting the distance, and the offset sequence includes at least one offset value.
  • the embodiments of the present application can realize real-time tracking of target objects.

Abstract

A target tracking method and apparatus, an intelligent mobile device and a storage medium. The method comprises: obtaining an acquired image (S10); determining the position of a target object in the image (S20); and obtaining, on the basis of the distance between the position of the target object and the central position of the image, a control instruction for controlling an intelligent mobile device to rotate (S30), wherein the control instruction is used for positioning the target object at the center of the image, the control instruction comprises a rotation instruction corresponding to an offset value in an offset sequence constituting the distance, and the offset sequence comprises at least one offset value.

Description

目标跟踪方法及装置、智能移动设备和存储介质Target tracking method and device, smart mobile equipment and storage medium
相关申请的交叉引用Cross references to related applications
本申请基于申请号为201910646696.8、申请日为2019年07月17日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。This application is filed based on a Chinese patent application with an application number of 201910646696.8 and an application date of July 17, 2019, and claims the priority of the Chinese patent application. The entire content of the Chinese patent application is hereby incorporated into this application by reference.
技术领域Technical field
本申请实施例涉及计算机视觉技术领域,涉及但不限于一种目标跟踪方法及装置、智能移动设备和存储介质。The embodiments of the present application relate to the field of computer vision technology, and relate to but not limited to a target tracking method and device, smart mobile equipment, and storage media.
背景技术Background technique
目前,例如遥控车、移动机器人等智能移动设备在各个领域中都有所应用,例如在教育行业,可以通过遥控车作为教学用具,以实现目标跟踪。At present, smart mobile devices such as remote control cars and mobile robots are used in various fields. For example, in the education industry, remote control cars can be used as teaching tools to achieve target tracking.
发明内容Summary of the invention
本申请实施例提出了一种目标跟踪方法及装置、智能移动设备和存储介质。The embodiment of the present application proposes a target tracking method and device, smart mobile equipment and storage medium.
本申请实施例提供了一种目标跟踪方法,包括:获取采集的图像;确定所述图像中的目标对象的位置;基于所述目标对象的位置和所述图像的中心位置之间的距离,确定用于控制智能移动设备转动的控制指令,其中,所述控制指令用于使得所述目标对象位于所述图像的中心位置,且所述控制指令包括构成所述距离的偏移序列中的偏移值对应的转动指令,所述偏移序列包括至少一个偏移值。An embodiment of the present application provides a target tracking method, including: acquiring a captured image; determining the position of a target object in the image; and determining the distance between the position of the target object and the center position of the image A control instruction for controlling the rotation of a smart mobile device, wherein the control instruction is used to make the target object be located at the center of the image, and the control instruction includes an offset in an offset sequence constituting the distance Value corresponding to the rotation instruction, the offset sequence includes at least one offset value.
在本申请一些实施例中,在确定所述图像中的目标对象的位置之前,所述方法还包括对所述图像执行预处理操作,所述预处理操作包括:将所述图像调整成预设规格的灰度图像,以及对所述灰度像执行归一化处理;其中,所述确定所述图像中的目标对象的位置,包括:对所述预处理操作后得到的图像执行目标检测处理,获得所述预处理操作后的图像中所述目标对象的位置;基于所述预处理操作后的图像中所述目标对象的位置,确定所述图像中所述目标对象的位置。In some embodiments of the present application, before determining the position of the target object in the image, the method further includes performing a preprocessing operation on the image, and the preprocessing operation includes: adjusting the image to a preset A grayscale image of a specification, and performing normalization processing on the grayscale image; wherein the determining the position of the target object in the image includes: performing target detection processing on the image obtained after the preprocessing operation , Obtaining the position of the target object in the image after the preprocessing operation; and determining the position of the target object in the image based on the position of the target object in the image after the preprocessing operation.
在本申请一些实施例中,所述对所述灰度图像执行归一化处理,包括:确定所述灰度图像中各像素点的像素值的平均值和标准差;获得所述各像素点的像素值与所述平均值之间的差值;将所述各像素点对应的所述差值和所述标准差之间的比值,确定为所述各像素点归一化后的像素值。In some embodiments of the present application, the performing normalization processing on the grayscale image includes: determining the average value and standard deviation of the pixel value of each pixel in the grayscale image; obtaining each pixel The difference between the pixel value of and the average value; the ratio between the difference and the standard deviation corresponding to each pixel is determined as the normalized pixel value of each pixel .
在本申请一些实施例中,所述确定所述图像中的目标对象的位置,包括:提取所述图像的图像特征;对所述图像特征执行分类处理,得到所述图像中的目标对象的位置区域;将所述位置区域的中心位置确定为所述目标对象的位置。In some embodiments of the present application, the determining the location of the target object in the image includes: extracting image features of the image; performing classification processing on the image features to obtain the location of the target object in the image Area; the center position of the location area is determined as the location of the target object.
在本申请一些实施例中,所述目标对象包括:人脸;对应地,所述确定所述图像中的目标对象的位置,包括:确定所述图像中人脸的位置。In some embodiments of the present application, the target object includes: a human face; correspondingly, the determining the position of the target object in the image includes: determining the position of the human face in the image.
在本申请一些实施例中,所述基于所述目标对象的位置和所述图像的中心位置之间的距离,确定用于控制智能移动设备转动的控制指令,包括:基于所述图像中的目标对象的位置与所述图像的中心位置之间的距离,确定目标偏移量;基于所述目标偏移量生成多组偏移序列,并且每组偏移序列中的偏移值的加和值为所述目标偏移量;利用强化学习算法,从所述多组偏移序列中选择出满足要求的偏移序列,并确定所述满足要求的偏移序列对应的控制指令。In some embodiments of the present application, the determining the control instruction for controlling the rotation of the smart mobile device based on the distance between the position of the target object and the center position of the image includes: based on the target in the image Determine the target offset based on the distance between the position of the object and the center position of the image; generate multiple sets of offset sequences based on the target offset, and the sum of the offset values in each set of offset sequences Is the target offset; using a reinforcement learning algorithm, select an offset sequence that meets the requirements from the multiple sets of offset sequences, and determine the control instruction corresponding to the offset sequence that meets the requirements.
在本申请一些实施例中,所述利用强化学习算法,从所述多组偏移序列中选择出满足要求的偏移序列,包括:针对所述多组偏移序列中各偏移值,确定价值表中与所述偏 移值对应的最大价值,所述价值表包括偏移值在不同转动指令下对应的价值;获得所述偏移值对应的奖赏值,并基于所述偏移值对应的所述奖赏值和所述最大价值,确定所述偏移值的最终价值,所述奖赏值为在未执行所述偏移值的最大价值对应的转动指令的情况下,目标对象的位置与图像中心位置之间的距离;将所述多组偏移序列中各偏移值的所述最终价值之和最大的偏移序列,确定为满足要求的偏移序列。In some embodiments of the present application, the use of a reinforcement learning algorithm to select an offset sequence that meets the requirements from the multiple sets of offset sequences includes: determining for each offset value in the multiple sets of offset sequences The maximum value corresponding to the offset value in the value table, the value table includes the value corresponding to the offset value under different rotation commands; the reward value corresponding to the offset value is obtained, and the corresponding value is based on the offset value The reward value and the maximum value of the offset value are determined to determine the final value of the offset value, and the reward value is the position of the target object when the rotation instruction corresponding to the maximum value of the offset value is not executed The distance between the center positions of the image; the offset sequence with the largest sum of the final value of the offset values in the multiple sets of offset sequences is determined as the offset sequence that meets the requirements.
在本申请一些实施例中,所述确定所述满足要求的偏移序列对应的控制指令,包括:基于所述满足要求的偏移序列中各偏移值的最大价值对应的转动指令,确定所述控制指令。In some embodiments of the present application, the determining the control instruction corresponding to the offset sequence that meets the requirements includes: determining the rotation instruction corresponding to the maximum value of each offset value in the offset sequence that meets the requirements. The control instructions.
在本申请一些实施例中,所述方法还包括:基于所述控制指令驱动所述智能移动设备执行转动。In some embodiments of the present application, the method further includes: driving the smart mobile device to perform rotation based on the control instruction.
在本申请一些实施例中,所述方法还包括:基于所述目标对象的位置区域,确定用于控制所述智能移动设备移动的控制指令,其中,响应于所述目标对象的位置区域对应的面积大于第一阈值,生成用于控制所述智能移动设备后退的控制指令;响应于所述目标对象的位置区域对应的面积小于第二阈值,生成用于控制所述智能移动设备前进的控制指令,所述第一阈值大于第二阈值。In some embodiments of the present application, the method further includes: determining a control instruction for controlling the movement of the smart mobile device based on the location area of the target object, wherein the response to the location area of the target object corresponds to If the area is greater than the first threshold, generate a control instruction for controlling the back of the smart mobile device; in response to the area corresponding to the location area of the target object is less than the second threshold, generate a control instruction for controlling the smart mobile device to move forward , The first threshold is greater than the second threshold.
本申请实施例提供了一种目标跟踪装置,其包括:图像采集模块,其配置为采集图像;目标检测模块,其配置为确定所述图像中的目标对象的位置;控制模块,其配置为基于所述目标对象的位置和所述图像的中心位置之间的距离,确定控制智能移动设备转动的控制指令,其中,所述控制指令用于使得所述目标对象的位置位于所述图像的中心位置,且所述控制指令包括构成所述距离的偏移序列中的偏移值对应的转动指令,所述偏移序列包括至少一个偏移值。An embodiment of the application provides a target tracking device, which includes: an image acquisition module configured to acquire an image; a target detection module configured to determine the position of a target object in the image; and a control module configured to be based on The distance between the position of the target object and the center position of the image determines a control instruction for controlling the rotation of the smart mobile device, wherein the control instruction is used to make the position of the target object be located at the center position of the image , And the control instruction includes a rotation instruction corresponding to an offset value in an offset sequence constituting the distance, and the offset sequence includes at least one offset value.
在本申请一些实施例中,所述装置还包括预处理模块,其配置为对所述图像执行预处理操作,所述预处理操作包括:将所述图像调整成预设规格的灰度图像,以及对所述灰度图像执行归一化处理;所述目标检测模块还配置为对所述预处理操作后得到的图像执行目标检测处理,获得所述预处理操作后的图像中所述目标对象的位置;基于所述预处理操作后的图像中所述目标对象的位置,确定所述图像中所述目标对象的位置。In some embodiments of the present application, the device further includes a preprocessing module configured to perform a preprocessing operation on the image, and the preprocessing operation includes: adjusting the image to a grayscale image of a preset specification, And performing normalization processing on the grayscale image; the target detection module is further configured to perform target detection processing on the image obtained after the preprocessing operation to obtain the target object in the image after the preprocessing operation Based on the position of the target object in the image after the preprocessing operation, determine the position of the target object in the image.
在本申请的一些实施例中,所述预处理模块执行所述对所述灰度图像执行归一化处理的步骤包括:确定所述灰度图像中各像素点的像素值的平均值和标准差;获得所述各像素点的像素值与所述平均值之间的差值;将所述各像素点对应的所述差值和所述标准差之间的比值,确定为所述各像素点归一化后的像素值。In some embodiments of the present application, the step of performing the normalization process on the grayscale image by the preprocessing module includes: determining the average value and standard of the pixel value of each pixel in the grayscale image Difference; obtain the difference between the pixel value of each pixel and the average value; determine the ratio between the difference and the standard deviation corresponding to each pixel as the pixel The pixel value after point normalization.
在本申请的一些实施例中,所述目标检测模块还配置为提取所述图像的图像特征;对所述图像特征执行分类处理,得到所述图像中的目标对象的位置区域;将所述位置区域的中心位置确定为所述目标对象的位置。In some embodiments of the present application, the target detection module is further configured to extract image features of the image; perform classification processing on the image features to obtain the location area of the target object in the image; The center position of the area is determined as the position of the target object.
在本申请的一些实施例中,所述目标对象包括人脸;对应地,所述目标检测模块还配置为确定所述图像中人脸的位置。In some embodiments of the present application, the target object includes a human face; correspondingly, the target detection module is further configured to determine the position of the human face in the image.
在本申请的一些实施例中,所述控制模块还配置为基于所述图像中的目标对象的位置与所述图像的中心位置之间的距离,确定目标偏移量;基于所述目标偏移量生成多组偏移序列,并且每组偏移序列中的偏移值的加和值为所述目标偏移量;利用强化学习算法,从所述多组偏移序列中选择出满足要求的偏移序列,并得到所述满足要求的偏移序列对应的控制指令。In some embodiments of the present application, the control module is further configured to determine the target offset based on the distance between the position of the target object in the image and the center position of the image; based on the target offset Generate multiple sets of offset sequences, and the sum of the offset values in each set of offset sequences is the target offset; the reinforcement learning algorithm is used to select from the multiple sets of offset sequences that meet the requirements Offset sequence, and obtain the control instruction corresponding to the offset sequence that meets the requirements.
在本申请的一些实施例中,所述控制模块还配置为针对所述多组偏移序列中各偏移值,确定价值表中与所述偏移值对应的最大价值,所述价值表包括偏移值在不同转动指令下对应的价值;获得所述偏移值对应的奖赏值,并基于所述偏移值对应的所述奖赏值和所述最大价值,确定所述偏移值的最终价值,所述奖赏值为在未执行所述偏移值的最大价值对应的转动指令的情况下,目标对象的位置与图像中心的距离;将所述多组偏移 序列中各偏移值的所述最终价值之和最大的偏移序列,确定为满足要求的偏移序列。In some embodiments of the present application, the control module is further configured to determine the maximum value corresponding to the offset value in the value table for each offset value in the multiple sets of offset sequences, and the value table includes The value corresponding to the offset value under different rotation commands; the reward value corresponding to the offset value is obtained, and the final value of the offset value is determined based on the reward value and the maximum value corresponding to the offset value Value, the reward value is the distance between the position of the target object and the center of the image when the rotation instruction corresponding to the maximum value of the offset value is not executed; the offset value of each offset value in the multiple sets of offset sequences The offset sequence with the largest sum of the final value is determined as the offset sequence that meets the requirements.
在本申请的一些实施例中,所述控制模块还配置为基于所述满足要求的偏移序列中各偏移值的最大价值对应的转动指令,确定所述控制指令。In some embodiments of the present application, the control module is further configured to determine the control instruction based on the rotation instruction corresponding to the maximum value of each offset value in the offset sequence that meets the requirements.
在本申请的一些实施例中,所述目标检测模块还配置为基于所述目标对象的位置区域,确定控制所述智能移动设备移动的控制指令,其中,在所述目标对象的位置区域对应的面积大于第一阈值的情况下,生成控制所述智能移动设备后退的控制指令;在所述目标对象的位置区域对应的面积小于第二阈值的情况下,生成控制所述智能移动设备前进的控制指令,所述第一阈值大于第二阈值。In some embodiments of the present application, the target detection module is further configured to determine a control instruction for controlling the movement of the smart mobile device based on the location area of the target object, wherein the location area corresponding to the target object If the area is greater than the first threshold, generate a control instruction to control the back of the smart mobile device; if the area corresponding to the location area of the target object is less than the second threshold, generate a control to control the smart mobile device to move forward Instruction that the first threshold is greater than the second threshold.
本申请实施例提供了一种智能移动设备,其包括所述的目标跟踪装置,以及所述目标跟踪装置内的目标检测模块集成在智能移动设备的管理装置中,通过所述管理装置执行所述图像采集模块采集的图像的目标检测处理,得到所述目标对象的位置;所述控制模块与所述管理装置连接,并用于根据所述管理装置得到目标对象的位置生成所述控制指令,并根据所述控制指令控制所述智能移动设备转动。The embodiment of the present application provides a smart mobile device, which includes the target tracking device, and the target detection module in the target tracking device is integrated in the management device of the smart mobile device, and the management device executes the The target detection processing of the image collected by the image acquisition module obtains the position of the target object; the control module is connected with the management device and is used to generate the control instruction according to the position of the target object obtained by the management device, and The control instruction controls the rotation of the smart mobile device.
在本申请的一些实施例中,所述管理装置还集成有所述目标跟踪装置的预处理模块以用于对所述图像执行预处理操作,并对所述预处理操作后的图像执行目标检测处理,得到所述图像中目标对象的位置。In some embodiments of the present application, the management device is also integrated with the preprocessing module of the target tracking device for performing preprocessing operations on the images, and performing target detection on the images after the preprocessing operations Processing to obtain the position of the target object in the image.
在本申请的一些实施例中,所述智能移动设备包括教育机器人。In some embodiments of the present application, the smart mobile device includes an educational robot.
本申请实施例提供了一种智能移动设备,其包括:处理器;用于存储处理器可执行指令的存储器;其中,所述处理器被配置为调用所述存储器存储的指令,以执行任意一项所述的目标跟踪方法。The embodiment of the present application provides a smart mobile device, which includes: a processor; a memory for storing executable instructions of the processor; wherein the processor is configured to call the instructions stored in the memory to execute any one The target tracking method described in item.
本申请实施例提供了一种计算机可读存储介质,其上存储有计算机程序指令,所述计算机程序指令被处理器执行时实现第一方面中任意一项所述的目标跟踪方法。An embodiment of the present application provides a computer-readable storage medium on which computer program instructions are stored, and when the computer program instructions are executed by a processor, the target tracking method described in any one of the first aspect is implemented.
本申请实施例提供了一种计算机程序,包括计算机可读代码,当所述计算机可读代码在智能移动设备中运行时,所述智能移动设备中的处理器执行用于实现任意一项所述的目标跟踪方法。The embodiment of the present application provides a computer program, including computer-readable code. When the computer-readable code runs in a smart mobile device, the processor in the smart mobile device executes the Target tracking method.
本申请实施例提供的目标跟踪方法及装置、智能移动设备和存储介质,可以得到采集的图像中目标对象的位置,并根据该目标对象的位置与图像中心之间的距离,得到智能移动设备的控制指令,该控制指令用于控制智能移动设备的转动,得到的控制指令包括至少一个偏移值对应的转动指令,其中各偏移值形成的偏移序列与目标对象和图像中心之间的距离来确定,通过得到的控制指令可以使得转动后的目标对象能够在采集的图像的中心,从而使得目标对象在智能移动设备的跟踪范围内。本申请实施例提供的目标跟踪方法及装置、智能移动设备和存储介质,可以实时的根据目标对象的位置执行目标跟踪,具有更加方便、准确的特点。The target tracking method and device, smart mobile device, and storage medium provided by the embodiments of the application can obtain the position of the target object in the collected image, and obtain the position of the smart mobile device according to the distance between the position of the target object and the image center. The control instruction is used to control the rotation of the smart mobile device, and the obtained control instruction includes at least one rotation instruction corresponding to an offset value, wherein the distance between the offset sequence formed by each offset value and the target object and the image center It is determined that the obtained control instruction can make the rotated target object be in the center of the collected image, so that the target object is within the tracking range of the smart mobile device. The target tracking method and device, smart mobile device, and storage medium provided in the embodiments of the present application can perform target tracking according to the position of the target object in real time, which is more convenient and accurate.
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,而非限制本申请实施例。It should be understood that the above general description and the following detailed description are only exemplary and explanatory, rather than limiting the embodiments of the present application.
根据下面参考附图对示例性实施例的详细说明,本申请实施例的其它特征及方面将变得清楚。According to the following detailed description of exemplary embodiments with reference to the accompanying drawings, other features and aspects of the embodiments of the present application will become clear.
附图说明Description of the drawings
此处的附图被并入说明书中并构成本说明书的一部分,这些附图示出了符合本申请的实施例,并与说明书一起用于说明本申请实施例的技术方案。The drawings here are incorporated into the specification and constitute a part of the specification. These drawings show embodiments that conform to the application and are used together with the specification to illustrate the technical solutions of the embodiments of the application.
图1为本申请实施例提供的一种目标跟踪方法的流程示意图;FIG. 1 is a schematic flowchart of a target tracking method provided by an embodiment of this application;
图2为本申请实施例提供的对图像执行预处理的流程示意图;FIG. 2 is a schematic diagram of a process of performing preprocessing on an image provided by an embodiment of the application;
图3为本申请实施例提供的一种目标跟踪方法中步骤S20的流程示意图;3 is a schematic flowchart of step S20 in a target tracking method provided by an embodiment of this application;
图4为本申请实施例提供的一种目标跟踪方法中步骤S30的流程示意图;4 is a schematic flowchart of step S30 in a target tracking method according to an embodiment of the application;
图5为本申请实施例提供的一种目标跟踪方法中步骤S303的流程示意图;5 is a schematic flowchart of step S303 in a target tracking method provided by an embodiment of this application;
图6为本申请实施例提供的一种目标跟踪方法的另一流程示意图;6 is a schematic diagram of another process of a target tracking method provided by an embodiment of the application;
图7为本申请实施例提供的一种目标跟踪方法的应用示例图;FIG. 7 is an application example diagram of a target tracking method provided by an embodiment of the application;
图8为本申请实施例提供的预处理过程的流程示意图;FIG. 8 is a schematic flowchart of a preprocessing process provided by an embodiment of the application;
图9为本申请实施例提供的目标检测网络的训练过程示意图;FIG. 9 is a schematic diagram of the training process of the target detection network provided by an embodiment of the application;
图10为本申请实施例提供的目标检测网络的应用过程示意图;FIG. 10 is a schematic diagram of the application process of the target detection network provided by an embodiment of this application;
图11为本申请实施例提供的基于强化学习路径规划算法的流程示意图;11 is a schematic flowchart of a path planning algorithm based on reinforcement learning provided by an embodiment of the application;
图12为本申请实施例提供的一种目标跟踪装置的结构示意图;FIG. 12 is a schematic structural diagram of a target tracking device provided by an embodiment of this application;
图13为本申请实施例提供的一种智能移动设备的结构示意图。FIG. 13 is a schematic structural diagram of a smart mobile device provided by an embodiment of this application.
具体实施方式Detailed ways
以下将参考附图详细说明本申请实施例的各种示例性实施例、特征和方面。附图中相同的附图标记表示功能相同或相似的元件。尽管在附图中示出了实施例的各种方面,但是除非特别指出,不必按比例绘制附图。Various exemplary embodiments, features, and aspects of the embodiments of the present application will be described in detail below with reference to the accompanying drawings. The same reference numerals in the drawings indicate elements with the same or similar functions. Although various aspects of the embodiments are shown in the drawings, unless otherwise noted, the drawings are not necessarily drawn to scale.
在这里专用的词“示例性”意为“用作例子、实施例或说明性”。这里作为“示例性”所说明的任何实施例不必解释为优于或好于其它实施例。The dedicated word "exemplary" here means "serving as an example, embodiment, or illustration." Any embodiment described herein as "exemplary" need not be construed as being superior or better than other embodiments.
本文中术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中术语“至少一种”表示多种中的任意一种或多种中的至少两种的任意组合,例如,包括A、B、C中的至少一种,可以表示包括从A、B和C构成的集合中选择的任意一个或多个元素。The term "and/or" in this article is only an association relationship describing associated objects, which means that there can be three relationships, for example, A and/or B, which can mean: A alone exists, A and B exist at the same time, exist alone B these three situations. In addition, the term "at least one" in this document means any one or any combination of at least two of the multiple, for example, including at least one of A, B, and C, may mean including A, Any one or more elements selected in the set formed by B and C.
另外,为了更好地说明本申请实施例,在下文的具体实施方式中给出了众多的具体细节。本领域技术人员应当理解,没有某些具体细节,本申请实施例同样可以实施。在一些实例中,对于本领域技术人员熟知的方法、手段、元件和电路未作详细描述,以便于凸显本申请实施例的主旨。In addition, in order to better illustrate the embodiments of the present application, numerous specific details are given in the following specific embodiments. Those skilled in the art should understand that the embodiments of the present application can also be implemented without certain specific details. In some examples, the methods, means, elements, and circuits well known to those skilled in the art have not been described in detail, so as to highlight the gist of the embodiments of the present application.
本申请实施例提供了一种目标跟踪方法,该方法可以应用在任意的具有图像处理功能的智能移动设备中。例如,目标跟踪方法可以应用在移动机器人、遥控车、飞行器等设备中,上述仅为示例性说明,只要能够实现移动的设备均可以采用本申请实施例提供的目标跟踪方法。在一些可能的实现方式中,该目标跟踪方法可以通过处理器调用存储器中存储的计算机可读指令的方式来实现。The embodiment of the application provides a target tracking method, which can be applied to any smart mobile device with image processing function. For example, the target tracking method can be applied to devices such as mobile robots, remote-controlled vehicles, and aircraft. The foregoing is only an example. As long as the device that can achieve movement can use the target tracking method provided in the embodiments of the present application. In some possible implementation manners, the target tracking method may be implemented by a processor invoking computer-readable instructions stored in a memory.
图1为本申请实施例提供的一种目标跟踪方法的流程示意图,如图1所示,所述目标跟踪方法包括:FIG. 1 is a schematic flowchart of a target tracking method provided by an embodiment of the application. As shown in FIG. 1, the target tracking method includes:
步骤S10:获取采集的图像;Step S10: Obtain the collected image;
在本申请的一些实施例中,应用本申请实施例的目标跟踪方法的智能移动设备上可以包括图像采集设备,如摄像头或者照相机等设备。本申请实施例可以通过图像采集设备直接采集图像,或者也可以通过图像采集设备采集视频数据,并对视频数据执行分帧或者选帧处理,得到对应的图像。In some embodiments of the present application, the smart mobile device to which the target tracking method of the embodiments of the present application is applied may include an image acquisition device, such as a camera or a camera. In the embodiments of the present application, images can be directly collected by an image collection device, or video data can be collected by the image collection device, and the video data can be subjected to frame division or frame selection processing to obtain corresponding images.
步骤S20:确定所述图像中的目标对象的位置;Step S20: Determine the position of the target object in the image;
在本申请的一些实施例中,在得到采集的图像的情况下,可以执行采集的图像的目标检测处理,即检测所采集的图像中是否存在目标对象,以及存在目标对象时,确定目标对象所在的位置。In some embodiments of the present application, when the captured image is obtained, the target detection process of the captured image can be performed, that is, whether the target object exists in the captured image, and when the target object exists, it is determined where the target object is s position.
在本申请的一些实施例中,可以通过神经网络实现该目标检测处理。其中,本申请实施例检测的目标对象可以为任意类型的对象,如目标对象可以为人脸,或者目标对象是其他的待被跟踪的对象,本申请实施例对此不作具体限定。或者,在一些实施例中,目标对象可以为特定已知身份的对象,即本申请实施例可以执行对相应类型的对象(如 全部的人脸图像)的跟踪,也可以执行对某一特定身份的对象(如已知的特定的人脸图像)的跟踪,可以根据需求设定,本申请实施例对此不作具体限定。In some embodiments of the present application, the target detection processing can be realized through a neural network. The target object detected by the embodiment of the present application may be any type of object, for example, the target object may be a human face, or the target object may be another object to be tracked, which is not specifically limited in the embodiment of the present application. Or, in some embodiments, the target object may be an object with a specific known identity, that is, the embodiments of the present application can perform tracking of corresponding types of objects (such as all face images), or perform tracking of a specific identity. The tracking of an object (such as a known specific face image) can be set according to requirements, which is not specifically limited in the embodiment of the present application.
在本申请的一些实施例中,实现目标检测处理的神经网络可以为卷积神经网络,该神经网络经过训练后,能够精确的检测出图像中的目标对象的位置,神经网络的形式不做限定。In some embodiments of the present application, the neural network that implements target detection processing may be a convolutional neural network. After training, the neural network can accurately detect the position of the target object in the image. The form of the neural network is not limited. .
在一个示例中,可以在对图像执行目标检测处理的过程中,对图像执行特征提取以获得图像特征,而后对图像特征执行分类处理,得到图像中目标对象的位置区域,基于该位置区域即可以确定目标对象的位置。其中分类处理得到的分类结果可以包括图像中是否存在目标对象的标识,如第一标识或者第二标识,其中第一标识表示当前位置在图像中对应的像素点为目标对象,第二标识表示当前位置在图像中对应的像素点不是目标对象,通过第一标识构成的区域即可以确定目标对象在图像中的位置,如可以将该区域的中心位置确定为目标对象的位置。通过上述,在图像中包括目标对象的情况下,可以直接得到目标对象在图像中的位置,例如可以按照坐标的形式表示目标对象的位置。本申请实施例可以将目标对象在图像中的位置区域的中心位置作为目标对象的位置。另外,在图像中检测不到目标对象的情况下,输出的位置为空。In one example, in the process of performing target detection processing on the image, perform feature extraction on the image to obtain image features, and then perform classification processing on the image features to obtain the location area of the target object in the image, based on the location area. Determine the location of the target object. The classification result obtained by the classification process may include the identification of whether there is a target object in the image, such as the first identification or the second identification, where the first identification indicates that the pixel corresponding to the current position in the image is the target object, and the second identification indicates that the current The pixel point corresponding to the position in the image is not the target object, and the position of the target object in the image can be determined by the area formed by the first identifier. For example, the center position of the area can be determined as the position of the target object. Through the foregoing, when the target object is included in the image, the position of the target object in the image can be directly obtained, for example, the position of the target object can be expressed in the form of coordinates. In the embodiment of the present application, the center position of the position area of the target object in the image may be used as the position of the target object. In addition, if the target object is not detected in the image, the output position is empty.
步骤S30:基于所述目标对象的位置和所述图像的中心位置之间的距离,确定用于控制智能移动设备转动的控制指令,其中,所述控制指令用于使得所述目标对象的位置位于所述图像的中心位置,且所述控制指令包括构成所述距离的偏移序列中的偏移值对应的转动指令,所述偏移序列包括至少一个偏移值。Step S30: Determine a control instruction for controlling the rotation of the smart mobile device based on the distance between the position of the target object and the center position of the image, wherein the control instruction is used for making the position of the target object located The center position of the image, and the control instruction includes a rotation instruction corresponding to an offset value in an offset sequence constituting the distance, and the offset sequence includes at least one offset value.
在本申请的一些实施例中,在得到目标对象在图像中的位置的情况下,可以根据该位置控制智能移动设备移动,从而使得目标对象能够位于采集的图像的中心位置,进而实现对目标对象的跟踪。其中,本申请实施例可以根据目标对象在图像中的位置以及图像的中心位置之间的距离,得到控制智能移动设备转动的控制指令,使得所述目标对象的位置能够位于当前采集的图像的中心。其中,控制指令可以包括由至少一个偏移值分别对应的转动指令,其中根据至少一个偏移值对应的偏移序列能够确定上述目标对象的位置与图像中心位置之间的距离。例如,确定各偏移值的加和为上述距离值。其中,本申请实施例的距离可以为有向距离(如方向向量),偏移值也可以为方向向量,通过各偏移值对应的方向向量的加和可以得到距离对应的方向向量,即通过各偏移值对应的转动指令可以实现各偏移值的偏移,最终使得目标对象位于当前采集的图像的中心。在目标对象保持不动的情况下,可以从采集当前图像的下一个图像的时刻起,目标对象始终位于采集的图像的中心。如果目标对象存在移动的情况,由于本申请实施例可以快速的针对前一图像中目标对象的位置调整智能移动设备转动,从而使得目标对象在采集的图像中心,即使在目标对象移动的情况下,也可以对目标对象进行跟踪拍摄,使得目标对象在采集的图像的画面中。In some embodiments of the present application, when the position of the target object in the image is obtained, the smart mobile device can be controlled to move according to the position, so that the target object can be located at the center of the collected image, thereby realizing the target object Tracking. Among them, the embodiment of the present application can obtain a control instruction for controlling the rotation of the smart mobile device according to the distance between the position of the target object in the image and the center position of the image, so that the position of the target object can be located at the center of the currently collected image . Wherein, the control instruction may include rotation instructions respectively corresponding to at least one offset value, wherein the distance between the position of the target object and the center position of the image can be determined according to the offset sequence corresponding to the at least one offset value. For example, it is determined that the sum of the offset values is the aforementioned distance value. Among them, the distance in the embodiment of the present application can be a directed distance (such as a direction vector), and the offset value can also be a direction vector. The direction vector corresponding to the distance can be obtained by adding the direction vector corresponding to each offset value, that is, by The rotation instructions corresponding to each offset value can realize the offset of each offset value, and finally make the target object located in the center of the currently collected image. In the case that the target object remains stationary, the target object may always be located at the center of the captured image from the moment when the next image of the current image is captured. If the target object is moving, because the embodiment of the application can quickly adjust the rotation of the smart mobile device to the position of the target object in the previous image, so that the target object is in the center of the collected image, even when the target object is moving, It is also possible to track and shoot the target object so that the target object is in the frame of the collected image.
在本申请的一些实施例中,本申请实施例可以采用强化学习算法,执行智能移动设备的转动路径的规划,得到使得目标对象位于图像中心的控制指令,该控制指令可以为基于强化学习算法确定的最优移动方案对应的控制指令。在一个示例中,强化学习算法可以为价值学习算法(Q-learning算法)。In some embodiments of the present application, the embodiment of the present application may use a reinforcement learning algorithm to execute the planning of the rotation path of the smart mobile device, and obtain a control instruction for positioning the target object in the center of the image. The control instruction may be determined based on the reinforcement learning algorithm The control instructions corresponding to the optimal movement plan. In one example, the reinforcement learning algorithm may be a value learning algorithm (Q-learning algorithm).
通过强化学习算法,对智能移动设备的移动路径进行优化确定,得到在移动时间、移动路径的便捷性、以及智能移动设备的能耗的综合评价最优的移动路径对应的控制指令。Through the reinforcement learning algorithm, the movement path of the smart mobile device is optimized and determined, and the control instructions corresponding to the optimal movement path are obtained in the comprehensive evaluation of the movement time, the convenience of the movement path, and the energy consumption of the smart mobile device.
基于上述配置,本申请实施例可以方便且精确的实现对于目标对象的实时跟踪,根据目标对象的位置控制智能移动设备的转动,使得目标对象位于采集的图像的中心。其中可以根据图像中目标对象的位置与图像的中心位置之间的距离,得到智能移动设备的控制指令,该控制指令用于控制智能移动设备的转动,得到的控制指令包括至少一个偏 移值对应的转动指令,其中各偏移值形成的偏移序列与目标对象和图像中心之间的距离来确定,通过得到的控制指令可以使得转动后的目标对象能够在采集的图像的中心,从而使得目标对象在智能移动设备的跟踪范围内。本申请实施例可以实时的根据目标对象的位置执行目标跟踪,具有更加方便、准确且提高了智能移动设备的效能的特点。Based on the above configuration, the embodiment of the present application can conveniently and accurately realize real-time tracking of the target object, and control the rotation of the smart mobile device according to the position of the target object, so that the target object is located in the center of the collected image. The control instruction of the smart mobile device can be obtained according to the distance between the position of the target object in the image and the center position of the image. The control instruction is used to control the rotation of the smart mobile device, and the obtained control instruction includes at least one offset value corresponding The distance between the offset sequence formed by each offset value and the target object and the center of the image is determined. The obtained control command can enable the rotated target object to be in the center of the captured image, thereby making the target The object is within the tracking range of the smart mobile device. The embodiment of the present application can perform target tracking according to the position of the target object in real time, and has the characteristics of being more convenient, accurate, and improving the performance of the smart mobile device.
下面结合附图,对本申请实施例提供的进行详细说明。The following provides a detailed description of the embodiments of the present application in conjunction with the drawings.
如上述实施例所述,本申请实施例可以在采集到图像的情况下,即对该图像执行目标检测处理。本申请实施例中,由于采集的图像的规格、类型等参数可能不同,因此,在对图像执行目标检测处理之前还可以对图像执行预处理操作,得到归一化处理的图像。As described in the foregoing embodiment, the embodiment of the present application may perform target detection processing on the image when the image is collected. In the embodiments of the present application, since the specifications, types, and other parameters of the collected images may be different, it is possible to perform preprocessing operations on the images before performing target detection processing on the images to obtain a normalized image.
在确定所述图像中的目标对象的位置之前,所述方法还包括对所述图像执行预处理操作,图2为本申请实施例提供的对图像执行预处理的流程示意图,如图2所示,所述预处理操作包括:Before determining the position of the target object in the image, the method further includes performing a preprocessing operation on the image. FIG. 2 is a schematic diagram of the process of performing preprocessing on the image provided by an embodiment of the application, as shown in FIG. 2 , The preprocessing operation includes:
步骤S11:将所述图像调整成预设规格的灰度图像。Step S11: Adjust the image to a grayscale image of a preset specification.
在本申请的一些实施例中,采集的图像可能是彩色图像,或者其他形式的图像,可以将采集的图像转换为预设规格的图像,而后将预设规格的图像转换为灰度图像。或者,也可以首先将采集的图像转换为灰度图像,而后将灰度图像转换为预设规格的形式。其中预设规格可以为640*480,但不作为本申请实施例的具体限定。将彩色图像或者其他形式的图像转换为灰度图像可以基于对像素值的处理进行转换,如可以将各像素点的像素值除以最大的像素值,基于该结果得到相应的灰度值,上述仅为示例性说明,本申请实施例对该过程不作具体限定。In some embodiments of the present application, the captured image may be a color image or another form of image, and the captured image may be converted into an image of a preset specification, and then the image of the preset specification may be converted into a grayscale image. Alternatively, it is also possible to first convert the collected image into a grayscale image, and then convert the grayscale image into a form of preset specifications. The preset specification may be 640*480, but it is not a specific limitation of the embodiment of the present application. Converting color images or other forms of images into grayscale images can be based on the processing of pixel values. For example, the pixel value of each pixel can be divided by the maximum pixel value, and the corresponding grayscale value can be obtained based on the result. It is only illustrative, and the embodiment of the present application does not specifically limit the process.
由于在处理彩色图片或者其他形式的图像时可能会消耗掉大量资源,但是图片的形式对于检测效果影响较小,本申请实施例将图像转换为灰度图像,将图片直接转换成灰度图片,然后送到网络模型中进行检测,可以减少资源的消耗,提高处理速度。Since a large amount of resources may be consumed when processing color pictures or other forms of images, but the form of the picture has little influence on the detection effect, the embodiment of the present application converts the image into a grayscale image and directly converts the picture into a grayscale picture. Then send it to the network model for testing, which can reduce resource consumption and increase processing speed.
步骤S12:对所述灰度图像执行归一化处理。Step S12: Perform normalization processing on the grayscale image.
在得到灰度图像的情况下,可以对灰度图像执行归一化处理。通过归一化处理可以将图像的像素值归一化到相同的尺度范围内。其中,归一化处理可以包括:确定所述灰度图像中各像素点的像素值的平均值和标准差;确定所述像素点的像素值与所述平均值之间的差值;将每个像素点对应的所述差值和所述标准差之间的比值,确定为所述像素点归一化后的像素值。In the case of obtaining a grayscale image, normalization processing can be performed on the grayscale image. Through normalization processing, the pixel values of the image can be normalized to the same scale range. Wherein, the normalization processing may include: determining the average value and standard deviation of the pixel value of each pixel in the grayscale image; determining the difference between the pixel value of the pixel and the average value; The ratio between the difference and the standard deviation corresponding to each pixel is determined as the normalized pixel value of the pixel.
本申请实施例中采集的图像可以多个,也可以为一个,在图像为一个的情况下,得到的灰度图像也为一个。则针对该灰度图像中各像素点对应的像素值(灰度值),可以得到各像素点的像素值对应的平均值和标准差。继而可以将每个像素点与平均值之间的差值和标准差之间的比值,更新为该像素点的像素值。The images collected in the embodiment of the present application may be multiple or one. In the case of one image, the obtained grayscale image is also one. Then for the pixel value (gray value) corresponding to each pixel in the grayscale image, the average value and standard deviation corresponding to the pixel value of each pixel can be obtained. Then, the ratio between the difference between each pixel and the average value and the standard deviation can be updated to the pixel value of the pixel.
另外,在采集的图像为多个的情况下,可以对应的得到多个灰度图像。通过该多个灰度图像中各像素点的像素值可以确定多个灰度图像的像素值的平均值和标准差。即本申请实施例的平均值和标准差可以为针对一个图像的,也可以为针对多个图像的。在得到多个图像的各像素点的像素值的平均值和标准差的情况下,可以得到每个图像的每个像素点的像素值与平均值的差值,而后得到差值和平均值之间的比值,利用该比值更新像素点的像素值。In addition, when there are multiple images collected, multiple grayscale images can be obtained correspondingly. The average value and standard deviation of the pixel values of the multiple grayscale images can be determined by the pixel value of each pixel in the multiple grayscale images. That is, the average value and standard deviation of the embodiment of the present application may be for one image or for multiple images. In the case of obtaining the average value and standard deviation of the pixel value of each pixel of multiple images, the difference between the pixel value of each pixel of each image and the average value can be obtained, and then the difference between the difference and the average value can be obtained. Use this ratio to update the pixel value of the pixel.
通过上述方式,可以使得灰度图像中各像素点的像素值统一到相同的尺度上,实现采集的图像的归一化处理。Through the above method, the pixel value of each pixel in the grayscale image can be unified to the same scale, and the normalization processing of the collected image can be realized.
上述为示例性的说明本申请实施例执行预处理的方式,在其他实施例中,也可以通过其他方式执行预处理。例如可以仅执行将图像转换为预设规格,对预设规格的图像执行归一化处理。即本申请实施例也可以执行彩色图像的归一化处理。其中,可以得到彩色图像中每个像素点的各个通道的特征值的平均值和标准差,如可以得到图像的各像素点的红色(Red,R)通道的特征值(R值)的平均值和标准差,绿色(Green,G)通道 的特征值(G值)的平均值和标准差,以及蓝色(Blue,B)通道的特征值(B值)的平均值和标准差。而后根据相应颜色通道的特征值与平均值的差值和标准差的比值,得到对应的颜色通道的新的特征值。从而得到每个图像的每个像素点对应的颜色通道的更新的特征值,继而得到归一化的图像。The foregoing is an exemplary description of the manner in which the embodiment of the present application performs pre-processing. In other embodiments, the pre-processing may also be performed in other manners. For example, it is possible to only perform conversion of an image into a preset specification, and perform normalization processing on an image of the preset specification. That is, the embodiment of the present application may also perform normalization processing of color images. Among them, the average value and standard deviation of the feature value of each channel of each pixel in the color image can be obtained, for example, the average value of the feature value (R value) of the red (Red, R) channel of each pixel of the image can be obtained Sum standard deviation, the mean and standard deviation of the characteristic value (G value) of the green (Green, G) channel, and the mean and standard deviation of the characteristic value (B value) of the blue (Blue, B) channel. Then, according to the ratio of the difference between the eigenvalue of the corresponding color channel and the average value and the standard deviation, the new eigenvalue of the corresponding color channel is obtained. In this way, the updated feature value of the color channel corresponding to each pixel of each image is obtained, and then a normalized image is obtained.
通过对图像执行预处理,可以使得本申请实施例在实施时适用于不同类型的图像,以及不同尺度的图像,提高本申请实施例的适用性。By performing preprocessing on the image, the embodiments of the present application can be applied to different types of images and images of different scales during implementation, thereby improving the applicability of the embodiments of the present application.
在对图像执行预处理之后,也可以对预处理后的图像执行目标检测处理,得到预处理的图像中目标对象的位置,进而基于预处理图像与未预处理的图像中各像素点位置的对应关系,得到目标对象在图像中的位置,即可以根据预处理后的目标对象的位置得到原始采集的图像中目标对象的位置。下述仅以对采集的图像执行目标检测处理为例进行说明,针对预处理后的图像执行目标检测的过程与其相同,在此不做重复说明。After performing preprocessing on the image, you can also perform target detection processing on the preprocessed image to obtain the position of the target object in the preprocessed image, and then based on the correspondence between the preprocessed image and the pixel position in the unpreprocessed image Relationship, the position of the target object in the image is obtained, that is, the position of the target object in the original collected image can be obtained according to the position of the target object after preprocessing. The following only takes the execution of target detection processing on the collected image as an example for description, the process of performing target detection on the preprocessed image is the same as that, and the description is not repeated here.
图3为本申请实施例提供的一种目标跟踪方法中步骤S20的流程示意图,如图3所示,所述确定所述图像中的目标对象的位置,包括:FIG. 3 is a schematic flowchart of step S20 in a target tracking method according to an embodiment of the application. As shown in FIG. 3, the determining the position of the target object in the image includes:
步骤S201:提取所述图像的图像特征;Step S201: Extract image features of the image;
在本申请的一些实施例中,可以利用首先提取图像的图像特征,例如可以通过卷积处理得到图像特征,如上所述可以通过神经网络实现目标检测处理,其中神经网络可以包括特征提取模块和分类模块,特征提取模块可以包括至少一层卷积层,同时还可以包括池化层。通过特征提取模块可以提取图像的特征。在其他实施例中也可以残差网络的结构执行该特征提取处理的过程,得到图像特征,本申请实施例对此不作具体限定。In some embodiments of the present application, the image features of the image can be extracted first, for example, the image features can be obtained by convolution processing. As described above, the target detection processing can be realized by a neural network, where the neural network can include a feature extraction module and a classification Module, the feature extraction module may include at least one convolutional layer, and may also include a pooling layer. The feature extraction module can extract the features of the image. In other embodiments, the feature extraction process may also be performed in the structure of the residual network to obtain image features, which is not specifically limited in the embodiment of the present application.
步骤S202:对所述图像特征执行分类处理,得到所述图像中的目标对象的位置区域。Step S202: Perform classification processing on the image features to obtain the location area of the target object in the image.
在本申请的一些实施例中,可以对图像特征执行分类处理,例如执行分类处理的分类模块可以包括全连接层,通过全连接层得到图像中目标对象的检测结果,即目标对象的位置区域。本申请实施例的目标对象的位置区域可以按照坐标的形式表示,如检测到的目标对象的位置区域对应的检测框的两个顶角的位置坐标,或者,也可以为一个顶点的位置坐标,以及检测框的高度或者宽度。通过上述即可以得到目标对象所在的位置区域。也就是说,本申请实施例的分类处理得到的结果可以包括图像中是否存在目标类型的对象,即目标对象,以及目标对象的位置区域。其中可以利用第一标识和第二标识来标识是否存在目标类型的对象,并按照坐标的形式表示目标对象所在的位置区域。例如,第一标识可以为1,表示存在目标对象,相反地,第二标识可以为0,表示不存在目标对象,(x1,x2,y1,y2)分别为检测框的两个顶点对应的横纵坐标值。In some embodiments of the present application, classification processing can be performed on image features. For example, the classification module performing the classification processing can include a fully connected layer, and the detection result of the target object in the image, that is, the location area of the target object, is obtained through the fully connected layer. The location area of the target object in the embodiments of the present application can be expressed in the form of coordinates, such as the location coordinates of the two vertex corners of the detection frame corresponding to the location area of the detected target object, or the location coordinates of a vertex, And the height or width of the detection frame. Through the above, the location area of the target object can be obtained. In other words, the result of the classification process in the embodiment of the present application may include whether there is an object of the target type in the image, that is, the target object, and the location area of the target object. The first identifier and the second identifier can be used to identify whether there is an object of the target type, and to indicate the location area where the target object is located in the form of coordinates. For example, the first identifier can be 1, indicating that there is a target object, on the contrary, the second identifier can be 0, indicating that there is no target object, (x1, x2, y1, y2) are the horizontal lines corresponding to the two vertices of the detection frame. The ordinate value.
步骤S203:将所述位置区域的中心位置确定为所述目标对象的位置。Step S203: Determine the center position of the position area as the position of the target object.
在本申请的一些实施例中,可以将检测到的目标对象的位置区域的中心位置确定为目标对象的位置。可以取该目标对象所在的位置区域的四个顶点的坐标值的均值,得到中心位置的坐标,进而将该中心位置的坐标确定为目标对象的位置。In some embodiments of the present application, the center position of the detected position area of the target object may be determined as the position of the target object. The average value of the coordinate values of the four vertices of the location area where the target object is located can be taken to obtain the coordinates of the center position, and then the coordinates of the center position are determined as the position of the target object.
其中,在一个示例中,目标对象可以为人脸,目标检测处理可以为人脸检测处理,即可以检测图像中人脸所在的位置区域,并进一步根据检测到的人脸所在的位置区域的中心得到人脸的位置。继而执行针对人脸的目标跟踪。Among them, in an example, the target object can be a face, and the target detection process can be a face detection process, that is, the location area where the face is located in the image can be detected, and the person can be obtained according to the center of the location area where the detected face is located. The position of the face. Then perform target tracking for the face.
通过上述实施方式,本申请实施例可以得到高精度的目标对象的位置,提高目标跟踪的准确度。Through the foregoing implementation manners, the embodiments of the present application can obtain the position of the target object with high accuracy, and improve the accuracy of target tracking.
另外,在本申请的一些实施例中,本申请实施例可以通过智能移动设备的管理装置执行上述预处理以及目标检测处理的过程,本申请实施例中,所述管理装置可以是树莓派芯片,树莓派芯片具有较高的可扩展性同时具有较高的处理速度。In addition, in some embodiments of the present application, the above-mentioned preprocessing and target detection process can be performed by the management device of the smart mobile device. In the embodiments of the present application, the management device may be a Raspberry Pi chip. , Raspberry Pi chip has high scalability and high processing speed.
在本申请的一些实施例中,可以将得到的关于目标对象的位置等信息传输给智能移动设备的控制端,进而获得控制指令。本申请实施例执行目标对象的检测结果的传输可以按照预设的数据格式进行封装传输。其中检测结果表示目标对象在图像中的位置。其 中,该传输的检测结果对应的数据可以为80字节,并且其中可以包括模式标志位、检测结果信息、循环冗余(Cyclic Redundancy Check,CRC)校验、重传阈值、控制字段以及可选字段。模式标志位可以表示树莓派芯片当前的工作模式、检测结果信息即可以为目标对象的位置、CRC校验位用于安全验证、重传阈值用于表示重传数据的最大次数值,控制字段用于表示期望智能移动设备的工作模式,可选字段为可以附加的信息。In some embodiments of the present application, the obtained information about the location of the target object, etc., may be transmitted to the control terminal of the smart mobile device to obtain the control instruction. The transmission of the detection result of the execution target object in the embodiment of the present application may be encapsulated and transmitted according to a preset data format. The detection result indicates the position of the target object in the image. Among them, the data corresponding to the detection result of the transmission can be 80 bytes, and it can include mode flags, detection result information, cyclic redundancy (CRC) check, retransmission threshold, control field, and optional Field. The mode flag bit can indicate the current working mode of the Raspberry Pi chip, the detection result information can be the position of the target object, the CRC check bit is used for security verification, the retransmission threshold is used to indicate the maximum number of retransmissions of data, and the control field Used to indicate the desired working mode of the smart mobile device. The optional field is the information that can be added.
在得到图像中目标对象的位置的情况下,即可以执行智能移动设备的路径规划处理,得到控制智能移动设备的控制指令。图4为本申请实施例提供的一种目标跟踪方法中步骤S30的流程示意图,如图4所示,步骤S30可以通过以下步骤实现:When the position of the target object in the image is obtained, the path planning process of the smart mobile device can be executed to obtain the control instruction for controlling the smart mobile device. FIG. 4 is a schematic flowchart of step S30 in a target tracking method provided by an embodiment of the application. As shown in FIG. 4, step S30 can be implemented through the following steps:
步骤S301:基于所述图像中的目标对象的位置与所述图像的中心位置的距离,确定目标偏移量;Step S301: Determine a target offset based on the distance between the position of the target object in the image and the center position of the image;
在本申请的一些实施例中,本申请实施例在执行目标对象的跟踪时,可以保持目标对象的位置位于图像的中心位置,通过该方式实现目标对象的追踪。因此,本申请实施例在得到目标对象的位置的情况下,可以检测该目标对象的位置与图像的中心位置之间的距离,将该距离作为目标偏移量。其中可以将目标对象的位置的坐标与图像的中心位置的坐标之间的欧式距离,作为目标偏移量。其中该距离还可以表示为向量形式,例如可以表示为图像中心位置和目标对象的位置之间的有向向量,即得到的目标偏移量可以包括目标对象的位置和图像中心位置之间的距离,还可以包括图像的中心相对于目标对象的位置的方向。In some embodiments of the present application, when tracking the target object in the embodiments of the present application, the position of the target object can be maintained at the center of the image, and the tracking of the target object can be achieved in this way. Therefore, in the embodiment of the present application, when the position of the target object is obtained, the distance between the position of the target object and the center position of the image can be detected, and the distance is used as the target offset. The Euclidean distance between the coordinates of the position of the target object and the coordinates of the center position of the image can be used as the target offset. The distance can also be expressed in the form of a vector, for example, it can be expressed as a directed vector between the center position of the image and the position of the target object, that is, the obtained target offset may include the distance between the position of the target object and the center position of the image , Can also include the direction of the center of the image relative to the position of the target object.
步骤S302:基于所述目标偏移量生成多组偏移序列,所述偏移序列中包括至少一个偏移值,并且每组偏移序列中的偏移值的加和值为所述目标偏移量;Step S302: Generate multiple sets of offset sequences based on the target offset, the offset sequences include at least one offset value, and the sum of the offset values in each set of offset sequences is the target offset Shift
在本申请的一些实施例中,本申请实施例可以根据得到的目标偏移量生成多组偏移序列,该偏移序列中包括至少一个偏移值,并且该至少一个偏移值的加和为目标偏移量。例如,目标对象的位置为(100,0),图像中心的位置为(50,0),则目标偏移量为x轴上的50。为了实现该目标偏移量,可以生成多个偏移序列,如第一个偏移序列中的偏移值为10,20和20,第二个偏移序列的偏移值可以为10,25和15,其中各偏移值的方向可以均为x轴的正方向。同理,可以得到多组与目标偏移量对应的多组偏移序列。In some embodiments of the present application, the embodiment of the present application may generate multiple sets of offset sequences according to the obtained target offset, the offset sequence includes at least one offset value, and the sum of the at least one offset value Is the target offset. For example, if the position of the target object is (100, 0) and the position of the image center is (50, 0), the target offset is 50 on the x-axis. In order to achieve the target offset, multiple offset sequences can be generated. For example, the offset value of the first offset sequence is 10, 20, and 20, and the offset value of the second offset sequence can be 10, 25 And 15, where the direction of each offset value can be the positive direction of the x-axis. In the same way, multiple sets of offset sequences corresponding to the target offset can be obtained.
在一个可能的实施方式中,生成的多组偏移序列中的偏移值的数量可以为设定的,如可以为3,但不作为本申请实施例的具体限定。另外,生成多组偏移序列的方式可以为随机生成的方式。实际中,能够实现目标偏移量的偏移序列中偏移值的组合方式可以包括多种,本申请实施例可以该多种组合方式中随机选择出预设数量的组合方式,即预设数量的偏移序列。In a possible implementation, the number of offset values in the generated multiple sets of offset sequences may be set, for example, it may be 3, but it is not a specific limitation in the embodiment of the present application. In addition, the method of generating multiple sets of offset sequences may be a method of randomly generating. In practice, there may be multiple combinations of offset values in the offset sequence that can achieve the target offset. The embodiment of the present application may randomly select a preset number of combinations from the multiple combinations, that is, the preset number. The offset sequence.
步骤S303:利用强化学习算法,从所述多组偏移序列中选择出满足要求的偏移序列,并得到所述满足要求的偏移序列对应的控制指令。Step S303: Using a reinforcement learning algorithm, select an offset sequence that meets the requirements from the multiple sets of offset sequences, and obtain a control instruction corresponding to the offset sequence that meets the requirements.
在本申请的一些实施例中,在得到生成的偏移序列的情况下,则可以利用强化学习算法选择满足要求的偏移序列。其中,可以利用强化学习算法得到偏移序列对应的总价值,将总价值最高的偏移序列确定为满足要求的偏移序列。In some embodiments of the present application, when the generated offset sequence is obtained, a reinforcement learning algorithm may be used to select an offset sequence that meets the requirements. Among them, the reinforcement learning algorithm can be used to obtain the total value corresponding to the offset sequence, and the offset sequence with the highest total value is determined as the offset sequence that meets the requirements.
图5为本申请实施例提供的一种目标跟踪方法中步骤S303的流程示意图,如图5所示,步骤S303“所述利用强化学习算法,从所述多组偏移序列中选择出满足要求的偏移序列,并得到所述满足要求的偏移序列对应的控制指令”,可以包括:Fig. 5 is a schematic flow chart of step S303 in a target tracking method provided by an embodiment of the application. As shown in Fig. 5, step S303, “the use of the reinforcement learning algorithm, selects from the multiple sets of offset sequences that meet the requirements And obtain the control instruction corresponding to the offset sequence that meets the requirements" may include:
步骤S3031:针对所述多组偏移序列中各偏移值,确定价值表中与所述偏移值对应的最大价值,所述价值表包括偏移值在不同转动指令下对应的价值;Step S3031: For each offset value in the multiple sets of offset sequences, determine the maximum value corresponding to the offset value in the value table, and the value table includes the value corresponding to the offset value under different rotation commands;
在本申请的一些实施例中,强化学习算法可以为价值学习算法(Q-learning算法),对应的价值表(Q-table)可以表示不同的偏移值在不同的转动指令下对应的价值(quality)。转动指令是指控制智能移动设备转动的指令,其中可以包括电机转动角度、电机转速、以及电机转动时间等参数。本申请实施例中的价值表可以为预先经过强化学 习得到的价值表,其中价值表的参数可以精确的区分和体现在不同的偏移值的情况下,不同的转动指令对应的价值。例如,表1示出了转动指令的至少一部分参数,表2示出了价值表的示意表,其中横向的参数a1、a2和a3为不同的转动指令,纵向的参数s1、s2和s3表示不同的偏移值,表格内的参数表示相应的偏移值与相应的转动指令对应的价值。价值可以表示相应偏移值下对应的转动指令的价值,一般数值越大,价值越高,说明通过该指令实现目标跟踪的价值就越高。In some embodiments of the present application, the reinforcement learning algorithm may be a value learning algorithm (Q-learning algorithm), and the corresponding value table (Q-table) may indicate the value corresponding to different offset values under different rotation instructions ( quality). Rotation instructions refer to instructions that control the rotation of smart mobile devices, which can include parameters such as motor rotation angle, motor speed, and motor rotation time. The value table in the embodiment of the present application may be a value table obtained through intensive chemical learning in advance, wherein the parameters of the value table can be accurately distinguished and reflected in the case of different offset values, and the values corresponding to different rotation commands. For example, Table 1 shows at least a part of the parameters of the rotation command, and Table 2 shows a schematic table of the value table. The horizontal parameters a1, a2, and a3 are different rotation commands, and the vertical parameters s1, s2, and s3 are different. The parameter in the table indicates the value of the corresponding offset value and the corresponding rotation command. The value can represent the value of the corresponding rotation command under the corresponding offset value. Generally, the larger the value, the higher the value, indicating that the value of the target tracking through the command is higher.
表1、转动指令对应的部分转动参数表Table 1. Part of the rotation parameter table corresponding to the rotation command
动作action value
电机转速Motor speed 0-10000-1000
电机转动角度Motor rotation angle 0-3600-360
电机转动时间Motor rotation time ~
电机停止动作Motor stop action 保持、中断Hold, interrupt
表2、转动参数对应的价值表Table 2. Value table corresponding to rotation parameters
 To a1a1 a2a2 a3a3
s1s1 11 22 33
s2s2 11 11 22
s3s3 44 22 11
如上述实施例所述,每个偏移序列中可以包括多个偏移值,本申请实施例可以基于价值表确定每个序列中每个偏移值对应的最大的价值。例如针对偏移值s1,最大的价值为3,针对偏移值s2,最大的价值为2,针对偏移值s3,最大的价值为4。上述仅为示例性说明,对于不同的价值表,得到的价值可能不同,本申请实施例对此不作具体限定。As described in the foregoing embodiment, each offset sequence may include multiple offset values, and the embodiment of the present application may determine the maximum value corresponding to each offset value in each sequence based on the value table. For example, for the offset value s1, the maximum value is 3, for the offset value s2, the maximum value is 2, and for the offset value s3, the maximum value is 4. The foregoing is only an exemplary description, and the obtained value may be different for different value tables, which is not specifically limited in the embodiment of the present application.
步骤S3032:获得所述偏移值对应的奖赏值,并基于所述偏移值对应的所述奖赏值和所述最大价值,确定所述偏移值的最终价值,其中所述奖赏值为在未执行所述偏移值对应的转动指令的情况下,目标对象的位置与图像的中心位置之间的距离;Step S3032: Obtain the reward value corresponding to the offset value, and determine the final value of the offset value based on the reward value corresponding to the offset value and the maximum value, wherein the reward value is The distance between the position of the target object and the center position of the image when the rotation instruction corresponding to the offset value is not executed;
在本申请的一些实施例中,可以获得偏移序列中每个偏移值的奖赏值,其中奖赏值与未执行相应偏移值时目标对象的位置相关。例如对于每个偏移序列的第一个偏移值,在未执行该偏移值对应的转动指令的情况下,目标对象的位置即为图像中目标对象初始检测到的位置。而对于偏移序列中的其他的偏移值,可以基于假定执行了该偏移值之前的偏移值的最大价值对应的转动指令后,目标对象的位置。例如,假设检测到的图像中的目标对象的位置为(100,0),得到的满足条件的偏移序列可以为20,15,15。对于第一个偏移值,可以通过目标对象的位置(100,0)确定该第一个偏移值的奖赏值。对于第二个偏移值,可以确定目标对象的位置为(120,0),此时可以基于该位置确定第二个偏移值的奖赏值,以及在执行第三个偏移值时,可以确定目标对象的位置为(135,0),此时可以基于该位置确定第三个偏移值的奖赏值。In some embodiments of the present application, the reward value of each offset value in the offset sequence can be obtained, where the reward value is related to the position of the target object when the corresponding offset value is not executed. For example, for the first offset value of each offset sequence, if the rotation instruction corresponding to the offset value is not executed, the position of the target object is the initially detected position of the target object in the image. For other offset values in the offset sequence, the position of the target object may be assumed after the rotation instruction corresponding to the maximum value of the offset value before the offset value is executed. For example, assuming that the position of the target object in the detected image is (100, 0), the obtained offset sequence that satisfies the condition may be 20, 15, and 15. For the first offset value, the reward value of the first offset value can be determined by the position (100, 0) of the target object. For the second offset value, the position of the target object can be determined to be (120, 0). At this time, the reward value of the second offset value can be determined based on the position, and when the third offset value is executed, It is determined that the position of the target object is (135, 0), and the reward value of the third offset value can be determined based on this position.
在一个示例中,得到奖赏值的表达式可以如公式(1-1)所示:In an example, the expression for obtaining the reward value can be as shown in formula (1-1):
R(s,a)=(s(x)-b) 2+(s(y)-c) 2       (1-1); R(s,a)=(s(x)-b) 2 +(s(y)-c) 2 (1-1);
其中,R(s,a)为偏移值s对应的最大价值的转动指令a的奖赏值,即偏移值s对应的奖赏值,s(x)和s(y)分别为未执行偏移值对应的最大价值的转动指令a时的目标对象的位置的横坐标和纵坐标,b和c分别表示图像的中心位置的横坐标和纵坐标。Among them, R(s,a) is the reward value of the maximum value of the rotation instruction a corresponding to the offset value s, that is, the reward value corresponding to the offset value s, and s(x) and s(y) are respectively the unexecuted offset The value corresponds to the abscissa and ordinate of the position of the target object when the rotation instruction a of the maximum value corresponds, and b and c represent the abscissa and ordinate of the center position of the image, respectively.
在得到偏移值对应的奖赏值和最大价值的情况下,可以根据偏移值对应的奖赏值和该偏移值对应的最大价值,确定该偏移值的最终价值。例如可以利用奖赏值与最大价值 的加权和确定最终价值。其中,本申请实施例确定偏移值的最终价值的表达式可以如公式(1-2)所示:In the case of obtaining the reward value and the maximum value corresponding to the offset value, the final value of the offset value can be determined according to the reward value corresponding to the offset value and the maximum value corresponding to the offset value. For example, the weighted sum of the reward value and the maximum value can be used to determine the final value. Wherein, the expression for determining the final value of the offset value in the embodiment of the present application may be as shown in formula (1-2):
Q'(s,a)=R(s,a)+r·max{Q(s,a)}·0.2·0.5       (1-2);Q'(s,a)=R(s,a)+r·max{Q(s,a)}·0.2·0.5 (1-2);
其中,Q'(s,a)为偏移值s对应的最终价值,R(s,a)为偏移值s对应的最大价值的转动指令a的奖赏值,max{Q(s,a)}为偏移值s对应的最大价值。Among them, Q'(s,a) is the final value corresponding to the offset value s, R(s,a) is the reward value of the maximum value rotation instruction a corresponding to the offset value s, max{Q(s,a) } Is the maximum value corresponding to the offset value s.
通过上述方式则可以得到每个偏移值对应的最终价值。Through the above method, the final value corresponding to each offset value can be obtained.
步骤S3033:将所述最终价值之和最大的偏移序列,确定为满足要求的偏移序列。Step S3033: Determine the offset sequence with the largest sum of the final value as the offset sequence that meets the requirements.
在本申请的一些实施例中,可以将偏移序列中的各偏移值的最终价值进行加和处理,得到该偏移序列对应的总价值。而后选择总价值最大的偏移序列作为满足要求的偏移序列。In some embodiments of the present application, the final value of each offset value in the offset sequence may be summed to obtain the total value corresponding to the offset sequence. Then select the offset sequence with the largest total value as the offset sequence that meets the requirements.
通过上述方式,可以得到总价值最大的偏移序列,该总价值最大即表示偏移序列对应的转动路径对应的转动指令为最优的选择。Through the above method, the offset sequence with the largest total value can be obtained, and the maximum total value indicates that the rotation instruction corresponding to the rotation path corresponding to the offset sequence is the optimal choice.
在得到满足要求的偏移序列之后,即可以根据该偏移序列中每个偏移值在价值表中对应的最大价值对应的转动指令,组合生成控制指令。而后可以将该控制指令传输给智能移动设备,使得智能移动设备根据该控制指令执行转动操作。After the offset sequence that meets the requirements is obtained, the control instruction can be combined and generated according to the rotation instruction corresponding to the maximum value corresponding to each offset value in the value table in the offset sequence. The control instruction can then be transmitted to the smart mobile device, so that the smart mobile device performs a rotation operation according to the control instruction.
在本申请的一些实施例中,可以根据生成的控制指令控制智能移动设备移动。其中,控制指令中可以包括电机的转动角度、转动方向等参数,或者也可以包括电机转速、电机转动时间、是否停止等控制指令。In some embodiments of the present application, the smart mobile device can be controlled to move according to the generated control instruction. Among them, the control command may include parameters such as the rotation angle and direction of the motor, or may also include control commands such as the motor speed, the motor rotation time, whether to stop or not.
本申请实施例可采用差速转向的方式控制移动设备移动,例如智能移动设备可以为智能移动车,其可以包括左右两个驱动轮,本申请实施例可以基于控制指令控制左右两个驱动轮的转速实现转向和移动。驱动轮转速不同时,即使无转向轮或者转向轮不动作,车身也会转动。本申请实施例中,两个驱动轮转速的不同可以通过操作安装在左右半轴上的两个单独的离合器或制动装置来实现。The embodiment of the present application may control the movement of the mobile device by means of differential steering. For example, the smart mobile device may be a smart mobile vehicle, which may include two left and right drive wheels. The embodiment of the present application may control the left and right drive wheels based on control instructions. Rotation speed realizes steering and movement. When the driving wheels rotate at different speeds, the body will rotate even if there is no steering wheel or the steering wheel does not move. In the embodiment of the present application, the difference in the rotational speed of the two driving wheels can be realized by operating two separate clutches or braking devices installed on the left and right half shafts.
智能移动设备根据左右驱动轮的不同的转速和转动角度,可以实现不同转动轨迹。不同转动轨迹下,小车采集到的图片不同,然后通过不断优化,调整智能移动车的位置,最终保证目标对象在图像中心,实现目标对象的跟踪。The intelligent mobile device can realize different rotation trajectories according to the different rotation speed and rotation angle of the left and right driving wheels. Under different rotation trajectories, the pictures collected by the car are different, and then through continuous optimization, the position of the intelligent mobile car is adjusted to ensure that the target object is in the center of the image to achieve the tracking of the target object.
另外,在本申请的一些实施例中,还可以根据检测到的目标对象的大小来确定智能移动设备的前进或者后退等移动。图6为本申请实施例提供的一种目标跟踪方法的另一流程示意图,如图6所示,所述目标跟踪方法还包括:In addition, in some embodiments of the present application, the forward or backward movement of the smart mobile device can also be determined according to the size of the detected target object. FIG. 6 is a schematic diagram of another process of a target tracking method provided by an embodiment of the application. As shown in FIG. 6, the target tracking method further includes:
步骤S41:基于所述目标对象的位置区域,确定用于控制所述智能移动设备移动的控制指令,其中,可以判断目标对象的位置区域的面积是否在第一阈值和第二阈值之间的范围内。本申请实施例的执行步骤S20的过程中,可以得到目标对象在采集的图像中的位置区域,本申请实施例可以根据该位置区域的面积对智能移动设备的移动方向进行控制。Step S41: Determine a control instruction for controlling the movement of the smart mobile device based on the location area of the target object, wherein it can be determined whether the area of the location area of the target object is within the range between the first threshold and the second threshold Inside. In the process of performing step S20 in the embodiment of the present application, the location area of the target object in the collected image can be obtained, and the embodiment of the present application can control the moving direction of the smart mobile device according to the area of the location area.
其中,可以根据得到的目标对象的位置区域,确定该位置区域的面积,并比较该面积和第一阈值以及第二阈值。其中第一阈值和第二阈值可以为预设的参考阈值,第一阈值大于第二阈值,本申请实施例对具体数值不作限定。Wherein, the area of the location area can be determined according to the obtained location area of the target object, and the area can be compared with the first threshold and the second threshold. The first threshold and the second threshold may be preset reference thresholds, the first threshold is greater than the second threshold, and the embodiment of the present application does not limit specific values.
步骤S42:在所述目标对象的位置区域对应的面积大于第一阈值的情况下,生成用于控制所述智能移动设备后退的控制指令;Step S42: in the case that the area corresponding to the location area of the target object is greater than the first threshold, generate a control instruction for controlling the backing of the smart mobile device;
本申请实施例中,在检测到的目标对象的位置区域的面积大于第一阈值时,即表明该目标对象与智能移动设备之间的距离较近,此时可以向后移动智能移动设备。其中可以生成控制所述智能移动设备后退的控制指令,直至检测到的目标对象的位置区域的面积小于第一阈值且大于第二阈值。In the embodiment of the present application, when the area of the detected location area of the target object is greater than the first threshold, it indicates that the distance between the target object and the smart mobile device is relatively short, and the smart mobile device can be moved backward at this time. Wherein, a control instruction for controlling the backing of the smart mobile device can be generated until the area of the detected location area of the target object is less than the first threshold and greater than the second threshold.
步骤S43:在所述目标对象的位置区域对应的面积小于第二阈值的情况下,生成用于控制所述智能移动设备前进的控制指令,所述第一阈值大于第二阈值。Step S43: In a case where the area corresponding to the location area of the target object is smaller than a second threshold, generate a control instruction for controlling the advancement of the smart mobile device, where the first threshold is greater than the second threshold.
本申请实施例中,在检测到的目标对象的位置区域的面积小于第二阈值时,即表明该目标对象与智能移动设备之间的距离较远,此时可以向前移动智能移动设备。其中可以生成控制所述智能移动设备前进的控制指令,直至检测到的目标对象的位置区域的面积小于第一阈值且大于第二阈值。In the embodiment of the present application, when the area of the detected location area of the target object is smaller than the second threshold, it means that the distance between the target object and the smart mobile device is far, and the smart mobile device can be moved forward at this time. Wherein, a control instruction for controlling the advancement of the smart mobile device can be generated until the area of the detected location area of the target object is less than the first threshold and greater than the second threshold.
对应的,智能移动设备可以根据接收到的前进或者后退的控制指令执行前进或者后退操作。Correspondingly, the smart mobile device can perform a forward or backward operation according to the received forward or backward control instruction.
通过上述方式,可以实现根据目标对象的大小控制智能移动设备的移动,保持检测到的目标对象(如人脸)的位置区域对应的面积在第二阈值和第一阈值之间,实现智能移动设备的移动方向的控制。Through the above method, the movement of the smart mobile device can be controlled according to the size of the target object, and the area corresponding to the location area of the detected target object (such as a human face) can be kept between the second threshold and the first threshold to realize the smart mobile device Control of the moving direction.
本申请实施例中的目标跟踪方法的应用主体可以为智能移动设备,或者也可以为安装在智能移动设备中的装置,该装置用以控制智能移动设备的移动。下面以应用本申请实施例的目标跟踪方法的智能移动设备为教育机器人,教育机器人的管理装置为树莓派,以及目标对象为人脸为例进行说明,以清楚的体现本申请实施例。图7为本申请实施例提供的一种目标跟踪方法的应用示例图,其中,摄像头701与树莓派702连接,用以将摄像头701采集的图像或者视频传输给树莓派702,其中摄像头701和树莓派702可以通过通用串行总线(Universal Serial Bus,USB)端口连接进行数据传输,但该连接方式不作为本申请实施例的限定。而后可以执行下述过程。The application body of the target tracking method in the embodiment of the present application may be a smart mobile device, or may also be a device installed in the smart mobile device, and the device is used to control the movement of the smart mobile device. In the following, the intelligent mobile device to which the target tracking method of the embodiment of the present application is applied is an educational robot, the management device of the educational robot is a Raspberry Pi, and the target object is a human face as an example for description, to clearly embody the embodiments of the present application. FIG. 7 is an application example diagram of a target tracking method provided by an embodiment of the application, in which the camera 701 is connected to the raspberry pi 702 to transmit the image or video collected by the camera 701 to the raspberry pi 702, wherein the camera 701 The Raspberry Pi 702 can be connected to the Raspberry Pi 702 through a Universal Serial Bus (USB) port for data transmission, but this connection method is not limited to this embodiment of the application. The following process can then be performed.
一、树莓派图像采集及图像预处理。1. Raspberry Pi image acquisition and image preprocessing.
本申请实施例的应用领域可以为教育背景下的智能机器人,智能机器人可以实现人脸检测及跟踪功能。其中,树莓派702可以执行图像的处理过程,本申请实施例的树莓派702可以执行图像的预处理以及目标检测处理,树莓派可以集成有目标检测网络。由于通过摄像头701采集到图像类型不尽相同,所以在将图像传输给目标检测网络模型之前,树莓派702需要对图像数据进行必要的预处理工作。The application field of the embodiment of the present application may be an intelligent robot in an educational background, and the intelligent robot may realize the functions of face detection and tracking. Among them, the Raspberry Pi 702 can perform image processing, the Raspberry Pi 702 in the embodiment of the present application can perform image preprocessing and target detection processing, and the Raspberry Pi can be integrated with a target detection network. Since the types of images collected by the camera 701 are not the same, the Raspberry Pi 702 needs to perform necessary preprocessing on the image data before transmitting the images to the target detection network model.
预处理流程包括以下四个部分,图8为本申请实施例提供的预处理过程的流程示意图,如图8所示,包括:The preprocessing process includes the following four parts. Fig. 8 is a schematic flow chart of the preprocessing process provided by an embodiment of the application, as shown in Fig. 8, including:
步骤S51,接收采集的视频数据。Step S51: Receive the collected video data.
步骤S52,将所述视频数据分帧为图片数据。Step S52: Framing the video data into picture data.
步骤S53,统一图片大小。Step S53: unify the picture size.
步骤S54,将所述图片转换为灰度图。Step S54: Convert the picture into a grayscale image.
步骤S55,将图片归一化。Step S55: Normalize the picture.
图像分帧是指将采集到的视频数据分解为一帧帧的图像,然后将图像大小统一到640*480大小范围。由于彩色图像在处理时会消耗掉大量资源,但是对于检测效果影响较小,因此,本申请实施例忽略颜色特征,将图像直接转换灰度图像然后送到目标检测网络中进行检测工作。最后为了图像处理方便性,将图像进行归一化处理,就是将图像数据的每一维原始数据减去每一维数据的平均值,将结果代替原始数据,然后再将每一维的数据除以每一维数据的标准差,这样就可以将图像数据归一化处理到相同尺度。Image framing refers to decomposing the collected video data into one frame of images, and then unifying the image size to a size range of 640*480. Since color images consume a lot of resources during processing, but have little impact on the detection effect, the embodiment of the present application ignores color features, directly converts the image to a grayscale image, and sends it to the target detection network for detection. Finally, for the convenience of image processing, the image is normalized, which is to subtract the average value of each dimension from the original data of each dimension of the image data, replace the original data with the result, and then divide the data of each dimension With the standard deviation of each dimension data, the image data can be normalized to the same scale.
二、基于深度神经网络模型实现人脸检测。2. Realize face detection based on deep neural network model.
输入:摄像头701采集得到图片。Input: The camera 701 collects the picture.
输出:人脸检测坐标位置。Output: Face detection coordinate position.
在本申请实施例可以通过树莓派702中的目标检测网络执行图像中的人脸识别和检测,即本申请实施例可以利用深度学习技术实现人脸检测技术,其中深度学习技术实现人脸检测技术分为模型训练和模型应用两个阶段,图9为本申请实施例提供的目标检测网络的训练过程示意图,如图9所示,训练过程包括:In the embodiment of the application, the target detection network in the Raspberry Pi 702 can perform face recognition and detection in the image, that is, the embodiment of the application can use the deep learning technology to realize the face detection technology, where the deep learning technology realizes the face detection The technology is divided into two stages: model training and model application. FIG. 9 is a schematic diagram of the training process of the target detection network provided in an embodiment of the application. As shown in FIG. 9, the training process includes:
步骤S61,采集人脸数据集图片。Step S61: Collect a face data set picture.
人脸数据集图片包括各个年龄、各个地域的人脸图片,并对人脸图片进行人工标注,得到人脸坐标位置。构建人脸数据集,并将该数据集划分为训练集、测试集和验证集三部分。The face data set pictures include face pictures of various ages and various regions, and the face pictures are manually labeled to obtain the coordinate positions of the faces. Construct a face data set and divide the data set into three parts: training set, test set and verification set.
步骤S62,构建神经网络模型。Step S62, construct a neural network model.
在实际实现时,步骤S62可以通过以下步骤实现:In actual implementation, step S62 can be implemented through the following steps:
步骤S621,通过叠加卷积层和池化层实现特征提取。In step S621, feature extraction is achieved by superimposing the convolutional layer and the pooling layer.
步骤S622,利用分类器对提取出的特征进行分类。Step S622: Use a classifier to classify the extracted features.
在实现时,可以通过全连接层(分类器)实现分类。In implementation, classification can be achieved through a fully connected layer (classifier).
步骤S63,神经网络模型训练。Step S63, training the neural network model.
通过一系列梯度优化算法实现模型训练,经过大量迭代训练后,可以得到训练好的模型,用于模型测试。Model training is achieved through a series of gradient optimization algorithms. After a large number of iterative training, a trained model can be obtained for model testing.
步骤S64,得到训练好的神经网络模型。In step S64, a trained neural network model is obtained.
本申请实施例中,模型的训练过程即为目标检测网络(神经网络模型)的训练过程。In the embodiment of the present application, the training process of the model is the training process of the target detection network (neural network model).
图10为本申请实施例提供的目标检测网络的应用过程示意图,如图10所示,应用过程包括:FIG. 10 is a schematic diagram of the application process of the target detection network provided by an embodiment of the application. As shown in FIG. 10, the application process includes:
步骤S71,采集人脸图片。Step S71: Collect a face picture.
步骤S72,将经过预处理过的图片送入已经训练好的模型中。In step S72, the preprocessed picture is sent to the trained model.
步骤S73,得到人脸坐标位置。Step S73, obtain the coordinate position of the face.
本申请实施例中,将经过预处理过的图片送入已经训练好的模型中,经过前向计算后可以输出图片中人脸坐标位置。In the embodiment of this application, the pre-processed picture is sent to the trained model, and the coordinate position of the face in the picture can be output after forward calculation.
三、将检测结果发送至教育机器人EV3(同上述实施例中的智能机器人)。3. Send the detection result to the educational robot EV3 (same as the intelligent robot in the above embodiment).
通过上述实施例,可以通过树莓派702完成人脸坐标位置检测,然后可以通过根据已经定义好的通信协议规范,可以将该人脸坐标位置封装到数据包中。在完成数据封装后,通过串口发送至智能移动设备703中的处理器或者控制器,其中智能移动设备703可以为教育机器人EV3,而后智能移动设备703可以根据接收到的人脸位置完成后续人脸跟踪。Through the foregoing embodiment, the face coordinate position detection can be completed by the Raspberry Pi 702, and then the face coordinate position can be encapsulated into a data packet according to a defined communication protocol specification. After the data encapsulation is completed, it is sent to the processor or controller in the smart mobile device 703 through the serial port, where the smart mobile device 703 can be an educational robot EV3, and then the smart mobile device 703 can complete subsequent faces according to the received face position track.
四、EV3根据人脸位置坐标,进行路径规划。4. EV3 performs path planning according to the coordinates of the face position.
教育机器人EV3接收并解析从树莓派702侧发送的数据包,可以得到人脸坐标位置,然后完成路径规划。其中可以采用强化学习算法实现路径规划。强化学习主要包括状态、奖赏和动作因素。其中,状态即每次检测得到的人脸坐标位置,奖赏可以定义为人脸中心距离图片中心的欧式距离,动作即每次执行的电机运动指令,在教育机器人EV3中,可以控制电机动作如表1。通过神经Q-learning算法模型,可以进行路径规划。定义Q函数如下,输入包括状态和动作,返回在特定状态下执行某一动作的奖赏值。The educational robot EV3 receives and analyzes the data packet sent from the Raspberry Pi 702 side to obtain the coordinate position of the face, and then complete the path planning. Among them, reinforcement learning algorithms can be used to realize path planning. Reinforcement learning mainly includes state, reward and action factors. Among them, the state is the coordinate position of the face detected each time, the reward can be defined as the Euclidean distance between the center of the face and the center of the picture, and the action is the motor motion instruction executed each time. In the educational robot EV3, the motor motion can be controlled as shown in Table 1. . Through the neural Q-learning algorithm model, path planning can be performed. The Q function is defined as follows. Input includes state and action, and returns the reward value for performing an action in a specific state.
图11为本申请实施例提供的基于强化学习路径规划算法的流程示意图,如图11所示,包括:FIG. 11 is a schematic flowchart of a path planning algorithm based on reinforcement learning provided by an embodiment of the application, as shown in FIG. 11, including:
步骤S81,初始化Q值表。Step S81, initialize the Q value table.
步骤S82,从动作集合中选定特定电机执行指令。Step S82, selecting a specific motor from the action set to execute the command.
步骤S83,执行特定电机执行指令。Step S83, execute a specific motor execution instruction.
步骤S84,计算该状态的Q值表。Step S84: Calculate the Q value table of this state.
步骤S85,更新Q值表。Step S85, update the Q value table.
教育机器人EV3的动作集合见表1,状态集合通过人脸坐标来确定跟踪效果,即人脸位置距离图片中心的距离作为奖赏函数,通过衡量不同动作的奖赏函数来更新Q值表,最后可以得到最优Q值表对,Q值表对包含最佳的动作序列,即电机执行指令。The action set of the educational robot EV3 is shown in Table 1. The state set uses the face coordinates to determine the tracking effect, that is, the distance between the face position and the center of the picture is used as the reward function, and the Q value table is updated by measuring the reward function of different actions. Finally, you can get The optimal Q value table pair, the Q value table pair contains the best action sequence, that is, the motor executes the command.
五、智能移动设备703根据运动指令(同上述各个实施例中的控制指令),实现人脸跟踪。5. The smart mobile device 703 implements face tracking according to the motion instructions (same as the control instructions in the above embodiments).
如教育机器人等智能移动设备采用差速转向机制,小车通过控制左右两个驱动轮704和705的转速实现转向。驱动轮转速不同时,即使无转向轮或者转向轮不动作,车身也会旋转。驱动轮转速的不同可以通过操作安装在左右半轴上的两个单独的离合器或制动装置来实现。Smart mobile devices such as educational robots use a differential steering mechanism, and the trolley realizes steering by controlling the speed of the left and right driving wheels 704 and 705. When the driving wheels rotate at different speeds, the body will rotate even if there is no steering wheel or the steering wheel does not move. The difference in the speed of the driving wheels can be realized by operating two separate clutches or braking devices mounted on the left and right axles.
智能移动设备703可以根据左右轮不同的转速和旋转角度,实现不同旋转轨迹。不同旋转轨迹下,小车采集到的图片不同,然后不断优化动作,调整小车位置,最终保证人脸位置在图片中心,实现人脸跟踪功能。The smart mobile device 703 can realize different rotation trajectories according to different rotation speeds and rotation angles of the left and right wheels. Under different rotation trajectories, the pictures collected by the car are different, and then continuously optimize the action, adjust the position of the car, and finally ensure that the face position is in the center of the picture to realize the face tracking function.
另外,本申请实施例中的智能移动设备上还可以设置有传感器706,例如距离传感器、触碰传感器等,用于感测智能移动设备703周围环境的相关信息,并可以根据感测到的周围环境的相关信息控制智能移动设备703的工作模式、移动参数等。In addition, the smart mobile device in the embodiment of the present application may also be provided with a sensor 706, such as a distance sensor, a touch sensor, etc., for sensing related information of the surrounding environment of the smart mobile device 703, and can be based on the sensed surroundings The related information of the environment controls the working mode and movement parameters of the smart mobile device 703.
上述仅为示例性举例说明,不作为本申请实施例的具体限定。The foregoing is merely illustrative and is not intended as a specific limitation of the embodiments of the present application.
综上所述,本申请实施例提供的目标跟踪方法,可以得到采集的图像中目标对象的位置,并根据该目标对象的位置与图像中心之间的距离,得到智能移动设备的控制指令,该控制指令用于调整智能移动设备的转动角度,得到的控制指令包括至少一个偏移值对应的转动指令,其中各偏移值形成的偏移序列与目标对象和图像中心之间的距离来确定,通过得到的控制指令可以使得转动后的目标对象能够在采集的图像的中心,从而使得目标对象在智能移动设备的跟踪范围内。本申请实施例可以实时的根据目标对象的位置执行目标跟踪,具有更加方便、准确且提高了智能移动设备的效能的特点。In summary, the target tracking method provided by the embodiments of the present application can obtain the position of the target object in the collected image, and obtain the control instruction of the smart mobile device according to the distance between the position of the target object and the image center. The control instruction is used to adjust the rotation angle of the smart mobile device. The obtained control instruction includes at least one rotation instruction corresponding to an offset value, wherein the distance between the offset sequence formed by each offset value and the target object and the image center is determined, The obtained control instruction can enable the rotated target object to be in the center of the collected image, so that the target object is within the tracking range of the smart mobile device. The embodiment of the present application can perform target tracking according to the position of the target object in real time, and has the characteristics of being more convenient, accurate, and improving the performance of the smart mobile device.
另外,本申请实施例可以利用深度学习技术完成人脸检测(利用神经网络实现目标检测),相比于传统的目标检测方法准确性和速度有明显地提升。本申请实施例还可以利用强化学习算法,通过Q-learning技术进行路径规划,可以选取最佳的转动路径。本申请实施例还可以适用于不同场景需求,可扩展性好。In addition, the embodiments of the present application can use deep learning technology to complete face detection (using neural networks to achieve target detection), which has significantly improved accuracy and speed compared to traditional target detection methods. In the embodiment of the present application, a reinforcement learning algorithm may also be used to perform path planning through Q-learning technology, and the optimal rotation path may be selected. The embodiments of the present application can also be adapted to the requirements of different scenarios and have good scalability.
本领域技术人员可以理解,在具体实施方式的上述方法中,各步骤的撰写顺序并不意味着严格的执行顺序而对实施过程构成任何限定,各步骤的具体执行顺序应当以其功能和可能的内在逻辑确定。Those skilled in the art can understand that in the above methods of the specific implementation, the writing order of the steps does not mean a strict execution order but constitutes any limitation on the implementation process. The specific execution order of each step should be based on its function and possibility. The inner logic is determined.
可以理解,本申请提及的上述各个方法实施例,在不违背原理逻辑的情况下,均可以彼此相互结合形成结合后的实施例。It can be understood that the various method embodiments mentioned in this application can be combined with each other to form a combined embodiment without violating the principle logic.
此外,本申请实施例还提供了目标跟踪装置、智能移动设备、计算机可读存储介质、程序,上述均可用来实现本申请实施例提供的任一种目标跟踪方法,相应技术方案和描述和参见方法部分的相应记载。In addition, the embodiments of the present application also provide target tracking devices, smart mobile devices, computer-readable storage media, and programs, all of which can be used to implement any target tracking method provided in the embodiments of the present application. For the corresponding technical solutions and descriptions, see Corresponding records in the method section.
图12为本申请实施例提供的一种目标跟踪装置的结构示意图,如图12所示,所述目标跟踪装置包括:FIG. 12 is a schematic structural diagram of a target tracking device provided by an embodiment of the application. As shown in FIG. 12, the target tracking device includes:
图像采集模块10,其配置为采集图像;The image acquisition module 10 is configured to acquire images;
目标检测模块20,其配置为确定所述图像中的目标对象的位置;The target detection module 20 is configured to determine the position of the target object in the image;
控制模块30,其配置为基于所述目标对象的位置和所述图像的中心位置之间的距离,确定控制智能移动设备转动的控制指令,其中,所述控制指令用于使得所述目标对象的位置位于所述图像的中心位置,且所述控制指令包括构成所述距离的偏移序列中的偏移值对应的控制指令,所述偏移序列包括至少一个偏移值。The control module 30 is configured to determine a control instruction for controlling the rotation of the smart mobile device based on the distance between the position of the target object and the center position of the image, wherein the control instruction is used to make the target object The position is located at the center of the image, and the control instruction includes a control instruction corresponding to an offset value in an offset sequence constituting the distance, and the offset sequence includes at least one offset value.
在本申请的一些实施例中,所述装置还包括预处理模块,其配置为对所述图像执行预处理操作,所述预处理操作包括:将所述图像调整成预设规格的灰度图像,以及对所述灰度图像执行归一化处理;In some embodiments of the present application, the device further includes a preprocessing module configured to perform a preprocessing operation on the image, and the preprocessing operation includes: adjusting the image to a grayscale image of a preset specification , And perform normalization processing on the grayscale image;
所述目标检测模块还配置为对所述预处理操作后得到的图像执行目标检测处理,获得所述预处理操作后的图像中所述目标对象的位置;The target detection module is further configured to perform target detection processing on the image obtained after the preprocessing operation to obtain the position of the target object in the image after the preprocessing operation;
基于所述预处理操作后的图像中所述目标对象的位置,确定所述图像中所述目标对象的位置。Determine the position of the target object in the image based on the position of the target object in the image after the preprocessing operation.
在本申请的一些实施例中,所述预处理模块执行所述对所述灰度图像执行归一化处理的步骤包括:In some embodiments of the present application, the step of performing the normalization processing on the grayscale image by the preprocessing module includes:
确定所述灰度图像中各像素点的像素值的平均值和标准差;Determining the average value and standard deviation of the pixel value of each pixel in the grayscale image;
获得所述各像素点的像素值与所述平均值之间的差值;Obtaining the difference between the pixel value of each pixel and the average value;
将所述各像素点对应的所述差值和所述标准差之间的比值,确定为所述像素点归一化后的像素值。The ratio between the difference and the standard deviation corresponding to each pixel is determined as the normalized pixel value of the pixel.
在本申请的一些实施例中,所述目标检测模块还配置为提取所述图像的图像特征;In some embodiments of the present application, the target detection module is further configured to extract image features of the image;
对所述图像特征执行分类处理,得到所述图像中的目标对象的位置区域;Perform classification processing on the image features to obtain the location area of the target object in the image;
将所述位置区域的中心位置确定为所述目标对象的位置。The center position of the position area is determined as the position of the target object.
在本申请的一些实施例中,所述目标对象包括人脸;In some embodiments of the present application, the target object includes a human face;
对应地,所述目标检测模块还配置为确定所述图像中人脸的位置。Correspondingly, the target detection module is further configured to determine the position of the human face in the image.
在本申请的一些实施例中,所述控制模块还配置为基于所述图像中的目标对象的位置与所述图像的中心位置之间的距离,确定目标偏移量;In some embodiments of the present application, the control module is further configured to determine the target offset based on the distance between the position of the target object in the image and the center position of the image;
基于所述目标偏移量生成多组偏移序列,并且每组偏移序列中的偏移值的加和值为所述目标偏移量;Generating multiple sets of offset sequences based on the target offset, and the sum of the offset values in each set of offset sequences is the target offset;
利用强化学习算法,从所述多组偏移序列中选择出满足要求的偏移序列,并得到所述满足要求的偏移序列对应的控制指令。Using a reinforcement learning algorithm, select an offset sequence that meets the requirements from the multiple sets of offset sequences, and obtain a control instruction corresponding to the offset sequence that meets the requirements.
在本申请的一些实施例中,所述控制模块还配置为针对所述多组偏移序列中各偏移值,确定价值表中与所述偏移值对应的最大价值,所述价值表包括偏移值在不同转动指令下对应的价值;In some embodiments of the present application, the control module is further configured to determine the maximum value corresponding to the offset value in the value table for each offset value in the multiple sets of offset sequences, and the value table includes The offset value corresponds to the value under different rotation commands;
获得所述偏移值对应的奖赏值,并基于所述偏移值对应的所述奖赏值和所述最大价值,确定所述偏移值的最终价值,所述奖赏值为在未执行所述偏移值的最大价值对应的转动指令的情况下,目标对象的位置与图像中心的距离;The reward value corresponding to the offset value is obtained, and the final value of the offset value is determined based on the reward value and the maximum value corresponding to the offset value, and the reward value is In the case of a rotation instruction corresponding to the maximum value of the offset value, the distance between the position of the target object and the image center;
将所述多组偏移序列中各偏移值的所述最终价值之和最大的偏移序列,确定为满足要求的偏移序列。The offset sequence that has the largest sum of the final value of the offset values in the multiple sets of offset sequences is determined as the offset sequence that meets the requirements.
在本申请的一些实施例中,所述控制模块还配置为基于所述满足要求的偏移序列中各偏移值的最大价值对应的转动指令,确定所述控制指令。In some embodiments of the present application, the control module is further configured to determine the control instruction based on the rotation instruction corresponding to the maximum value of each offset value in the offset sequence that meets the requirements.
在本申请的一些实施例中,所述目标检测模块还配置为基于所述目标对象的位置区域,确定控制所述智能移动设备移动的控制指令,其中,In some embodiments of the present application, the target detection module is further configured to determine a control instruction for controlling the movement of the smart mobile device based on the location area of the target object, wherein:
在所述目标对象的位置区域对应的面积大于第一阈值的情况下,生成控制所述智能移动设备后退的控制指令;In the case that the area corresponding to the location area of the target object is greater than the first threshold, generating a control instruction for controlling the smart mobile device to move back;
在所述目标对象的位置区域对应的面积小于第二阈值的情况下,生成控制所述智能移动设备前进的控制指令,所述第一阈值大于第二阈值。In a case where the area corresponding to the location area of the target object is smaller than a second threshold, a control instruction for controlling the advancement of the smart mobile device is generated, and the first threshold is greater than the second threshold.
另外,本申请实施例还提供了一种智能移动设备,该智能移动设备包括上述实施例所述的目标跟踪装置,所述目标跟踪装置内的目标检测网络集成在智能移动设备的管理装置中,通过所述管理装置执行所述图像采集模块采集的图像的目标检测处理,得到所述目标对象的位置;In addition, an embodiment of the present application also provides a smart mobile device that includes the target tracking device described in the above embodiment, and the target detection network in the target tracking device is integrated in the management device of the smart mobile device, Execute the target detection processing of the image collected by the image collection module by the management device to obtain the position of the target object;
所述控制模块与所述管理装置连接,并用于根据所述管理装置得到目标对象的位置生成所述控制指令,并根据所述控制指令控制所述智能移动设备转动。The control module is connected to the management device, and is used to generate the control instruction according to the position of the target object obtained by the management device, and control the rotation of the smart mobile device according to the control instruction.
在本申请的一些实施例中,所述管理装置为树莓派。In some embodiments of the present application, the management device is a Raspberry Pi.
在本申请的一些实施例中,所述智能移动设备包括教育机器人。In some embodiments of the present application, the smart mobile device includes an educational robot.
在本申请的一些实施例中,所述管理装置还集成有所述目标跟踪装置的预处理模块以配置为对所述图像执行预处理操作,并对所述预处理操作后的图像执行目标检测处理,得到所述图像中目标对象的位置。In some embodiments of the present application, the management device is also integrated with the preprocessing module of the target tracking device to be configured to perform preprocessing operations on the images and perform target detection on the images after the preprocessing operations Processing to obtain the position of the target object in the image.
在一些实施例中,本申请实施例提供的装置具有的功能或包含的模块可以配置为执 行上文方法实施例描述的方法,其具体实现可以参照上文方法实施例的描述。In some embodiments, the functions or modules included in the apparatus provided in the embodiments of the present application can be configured to execute the methods described in the above method embodiments, and for specific implementation, refer to the description of the above method embodiments.
本申请实施例还提出一种计算机可读存储介质,其上存储有计算机程序指令,所述计算机程序指令被处理器执行时实现上述方法。计算机可读存储介质可以是非易失性计算机可读存储介质。The embodiment of the present application also proposes a computer-readable storage medium on which computer program instructions are stored, and the computer program instructions implement the foregoing method when executed by a processor. The computer-readable storage medium may be a non-volatile computer-readable storage medium.
本申请实施例还提出一种智能移动设备,包括:处理器;用于存储处理器可执行指令的存储器;其中,所述处理器被配置为上述方法。An embodiment of the present application also proposes an intelligent mobile device, including: a processor; a memory for storing executable instructions of the processor; wherein the processor is configured as the above method.
图13为本申请实施例提供的一种智能移动设备的结构示意图。例如,智能移动设备800可以是任意的能够执行图像处理的设备或者能够执行目标跟踪的移动设备。FIG. 13 is a schematic structural diagram of a smart mobile device provided by an embodiment of this application. For example, the smart mobile device 800 may be any device capable of performing image processing or a mobile device capable of performing target tracking.
参照图13,设备800可以包括以下一个或多个组件:处理组件802,存储器804,电源组件806,多媒体组件808,音频组件810,输入/输出(Input Output,I/O)的接口812,传感器组件814,以及通信组件816。13, the device 800 may include one or more of the following components: a processing component 802, a memory 804, a power supply component 806, a multimedia component 808, an audio component 810, an input/output (Input Output, I/O) interface 812, a sensor Component 814, and communication component 816.
处理组件802通常控制设备800的整体操作,诸如与显示,电话呼叫,数据通信,相机操作和记录操作相关联的操作。处理组件802可以包括一个或多个处理器820来执行指令,以完成上述的方法的全部或部分步骤。此外,处理组件802可以包括一个或多个模块,便于处理组件802和其他组件之间的交互。例如,处理组件802可以包括多媒体模块,以方便多媒体组件808和处理组件802之间的交互。The processing component 802 generally controls the overall operations of the device 800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing component 802 may include one or more processors 820 to execute instructions to complete all or part of the steps of the foregoing method. In addition, the processing component 802 may include one or more modules to facilitate the interaction between the processing component 802 and other components. For example, the processing component 802 may include a multimedia module to facilitate the interaction between the multimedia component 808 and the processing component 802.
存储器804被配置为存储各种类型的数据以支持在设备800的操作。这些数据的示例包括用于在设备800上操作的任何应用程序或方法的指令,联系人数据,电话簿数据,消息,图片,视频等。存储器804可以由任何类型的易失性或非易失性存储设备或者它们的组合实现,如静态随机存取存储器(Static Random-Access Memory,SRAM),电可擦除可编程只读存储器(Electrically Erasable Programmable Read Only Memory,EEPROM),可擦除可编程只读存储器(Electrical Programmable Read Only Memory,EPROM),可编程只读存储器(Programmable Read-Only Memory,PROM),只读存储器(Read-Only Memory,ROM),磁存储器,快闪存储器,磁盘或光盘。The memory 804 is configured to store various types of data to support the operation of the device 800. Examples of these data include instructions for any application or method operating on the device 800, contact data, phone book data, messages, pictures, videos, etc. The memory 804 can be implemented by any type of volatile or non-volatile storage devices or their combination, such as static random access memory (Static Random-Access Memory, SRAM), electrically erasable programmable read-only memory (Electrically Erasable Programmable Read Only Memory, EEPROM, Erasable Programmable Read-Only Memory (Electrical Programmable Read Only Memory, EPROM), Programmable Read-Only Memory (PROM), Read-Only Memory (Read-Only Memory) , ROM), magnetic memory, flash memory, magnetic disk or optical disk.
电源组件806为设备800的各种组件提供电力。电源组件806可以包括电源管理系统,一个或多个电源,及其他与为设备800生成、管理和分配电力相关联的组件。The power supply component 806 provides power for various components of the device 800. The power supply component 806 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the device 800.
多媒体组件808包括在所述设备800和用户之间的提供一个输出接口的屏幕。在一些实施例中,屏幕可以包括液晶显示器(Liquid Crystal Display,LCD)和触摸面板(Touch Pad,TP)。如果屏幕包括触摸面板,屏幕可以被实现为触摸屏,以接收来自用户的输入信号。触摸面板包括一个或多个触摸传感器以感测触摸、滑动和触摸面板上的手势。所述触摸传感器可以不仅感测触摸或滑动动作的边界,而且还检测与所述触摸或滑动操作相关的持续时间和压力。在一些实施例中,多媒体组件808包括一个前置摄像头和/或后置摄像头。当设备800处于操作模式,如拍摄模式或视频模式时,前置摄像头和/或后置摄像头可以接收外部的多媒体数据。每个前置摄像头和后置摄像头可以是一个固定的光学透镜系统或具有焦距和光学变焦能力。The multimedia component 808 includes a screen that provides an output interface between the device 800 and the user. In some embodiments, the screen may include a liquid crystal display (Liquid Crystal Display, LCD) and a touch panel (Touch Pad, TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from the user. The touch panel includes one or more touch sensors to sense touch, sliding, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure related to the touch or slide operation. In some embodiments, the multimedia component 808 includes a front camera and/or a rear camera. When the device 800 is in an operation mode, such as a shooting mode or a video mode, the front camera and/or the rear camera can receive external multimedia data. Each front camera and rear camera can be a fixed optical lens system or have focal length and optical zoom capabilities.
音频组件810被配置为输出和/或输入音频信号。例如,音频组件810包括一个麦克风(MIC),当设备800处于操作模式,如呼叫模式、记录模式和语音识别模式时,麦克风被配置为接收外部音频信号。所接收的音频信号可以被进一步存储在存储器804或经由通信组件816发送。在一些实施例中,音频组件810还包括一个扬声器,用于输出音频信号。The audio component 810 is configured to output and/or input audio signals. For example, the audio component 810 includes a microphone (MIC). When the device 800 is in an operating mode, such as a call mode, a recording mode, and a voice recognition mode, the microphone is configured to receive external audio signals. The received audio signal may be further stored in the memory 804 or transmitted via the communication component 816. In some embodiments, the audio component 810 further includes a speaker for outputting audio signals.
I/O接口812为处理组件802和外围接口模块之间提供接口,上述外围接口模块可以是键盘,点击轮,按钮等。这些按钮可包括但不限于:主页按钮、音量按钮、启动按钮和锁定按钮。The I/O interface 812 provides an interface between the processing component 802 and a peripheral interface module. The peripheral interface module may be a keyboard, a click wheel, a button, and the like. These buttons may include but are not limited to: home button, volume button, start button, and lock button.
传感器组件814包括一个或多个传感器,用于为设备800提供各个方面的状态评估。例如,传感器组件814可以检测到设备800的打开/关闭状态,组件的相对定位,例如所述组件为设备800的显示器和小键盘,传感器组件814还可以检测设备800或设备800一个组 件的位置改变,用户与设备800接触的存在或不存在,设备800方位或加速/减速和设备800的温度变化。传感器组件814可以包括接近传感器,被配置用来在没有任何的物理接触时检测附近物体的存在。传感器组件814还可以包括光传感器,如互补金属氧化物半导体(Complementary Metal Oxide Semiconductor,CMOS)或电荷耦合器件(Charge Coupled Device,CCD)图像传感器,用于在成像应用中使用。在一些实施例中,该传感器组件814还可以包括加速度传感器,陀螺仪传感器,磁传感器,压力传感器或温度传感器。The sensor component 814 includes one or more sensors for providing the device 800 with various aspects of status assessment. For example, the sensor component 814 can detect the on/off status of the device 800 and the relative positioning of components, such as the display and keypad of the device 800, and the sensor component 814 can also detect the position change of the device 800 or a component of the device 800 , The presence or absence of contact between the user and the device 800, the orientation or acceleration/deceleration of the device 800, and the temperature change of the device 800. The sensor component 814 may include a proximity sensor configured to detect the presence of nearby objects when there is no physical contact. The sensor component 814 may also include a light sensor, such as a Complementary Metal Oxide Semiconductor (CMOS) or Charge Coupled Device (CCD) image sensor, for use in imaging applications. In some embodiments, the sensor component 814 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor or a temperature sensor.
通信组件816被配置为便于设备800和其他设备之间有线或无线方式的通信。设备800可以接入基于通信标准的无线网络,如WiFi,2G或3G,或它们的组合。在一个示例性实施例中,通信组件816经由广播信道接收来自外部广播管理系统的广播信号或广播相关信息。在一个示例性实施例中,所述通信组件816还包括近场通信(Near Field Communication,NFC)模块,以促进短程通信。例如,在NFC模块可基于射频识别(Radio Frequency Identification,RFID)技术,红外数据协会(Infrared Data Association,IrDA)技术,超宽带(Ultra Wide Band,UWB)技术,蓝牙(Bluetooth,BT)技术和其他技术来实现。The communication component 816 is configured to facilitate wired or wireless communication between the device 800 and other devices. The device 800 can access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In an exemplary embodiment, the communication component 816 receives a broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 816 further includes a Near Field Communication (NFC) module to facilitate short-range communication. For example, the NFC module can be based on radio frequency identification (RFID) technology, infrared data association (Infrared Data Association, IrDA) technology, ultra wideband (UWB) technology, Bluetooth (BT) technology and other technologies. Technology to achieve.
在示例性实施例中,设备800可以被一个或多个应用专用集成电路(Application Specific Integrated Circuit,ASIC)、数字信号处理器(Digital Signal Processor,DSP)、数字信号处理设备(Digital Signal Process,DSPD)、可编程逻辑器件(Programmable Logic Device,PLD)、现场可编程门阵列(Field Programmable Gate Array,FPGA)、控制器、微控制器、微处理器或其他电子元件实现,用于执行上述方法。In an exemplary embodiment, the device 800 may be implemented by one or more application specific integrated circuits (Application Specific Integrated Circuit, ASIC), digital signal processor (Digital Signal Processor, DSP), and digital signal processing device (Digital Signal Process, DSPD). ), programmable logic device (Programmable Logic Device, PLD), Field Programmable Gate Array (Field Programmable Gate Array, FPGA), controller, microcontroller, microprocessor, or other electronic components to implement the above methods.
在示例性实施例中,还提供了一种非易失性计算机可读存储介质,例如包括计算机程序指令的存储器804,上述计算机程序指令可由设备800的处理器820执行以完成上述方法。In an exemplary embodiment, there is also provided a non-volatile computer-readable storage medium, such as the memory 804 including computer program instructions, which can be executed by the processor 820 of the device 800 to implement the foregoing methods.
本申请实施例可以是系统、方法和/或计算机程序产品。计算机程序产品可以包括计算机可读存储介质,其上载有用于使处理器实现本申请实施例的各个方面的计算机可读程序指令。The embodiments of the application may be systems, methods and/or computer program products. The computer program product may include a computer-readable storage medium loaded with computer-readable program instructions for enabling a processor to implement various aspects of the embodiments of the present application.
计算机可读存储介质可以是可以保持和存储由指令执行设备使用的指令的有形设备。计算机可读存储介质例如可以是但不限于电存储设备、磁存储设备、光存储设备、电磁存储设备、半导体存储设备或者上述的任意合适的组合。计算机可读存储介质的更具体的例子(非穷举的列表)包括:便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、静态随机存取存储器(SRAM)、便携式压缩盘只读存储器(CD-ROM)、数字多功能盘(Digital Video Disc,DVD)、记忆棒、软盘、机械编码设备、例如其上存储有指令的打孔卡或凹槽内凸起结构、以及上述的任意合适的组合。这里所使用的计算机可读存储介质不被解释为瞬时信号本身,诸如无线电波或者其他自由传播的电磁波、通过波导或其他传输媒介传播的电磁波(例如,通过光纤电缆的光脉冲)、或者通过电线传输的电信号。The computer-readable storage medium may be a tangible device that can hold and store instructions used by the instruction execution device. The computer-readable storage medium may be, for example, but not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. More specific examples of computer-readable storage media (non-exhaustive list) include: portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM) Or flash memory), static random access memory (SRAM), portable compact disk read-only memory (CD-ROM), digital versatile disk (Digital Video Disc, DVD), memory stick, floppy disk, mechanical encoding device, such as storage on it Commanded punch cards or protruding structures in the grooves, and any suitable combination of the above. The computer-readable storage medium used here is not interpreted as a transient signal itself, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (for example, light pulses through fiber optic cables), or through wires Transmission of electrical signals.
这里所描述的计算机可读程序指令可以从计算机可读存储介质下载到各个计算/处理设备,或者通过网络、例如因特网、局域网、广域网和/或无线网下载到外部计算机或外部存储设备。网络可以包括铜传输电缆、光纤传输、无线传输、路由器、防火墙、交换机、网关计算机和/或边缘服务器。每个计算/处理设备中的网络适配卡或者网络接口从网络接收计算机可读程序指令,并转发该计算机可读程序指令,以供存储在各个计算/处理设备中的计算机可读存储介质中。The computer-readable program instructions described herein can be downloaded from a computer-readable storage medium to various computing/processing devices, or downloaded to an external computer or external storage device via a network, such as the Internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, optical fiber transmission, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers. The network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network, and forwards the computer-readable program instructions for storage in the computer-readable storage medium in each computing/processing device .
用于执行本申请实施例操作的计算机程序指令可以是汇编指令、指令集架构(ISA)指令、机器指令、机器相关指令、微代码、固件指令、状态设置数据、或者以一种或多种编程语言的任意组合编写的源代码或目标代码,所述编程语言包括面向对象的编程语言—诸如Smalltalk、C++等,以及常规的过程式编程语言—诸如“C”语言或类似的编程语 言。计算机可读程序指令可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络—包括局域网(Local Area Network,LAN)或广域网(Wide Area Network,WAN)—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。在一些实施例中,通过利用计算机可读程序指令的状态信息来个性化定制电子电路,例如可编程逻辑电路、现场可编程门阵列(Field Programmable Gate Array,FPGA)或可编程逻辑阵列(Programmable Logic Array,PLA),该电子电路可以执行计算机可读程序指令,从而实现本申请实施例的各个方面。The computer program instructions used to perform the operations of the embodiments of the present application may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine-related instructions, microcode, firmware instructions, state setting data, or one or more programming Source code or object code written in any combination of languages, the programming language includes object-oriented programming languages such as Smalltalk, C++, etc., and conventional procedural programming languages such as "C" language or similar programming languages. Computer-readable program instructions can be executed entirely on the user's computer, partly on the user's computer, executed as a stand-alone software package, partly on the user's computer and partly executed on a remote computer, or entirely on the remote computer or server carried out. In the case of a remote computer, the remote computer can be connected to the user's computer through any kind of network-including Local Area Network (LAN) or Wide Area Network (WAN)-or it can be connected to an external computer (such as Use an Internet service provider to connect via the Internet). In some embodiments, the electronic circuit is customized by using the state information of the computer-readable program instructions, such as programmable logic circuit, Field Programmable Gate Array (FPGA) or Programmable Logic Array (Programmable Logic). Array, PLA), the electronic circuit can execute computer-readable program instructions to implement various aspects of the embodiments of the present application.
这里参照根据本申请实施例的方法、装置(系统)和计算机程序产品的流程图和/或框图描述了本申请实施例的各个方面。应当理解,流程图和/或框图的每个方框以及流程图和/或框图中各方框的组合,都可以由计算机可读程序指令实现。Here, various aspects of the embodiments of the present application are described with reference to the flowcharts and/or block diagrams of the methods, devices (systems) and computer program products according to the embodiments of the present application. It should be understood that each block of the flowcharts and/or block diagrams and combinations of blocks in the flowcharts and/or block diagrams can be implemented by computer-readable program instructions.
这些计算机可读程序指令可以提供给通用计算机、专用计算机或其它可编程数据处理装置的处理器,从而生产出一种机器,使得这些指令在通过计算机或其它可编程数据处理装置的处理器执行时,产生了实现流程图和/或框图中的一个或多个方框中规定的功能/动作的装置。也可以把这些计算机可读程序指令存储在计算机可读存储介质中,这些指令使得计算机、可编程数据处理装置和/或其他设备以特定方式工作,从而,存储有指令的计算机可读介质则包括一个制造品,其包括实现流程图和/或框图中的一个或多个方框中规定的功能/动作的各个方面的指令。These computer-readable program instructions can be provided to the processor of a general-purpose computer, a special-purpose computer, or other programmable data processing device, thereby producing a machine such that when these instructions are executed by the processor of the computer or other programmable data processing device , A device that implements the functions/actions specified in one or more blocks in the flowchart and/or block diagram is produced. It is also possible to store these computer-readable program instructions in a computer-readable storage medium. These instructions make computers, programmable data processing apparatuses, and/or other devices work in a specific manner, so that the computer-readable medium storing instructions includes An article of manufacture, which includes instructions for implementing various aspects of the functions/actions specified in one or more blocks in the flowchart and/or block diagram.
也可以把计算机可读程序指令加载到计算机、其它可编程数据处理装置、或其它设备上,使得在计算机、其它可编程数据处理装置或其它设备上执行一系列操作步骤,以产生计算机实现的过程,从而使得在计算机、其它可编程数据处理装置、或其它设备上执行的指令实现流程图和/或框图中的一个或多个方框中规定的功能/动作。It is also possible to load computer-readable program instructions onto a computer, other programmable data processing device, or other equipment, so that a series of operation steps are executed on the computer, other programmable data processing device, or other equipment to produce a computer-implemented process , So that the instructions executed on the computer, other programmable data processing apparatus, or other equipment realize the functions/actions specified in one or more blocks in the flowcharts and/or block diagrams.
附图中的流程图和框图显示了根据本申请的多个实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段或指令的一部分,所述模块、程序段或指令的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个连续的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或动作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。The flowcharts and block diagrams in the drawings show the possible implementation of the system architecture, functions, and operations of the system, method, and computer program product according to multiple embodiments of the present application. In this regard, each block in the flowchart or block diagram may represent a module, program segment, or part of an instruction, and the module, program segment, or part of an instruction contains one or more functions for implementing the specified logical function. Executable instructions. In some alternative implementations, the functions marked in the block may also occur in a different order from the order marked in the drawings. For example, two consecutive blocks can actually be executed in parallel, or they can sometimes be executed in the reverse order, depending on the functions involved. It should also be noted that each block in the block diagram and/or flowchart, and the combination of the blocks in the block diagram and/or flowchart, can be implemented by a dedicated hardware-based system that performs the specified functions or actions Or it can be realized by a combination of dedicated hardware and computer instructions.
以上已经描述了本申请的各实施例,上述说明是示例性的,并非穷尽性的,并且也不限于所披露的各实施例。在不偏离所说明的各实施例的范围和精神的情况下,对于本技术领域的普通技术人员来说许多修改和变更都是显而易见的。本文中所用术语的选择,旨在最好地解释各实施例的原理、实际应用或对市场中的技术改进,或者使本技术领域的其它普通技术人员能理解本文披露的各实施例。The embodiments of the present application have been described above, and the above description is exemplary and not exhaustive, and is not limited to the disclosed embodiments. Without departing from the scope and spirit of the described embodiments, many modifications and changes are obvious to those of ordinary skill in the art. The choice of terms used herein is intended to best explain the principles, practical applications, or technical improvements in the market of the embodiments, or to enable other ordinary skilled in the art to understand the embodiments disclosed herein.
工业实用性Industrial applicability
本申请实施例公开了一种目标跟踪方法及装置、智能移动设备和存储介质,所述方法包括:获取采集的图像;确定所述图像中的目标对象的位置;基于所述目标对象的位置和所述图像的中心位置之间的距离,得到用于控制智能移动设备控制指令,其中,所述控制指令用于使得所述目标对象的位置位于所述图像的中心位置,且所述控制指令包括构成所述距离的偏移序列中的偏移值对应的转动指令,所述偏移序列包括至少一个偏移值。本申请实施例可实现目标对象的实时跟踪。The embodiment of the application discloses a target tracking method and device, a smart mobile device, and a storage medium. The method includes: acquiring a captured image; determining the location of a target object in the image; and based on the location of the target object and The distance between the center positions of the image obtains a control instruction used to control a smart mobile device, wherein the control instruction is used to make the position of the target object be located at the center position of the image, and the control instruction includes The rotation instruction corresponding to the offset value in the offset sequence constituting the distance, and the offset sequence includes at least one offset value. The embodiments of the present application can realize real-time tracking of target objects.

Claims (25)

  1. 一种目标跟踪方法,包括:A target tracking method includes:
    获取采集的图像;Obtain the collected images;
    确定所述图像中的目标对象的位置;Determining the position of the target object in the image;
    基于所述目标对象的位置和所述图像的中心位置之间的距离,确定用于控制智能移动设备转动的控制指令,其中,所述控制指令用于使得所述目标对象的位置位于所述图像的中心位置,且所述控制指令包括构成所述距离的偏移序列中的偏移值对应的转动指令,所述偏移序列包括至少一个偏移值。Based on the distance between the position of the target object and the center position of the image, a control instruction for controlling the rotation of the smart mobile device is determined, wherein the control instruction is used to make the position of the target object in the image The control instruction includes a rotation instruction corresponding to an offset value in an offset sequence constituting the distance, and the offset sequence includes at least one offset value.
  2. 根据权利要求1所述的方法,在确定所述图像中的目标对象的位置之前,所述方法还包括对所述图像执行预处理操作,所述预处理操作包括:将所述图像调整成预设规格的灰度图像,以及对所述灰度图像执行归一化处理;The method according to claim 1, before determining the position of the target object in the image, the method further comprises performing a preprocessing operation on the image, and the preprocessing operation comprises: adjusting the image to a preprocessing operation. Set a grayscale image of the specification, and perform normalization processing on the grayscale image;
    其中,所述确定所述图像中的目标对象的位置,包括:Wherein, the determining the position of the target object in the image includes:
    对所述预处理操作后得到的图像执行目标检测处理,获得所述预处理操作后的图像中所述目标对象的位置;Performing target detection processing on the image obtained after the preprocessing operation to obtain the position of the target object in the image after the preprocessing operation;
    基于所述预处理操作后的图像中所述目标对象的位置,确定所述图像中所述目标对象的位置。Determine the position of the target object in the image based on the position of the target object in the image after the preprocessing operation.
  3. 根据权利要求2所述的方法,所述对所述灰度图像执行归一化处理,包括:The method according to claim 2, wherein the performing normalization processing on the grayscale image comprises:
    确定所述灰度图像中各像素点的像素值的平均值和标准差;Determining the average value and standard deviation of the pixel value of each pixel in the grayscale image;
    获得所述各像素点的像素值与所述平均值之间的差值;Obtaining the difference between the pixel value of each pixel and the average value;
    将所述各像素点对应的所述差值和所述标准差之间的比值,确定为所述各像素点归一化后的像素值。The ratio between the difference and the standard deviation corresponding to each pixel is determined as the normalized pixel value of each pixel.
  4. 根据权利要求1至3中任意一项所述的方法,所述确定所述图像中的目标对象的位置,包括:The method according to any one of claims 1 to 3, said determining the position of the target object in the image comprises:
    提取所述图像的图像特征;Extract image features of the image;
    对所述图像特征执行分类处理,得到所述图像中的目标对象的位置区域;Perform classification processing on the image features to obtain the location area of the target object in the image;
    将所述位置区域的中心位置确定为所述目标对象的位置。The center position of the position area is determined as the position of the target object.
  5. 根据权利要求1至4中任意一项所述的方法,所述目标对象包括人脸;The method according to any one of claims 1 to 4, the target object includes a human face;
    对应地,所述确定所述图像中的目标对象的位置,包括:确定所述图像中人脸的位置。Correspondingly, the determining the position of the target object in the image includes: determining the position of the human face in the image.
  6. 根据权利要求1至5中任意一项所述的方法,所述基于所述目标对象的位置和所述图像的中心位置之间的距离,确定用于控制智能移动设备转动的控制指令,包括:The method according to any one of claims 1 to 5, wherein the determining a control instruction for controlling the rotation of a smart mobile device based on the distance between the position of the target object and the center position of the image includes:
    基于所述图像中的目标对象的位置与所述图像的中心位置之间的距离,确定目标偏移量;Determine the target offset based on the distance between the position of the target object in the image and the center position of the image;
    基于所述目标偏移量生成多组偏移序列,并且每组偏移序列中的偏移值的加和值为所述目标偏移量;Generating multiple sets of offset sequences based on the target offset, and the sum of the offset values in each set of offset sequences is the target offset;
    利用强化学习算法,从所述多组偏移序列中选择出满足要求的偏移序列,并确定所述满足要求的偏移序列对应的控制指令。Using a reinforcement learning algorithm, select an offset sequence that meets the requirements from the multiple sets of offset sequences, and determine the control instruction corresponding to the offset sequence that meets the requirements.
  7. 根据权利要求6所述的方法,所述利用强化学习算法,从所述多组偏移序列中选择出满足要求的偏移序列,包括:The method according to claim 6, said using a reinforcement learning algorithm to select an offset sequence that meets the requirements from the multiple sets of offset sequences, comprising:
    针对所述多组偏移序列中各偏移值,确定价值表中与所述偏移值对应的最大价值,所述价值表包括偏移值在不同转动指令下对应的价值;For each offset value in the multiple sets of offset sequences, determine the maximum value corresponding to the offset value in the value table, the value table including the value corresponding to the offset value under different rotation instructions;
    获得所述偏移值对应的奖赏值,并基于所述偏移值对应的所述奖赏值和所述最大价 值,确定所述偏移值的最终价值,所述奖赏值为在未执行所述偏移值的最大价值对应的转动指令的情况下,目标对象的位置与所述图像的中心位置之间的距离;The reward value corresponding to the offset value is obtained, and the final value of the offset value is determined based on the reward value and the maximum value corresponding to the offset value, and the reward value is In the case of a rotation instruction corresponding to the maximum value of the offset value, the distance between the position of the target object and the center position of the image;
    将所述多组偏移序列中各偏移值的所述最终价值之和最大的偏移序列,确定为满足要求的偏移序列。The offset sequence that has the largest sum of the final value of the offset values in the multiple sets of offset sequences is determined as the offset sequence that meets the requirements.
  8. 根据权利要求6或7所述的方法,所述确定所述满足要求的偏移序列对应的控制指令,包括:The method according to claim 6 or 7, wherein the determining the control instruction corresponding to the offset sequence that meets the requirement includes:
    基于所述满足要求的偏移序列中各偏移值的最大价值对应的转动指令,确定所述控制指令。The control instruction is determined based on the rotation instruction corresponding to the maximum value of each offset value in the offset sequence that meets the requirements.
  9. 根据权利要求1至8中任意一项所述的方法,所述方法还包括:The method according to any one of claims 1 to 8, the method further comprising:
    基于所述控制指令驱动所述智能移动设备执行转动。Drive the smart mobile device to perform rotation based on the control instruction.
  10. 根据权利要求4所述的方法,所述方法还包括:The method according to claim 4, further comprising:
    基于所述目标对象的位置区域,确定用于控制所述智能移动设备移动的控制指令,其中,Based on the location area of the target object, a control instruction for controlling the movement of the smart mobile device is determined, wherein:
    响应于所述目标对象的位置区域对应的面积大于第一阈值,生成用于控制所述智能移动设备后退的控制指令;In response to the area corresponding to the location area of the target object being greater than the first threshold, generating a control instruction for controlling the backing of the smart mobile device;
    响应于所述目标对象的位置区域对应的面积小于第二阈值,生成用于控制所述智能移动设备前进的控制指令,所述第一阈值大于第二阈值。In response to the area corresponding to the location area of the target object being smaller than a second threshold, a control instruction for controlling the advancement of the smart mobile device is generated, where the first threshold is greater than the second threshold.
  11. 一种目标跟踪装置,包括:A target tracking device includes:
    图像采集模块,其配置为采集图像;Image acquisition module, which is configured to acquire images;
    目标检测模块,其配置为确定所述图像中的目标对象的位置;A target detection module configured to determine the position of the target object in the image;
    控制模块,其配置为基于所述目标对象的位置和所述图像的中心位置之间的距离,确定用于控制智能移动设备转动的控制指令,其中,所述控制指令用于使得所述目标对象的位置位于所述图像的中心位置,且所述控制指令包括构成所述距离的偏移序列中的偏移值对应的转动指令,所述偏移序列包括至少一个偏移值。The control module is configured to determine a control instruction for controlling the rotation of the smart mobile device based on the distance between the position of the target object and the center position of the image, wherein the control instruction is used to make the target object The position of is located at the center position of the image, and the control instruction includes a rotation instruction corresponding to an offset value in an offset sequence constituting the distance, and the offset sequence includes at least one offset value.
  12. 根据权利要求11所述的装置,所述装置还包括预处理模块,其配置为对所述图像执行预处理操作,所述预处理操作包括:将所述图像调整成预设规格的灰度图像,以及对所述灰度图像执行归一化处理;The device according to claim 11, the device further comprising a preprocessing module configured to perform a preprocessing operation on the image, the preprocessing operation comprising: adjusting the image to a grayscale image of a preset specification , And perform normalization processing on the grayscale image;
    所述目标检测模块还配置为对所述预处理操作后得到的图像执行目标检测处理,获得所述预处理操作后的图像中所述目标对象的位置;The target detection module is further configured to perform target detection processing on the image obtained after the preprocessing operation to obtain the position of the target object in the image after the preprocessing operation;
    基于所述预处理操作后的图像中所述目标对象的位置,确定所述图像中所述目标对象的位置。Determine the position of the target object in the image based on the position of the target object in the image after the preprocessing operation.
  13. 根据权利要求12所述的装置,所述预处理模块执行所述对所述灰度图像执行归一化处理的步骤包括:The device according to claim 12, wherein the step of performing the normalization processing on the grayscale image by the preprocessing module comprises:
    确定所述灰度图像中各像素点的像素值的平均值和标准差;Determining the average value and standard deviation of the pixel value of each pixel in the grayscale image;
    获得所述各像素点的像素值与所述平均值之间的差值;Obtaining the difference between the pixel value of each pixel and the average value;
    将所述各像素点对应的所述差值和所述标准差之间的比值,确定为所述各像素点归一化后的像素值。The ratio between the difference and the standard deviation corresponding to each pixel is determined as the normalized pixel value of each pixel.
  14. 根据权利要求11至13中任意一项所述的装置,所述目标检测模块还配置为提取所述图像的图像特征;The device according to any one of claims 11 to 13, wherein the target detection module is further configured to extract image features of the image;
    对所述图像特征执行分类处理,得到所述图像中的目标对象的位置区域;Perform classification processing on the image features to obtain the location area of the target object in the image;
    将所述位置区域的中心位置确定为所述目标对象的位置。The center position of the position area is determined as the position of the target object.
  15. 根据权利要求11至14中任意一项所述的装置,所述目标对象包括人脸;The device according to any one of claims 11 to 14, wherein the target object includes a human face;
    对应地,所述目标检测模块还配置为确定所述图像中人脸的位置。Correspondingly, the target detection module is further configured to determine the position of the human face in the image.
  16. 根据权利要求11至15中任意一项所述的装置,所述控制模块还配置为基于所述图像中的目标对象的位置与所述图像的中心位置之间的距离,确定目标偏移量;The device according to any one of claims 11 to 15, wherein the control module is further configured to determine the target offset based on the distance between the position of the target object in the image and the center position of the image;
    基于所述目标偏移量生成多组偏移序列,并且每组偏移序列中的偏移值的加和值为所述目标偏移量;Generating multiple sets of offset sequences based on the target offset, and the sum of the offset values in each set of offset sequences is the target offset;
    利用强化学习算法,从所述多组偏移序列中选择出满足要求的偏移序列,并得到所述满足要求的偏移序列对应的控制指令。Using a reinforcement learning algorithm, select an offset sequence that meets the requirements from the multiple sets of offset sequences, and obtain a control instruction corresponding to the offset sequence that meets the requirements.
  17. 根据权利要求16所述的装置,所述控制模块还配置为针对所述多组偏移序列中各偏移值,确定价值表中与所述偏移值对应的最大价值,所述价值表包括偏移值在不同转动指令下对应的价值;The device according to claim 16, wherein the control module is further configured to determine the maximum value corresponding to the offset value in the value table for each offset value in the multiple sets of offset sequences, the value table comprising The corresponding value of the offset value under different rotation commands;
    获得所述偏移值对应的奖赏值,并基于所述偏移值对应的所述奖赏值和所述最大价值,确定所述偏移值的最终价值,所述奖赏值为在未执行所述偏移值的最大价值对应的转动指令的情况下,目标对象的位置与所述图像的中心位置之间的距离;The reward value corresponding to the offset value is obtained, and the final value of the offset value is determined based on the reward value and the maximum value corresponding to the offset value, and the reward value is In the case of a rotation instruction corresponding to the maximum value of the offset value, the distance between the position of the target object and the center position of the image;
    将所述多组偏移序列中各偏移值的所述最终价值之和最大的偏移序列,确定为满足要求的偏移序列。The offset sequence that has the largest sum of the final value of the offset values in the multiple sets of offset sequences is determined as the offset sequence that meets the requirements.
  18. 根据权利要求16或17所述的装置,所述控制模块还配置为基于所述满足要求的偏移序列中各偏移值的最大价值对应的转动指令,确定所述控制指令。According to the device of claim 16 or 17, the control module is further configured to determine the control instruction based on the rotation instruction corresponding to the maximum value of each offset value in the offset sequence that meets the requirements.
  19. 根据权利要求14所述的装置,所述目标检测模块还配置为基于所述目标对象的位置区域,确定用于控制所述智能移动设备移动的控制指令,其中,The apparatus according to claim 14, wherein the target detection module is further configured to determine a control instruction for controlling the movement of the smart mobile device based on the location area of the target object, wherein:
    在所述目标对象的位置区域对应的面积大于第一阈值的情况下,生成用于控制所述智能移动设备后退的控制指令;In the case that the area corresponding to the location area of the target object is greater than the first threshold, generating a control instruction for controlling the smart mobile device to back up;
    在所述目标对象的位置区域对应的面积小于第二阈值的情况下,生成用于控制所述智能移动设备前进的控制指令,所述第一阈值大于第二阈值。In a case where the area corresponding to the location area of the target object is smaller than a second threshold, a control instruction for controlling the advancement of the smart mobile device is generated, and the first threshold is greater than the second threshold.
  20. 一种智能移动设备,包括:如权利要求11至19中任意一项所述的目标跟踪装置,An intelligent mobile device, comprising: the target tracking device according to any one of claims 11 to 19,
    所述目标跟踪装置内的所述目标检测模块集成在智能移动设备的管理装置中,通过所述管理装置执行所述图像采集模块采集的图像的目标检测处理,得到所述目标对象的位置;The target detection module in the target tracking device is integrated in a management device of a smart mobile device, and the management device executes target detection processing of images collected by the image collection module to obtain the position of the target object;
    所述控制模块与所述管理装置连接,并配置为根据所述管理装置得到的目标对象的位置生成所述控制指令,并根据所述控制指令控制所述智能移动设备转动。The control module is connected to the management device and is configured to generate the control instruction according to the position of the target object obtained by the management device, and control the rotation of the smart mobile device according to the control instruction.
  21. 根据权利要求20所述的设备,所述管理装置还集成有所述目标跟踪装置的预处理模块以配置为对所述图像执行预处理操作,并对所述预处理操作后的图像执行目标检测处理,得到所述图像中目标对象的位置。The device according to claim 20, wherein the management device is further integrated with a preprocessing module of the target tracking device to be configured to perform a preprocessing operation on the image, and perform target detection on the image after the preprocessing operation Processing to obtain the position of the target object in the image.
  22. 根据权利要求20或21所述的设备,所述智能移动设备包括教育机器人。The device according to claim 20 or 21, the smart mobile device comprises an educational robot.
  23. 一种智能移动设备,包括:A smart mobile device, including:
    处理器;processor;
    配置为存储处理器可执行指令的存储器;A memory configured to store executable instructions of the processor;
    其中,所述处理器被配置为调用所述存储器存储的指令,以执行权利要求1至10中任意一项所述的方法。Wherein, the processor is configured to call instructions stored in the memory to execute the method according to any one of claims 1 to 10.
  24. 一种计算机可读存储介质,其上存储有计算机程序指令,所述计算机程序指令被处理器执行时实现权利要求1至10中任意一项所述的方法。A computer-readable storage medium having computer program instructions stored thereon, and when the computer program instructions are executed by a processor, the method according to any one of claims 1 to 10 is realized.
  25. 一种计算机程序,包括计算机可读代码,当所述计算机可读代码在智能移动设备中运行时,所述智能移动设备中的处理器执行用于实现权利要求1至10中的任一项所述的方法。A computer program, comprising computer readable code, when the computer readable code runs in a smart mobile device, the processor in the smart mobile device executes for realizing any one of claims 1 to 10 The method described.
PCT/CN2020/089620 2019-07-17 2020-05-11 Target tracking method and apparatus, intelligent mobile device and storage medium WO2021008207A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
KR1020217014152A KR20210072808A (en) 2019-07-17 2020-05-11 Target tracking method and device, smart mobile device and storage medium
JP2021525569A JP2022507145A (en) 2019-07-17 2020-05-11 Target tracking methods and equipment, intelligent mobile equipment and storage media

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910646696.8 2019-07-17
CN201910646696.8A CN110348418B (en) 2019-07-17 2019-07-17 Target tracking method and device, intelligent mobile device and storage medium

Publications (1)

Publication Number Publication Date
WO2021008207A1 true WO2021008207A1 (en) 2021-01-21

Family

ID=68175655

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/089620 WO2021008207A1 (en) 2019-07-17 2020-05-11 Target tracking method and apparatus, intelligent mobile device and storage medium

Country Status (5)

Country Link
JP (1) JP2022507145A (en)
KR (1) KR20210072808A (en)
CN (1) CN110348418B (en)
TW (2) TWI755762B (en)
WO (1) WO2021008207A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113139655A (en) * 2021-03-31 2021-07-20 北京大学 Target tracking training method and tracking method based on reinforcement learning
CN113625658A (en) * 2021-08-17 2021-11-09 杭州飞钛航空智能装备有限公司 Offset information processing method and device, electronic equipment and hole making mechanism
CN115250329A (en) * 2021-04-28 2022-10-28 深圳市三诺数字科技有限公司 Camera control method and device, computer equipment and storage medium

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110348418B (en) * 2019-07-17 2022-03-11 上海商汤智能科技有限公司 Target tracking method and device, intelligent mobile device and storage medium
CN112207821B (en) * 2020-09-21 2021-10-01 大连遨游智能科技有限公司 Target searching method of visual robot and robot
CN113409220A (en) * 2021-06-28 2021-09-17 展讯通信(天津)有限公司 Face image processing method, device, medium and equipment
CN115037877A (en) * 2022-06-08 2022-09-09 湖南大学重庆研究院 Automatic following method and device and safety monitoring method and device
CN117238039B (en) * 2023-11-16 2024-03-19 暗物智能科技(广州)有限公司 Multitasking human behavior analysis method and system based on top view angle

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101888479A (en) * 2009-05-14 2010-11-17 汉王科技股份有限公司 Method and device for detecting and tracking target image
CN104751486A (en) * 2015-03-20 2015-07-01 安徽大学 Moving object relay tracing algorithm of multiple PTZ (pan/tilt/zoom) cameras
CN105740644A (en) * 2016-03-24 2016-07-06 苏州大学 Cleaning robot optimal target path planning method based on model learning
CN107992099A (en) * 2017-12-13 2018-05-04 福州大学 A kind of target sport video tracking and system based on improvement frame difference method
CN109040574A (en) * 2017-06-08 2018-12-18 北京君正集成电路股份有限公司 A kind of method and device of rotation head-shaking machine tracking target
CN110348418A (en) * 2019-07-17 2019-10-18 上海商汤智能科技有限公司 Method for tracking target and device, Intelligent mobile equipment and storage medium

Family Cites Families (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1178467C (en) * 1998-04-16 2004-12-01 三星电子株式会社 Method and apparatus for automatically tracing moving object
US7430315B2 (en) * 2004-02-13 2008-09-30 Honda Motor Co. Face recognition system
JP3992026B2 (en) * 2004-07-09 2007-10-17 船井電機株式会社 Self-propelled robot
JP2010176504A (en) * 2009-01-30 2010-08-12 Canon Inc Image processor, image processing method, and program
JP2012191265A (en) * 2011-03-08 2012-10-04 Nikon Corp Image processing apparatus and program
CN102411368B (en) * 2011-07-22 2013-10-09 北京大学 Active vision human face tracking method and tracking system of robot
CN102307297A (en) * 2011-09-14 2012-01-04 镇江江大科茂信息系统有限责任公司 Intelligent monitoring system for multi-azimuth tracking and detecting on video object
KR102131477B1 (en) * 2013-05-02 2020-07-07 퀄컴 인코포레이티드 Methods for facilitating computer vision application initialization
JP6680498B2 (en) * 2015-09-28 2020-04-15 株式会社日立システムズ Autonomous flying vehicle, target tracking method
WO2017120336A2 (en) * 2016-01-05 2017-07-13 Mobileye Vision Technologies Ltd. Trained navigational system with imposed constraints
CN113589833A (en) * 2016-02-26 2021-11-02 深圳市大疆创新科技有限公司 Method for visual target tracking
CN108292141B (en) * 2016-03-01 2022-07-01 深圳市大疆创新科技有限公司 Method and system for target tracking
CN107798723B (en) * 2016-08-30 2021-11-19 北京神州泰岳软件股份有限公司 Target tracking control method and device
US10140719B2 (en) * 2016-12-22 2018-11-27 TCL Research America Inc. System and method for enhancing target tracking via detector and tracker fusion for unmanned aerial vehicles
JP6856914B2 (en) * 2017-07-18 2021-04-14 ハンジョウ タロ ポジショニング テクノロジー カンパニー リミテッドHangzhou Taro Positioning Technology Co.,Ltd. Intelligent object tracking
CN108549413A (en) * 2018-04-27 2018-09-18 全球能源互联网研究院有限公司 A kind of holder method of controlling rotation, device and unmanned vehicle
CN108806146A (en) * 2018-06-06 2018-11-13 合肥嘉仕诚能源科技有限公司 A kind of safety monitoring dynamic object track lock method and system
CN109992000B (en) * 2019-04-04 2020-07-03 北京航空航天大学 Multi-unmanned aerial vehicle path collaborative planning method and device based on hierarchical reinforcement learning

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101888479A (en) * 2009-05-14 2010-11-17 汉王科技股份有限公司 Method and device for detecting and tracking target image
CN104751486A (en) * 2015-03-20 2015-07-01 安徽大学 Moving object relay tracing algorithm of multiple PTZ (pan/tilt/zoom) cameras
CN105740644A (en) * 2016-03-24 2016-07-06 苏州大学 Cleaning robot optimal target path planning method based on model learning
CN109040574A (en) * 2017-06-08 2018-12-18 北京君正集成电路股份有限公司 A kind of method and device of rotation head-shaking machine tracking target
CN107992099A (en) * 2017-12-13 2018-05-04 福州大学 A kind of target sport video tracking and system based on improvement frame difference method
CN110348418A (en) * 2019-07-17 2019-10-18 上海商汤智能科技有限公司 Method for tracking target and device, Intelligent mobile equipment and storage medium

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113139655A (en) * 2021-03-31 2021-07-20 北京大学 Target tracking training method and tracking method based on reinforcement learning
CN113139655B (en) * 2021-03-31 2022-08-19 北京大学 Target tracking training method and tracking method based on reinforcement learning
CN115250329A (en) * 2021-04-28 2022-10-28 深圳市三诺数字科技有限公司 Camera control method and device, computer equipment and storage medium
CN115250329B (en) * 2021-04-28 2024-04-19 深圳市三诺数字科技有限公司 Camera control method and device, computer equipment and storage medium
CN113625658A (en) * 2021-08-17 2021-11-09 杭州飞钛航空智能装备有限公司 Offset information processing method and device, electronic equipment and hole making mechanism

Also Published As

Publication number Publication date
CN110348418A (en) 2019-10-18
TW202215364A (en) 2022-04-16
KR20210072808A (en) 2021-06-17
TW202105326A (en) 2021-02-01
TWI755762B (en) 2022-02-21
CN110348418B (en) 2022-03-11
JP2022507145A (en) 2022-01-18

Similar Documents

Publication Publication Date Title
WO2021008207A1 (en) Target tracking method and apparatus, intelligent mobile device and storage medium
US20200387698A1 (en) Hand key point recognition model training method, hand key point recognition method and device
US11468581B2 (en) Distance measurement method, intelligent control method, electronic device, and storage medium
WO2020187153A1 (en) Target detection method, model training method, device, apparatus and storage medium
TWI728621B (en) Image processing method and device, electronic equipment, computer readable storage medium and computer program
WO2021164469A1 (en) Target object detection method and apparatus, device, and storage medium
TWI724736B (en) Image processing method and device, electronic equipment, storage medium and computer program
TWI766286B (en) Image processing method and image processing device, electronic device and computer-readable storage medium
CN111985265B (en) Image processing method and device
WO2021017836A1 (en) Method for controlling display of large-screen device, and mobile terminal and first system
WO2020216054A1 (en) Sight line tracking model training method, and sight line tracking method and device
WO2022127919A1 (en) Surface defect detection method, apparatus, system, storage medium, and program product
CN104156915A (en) Skin color adjusting method and device
CN110443366B (en) Neural network optimization method and device, and target detection method and device
WO2021035812A1 (en) Image processing method and apparatus, electronic device and storage medium
US20190271940A1 (en) Electronic device, external device capable of being combined with the electronic device, and a display method thereof
KR101623642B1 (en) Control method of robot cleaner and terminal device and robot cleaner control system including the same
CN104063865A (en) Classification model creation method, image segmentation method and related device
WO2020220973A1 (en) Photographing method and mobile terminal
CN108156374A (en) A kind of image processing method, terminal and readable storage medium storing program for executing
CN111435422B (en) Action recognition method, control method and device, electronic equipment and storage medium
TWI770531B (en) Face recognition method, electronic device and storage medium thereof
US20230245344A1 (en) Electronic device and controlling method of electronic device
WO2023137923A1 (en) Person re-identification method and apparatus based on posture guidance, and device and storage medium
CN105608469A (en) Image resolution determination method and device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20841368

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2021525569

Country of ref document: JP

Kind code of ref document: A

Ref document number: 20217014152

Country of ref document: KR

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20841368

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 20841368

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 09.09.2022)

122 Ep: pct application non-entry in european phase

Ref document number: 20841368

Country of ref document: EP

Kind code of ref document: A1