CN111462194A

CN111462194A - Training method and device of object tracking model and storage medium

Info

Publication number: CN111462194A
Application number: CN202010236365.XA
Authority: CN
Inventors: 杨大鹏; 罗灿锋; 张祖良; 刘茂; 张俊杰; 黄春华; 过全; 周端继
Original assignee: Suzhou Keda Technology Co Ltd
Current assignee: Suzhou Keda Technology Co Ltd
Priority date: 2020-03-30
Filing date: 2020-03-30
Publication date: 2020-07-28
Anticipated expiration: 2040-03-30
Also published as: CN111462194B

Abstract

The application relates to a training method, a device and a storage medium of an object tracking model, belonging to the technical field of computers, wherein the method comprises the following steps: controlling a first image acquisition assembly to acquire images of the training image set to obtain sample pixel coordinates of a training object of each training image in the training image set in the panoramic image; for each training image in the training image set, acquiring an expected rotation angle of the second image acquisition assembly relative to the training image; performing model training by using the sample pixel coordinates of each training image and the corresponding expected rotation angle to obtain an object tracking model; the problem that the tracking precision of the tracking shot object is low due to the fact that random errors inevitably exist in the production process of the object tracking equipment of the same model can be solved; the object tracking model obtained through personalized training can be used for object tracking, and therefore the object tracking accuracy is improved. And the training process is fully automatic, so that the model training efficiency can be improved.

Description

Training method and device of object tracking model and storage medium

Technical Field

The application relates to a training method and device of an object tracking model and a storage medium, and belongs to the technical field of computers.

Background

In a video conference, a speaker is usually required to be taken as a main subject for tracking shooting, so that an intelligent tracking camera is required to be capable of accurate tracking positioning.

In a typical object tracking method, an image is captured by a tracking camera, and if a target object appears in the image, the target object is tracked and positioned.

However, since the tracking camera has random errors during the production process, a problem of low tracking accuracy of the tracking target object may result.

Disclosure of Invention

The application provides a training method and a training device for an object tracking model and a storage medium, which can automatically and quickly correct model parameters of produced object tracking equipment, and effectively solve the problems that the tracking precision of a tracking shot object is low and personalized correction is difficult to perform due to the inevitable random error in the production process of the object tracking equipment of the same model. After the personalized correction is carried out, the method further provides a scheme for automatically evaluating the correction effect, the correction is completed only when the corrected model meets the requirements, and each piece of equipment is guaranteed to meet the use requirement, and the following technical scheme is provided:

in a first aspect, a method for training an object tracking model is provided, the method including:

controlling a first image acquisition assembly to acquire images of a training image set to obtain sample pixel coordinates of a training object of each training image in the training image set in a panoramic image; each training image in the training image set is located within the acquisition range of the first image acquisition assembly;

for each training image in the training image set, acquiring a desired rotation angle of a second image acquisition assembly relative to the training image, wherein the desired rotation angle enables a training object in the training image to be located in a desired image area of a tracking image acquired by the second image acquisition assembly;

performing model training by using the sample pixel coordinates of each training image and the corresponding expected rotation angle to obtain an object tracking model; the object tracking model is used for determining the rotation angle of the second image acquisition assembly according to the pixel coordinates of the target object in the panoramic image in the object tracking process, so that the second image acquisition assembly can perform tracking shooting on the target object.

Optionally, the acquiring, for each training image in the set of training images, a desired rotation angle of the second image acquisition assembly relative to the training image comprises:

acquiring a preset angle calculation formula;

inputting the sample pixel coordinates of the training image into the angle calculation formula to obtain the initial rotation angle of the second image acquisition assembly;

controlling the second image acquisition assembly to rotate to the initial rotation angle;

acquiring pixel coordinates of a training object in the training image in a tracking image acquired by the second image acquisition assembly;

calculating an offset angle of the training object in a second image acquisition assembly using pixel coordinates in the tracking image;

determining the desired angle of rotation based on the initial angle of rotation and the offset angle.

Optionally, after controlling the second image capturing assembly to rotate to the initial rotation angle, the method further includes:

performing image recognition on the tracking image acquired by the second image acquisition assembly;

upon identifying a training object, determining whether the identified training object is the same as a training object in the training image;

and when the identified training object is the same as the training object in the training image, triggering and executing the step of acquiring the pixel coordinates of the training object in the training image in the tracking image acquired by the second image acquisition component.

Optionally, the calculating an offset angle of the training object in the second image acquisition assembly using pixel coordinates in the tracking image comprises:

acquiring a scaling parameter of the second image acquisition assembly;

determining the focal length of the second image acquisition assembly according to the scaling parameter;

and inputting the focal length of the second image acquisition assembly and the pixel coordinates in the tracking image into the angle calculation formula to obtain the offset angle.

Optionally, after the model training is performed by using the sample pixel coordinates of each training image and the corresponding expected rotation angle to obtain the object tracking model, the method further includes:

controlling the first image acquisition assembly to acquire images of the test image set to obtain test pixel coordinates of a test object of each test image in the test image set in the panoramic image; the position of the test image set relative to the first image acquisition component is different from the position of the training image set relative to the first image acquisition component, and each test image in the test image set is located within the acquisition range of the first image acquisition component;

for each test image in the test image set, inputting the test pixel coordinates of the test image into the object tracking model to obtain the test rotation angle of the second image acquisition assembly;

controlling the second image acquisition assembly to rotate to the test rotation angle;

acquiring pixel coordinates of a test object in the test image in a tracking image acquired by a second image acquisition assembly after the test object rotates by an angle;

determining an accuracy of the object tracking model based on a difference between pixel coordinates in the tracking image and the desired image region.

Optionally, the desired image region is a pixel center point of a tracking image, and the determining the accuracy of the object tracking model based on a difference between pixel coordinates in the tracking image and the desired image region includes:

and when the difference value between the pixel coordinate in the tracking image and the pixel center point does not meet the correction condition, determining that the object tracking model is inaccurate, and updating and correcting the object tracking model again until the difference value between the pixel coordinate in the tracking image and the pixel center point meets the correction condition.

Optionally, the correction condition includes:

for pixel coordinates of a test object in a plurality of test images in a tracking image, an average value of differences between horizontal pixel coordinates of the plurality of pixel coordinates and horizontal pixel coordinates of the pixel center point is smaller than a first preset threshold, and a maximum value of differences between the horizontal pixel coordinates of the plurality of pixel coordinates and the horizontal pixel coordinates of the pixel center point is smaller than a second preset threshold;

and the difference value between the vertical pixel coordinate of the pixel coordinates and the vertical pixel coordinate of the pixel center point is greater than a third preset threshold value and less than a fourth preset threshold value.

In a second aspect, there is provided an apparatus for training an object tracking model, the apparatus comprising:

the coordinate acquisition module is used for controlling the first image acquisition assembly to acquire images of the training image set to obtain sample pixel coordinates of a training object of each training image in the training image set in the panoramic image; each training image in the training image set is located within the acquisition range of the first image acquisition assembly;

an angle acquisition module, configured to acquire, for each training image in the training image set, an expected rotation angle of a second image acquisition component with respect to the training image, where the expected rotation angle enables a training object in the training image to be located in an expected image area of a tracking image acquired by the second image acquisition component;

the model training module is used for performing model training by using the sample pixel coordinates of each training image and the corresponding expected rotation angle to obtain an object tracking model; the object tracking model is used for determining the rotation angle of the second image acquisition assembly according to the pixel coordinates of the target object in the panoramic image in the object tracking process, so that the second image acquisition assembly can perform tracking shooting on the target object.

In a third aspect, an apparatus for training an object tracking model is provided, the apparatus comprising a processor and a memory; the memory has stored therein a program that is loaded and executed by the processor to implement the method of training an object tracking model according to the first aspect.

In a fourth aspect, a computer-readable storage medium is provided, in which a program is stored, the program being loaded and executed by the processor to implement the method for training an object tracking model according to the first aspect.

The beneficial effect of this application lies in: acquiring images of a training image set by controlling a first image acquisition assembly to obtain sample pixel coordinates of a training object of each training image in the training image set in a panoramic image; for each training image in the training image set, acquiring an expected rotation angle of the second image acquisition assembly relative to the training image; performing model training by using the sample pixel coordinates of each training image and the corresponding expected rotation angle to obtain an object tracking model; the problem that the tracking precision of the tracking shot object is low due to the fact that random errors inevitably exist in the production process of the object tracking equipment of the same model can be solved; because the object tracking model obtained by personalized training can be used for object tracking, the object tracking precision can be improved. Meanwhile, the training process is fully automatic, so that the model training efficiency can be improved.

The foregoing description is only an overview of the technical solutions of the present application, and in order to make the technical solutions of the present application more clear and clear, and to implement the technical solutions according to the content of the description, the following detailed description is made with reference to the preferred embodiments of the present application and the accompanying drawings.

Drawings

FIG. 1 is a schematic structural diagram of a training system for an object tracking model according to an embodiment of the present application;

FIG. 2 is a schematic diagram of a distribution of target objects in a training image set according to an embodiment of the present application;

FIG. 3 is a flow chart of a method for training an object tracking model provided by an embodiment of the present application;

FIG. 4 is a schematic diagram of determining an angle of a human face relative to a first image acquisition assembly according to one embodiment of the present application;

FIG. 5 is a flow chart of a method for training an object tracking model according to another embodiment of the present application;

FIG. 6 is a block diagram of an apparatus for training an object tracking model provided in one embodiment of the present application;

fig. 7 is a block diagram of a training apparatus for an object tracking model according to an embodiment of the present application.

Detailed Description

The following detailed description of embodiments of the present application will be described in conjunction with the accompanying drawings and examples. The following examples are intended to illustrate the present application but are not intended to limit the scope of the present application.

Fig. 1 is a schematic structural diagram of a training system for an object tracking model according to an embodiment of the present application, and as shown in fig. 1, the system at least includes: a first image acquisition assembly 110, a second image acquisition assembly 120, and a control assembly 130.

The first image capturing component 110 is used to capture panoramic images. The first image capturing component 110 may also be referred to as a panoramic camera, etc., and the name of the first image capturing component is not limited in this embodiment.

The first image acquisition assembly 110 is communicatively coupled to the control assembly 130. The first image acquisition component 110 sends the acquired panoramic image to the control component 130 through a communication connection with the control component 130; alternatively, the first image capturing component 110 may also recognize the target object and send the pixel coordinates of the recognized target object to the control component 130.

Optionally, the control component 130 may be a device such as a mobile phone, a tablet computer, a computer, and a cradle head, and the implementation manner of the control component 130 is not limited in this embodiment.

In this embodiment, the control component 130 is configured to: controlling a first image acquisition component 110 to perform image acquisition on the training image set to obtain sample pixel coordinates of a training object of each training image in the training image set in the panoramic image; for each training image in the set of training images, obtaining a desired rotation angle of the second image acquisition assembly 120 relative to the training image; and carrying out model training by using the sample pixel coordinates of each training image and the corresponding expected rotation angle to obtain an object tracking model.

The object tracking model is used for determining a rotation angle of the second image capturing component 120 according to pixel coordinates of the target object in the panoramic image in the object tracking process, so that the second image capturing component 120 performs tracking shooting on the target object.

The training image set includes a plurality of training images. The training image is an image including a training subject. In the present application, the training object and the target object refer to objects tracked by the second image acquisition component 120, and the training object and the target object may be a human face, a vehicle, an animal, and the like; the type of the training object is the same as or different from the type of the target object; the present embodiment does not limit the types of the training object and the target object.

Each training image in the set of training images is located within the acquisition range of the first image acquisition assembly. The training images can be pasted on the wall board or displayed through a display device (such as a television, a display screen, etc.), and the embodiment does not limit the setting mode of the training images. The positions of the training images in the training image set relative to the first acquisition assembly are distributed in different directions, and the acquisition range of the first image acquisition assembly is covered as far as possible. Referring to the arrangement of the training images 21 shown in fig. 2, as can be seen from fig. 2, a plurality of training images 21 are distributed in the horizontal direction and the vertical direction, respectively.

In the embodiment, the training images are distributed in different directions, so that the training images can be distributed in the acquisition range of the first image acquisition assembly as uniformly as possible, and the training result is more reliable. And when the coverage of the training image set is large enough, a plurality of first image acquisition assemblies and second image acquisition assemblies can be trained simultaneously, and the training efficiency is improved. In addition, the first image acquisition assembly and the second image acquisition assembly are trained through the training image set, simulation training is not required to be carried out by changing different positions through personnel, the training time of the object tracking model is shortened, and the training efficiency and accuracy of the object tracking model are optimized.

The sample pixel coordinates of the training object in the panoramic image are determined based on a coordinate system established by taking the central point of the panoramic image as an origin, the horizontal axis as an x-axis and the vertical axis as a y-axis; or, the sample pixel coordinates may be determined based on a coordinate system established with the lower left vertex of the panoramic image as the origin, the horizontal bottom side as the x-axis, and the vertical left side as the y-axis, but the sample pixel coordinates may also be determined based on other coordinate systems, and the determination method of the coordinate system is not limited in this embodiment.

The desired rotation angle is such that the training object in the training image is located within the desired image area of the tracking image acquired by the second image acquisition assembly 120. Wherein the desired image area is determined based on a pixel center point of the tracking image. Optionally, the desired rotation angle is such that the center point of the training object is located within a desired image area of the tracking image.

The control assembly 130 is also communicatively coupled to the second image capturing assembly 120. The second image acquisition component 120 is used to track images of the acquisition object (including the training object, the target object, and the test object hereinafter) to obtain a tracked image. The shooting angle of the second image capturing assembly 120 can be rotated, for example: 360 degrees in the horizontal direction and 180 degrees in the vertical direction. Optionally, the focal length of the second acquisition assembly is variable. In practical implementation, the second image capturing component 120 may be a camera with a cradle head, and the cradle head is an electronic cradle head, that is, the camera may be driven to rotate according to the control instruction of the control component 130.

Optionally, the acquisition range of the second image acquisition assembly 120 is smaller than the acquisition range of the first image acquisition assembly 110. Optionally, after the second image capturing component 120 captures the tracking image, the captured tracking image data may be sent to the control component 130.

In one example, the second image capture assembly 120 is co-axial and located adjacent to the first image capture assembly 110. Such as: the second image capturing component 120 is located directly below the first image capturing component 110; or, directly above.

Of course, in other embodiments, the second image capturing component 120 may not be located on the same axis as the first image capturing component 110.

Optionally, the first image capturing assembly 110 and the second image capturing assembly 120 rotate synchronously; alternatively, the first image capturing assembly does not rotate in synchronization with the second image capturing assembly 120.

In this embodiment, only one number of the first image assemblies 120 and the second image assemblies 120 is taken as an example for explanation, and in practical implementation, the control assembly 130 may be communicatively connected to the plurality of first image assemblies 120 and the plurality of second image assemblies 120, respectively, and the number of the first image assemblies 120 and the second image assemblies 120 is not limited in this embodiment.

Alternatively, the control component 130 may be integrated in the second image acquisition component 120, i.e. the control component 130 is implemented as the same device as the second image acquisition component 120.

Fig. 3 is a flowchart of a training method for an object tracking model according to an embodiment of the present application, where the method is applied to the training system for an object tracking model shown in fig. 1, and an execution subject of each step is illustrated as an example of the control component 130 in the system. The method at least comprises the following steps:

step 301, controlling a first image acquisition component to perform image acquisition on a training image set to obtain sample pixel coordinates of a training object of each training image in the training image set in a panoramic image.

Wherein each training image in the set of training images is located within the acquisition range of the first image acquisition assembly.

Optionally, the sample pixel coordinates of the training object in the training image in the panoramic image may be sent by the first image capturing component (i.e., obtained after the first image capturing component identifies the target object in the panoramic image); or the control component identifies the panoramic image sent by the first image acquisition component.

It should be added that, if the target object is identified by the first image capturing component, the panoramic image may be an image displayed in the view finder by the first image capturing component, and the image is not accessible; alternatively, the accessible image captured by the first image capturing component may be used.

Because the first image acquisition component or the control component identifies the training object in the training image, in the present application, the sample pixel coordinates of the training image refer to: sample pixel coordinates in a training object panoramic image in a training image.

Optionally, the pixel coordinates of the training object in the panoramic image are determined based on a coordinate system established by taking the central point of the panoramic image as an origin, the horizontal axis as an x-axis and the vertical axis as a y-axis; or, the pixel coordinates may be determined based on a coordinate system established with the lower left vertex of the panoramic image as the origin, the horizontal bottom side as the x-axis, and the vertical left side as the y-axis, or, of course, the pixel coordinates may also be determined based on other coordinate systems, and the determination method of the coordinate system is not limited in this embodiment.

Optionally, the sample pixel coordinates of the training subject in the panoramic image are: pixel coordinates of a center point of a training object (such as a face center point); alternatively, the average value of the pixel coordinates of each point of the training target.

Step 302, for each training image in the set of training images, a desired rotation angle of the second image capturing assembly relative to the training image is obtained.

The desired rotation angle is such that the training object in the training image is located within the desired image area of the tracking image acquired by the second image acquisition assembly. Wherein the desired image area is determined based on a pixel center point of the tracking image.

The second image acquisition assembly is used for tracking a target object to obtain a tracking image. The shooting angle of the second image acquisition assembly can be rotated.

The desired rotation angle of the second image capturing assembly includes a rotation angle in a horizontal direction and a rotation angle in a vertical direction.

Optionally, the control component sends a control instruction to the second image acquisition component, wherein the control instruction carries the rotation angle; and after the second image acquisition assembly receives the control command, the second image acquisition assembly rotates based on the rotation angle in the control command. Or the control component controls the equipment body (such as a holder) to rotate according to the rotation angle, and the equipment body drives the second image acquisition component to rotate.

In one example, for each training image in the set of training images, acquiring a desired rotation angle of the second image acquisition assembly relative to the training image comprises: acquiring a preset angle calculation formula; inputting the sample pixel coordinates of the training image into an angle calculation formula to obtain the initial rotation angle of the second image acquisition assembly; controlling the second image acquisition assembly to rotate to an initial rotation angle; acquiring pixel coordinates of a training object in the training image in a tracking image acquired by the second image acquisition assembly; calculating an offset angle of the training object in the second image acquisition assembly by using the pixel coordinates in the tracking image; the desired rotation angle is determined based on the initial rotation angle and the offset angle.

In one example, the angle calculation formula is expressed by the following equation tan α ═ fh/F.

α is the angle (horizontal angle or vertical angle) of the training object relative to the first image capturing component, fh is the distance (horizontal distance or vertical distance) between the pixel coordinate of the training object and the screen center of the first image capturing component, and F is the focal length of the first image capturing component.

Referring to a cross-sectional schematic diagram of the training object and the first image acquisition component shown in fig. 4, the description is given by taking the training object as a face and taking a sample pixel coordinate of the training object as a pixel coordinate of a center point of the face in fig. 4 as an example. As can be seen from fig. 4, the imaging position (i.e., the pixel coordinate) of the face center point P in the first image capturing assembly and the lens focal length F form a trigonometric function relationship, and the angle calculation formula can be obtained based on the trigonometric function relationship.

In another example, in order to improve the accuracy of the angle calculation, the angle calculation formula is expressed by the following equation tan α ═ fh/F + b.

Where b represents a bias term. Initializing the value of b to a preset value.

Because the number of the training images collected by the first image collecting assembly is multiple, the one-to-one correspondence relationship between the sample pixel coordinate of each training image and the expected rotation angle is ensured. After the second image acquisition assembly is controlled to rotate to the initial rotation angle, image recognition is required to be carried out on the tracking image acquired by the second image acquisition assembly; when the training object is identified, determining whether the identified training object is the same as the training object in the training image; and when the identified training object is the same as the training object in the training image, executing the step of acquiring the pixel coordinates of the training object in the training image in the tracking image acquired by the second image acquisition component.

In one example, the second image capturing component is a tracking camera, and the focal length of the tracking camera is changed, so that the focal length of the second image capturing component needs to be acquired first, and then the offset angle needs to be calculated according to the current focal length. Specifically, calculating the offset angle of the training object in the second image acquisition assembly by using the pixel coordinates in the tracking image comprises the following steps: acquiring a scaling parameter of a second image acquisition component; determining the focal length of the second image acquisition assembly according to the scaling parameter; and inputting the focal length of the second image acquisition assembly and the pixel coordinate in the tracking image into an angle calculation formula to obtain the offset angle.

The focal length of the second image capturing component is determined according to the scaling parameter, and is expressed by the following formula (other types of devices may have different expression formulas, and the embodiment does not limit the specific expression formulas):

1/F’＝[tan(a×zoom+b)]/c

wherein F' is the focal length of the second image capturing component and zoom is the scaling parameter. and a, b and c are specific numerical values obtained by fitting according to related parameters of the second image acquisition assembly, and the values of a, b and c corresponding to the image acquisition assemblies of different models are different. At this time, fh in the angle calculation formula is the distance (horizontal distance or vertical distance) from the pixel coordinate of the target object in the tracking image to the screen center of the second acquisition assembly; f is the focal length F' of the second image capturing component. In one embodiment, for example, the focus fit formula for a camera of a certain model is:

1/F’＝[tan(-0.0001×zoom+0.785398)]/1350

after the offset angle is obtained, the initial rotation angle is added to the offset angle to obtain the desired rotation angle.

Step 303, performing model training by using the sample pixel coordinates of each training image and the corresponding expected rotation angle to obtain an object tracking model.

The object tracking model is used for determining the rotation angle of the second image acquisition assembly according to the pixel coordinates of the target object in the panoramic image so that the second image acquisition assembly can perform tracking shooting on the target object.

Optionally, performing model training using the sample pixel coordinates and the expected rotation angle to obtain an object tracking model, including: and training by using the sample pixel coordinates of each training image and the parameters in the corresponding expected rotation angle diagonal degree calculation formula to obtain an object tracking model. Such as: and training the values of 1/F and b in the angle calculation formula to obtain an object tracking model.

In summary, in the training method of the object tracking model provided in this embodiment, the first image acquisition component is controlled to perform image acquisition on the training image set, so as to obtain sample pixel coordinates of the training object of each training image in the training image set in the panoramic image; for each training image in the training image set, acquiring an expected rotation angle of the second image acquisition assembly relative to the training image; performing model training by using the sample pixel coordinates of each training image and the corresponding expected rotation angle to obtain an object tracking model; the problem that the tracking precision of the tracking shot object is low due to the fact that random errors inevitably exist in the production process of the object tracking equipment of the same model can be solved; because the object tracking model obtained by personalized training can be used for object tracking, the object tracking precision can be improved. Meanwhile, the training process is fully automatic, so that the model training efficiency can be improved.

In addition, when an expected rotation angle corresponding to each sample pixel coordinate is obtained, an initial rotation angle of the second image acquisition assembly is determined according to the sample pixel coordinate, and the second image acquisition assembly is controlled to rotate to the initial rotation angle; then, calculating the offset angle of the second image acquisition assembly, and finely adjusting the initial rotation angle of the second image acquisition assembly so as to enable the training object to be located in the expected image area of the tracking image; when the second image acquisition assembly rotates to the initial rotation angle, the training image is usually positioned in the tracking image; the problem that when the second image acquisition assembly randomly rotates to determine the expected rotation angle, the training image is not usually positioned in the tracking image, and the efficiency of determining the expected rotation angle is low can be solved; therefore, the efficiency of determining the desired rotation angle can be improved.

Optionally, after the object tracking model is obtained through the above embodiment, the object tracking model needs to be evaluated to determine the accuracy of the tracking model. At this time, referring to fig. 5, after step 303, at least the following steps are further included:

step 501, controlling a first image acquisition component to acquire images of a test image set to obtain test pixel coordinates of a test object of each test image in the test image set in a panoramic image.

The position of the test image set relative to the first image acquisition assembly is different from the position of the training image set relative to the first image acquisition assembly, and each test image in the test image set is located in the acquisition range of the first image acquisition assembly.

Alternatively, the test object may be the same type of object as the training object and the target object; or objects of different types.

Optionally, the test image set is the same or different from the training image set.

The relevant description of this step is detailed in step 301, and the difference is that the training image set is replaced by the test image set, the training image is replaced by the test image, and the training object is replaced by the test object.

Step 502, for each test image in the test image set, inputting the test pixel coordinates of the test image into the object tracking model to obtain the test rotation angle of the second image acquisition assembly.

The object tracking model reflects the mapping relation between the pixel coordinates of the target object in the panoramic image and the rotation angle of the second image acquisition assembly, so that the test rotation angle corresponding to the test pixel coordinates can be obtained after the test pixel coordinates are input into the object tracking model.

Step 503, controlling the second image capturing assembly to rotate to the testing rotation angle.

Optionally, the control component sends a control instruction to the second image acquisition component, wherein the control instruction carries the test rotation angle; and after the second image acquisition assembly receives the control command, the second image acquisition assembly rotates based on the test rotation angle in the control command. Or the control component controls the equipment body (such as a holder) to rotate according to the test rotation angle, and the equipment body drives the second image acquisition component to rotate.

And step 504, acquiring pixel coordinates of the test object in the test image in the tracking image acquired by the second image acquisition assembly after the angle rotation.

Optionally, the pixel coordinates of the test object in the tracking image may be sent by the second image capturing component (i.e. obtained after the second image capturing component identifies the target object in the tracking image); or the control component identifies the tracking image sent by the second image acquisition component.

Step 505, the accuracy of the object tracking model is determined based on the difference between the pixel coordinates in the tracking image and the desired image area.

Optionally, the desired image area is a pixel center point of the tracking image, and the accuracy of the object tracking model is determined based on a difference between pixel coordinates in the tracking image and the desired image area, including: when the difference value between the pixel coordinate and the pixel center point in the tracking image does not satisfy the correction condition, it is determined that the object tracking model is inaccurate, and the update correction is performed on the object tracking model again (i.e.

step

301 and 303 are performed again) until the difference value between the pixel coordinate and the pixel center point in the tracking image satisfies the correction condition.

In one example, the correction condition includes: for pixel coordinates of a test object in the plurality of test images in the tracking image, an average value of differences between horizontal pixel coordinates of the plurality of pixel coordinates and horizontal pixel coordinates of a pixel center point is smaller than a first preset threshold, and a maximum value of differences between the horizontal pixel coordinates of the plurality of pixel coordinates and the horizontal pixel coordinates of the pixel center point is smaller than a second preset threshold; the difference value between the vertical pixel coordinate of the plurality of pixel coordinates and the vertical pixel coordinate of the pixel center point is greater than a third preset threshold value and less than a fourth preset threshold value.

The correction conditions are represented by the following formula:

c<y_i<d

x_iis the difference between the ith horizontal pixel coordinate and the horizontal pixel coordinate of the pixel center point in the plurality of horizontal pixel coordinatesThe value is obtained. a is a first preset threshold; b is a second preset threshold. y is_iIs the difference between the ith vertical pixel coordinate of the plurality of vertical pixel coordinates and the vertical pixel coordinate of the pixel center point. c is a third preset threshold; d is a fourth preset threshold.

Of course, the correction condition may also be other ways, such as: for the vertical pixel coordinate of each pixel coordinate in the plurality of pixel coordinates, an average value of differences between the plurality of vertical pixel coordinates and the vertical pixel coordinate of the pixel center point is smaller than a first preset threshold, and a maximum value of differences between the plurality of vertical pixel coordinates and the vertical pixel coordinate of the pixel center point is smaller than a second preset threshold.

In summary, in the training method for the object tracking model provided in this embodiment, the accuracy of the object tracking model is verified, and when the object tracking model meets the correction condition, the object tracking model is corrected again until the object tracking model does not meet the correction condition; the accuracy of tracking and positioning by using the tracking and positioning model can be improved.

Fig. 6 is a block diagram of an apparatus for training an object tracking model according to an embodiment of the present application, and this embodiment takes the control component 130 of the apparatus applied in the system for training an object tracking model shown in fig. 1 as an example for explanation. The device at least comprises the following modules: coordinate acquisition module 610, angle acquisition module 620, and model training module 630.

The coordinate acquisition module 610 is configured to control a first image acquisition component to perform image acquisition on a training image set, so as to obtain sample pixel coordinates of a training object of each training image in the training image set in a panoramic image; each training image in the set of training images is located within an acquisition range of the first image acquisition assembly.

An angle obtaining module 620, configured to obtain, for each training image in the training image set, a desired rotation angle of the second image capturing component relative to the training image, where the desired rotation angle enables a training object in the training image to be located in a desired image area of the tracking image captured by the second image capturing component.

A model training module 630, configured to perform model training using the sample pixel coordinates of each training image and the corresponding expected rotation angle to obtain an object tracking model; the object tracking model is used for determining the rotation angle of the second image acquisition assembly according to the pixel coordinates of the target object in the panoramic image in the object tracking process, so that the second image acquisition assembly can perform tracking shooting on the target object.

Optionally, the angle obtaining module 620 includes: an angle formula acquisition unit 621, an initial angle calculation unit 622, a rotation angle control unit 623, a pixel coordinate acquisition unit 624, an offset angle acquisition unit 625, and a rotation angle determination unit 626.

An angle formula obtaining unit 621, configured to obtain a preset angle calculation formula;

an initial angle calculation unit 622, configured to input the sample pixel coordinates of the training image into the angle calculation formula, so as to obtain an initial rotation angle of the second image acquisition component;

a rotation angle control unit 623, configured to control the second image capturing assembly to rotate to the initial rotation angle;

a pixel coordinate obtaining unit 624, configured to obtain pixel coordinates of the training object in the training image in the tracking image collected by the second image collection component;

an offset angle obtaining unit 625, configured to calculate an offset angle of the training object in the second image acquisition component using the pixel coordinates in the tracking image;

a rotation angle determination unit 626 for determining the desired rotation angle based on the initial rotation angle and the offset angle.

Optionally, the angle obtaining module 620 further includes: the object recognition unit 627 is trained. The object identifying unit 627 is configured to: after the second image acquisition assembly is controlled to rotate to the initial rotation angle, carrying out image identification on the tracking image acquired by the second image acquisition assembly; upon identifying a training object, determining whether the identified training object is the same as a training object in the training image; when the identified training object is the same as the training object in the training image, the pixel coordinate obtaining unit 624 is triggered to perform the step of obtaining the pixel coordinates of the training object in the training image in the tracking image collected by the second image collecting component.

Optionally, the offset angle obtaining unit 625 is configured to:

acquiring a scaling parameter of the second image acquisition assembly;

Optionally, the apparatus further comprises: a test coordinate acquisition module 640, a test angle acquisition module 650, a rotation angle control module 660, a tracking coordinate acquisition module 670, and a tracking model verification module 680.

The test coordinate acquisition module 640 is configured to perform model training by using the sample pixel coordinate of each training image and the corresponding expected rotation angle to obtain an object tracking model, and then control the first image acquisition assembly to perform image acquisition on a test image set to obtain a test pixel coordinate of a test object of each test image in the test image set in a panoramic image; the position of the test image set relative to the first image acquisition component is different from the position of the training image set relative to the first image acquisition component, and each test image in the test image set is located within the acquisition range of the first image acquisition component;

a test angle obtaining module 650, configured to, for each test image in the test image set, input the test pixel coordinates of the test image into the object tracking model, so as to obtain a test rotation angle of the second image acquisition assembly;

a rotation angle control module 660, configured to control the second image capturing component to rotate to the test rotation angle;

a tracking coordinate obtaining module 670, configured to obtain a pixel coordinate in a tracking image acquired by the second image acquisition component after the test object in the test image rotates by the angle;

a tracking model verification module 680 to determine an accuracy of the object tracking model based on differences between pixel coordinates in the tracking image and the desired image region.

Optionally, the desired image region is a pixel center point of a tracking image, and the tracking model verification module 680 is configured to:

Optionally, the correction condition includes:

For relevant details reference is made to the above-described method embodiments.

It should be noted that: in the above embodiment, when the training apparatus for the object tracking model performs object tracking, only the division of the functional modules is illustrated, and in practical applications, the function distribution may be completed by different functional modules according to needs, that is, the internal structure of the training apparatus for the object tracking model is divided into different functional modules, so as to complete all or part of the functions described above. In addition, the training apparatus for the object tracking model and the embodiment of the training method for the object tracking model provided in the above embodiments belong to the same concept, and specific implementation processes thereof are described in detail in the embodiment of the method and are not described herein again.

FIG. 7 is a block diagram of an apparatus for training an object tracking model, which may be an apparatus including the control component 130 shown in FIG. 1, according to an embodiment of the present disclosure. The apparatus includes at least a processor 701 and a memory 702.

The processor 701 may include one or more Processing cores, such as a 4-core processor, an 8-core processor, etc. the processor 701 may employ a DSP (Digital Signal Processing), an FPGA (Field Programmable Gate Array), a P L A

Processor 701 may also include a main processor, which is a processor for Processing data in an awake state, also referred to as a Central Processing Unit (CPU), and a coprocessor, which is a low power processor for Processing data in a standby state.

Memory 702 may include one or more computer-readable storage media, which may be non-transitory. Memory 702 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in memory 702 is used to store at least one instruction for execution by processor 701 to implement a method of training an object tracking model provided by method embodiments herein.

In some embodiments, the training device for the object tracking model may further include: a peripheral interface and at least one peripheral. The processor 701, memory 702, and peripheral interface may be connected by bus or signal lines. Each peripheral may be connected to the peripheral interface via a bus, signal line, or circuit board. Illustratively, peripheral devices include, but are not limited to: radio frequency circuit, display screen, audio circuit, power supply, etc.

Of course, the training apparatus for the object tracking model may also include fewer or more components, which is not limited by the embodiment.

Optionally, the present application further provides a computer-readable storage medium, in which a program is stored, and the program is loaded and executed by a processor to implement the training method for the object tracking model of the above method embodiment.

Optionally, the present application further provides a computer product, which includes a computer-readable storage medium, in which a program is stored, and the program is loaded and executed by a processor to implement the training method for the object tracking model of the above method embodiment.

The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. A method of training an object tracking model, the method comprising:

2. The method of claim 1, wherein said obtaining, for each training image in the set of training images, a desired angle of rotation of a second image acquisition assembly relative to the training image comprises:

acquiring a preset angle calculation formula;

3. The method of claim 2, wherein after controlling the second image capturing assembly to rotate to the initial rotation angle, further comprising:

4. The method of claim 2, wherein the calculating an offset angle of the training object in a second image acquisition assembly using pixel coordinates in the tracking image comprises:

acquiring a scaling parameter of the second image acquisition assembly;

5. The method according to any one of claims 1 to 4, wherein after performing model training using the sample pixel coordinates and the corresponding desired rotation angle of each training image to obtain the object tracking model, the method further comprises:

6. The method of claim 5, wherein the desired image region is a pixel center point of a tracking image, and wherein determining the accuracy of the object tracking model based on differences between pixel coordinates in the tracking image and the desired image region comprises:

7. The method of claim 6, wherein the correction condition comprises:

8. An apparatus for training an object tracking model, the apparatus comprising:

9. An apparatus for training an object tracking model, the apparatus comprising a processor and a memory; the memory has stored therein a program that is loaded and executed by the processor to implement the training method of the object tracking model according to any one of claims 1 to 7.

10. A computer-readable storage medium, characterized in that the storage medium has stored therein a program which, when being executed by a processor, is adapted to carry out a method of training an object tracking model according to any one of claims 1 to 7.