CN117351067A

CN117351067A - Image processing method and device and electronic equipment

Info

Publication number: CN117351067A
Application number: CN202210754294.1A
Authority: CN
Inventors: 郭亨凯; 朱炎; 张永杰
Original assignee: Beijing Zitiao Network Technology Co Ltd
Current assignee: Beijing Zitiao Network Technology Co Ltd
Priority date: 2022-06-28
Filing date: 2022-06-28
Publication date: 2024-01-05

Abstract

The disclosure provides an image processing method, an image processing device and electronic equipment, wherein the method comprises the following steps: acquiring a plurality of first images of a first object, wherein the plurality of first images are images obtained by shooting the first object by a plurality of camera equipment at the same time; determining a visual mark in each first image, and acquiring the rotation amount of the first object when the plurality of image capturing devices capture the first object; and determining a first pose of the first object according to the visual marks in the first images and the rotation amount. And the accuracy of pose estimation is improved.

Description

Image processing method and device and electronic equipment

Technical Field

The disclosure relates to the technical field of computer vision, and in particular relates to an image processing method, an image processing device and electronic equipment.

Background

In the field of computer vision technology, an electronic device may perform pose estimation on an object in an image, and further track the object in the image.

Currently, a plurality of visual markers can be set in an object, after the monocular camera shoots the object, the monocular camera sends an image including the visual markers to the electronic device, and the electronic device performs pose estimation on the object in the image according to the visual markers in the image. However, when the object moves faster, the image shot by the monocular camera is blurred (for example, the visual mark is blurred, the visual mark is not shot, etc.), so that the electronic device cannot accurately estimate the pose of the object through the visual mark, and the pose estimation accuracy is lower.

Disclosure of Invention

The disclosure provides an image processing method, an image processing device and electronic equipment, which are used for solving the technical problem of lower accuracy of position estimation in the prior art.

In a first aspect, the present disclosure provides an image processing method, the method comprising:

acquiring a plurality of first images of a first object, wherein the plurality of first images are images obtained by shooting the first object by a plurality of camera equipment at the same time;

determining a visual marker in each first image;

acquiring the rotation amount of the first object when the plurality of image capturing devices capture the first object;

and determining a first pose of the first object according to the visual marks in the first images and the rotation amount.

In a second aspect, the present disclosure provides an image processing apparatus including a first acquisition module, a first determination module, a second acquisition module, and a second determination module, wherein:

the first acquisition module is used for acquiring a plurality of first images of a first object, wherein the plurality of first images are images obtained by shooting the first object by a plurality of camera equipment at the same time;

the first determining module is used for determining visual marks in each first image;

The second acquisition module is used for acquiring the rotation amount of the first object when the plurality of image pickup devices shoot the first object;

the second determining module is used for determining a first pose of the first object according to visual marks in the plurality of first images and the rotation amount.

In a third aspect, an embodiment of the present disclosure provides an electronic device, including: a processor and a memory;

the memory stores computer-executable instructions;

the processor executes computer-executable instructions stored by the memory to cause the at least one processor to perform the image processing method as described above in the first aspect and various possible aspects of the first aspect.

In a fourth aspect, embodiments of the present disclosure provide a computer-readable storage medium having stored therein computer-executable instructions which, when executed by a processor, implement the image processing method as described in the first aspect and the various possible aspects of the first aspect.

In a fifth aspect, embodiments of the present disclosure provide a computer program product comprising a computer program which, when executed by a processor, implements the image processing method as described above in the first aspect and the various possible aspects of the first aspect.

The disclosure provides an image processing method, an image processing device and electronic equipment, wherein a plurality of first images of a first object are obtained, the first images are images obtained by shooting the first object by a plurality of image pickup equipment at the same time, visual marks in each first image are obtained, when the first object is shot by the plurality of image pickup equipment, the rotation amount of the first object is determined, and the first pose of the first object is determined according to the visual marks and the rotation amount in the first images. In the method, because the plurality of first images are obtained by shooting different shooting devices at the same time, when the first object moves faster, the electronic device can accurately acquire the information of the visual mark of the first object through the plurality of first images, and accurately estimate the pose of the first object by combining the rotation amount of the first object, thereby improving the accuracy of pose estimation.

Drawings

Fig. 1 is a schematic view of an application scenario provided in an embodiment of the present disclosure;

fig. 2 is a schematic flow chart of an image processing method according to an embodiment of the disclosure;

FIG. 3 is a schematic illustration of a visual marker provided by an embodiment of the present disclosure;

fig. 4 is a flowchart of a method for determining a first pose according to an embodiment of the present disclosure;

Fig. 5 is a schematic diagram of a process for acquiring a pose of a first object according to an embodiment of the disclosure;

FIG. 6 is a schematic diagram of another process for acquiring a pose of a first object according to an embodiment of the present disclosure;

fig. 7 is a process schematic diagram of an image processing method according to an embodiment of the disclosure;

fig. 8 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present disclosure; the method comprises the steps of,

fig. 9 is a schematic structural diagram of an electronic device according to an embodiment of the disclosure.

Detailed Description

Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present disclosure as detailed in the accompanying claims.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

For ease of understanding, concepts related to the embodiments of the present disclosure will be first described.

Electronic equipment: is a device with wireless receiving and transmitting function. The electronic device may be deployed on land, including indoors or outdoors, hand-held, wearable, or vehicle-mounted; can also be deployed on the water surface (such as a ship, etc.). The electronic device may be a mobile phone (mobile phone), a tablet (Pad), a computer with wireless transceiving function, a Virtual Reality (VR) electronic device, an augmented reality (augmented reality, AR) electronic device, a wireless terminal in industrial control (industrial control), a vehicle-mounted electronic device, a wireless terminal in unmanned driving (self driving), a wireless electronic device in remote medical (remote medical), a wireless electronic device in smart grid (smart grid), a wireless electronic device in transportation security (transportation safety), a wireless electronic device in smart city, a wireless electronic device in smart home (smart home), a wearable electronic device, etc. The electronic device according to the embodiments of the present disclosure may also be referred to as a terminal, a User Equipment (UE), an access electronic device, a vehicle-mounted terminal, an industrial control terminal, a UE unit, a UE station, a mobile station, a remote electronic device, a mobile device, a UE electronic device, a wireless communication device, a UE proxy, a UE apparatus, or the like. The electronic device may also be stationary or mobile.

In the related art, a plurality of visual marks can be set in a tracking object, when the tracking object moves, the tracking object is shot through a monocular camera to obtain a tracking image comprising the tracking object, the tracking image is sent to an electronic device, when the electronic device receives a single tracking image, the electronic device can identify the tracking object in the tracking image, acquire the visual marks on the tracking object, and further estimate the pose of the tracking object through the visual marks. However, when the tracked object moves faster, the tracked object cannot be clearly shot by the monocular camera, so that the electronic device cannot acquire enough visual marks to estimate the pose of the tracked object, and the pose estimation accuracy is lower.

In order to solve the technical problem of low accuracy of pose estimation in the related art, the embodiment of the disclosure provides an image processing method, an electronic device acquires a plurality of first images obtained by shooting a first object by a plurality of image capturing devices at the same time, acquires a visual mark in each first image and a rotation amount of the first object when the plurality of image capturing devices shoot the first object, and determines M second poses corresponding to the plurality of first images according to the visual marks of the plurality of first images, wherein M is an integer greater than or equal to 1, and determines the first poses according to the M second poses and the rotation amount. In the method, because the plurality of first images are obtained by shooting by different shooting devices at the same time, when the first object moves faster, the electronic device can obtain M second poses corresponding to the first object through clear visual marks in the plurality of first images, and further, the poses of the first object can be accurately estimated through the M second poses and the rotation quantity of the first object, so that the pose estimation accuracy is improved.

Next, an application scenario of the embodiment of the present disclosure will be described with reference to fig. 1.

Fig. 1 is a schematic view of an application scenario provided in an embodiment of the present disclosure. Referring to fig. 1, the method includes: an electronic apparatus, an image capturing apparatus a, an image capturing apparatus B, and a first object. Wherein the first object includes a plurality of preset visual markers (not shown in fig. 1), and the first object further includes an inertial sensor for acquiring a rotation amount. The electronic apparatus is communicatively connected to the image pickup apparatus a and the image pickup apparatus B, respectively. When the first object moves, the image capturing apparatus a captures the first object to obtain a first image a, and the image capturing apparatus B captures the first object to obtain a first image B. The image pickup apparatus a sends a first image a to the electronic apparatus, the image pickup apparatus B sends a first image B to the electronic apparatus, and the inertial sensor sends a rotation amount of the first object to the electronic apparatus. The electronic device determines the pose of the first object according to the visual marker in the first image A, the visual marker in the first image B and the rotation amount of the first object. Therefore, when the first object moves faster, the electronic device can accurately acquire the information of the visual mark of the first object through the first image A and the first image B, and accurately estimate the pose of the first object by combining the rotation amount of the first object, thereby improving the pose estimation accuracy.

The following describes the technical solutions of the present disclosure and how the technical solutions of the present disclosure solve the above technical problems in detail with specific embodiments. The following embodiments may be combined with each other, and the same or similar concepts or processes may not be described in detail in some embodiments. Embodiments of the present disclosure will be described below with reference to the accompanying drawings.

Fig. 2 is a flowchart of an image processing method according to an embodiment of the disclosure. Referring to fig. 2, the method may include:

s201, acquiring a plurality of first images of a first object.

The execution subject of the embodiments of the present disclosure may be an electronic device, or may be an image processing apparatus provided in an electronic device. The image processing device may be implemented by software, or the image processing device may be implemented by a combination of software and hardware.

Optionally, the first object is an object to be subjected to pose estimation. For example, the first object may be a movable object of a user, an airplane, or the like. Alternatively, the first object may comprise a plurality of visual markers. Wherein the visual marker is used to mark the first object. For example, the visual mark may be a sticker, a two-dimensional code, a preset pattern, or the like, which is a mark that is largely distinguished from the first object. Alternatively, the pose of the first object may be six degrees of freedom of the first object. For example, by estimating the six degrees of freedom of the first user, the motion gesture of the first object in space can be simulated, and further applied to various training simulators (such as a flight simulator, an automobile driving simulator, an earthquake simulator, a dynamic movie, and the like).

Alternatively, a plurality of visual markers may be set on the first object in advance. For example, a plurality of visual markers may be set on the surface of the first object in advance, and the electronic device may estimate the current posture of the first object by detecting the plurality of visual markers. For example, when the first object is a user, a sticker may be provided at a plurality of positions such as an extremity, a trunk, etc. of the user, and the electronic device determines the posture of the user by detecting the sticker.

Next, a visual marker of the first object will be described with reference to fig. 3.

Fig. 3 is a schematic diagram of a visual marker provided in an embodiment of the present disclosure. Referring to fig. 3, the method includes: a first object, visual indicia a, visual indicia B, and visual indicia C. Wherein the first subject is an automobile, and the visual indicia A, B and C are identical smiling face icons. Visual marker A is set in the front of the first object, visual marker B is set in the middle of the first object, and visual marker C is set in the tail of the first object. Alternatively, in the embodiment shown in fig. 3, a plurality of visual markers may be included in the first object. For example, a plurality of visual markers are arranged in six faces of the first object, so that the detection accuracy of the visual markers can be improved, and the accuracy of pose estimation can be further improved.

Alternatively, the first image may be an image of the first object captured by the image capturing apparatus. For example, when the first object is a user, the first image may be an image of the user captured by a camera, and when the first object is an automobile, the first image may be an image of the automobile captured by the camera.

Alternatively, the plurality of first images are images obtained by simultaneously capturing the first object by the plurality of image capturing apparatuses. For example, a plurality of cameras may simultaneously capture a first object at different angles, thereby obtaining a plurality of first images. For example, at a preset time, if an image of a first object captured by the camera a is an image a, an image of a first object captured by the camera B is an image B, and an image of a first object captured by the camera C is an image C, the first image corresponding to the first object includes an image a, an image B, and an image C, and the images a, B, and C are images obtained by capturing the first object by 3 cameras at the same time.

Optionally, before the image capturing apparatuses capture the first object, external parameters between the plurality of image capturing apparatuses are calibrated. For example, in the practical application process, external parameters between a plurality of cameras need to be calibrated first, and when calibration is completed, a first image of a first object can be acquired through the plurality of cameras.

Alternatively, the electronic device may receive the first image captured by the image capturing device. For example, the electronic device may be connected (e.g., communicatively connected, etc.) to a plurality of image capturing devices, and after the plurality of image capturing devices capture a first object, a first image corresponding to the first object may be sent to the electronic device, where the first image includes a timestamp, and the electronic device may determine the first image captured by the plurality of image capturing devices at the same time by using the timestamp, and further determine the pose of the first object at the time by using the first image.

S202, determining visual marks in each first image.

Optionally, for any first image, the visual mark in the first image may be determined according to the following possible implementation manner: it is determined whether the first image is a first frame image of the first object captured by the image capturing apparatus. For example, a camera may take a first image at a time, and the electronic device may determine whether the first image is the first image of the first object taken by the camera. For example, the image capturing apparatus starts capturing a first object at time a, determines that a first image is a first frame image if the first image is an image acquired by the image capturing apparatus at time a, and determines that the first image is not the first frame image if the first image is an image acquired by the image capturing apparatus after time a.

If the first image is a first frame image of the first object shot by the camera equipment, the first image is detected, and a visual mark in the first image is obtained. For example, if the first image is a first frame image of the first object captured by the image capturing device, it is indicated that the electronic device does not detect the visual mark, so that the electronic device may detect the visual mark in the first image through a preset detection algorithm, and further obtain the visual mark in the first image.

If the first image is not the first frame image of the first object shot by the camera equipment, the tracking information of the electronic equipment on the second image is acquired, and the visual mark is acquired according to the tracking information. The second image is the previous frame image of the first image. Optionally, the tracking information is used to indicate whether the electronic device successfully tracks the first object of the second image. Alternatively, tracking information of the second image by the image pickup apparatus may be acquired based on a result of optical flow tracking. For example, if the result of optical flow tracking is that the electronic device successfully tracks the first object in the second image, the tracking information is determined to be successful in tracking the electronic device, and if the result of optical flow tracking is that the electronic device does not track the first object in the second image, the tracking information is determined to be failed in tracking the electronic device.

Optionally, the visual mark in the first image is determined according to the tracking result of the first object, specifically: if the tracking information indicates that the electronic equipment fails to track the first object in the second image, the first image is detected, and a visual mark in the first image is obtained. For example, if the electronic device fails to track the first object in the second image, it is indicated that the electronic device cannot track the visual mark in the second image to the first image in a tracking manner, so that the electronic device can detect the visual mark in the first image through a preset detection algorithm, and further obtain the visual mark in the first image.

If the tracking information indicates that the electronic equipment successfully tracks the first object in the second image, the visual mark in the first image is obtained according to the tracking result of the first object. For example, if the electronic device successfully tracks the first object in the second image, it is indicated that the electronic device may track the visual mark in the second image to the first image directly through the tracking result of the second image, so as to obtain the visual mark in the first image. Therefore, when the electronic equipment successfully tracks the first object of the previous frame of the first image, the electronic equipment can track the visual mark to the first image of the frame through the tracking result, so that the visual mark in the first image is obtained, the electronic equipment does not need to detect the visual mark in each first image frame by frame, the detection efficiency of the visual mark is improved, the occurrence of detection failure is reduced, and the pose estimation stability of the first object is further improved.

S203, acquiring the rotation amounts of the first object when the plurality of image capturing apparatuses capture the first object.

Alternatively, the rotation amount of the first object is the rotation angle of the first object with respect to the image capturing apparatus. Optionally, the first object includes an inertial sensor, and the electronic device may acquire the rotation amount of the first object through the inertial sensor. For example, before the rotation amount of the first object is acquired by the inertial sensor, external parameters between the inertial sensor and the first object need to be calibrated in advance. Alternatively, the rotation amount of the first object is an absolute value with respect to the initial value (e.g., 0). For example, the first object may be rotated in any direction, and the inertial sensor may acquire an absolute value of the rotation of the first object.

Alternatively, the rotation amount of the first object may be obtained according to the following possible implementation manner: and judging whether the inertial sensor is calibrated when the image pickup device shoots the first image. The calibration of the inertial sensor is used for unifying the rotation amount acquired by the inertial sensor and the visual mark shot by the shooting equipment into the same coordinate system. For example, when the inertial sensor is calibrated, the electronic device may represent the rotation amount acquired by the inertial sensor and the visual mark captured by the imaging device through the same coordinate system, and further may perform pose estimation by combining the visual mark and the rotation amount.

And if the inertial sensor is calibrated, acquiring the rotation quantity of the first object according to the inertial sensor. For example, when the inertial sensor has been calibrated, the electronic device may obtain the rotation angle of the first object through the inertial sensor. If the inertial sensor is not calibrated, calibrating the inertial sensor and acquiring the rotation amount of the first object when the rotation amount of the inertial sensor is greater than or equal to a preset threshold value. For example, when the inertial sensor is not calibrated, it is necessary to calibrate the inertial sensor on line by an algorithm such as hand-eye calibration after the inertial sensor accumulates a certain rotation amount, and when the calibration is completed, the rotation amount of the first object can be acquired by the inertial sensor.

S204, determining a first pose of the first object according to visual marks and rotation amounts in the plurality of first images.

Alternatively, the pose may be a pose of the first object. The first pose may be a pose of the first object when the image capturing apparatus captures the first object. The first pose of the first object may be determined according to the following possible implementation: and determining M second poses corresponding to the plurality of first images according to the visual marks of the plurality of first images. Optionally, the second pose is a pose determined by visual markers. Wherein M is an integer greater than or equal to 1. For example, M is greater than or equal to 1, and M is less than or equal to the number of first images. For example, when a plurality of image capturing apparatuses capture a first object, there may be no visual marks or a small number of visual marks in the first image captured by a part of the image capturing apparatuses (e.g., factors such as the capturing angle of the image capturing apparatuses, the moving speed of the first object, etc. affect the capturing of the visual marks), so that the number of second poses captured by the electronic apparatus is smaller than the number of first images.

And determining the first pose according to the M second poses and the rotation amount. For example, the M second poses may be fused, and the first pose of the first image may be determined in combination with the rotation amount.

The embodiment of the disclosure provides an image processing method, which is used for acquiring a plurality of first images of a first object, acquiring visual marks in each first image and rotation amounts of the first object when a plurality of imaging devices shoot the first object, and determining M second poses corresponding to the plurality of first images according to the visual marks of the plurality of first images, wherein M is an integer greater than or equal to 1, and determining the first poses according to the M second poses and the rotation amounts. In the method, because the plurality of first images are obtained by shooting by different shooting devices at the same time, when the first object moves faster, the electronic device can obtain M second poses corresponding to the first object through clear visual marks in the plurality of first images, and further, the poses of the first object can be accurately estimated through the M second poses and the rotation quantity of the first object, so that the pose estimation accuracy is improved.

On the basis of the embodiment shown in fig. 2, a procedure of determining the first pose of the first object in the image processing method shown in fig. 2 will be described below with reference to fig. 4.

Fig. 4 is a flowchart of a method for determining a first pose according to an embodiment of the present disclosure. Referring to fig. 4, the method includes:

s401, determining M second positions corresponding to the plurality of first images according to the visual marks of the plurality of first images.

Wherein M is an integer greater than or equal to 1. Alternatively, M second poses may be determined according to the following possible implementation: and obtaining a visual mark corresponding to each first image, and processing the visual mark through a preset algorithm to obtain a second pose corresponding to each first image. For example, for any one first image, the electronic device acquires a visual mark in the first image, and performs pose estimation with six degrees of freedom through a PnP (transparent-n-Point) algorithm, so as to obtain a second pose corresponding to the first image. For example, if the electronic device obtains the visual marks corresponding to 3 first images, the electronic device may obtain the second poses corresponding to 3 first objects through the visual marks corresponding to each image.

S402, determining the first pose according to the M second poses and the rotation amount.

Optionally, in the practical application process, the second pose may not exist in a part of the first image, for example, the definition of the part of the first image is low or the visual mark is less, so that the electronic device cannot perform pose estimation through six degrees of freedom, and further the first image does not have the corresponding second pose.

Optionally, according to the M second poses and the rotation amounts, determining the first pose includes the following two cases:

case 1: m is equal to 1.

And if M is equal to 1, determining to obtain the first pose according to the second pose and the rotation amount. Alternatively, the rotation amount may or may not be null. For example, if the inertial sensor is not calibrated, the electronic device cannot acquire the rotation amount through the inertial sensor, and the rotation amount has a null value, and if the inertial sensor is calibrated, the electronic device can acquire the rotation amount through the inertial sensor, and the rotation amount has a non-null value.

Alternatively, the first pose may be determined according to the following possible implementation manner: and if the rotation amount is empty, determining the second pose as the first pose. For example, if the rotation amount is empty and the electronic device obtains only a second pose corresponding to the first image, the electronic device determines the second pose as the first pose of the first object.

And if the rotation quantity is not null, carrying out fusion processing on the second pose and the rotation quantity to obtain the first pose. For example, if the rotation amount is not null and the electronic device only obtains the second pose corresponding to the first image, the electronic device may adjust the second pose based on the rotation amount through a fusion algorithm, so as to obtain the first pose corresponding to the first object. Therefore, the electronic equipment can flexibly determine the first pose of the first object according to the value of the rotation quantity, and the flexibility of determining the pose of the first object is improved.

Next, a procedure of acquiring the pose of the first object in this case will be described with reference to fig. 5.

Fig. 5 is a schematic diagram of a process for acquiring a pose of a first object according to an embodiment of the disclosure. Referring to fig. 5, a first image is included. Wherein the first image comprises a first object and a plurality of visual markers (not shown in fig. 5). And the PnP algorithm obtains the second pose of the first object according to the plurality of visual marks. Since the accuracy of the second pose estimated by the preset algorithm is low, the rotation amount of the first object is determined. When the rotation amount of the first object is 30 degrees, the second pose is adjusted through the rotation amount of the first object, and then the first pose corresponding to the first object is obtained.

Case 2: if M is greater than 1.

If M is greater than 1, carrying out fusion processing on the M second poses to obtain a third pose, and determining to obtain the first pose according to the third pose and the rotation amount. For example, if M is greater than 1, it indicates that the electronic device obtains a plurality of second poses corresponding to the plurality of first images, and the electronic device may perform fusion processing on the plurality of second poses through a fusion algorithm to obtain a third pose, so that the third pose may combine information of the plurality of second poses, thereby improving accuracy of the third pose.

Alternatively, the first pose may be determined according to the following possible implementation manner: and if the rotation amount is empty, determining the third pose as the first pose. For example, if the rotation amount is empty, it indicates that the inertial sensor is not calibrated yet, the electronic device cannot acquire the rotation amount, and the electronic device determines the third pose as the first pose of the first object.

And if the rotation amount is not null, carrying out fusion processing on the third pose and the rotation amount to obtain the first pose. For example, if the rotation amount is not null, the electronic device may adjust the third pose based on the rotation amount through a fusion algorithm, so as to obtain a first pose corresponding to the first object. Therefore, the third pose combines the information of the plurality of second poses, so that the accuracy of the third pose is high, the electronic equipment can flexibly determine the first pose of the first object according to the value of the rotation quantity, the flexibility of determining the pose of the first object is improved, and the robustness of determining the first pose can be improved by combining the rotation quantity.

Next, a procedure of acquiring the pose of the first object in this case will be described with reference to fig. 6.

Fig. 6 is a schematic diagram of another process for acquiring a pose of a first object according to an embodiment of the disclosure. Referring to fig. 6, a first image a and a first image B (top view of a first object) are included. Wherein both the first image a and the first image B comprise a first object and a plurality of visual markers (not shown in fig. 6). The PnP algorithm obtains a second pose A according to a plurality of visual marks in the first image A, and the PnP algorithm obtains a second pose B according to a plurality of visual marks in the first image B.

Referring to fig. 6, the second pose a and the second pose B are fused to obtain a third pose. An amount of rotation of the first object is determined. And when the rotation amount of the first object is 30 degrees, the third pose is adjusted through the rotation amount of the first object, and then the first pose corresponding to the first object is obtained.

The embodiment of the disclosure provides a method for determining a first pose, which determines M second poses corresponding to a plurality of first images according to visual marks of the first images, determines to obtain the first pose according to the second poses and rotation amounts if M is equal to 1, performs fusion processing on the M second poses to obtain a third pose if M is greater than 1, and determines to obtain the first pose according to the third pose and rotation amounts. Therefore, the electronic equipment can flexibly determine the first pose of the first object according to the value of the rotation quantity, the flexibility of determining the pose of the first object is improved, and the accuracy of the third pose is higher because the third pose combines the information of a plurality of second poses, and the robustness of determining the first pose can be improved by combining the rotation quantity, so that the accuracy of determining the pose is improved.

On the basis of any one of the above embodiments, a procedure of the above image processing method will be described below with reference to fig. 7.

Fig. 7 is a process schematic diagram of an image processing method according to an embodiment of the disclosure. Referring to fig. 7, the method includes: an electronic apparatus, an image capturing apparatus a, an image capturing apparatus B, and a first object. Wherein the first object includes a plurality of preset visual markers (not shown in fig. 1), and the first object further includes an inertial sensor for acquiring a rotation amount. The electronic apparatus is communicatively connected to the image pickup apparatus a and the image pickup apparatus B, respectively. The image capturing device A captures a first object to obtain a first image A, and the image capturing device B captures the first object to obtain a first image B.

Referring to fig. 7, the image pickup apparatus a transmits a first image a to the electronic apparatus, the image pickup apparatus B transmits a first image B to the electronic apparatus, and the inertial sensor transmits a rotation amount of the first object to the electronic apparatus. The electronic equipment obtains a second pose A of the first object according to the visual mark in the first image A, and obtains a second pose B of the first object according to the visual mark in the first image B. And the electronic equipment fuses the second pose A and the second pose B to obtain a third pose. And the electronic equipment adjusts the third pose through the rotation amount of the first object by 0 degrees to obtain the first pose. Thus, as the third pose combines the information of the plurality of second poses, the accuracy of the third pose is higher, and the robustness of determining the first pose can be improved by combining the rotation amount, so that the accuracy of determining the pose is improved.

Fig. 8 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present disclosure. Referring to fig. 8, the image processing apparatus 80 includes a first acquisition module 81, a first determination module 82, a second acquisition module 83, and a second determination module 84, wherein:

the first obtaining module 81 is configured to obtain a plurality of first images of a first object, where the plurality of first images are images obtained by simultaneously capturing, by a plurality of image capturing devices, the first object;

the first determining module 82 is configured to determine a visual marker in each first image;

the second obtaining module 83 is configured to obtain a rotation amount of the first object when the plurality of image capturing apparatuses capture the first object;

the second determining module 84 is configured to determine a first pose of the first object according to visual markers in the plurality of first images and the rotation amount.

In one possible implementation, the second determining module 84 is specifically configured to:

according to the visual marks of the plurality of first images, M second positions corresponding to the plurality of first images are determined, wherein M is an integer greater than or equal to 1;

and determining the first pose according to the M second poses and the rotation amount.

if the M is equal to 1, determining to obtain the first pose according to the second pose and the rotation amount;

and if the M is greater than 1, carrying out fusion processing on the M second poses to obtain a third pose, and determining to obtain the first pose according to the third pose and the rotation amount.

if the rotation amount is empty, determining the second pose as the first pose;

and if the rotation amount is not null, carrying out fusion processing on the second pose and the rotation amount to obtain the first pose.

if the rotation amount is empty, determining the third pose as the first pose;

and if the rotation amount is not null, carrying out fusion processing on the third pose and the rotation amount to obtain the first pose.

In one possible implementation, the first determining module 82 is specifically configured to:

judging whether the first image is a first frame image of the first object shot by the shooting equipment;

If the first image is a first frame image of the first object shot by the shooting equipment, detecting the first image to obtain a visual mark in the first image;

if the first image is not the first frame image of the first object shot by the shooting equipment, acquiring tracking information of the electronic equipment on a second image, and acquiring the visual mark according to the tracking information, wherein the second image is the last frame image of the first image.

if the tracking information indicates that the electronic equipment fails to track the first object in the second image, detecting the first image to obtain a visual mark in the first image;

and if the tracking information indicates that the electronic equipment successfully tracks the first object in the second image, acquiring a visual mark in the first image according to a tracking result of the first object.

judging whether the inertial sensor is calibrated when the image pickup device shoots a first image;

If the inertial sensor is calibrated, acquiring the rotation quantity of the first object according to the inertial sensor;

if the inertial sensor is not calibrated, calibrating the inertial sensor and acquiring the rotation amount of the first object when the rotation amount of the inertial sensor is greater than or equal to a preset threshold value.

The image processing device provided in this embodiment may be used to execute the technical solution of the foregoing method embodiment, and its implementation principle and technical effects are similar, and this embodiment will not be described herein again.

Fig. 9 is a schematic structural diagram of an electronic device according to an embodiment of the disclosure. Referring to fig. 9, a schematic diagram of an electronic device 900 suitable for implementing embodiments of the present disclosure is shown, where the electronic device 900 may be a terminal device or a server. The terminal device may include, but is not limited to, a mobile terminal such as a mobile phone, a notebook computer, a digital broadcast receiver, a personal digital assistant (Personal Digital Assistant, PDA for short), a tablet (Portable Android Device, PAD for short), a portable multimedia player (Portable Media Player, PMP for short), an in-vehicle terminal (e.g., an in-vehicle navigation terminal), and the like, and a fixed terminal such as a digital TV, a desktop computer, and the like. The electronic device shown in fig. 9 is merely an example, and should not impose any limitations on the functionality and scope of use of embodiments of the present disclosure.

As shown in fig. 9, the electronic apparatus 900 may include a processing device (e.g., a central processor, a graphics processor, or the like) 901, which may perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 902 or a program loaded from a storage device 908 into a random access Memory (Random Access Memory, RAM) 903. In the RAM903, various programs and data necessary for the operation of the electronic device 900 are also stored. The processing device 901, the ROM 902, and the RAM903 are connected to each other through a bus 904. An input/output (I/O) interface 905 is also connected to the bus 904.

In general, the following devices may be connected to the I/O interface 905: input devices 906 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, and the like; an output device 907 including, for example, a liquid crystal display (Liquid Crystal Display, LCD for short), a speaker, a vibrator, and the like; storage 908 including, for example, magnetic tape, hard disk, etc.; and a communication device 909. The communication means 909 may allow the electronic device 900 to communicate wirelessly or by wire with other devices to exchange data. While fig. 9 shows an electronic device 900 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may be implemented or provided instead.

In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flowcharts. In such an embodiment, the computer program may be downloaded and installed from a network via the communication device 909, or installed from the storage device 908, or installed from the ROM 902. When executed by the processing device 901, performs the above-described functions defined in the methods of the embodiments of the present disclosure.

It should be noted that the computer readable medium described in the present disclosure may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this disclosure, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present disclosure, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, fiber optic cables, RF (radio frequency), and the like, or any suitable combination of the foregoing.

The computer readable medium may be contained in the electronic device; or may exist alone without being incorporated into the electronic device.

The computer-readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to perform the methods shown in the above-described embodiments.

Computer program code for carrying out operations of the present disclosure may be written in one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a local area network (Local Area Network, LAN for short) or a wide area network (Wide Area Network, WAN for short), or it may be connected to an external computer (e.g., connected via the internet using an internet service provider).

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units involved in the embodiments of the present disclosure may be implemented by means of software, or may be implemented by means of hardware. The name of the unit does not in any way constitute a limitation of the unit itself, for example the first acquisition unit may also be described as "unit acquiring at least two internet protocol addresses".

The functions described above herein may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: a Field Programmable Gate Array (FPGA), an Application Specific Integrated Circuit (ASIC), an Application Specific Standard Product (ASSP), a system on a chip (SOC), a Complex Programmable Logic Device (CPLD), and the like.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

In a first aspect, one or more embodiments of the present disclosure provide an image processing method, the method including:

determining a visual marker in each first image;

According to one or more embodiments of the present disclosure, determining a first pose of the first object from visual markers in the plurality of first images and the rotation amount comprises:

According to one or more embodiments of the present disclosure, determining the first pose from the M second poses and the rotation amount includes:

According to one or more embodiments of the present disclosure, determining the first pose according to the second pose and the rotation amount includes:

if the rotation amount is empty, determining the second pose as the first pose;

According to one or more embodiments of the present disclosure, determining the first pose according to the third pose and the rotation amount includes:

if the rotation amount is empty, determining the third pose as the first pose;

According to one or more embodiments of the present disclosure, for any one first image; acquiring a visual marker in a first image, comprising:

According to one or more embodiments of the present disclosure, obtaining the visual marker from the tracking information includes:

According to one or more embodiments of the present disclosure, the first object includes an inertial sensor therein; acquiring the rotation amount of the first object includes:

In a second aspect, one or more embodiments of the present disclosure provide an image processing apparatus including a first acquisition module 81, a first determination module 82, a second acquisition module 83, and a second determination module 84, wherein:

In accordance with one or more embodiments of the present disclosure, the second determining module 84 is specifically configured to:

if the rotation amount is empty, determining the second pose as the first pose;

If the rotation amount is empty, determining the third pose as the first pose;

In accordance with one or more embodiments of the present disclosure, the first determining module 82 is specifically configured to:

the memory stores computer-executable instructions;

The foregoing description is only of the preferred embodiments of the present disclosure and description of the principles of the technology being employed. It will be appreciated by persons skilled in the art that the scope of the disclosure referred to in this disclosure is not limited to the specific combinations of features described above, but also covers other embodiments which may be formed by any combination of features described above or equivalents thereof without departing from the spirit of the disclosure. Such as those described above, are mutually substituted with the technical features having similar functions disclosed in the present disclosure (but not limited thereto).

Moreover, although operations are depicted in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order. In certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while several specific implementation details are included in the above discussion, these should not be construed as limiting the scope of the present disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.

Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are example forms of implementing the claims.

Claims

1. An image processing method, comprising:

determining a visual marker in each first image;

2. The method of claim 1, wherein determining the first pose of the first object based on visual markers in the plurality of first images and the amount of rotation comprises:

3. The method of claim 2, wherein determining the first pose from the M second poses and the rotation amount comprises:

4. A method according to claim 3, wherein determining the first pose from the second pose and the amount of rotation comprises:

if the rotation amount is empty, determining the second pose as the first pose;

5. The method of claim 3 or 4, wherein determining the first pose based on the third pose and the amount of rotation comprises:

if the rotation amount is empty, determining the third pose as the first pose;

6. The method of any one of claims 1-5, wherein for any one of the first images, obtaining a visual marker in the first image comprises:

7. The method of claim 6, wherein obtaining the visual indicia from the tracking information comprises:

8. The method of any one of claims 1-7, wherein the first object includes an inertial sensor therein; acquiring the rotation amount of the first object includes:

9. An image processing apparatus, comprising a first acquisition module, a first determination module, a second acquisition module, and a second determination module, wherein:

10. An electronic device, comprising: a processor and a memory;

the memory stores computer-executable instructions;

the processor executes computer-executable instructions stored in the memory, causing the processor to perform the image processing method according to any one of claims 1 to 8.

11. A computer-readable storage medium, in which computer-executable instructions are stored, which, when executed by a processor, implement the image processing method according to any one of claims 1 to 8.

12. A computer program product comprising a computer program which, when executed by a processor, implements the image processing method according to any one of claims 1 to 8.