CN117372512A

CN117372512A - Method, device, equipment and medium for determining camera pose

Info

Publication number: CN117372512A
Application number: CN202210771166.8A
Authority: CN
Inventors: 李云龙; 吴涛; 王宝林; 左磊
Original assignee: Beijing Zitiao Network Technology Co Ltd
Current assignee: Beijing Zitiao Network Technology Co Ltd
Priority date: 2022-06-30
Filing date: 2022-06-30
Publication date: 2024-01-09

Abstract

The embodiment of the disclosure relates to a camera pose determining method, a device, equipment and a medium, wherein the method comprises the following steps: acquiring a target image acquired by a camera; determining an object region in the target image; extracting a plurality of feature points based on a non-object region in the target image; based on the plurality of feature points, a pose of the camera is determined. By adopting the technical scheme, the pose precision of the camera can be improved.

Description

Method, device, equipment and medium for determining camera pose

Technical Field

The disclosure relates to the technical field of image processing, and in particular relates to a method, a device, equipment and a medium for determining a camera pose.

Background

Visual localization is widely applied to wearable devices such as virtual reality devices, cameras are generally arranged in the wearable devices, and pose of the cameras is calculated based on images shot by the cameras by adopting a feature point detection mode, wherein accuracy of feature points can influence accuracy of the pose of the cameras.

At present, how to improve the precision of the pose of a camera is a technical problem to be solved urgently.

Disclosure of Invention

In order to solve the technical problems, the present disclosure provides a method, a device, equipment and a medium for determining a pose of a camera.

The embodiment of the disclosure provides a camera pose determining method, which comprises the following steps:

acquiring a target image acquired by a camera;

determining an object region in the target image;

extracting a plurality of feature points based on a non-object region in the target image;

based on the plurality of feature points, a pose of the camera is determined.

The embodiment of the disclosure also provides a camera pose determining device, which comprises:

the acquisition module is used for acquiring a target image acquired by the camera;

the area determining module is used for determining an object area in the target image;

an extraction module for extracting a plurality of feature points based on a non-object region in the target image;

and the pose determining module is used for determining the pose of the camera based on the plurality of characteristic points.

The embodiment of the disclosure also provides an electronic device, which comprises: a processor; a memory for storing the processor-executable instructions; the processor is configured to read the executable instructions from the memory and execute the instructions to implement the camera pose determining method according to the embodiment of the present disclosure.

The present disclosure also provides a computer-readable storage medium storing a computer program for executing the camera pose determination method as provided by the embodiments of the present disclosure.

The disclosed embodiments also provide a computer program product comprising computer programs/instructions which, when executed by a processor, implement a camera pose determination method as provided by the disclosed embodiments.

Compared with the prior art, the technical scheme provided by the embodiment of the disclosure has the following advantages: according to the camera pose determining scheme provided by the embodiment of the disclosure, through acquiring the target image acquired by the camera and determining the object area in the target image, a plurality of characteristic points are extracted based on the non-object area in the target image, and the pose of the camera is determined based on the plurality of characteristic points, so that following characteristic points are removed when the pose of the camera is determined, and the pose precision of the camera is improved.

Drawings

The above and other features, advantages, and aspects of embodiments of the present disclosure will become more apparent by reference to the following detailed description when taken in conjunction with the accompanying drawings. The same or similar reference numbers will be used throughout the drawings to refer to the same or like elements. It should be understood that the figures are schematic and that elements and components are not necessarily drawn to scale.

Fig. 1 is a flow chart of a method for determining a pose of a camera according to an embodiment of the disclosure;

Fig. 2 is a flowchart of another method for determining a pose of a camera according to an embodiment of the present disclosure;

fig. 3 is a schematic diagram of an object region in a target image according to an embodiment of the disclosure;

FIG. 4 is a schematic diagram of an object region in another target image according to an embodiment of the disclosure;

FIG. 5 is a schematic view of an object region in another target image according to an embodiment of the disclosure;

fig. 6 is a schematic structural diagram of a camera pose determining device according to an embodiment of the present disclosure;

fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure.

Detailed Description

Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure have been shown in the accompanying drawings, it is to be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but are provided to provide a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the present disclosure are for illustration purposes only and are not intended to limit the scope of the present disclosure.

The embodiment of the disclosure provides a camera pose determining method, and the method is described below with reference to specific embodiments.

Fig. 1 is a flow chart of a method for determining camera pose according to an embodiment of the present disclosure, where the method may be performed by a camera pose determining apparatus, and the apparatus may be implemented by software and/or hardware, and may be generally integrated in an electronic device. As shown in fig. 1, the method includes:

step 101, acquiring a target image acquired by a camera.

The method of the embodiment of the disclosure can be applied to the target device, wherein the target device is provided with a camera, and the target device comprises a wearable device such as a virtual reality device.

In this embodiment, the camera is controlled to collect the target image based on the visual positioning, and the pose of the camera is determined by detecting the feature point based on the target image collected by the camera.

Step 102, determining an object region in a target image.

In this embodiment, points under the world system corresponding to the object region satisfy the condition of following the camera motion. For example, taking a virtual reality device that can be worn by a user as an example, the object area includes a human body area, when the user wears the virtual reality device to move forward, the camera in the virtual reality device also moves forward, when the user looks down, the human body area of the user can be included in the target image collected by the camera, and a point under the world system corresponding to the human body area in the target image will follow the camera to move.

Among these are various implementations of determining the object region in the target image.

In one embodiment of the present disclosure, determining an object region in a target image includes: and processing the target image according to the pre-trained object detection model to generate an object region in the target image. In this embodiment, the object detection model may be trained through a sample image based on a deep neural network and a target detection algorithm, where the sample image is labeled with a region occupied by an object to be detected in the image, and the object detection model is input as an image and output as an object region in the image.

Taking a human body area as an example, taking a target area as a human body area as an example, for a virtual reality device provided with a camera, acquiring a sample image of a foot view angle of a user looking down, labeling the human body area in the sample image, training a human body detection model through the sample image, inputting the human body detection model into a target image shot by the camera, and outputting the input human body detection model into the human body area in the target image.

In one embodiment of the disclosure, the target device is further provided with an inertial measurement unit, which can obtain an estimated pose of the inertial measurement unit, and determine an inclination angle of the camera in a specified direction according to the estimated pose of the inertial measurement unit; the object region is determined from the target image according to the angle of view and the angle of inclination of the camera, optionally, the size of the object region increases with increasing angle of inclination, for example, the designated direction is a vertical direction, the target angle is determined according to half of the angle of view and the angle of inclination, the object region is determined according to the ratio between the target angle and the angle of view, and the corresponding relation between the preset ratio and the size of the object region.

Step 103, extracting a plurality of feature points based on the non-object region in the target image.

In this embodiment, the pose of the camera is determined by adopting a feature point detection manner, and optionally, based on a plurality of feature points extracted from the target image and a plurality of corresponding points in the world system, the pose of the camera can be determined according to a feature point-based camera pose estimation method.

In practical application, the accuracy of the feature points affects the pose accuracy of the camera, and according to the present embodiment, the pose accuracy of the camera is to be improved in the case where there are following feature points, compared to the case where there are no following feature points. Wherein the following feature points may be feature points in the object region, optionally, a plurality of first feature points are extracted based on the target image, and the following feature points are removed from the plurality of first feature points to determine second feature points not belonging to the object region.

Step 104, determining the pose of the camera based on the plurality of feature points.

In this embodiment, a camera pose estimation method is adopted to determine the pose of the camera based on a plurality of second feature points not belonging to the object region and a plurality of points under the world system corresponding to the plurality of second feature points.

According to the technical scheme of the embodiment of the disclosure, the target image acquired by the camera is acquired, the object area in the target image is determined, the plurality of characteristic points are extracted based on the non-object area in the target image, and the pose of the camera is determined based on the plurality of characteristic points, so that following characteristic points are removed when the pose of the camera is determined, the probability of effective characteristic points is increased, and the pose precision of the camera is improved.

Based on the above embodiments, a description will be given below of a camera pose determination method according to an embodiment of the present disclosure, taking an example in which an inertial measurement unit is provided on a target device.

Fig. 2 is a flow chart of another method for determining a pose of a camera according to an embodiment of the present disclosure, as shown in fig. 2, the method for determining a pose of a camera includes:

in step 201, a target image acquired by a camera is acquired.

In the disclosed embodiments, may be used for visual localization of a target device. The target device is provided with an inertial measurement unit and a camera, and is a head-mounted device, including a virtual reality device for example.

Step 202, acquiring the pose of the inertial measurement unit, and generating the inclination angle of the camera in the specified direction according to the pose of the inertial measurement unit.

In this embodiment, the inertial measurement unit may provide an estimated pose, and for the inertial measurement unit and the camera provided in the target device, the inclination angle of the camera in the specified direction may be determined according to the estimated pose provided by the inertial measurement unit. The inclination angle of the camera in the specified direction may be an included angle between a preset direction in the camera and the specified direction, alternatively, the preset direction may be a direction indicated by an image plane, a main optical axis, and the like of the camera, and the specified direction may be a vertical direction.

Among them, there are various implementations of generating an inclination angle of the camera in a specified direction according to the pose of the inertial measurement unit.

As one example, an inclination angle of the camera in a specified direction is generated from the pose of the inertial measurement unit and parameters of the camera. In this example, the parameters of the camera include external parameters of the camera, and according to the estimated pose provided by the inertial measurement unit and the external parameters of the camera, the inclination angle of the camera in the specified direction is calculated by a corresponding algorithm.

As another example, a first angle between the target device and the specified direction is determined according to the pose of the inertial measurement unit; the tilt angle is determined based on the first angle, a second angle between the camera and the target device. In this example, the inclination angle of the camera in the specified direction may be determined according to the estimated pose provided by the inertial measurement unit and the structural relationship between the hardware. According to the structural relationship between the target device and the inertial measurement unit and the estimated pose of the inertial measurement unit, a first angle between the target device and the specified direction may be determined, where the first angle may be an included angle between a preset direction in the target device and the specified direction, further, according to the structural relationship between the target device and the camera, a second angle between the camera and the target device may be determined, where the second angle may be an included angle between the preset direction in the target device and the preset direction in the camera, so as to determine an inclination angle of the camera in the specified direction according to a sum of the first angle and the second angle, for example, α=α1+α2, where α is an inclination angle of the camera in the specified direction, α1 is the first angle, and α2 is the second angle.

Step 203, determining the proportion of the object area in the target image according to the field angle and the inclination angle of the camera.

In this embodiment, the field angle of the camera indicates the shooting range of the camera, and according to the inclination angle and the field angle of the camera in the specified direction, the relationship between the shooting range and the specified direction of the camera may be determined, for example, taking the main optical axis of the camera as an example, if the included angle between the main optical axis of the camera and the specified direction is greater than or equal to one half of the field angle, the proportion of the object region in the target image may be determined to be zero, and if the included angle between the main optical axis of the camera and the specified direction is less than one half of the field angle, the difference between the one half of the field angle and the included angle may be determined, and the proportion of the object region in the target image may be determined.

As one example, the specified direction is a vertical direction, and the target angle is determined from half of the angle of view and the inclination angle of the camera in the specified direction; and determining the proportion of the object area in the target image according to the ratio between the target angle and the field angle.

In this example, the preset direction in the camera is the direction indicated by the image plane of the camera, and the proportion of the object area in the target image can be determined according to the following formula: p= (FOV/2- (90 ° - α))/FOV, where P is the proportion of the object region in the target image, FOV is the field angle of the camera, and α is the tilt angle of the camera in the vertical direction.

Taking a head-mounted virtual reality device which can be worn by a user as an example, wherein the object region comprises a human body region, when the user looks down, the estimated pose provided by the inertial measurement unit changes, the proportion of the object region in the target image is determined according to the steps, when the calculated proportion is greater than zero, the human body region of the user is contained in the target image acquired by the camera, and the calculated proportion is taken as the proportion of the object region in the target image.

It should be noted that the designation of the vertical direction is merely an example, and a horizontal direction or other designation may be used, which is not limited herein.

Step 204, determining the height of the object area according to the scale and the height of the target image, and determining the object area matched with the height of the object area from the lower area of the target image.

In this embodiment, the ratio and the height of the target image may be set as the height of the target region. Taking the head-mounted virtual reality device as an example, when the user looks down, the human body region of the user is generally included in the lower portion of the target image, and therefore the target region is determined from the lower region of the target image according to the height of the target region. The method is applied to the head-mounted equipment, the human body area in the target image is determined based on the inclination angle of the camera in the appointed direction, and in the scene that the user wears the head-mounted equipment, the implementation scheme for determining the human body area by combining the inclination angle is provided, a target detection algorithm is not needed, and the accuracy requirement can be met.

The following exemplifies the object region in the target image.

As an example, referring to fig. 3, a region denoted 31 in fig. 3 is a target region, and the target region in fig. 3 is a rectangular region, and in this example, the width of the target image is taken as the width of the target region, and the rectangular region is determined from the lower region of the target image based on the width of the target region and the height of the target region, and is taken as the target region.

As another example, referring to fig. 4, a region 41 in fig. 4 is a target region, and the target region in fig. 4 is a rectangular region, and in this example, the width of the target region is determined according to the height of the target region and a preset first aspect ratio coefficient, and further, the rectangular region is determined from the lower region of the target image according to the width of the target region and the height of the target region. Alternatively, the object region is located in the lower middle region of the target image, and the first aspect ratio factor is used to indicate the aspect ratio of the rectangular region, for example, the aspect ratio factor may be set to 1, or may be set to 5, or may be set to any ratio of 1 to 5, or may be set as desired, and is not limited thereto. Thus, with the head-mounted virtual reality device that can be worn by the user, when the user looks down, the human body region does not normally occupy the entire region of the lower portion of the target image, and therefore, the rectangular object region is determined from the lower portion region of the target image according to the preset first aspect ratio coefficient, the object region can be determined from the target image more accurately, the accuracy of the object region is improved, and the accuracy of the second feature point is further improved.

As another example, referring to fig. 5, a region denoted 51 in fig. 5 is a target region, and the target region in fig. 5 is a trapezoid region, in this example, an upper bottom and a lower bottom of the target region are determined according to a height of the target region and a preset second aspect ratio coefficient, and further, the trapezoid region is determined from a lower region of the target image according to a width of the target region, the upper bottom and the lower bottom of the target region. Optionally, the object region is located in a lower middle region of the target image, and the second aspect ratio factor is used to indicate a ratio of upper base to high and a ratio of lower base to high of the trapezoid region. Therefore, the trapezoid object area is determined from the lower area of the target image according to the preset second aspect ratio coefficient, the trapezoid object area is closer to the shape of the object area, the object area can be determined from the target image more accurately, the accuracy of the object area is improved, and the accuracy of the second feature point is further improved.

Alternatively, the ratio of the preset threshold value to the ratio of the object region in the target image may be compared, and it is determined in accordance with the comparison result which manner to determine the object region, as an example, when the ratio of the object region in the target image is detected to be smaller than the preset threshold value, the object region is determined from the target image in a manner corresponding to fig. 3, and when the ratio of the object region in the target image is detected to be equal to or larger than the preset threshold value, the object region is determined from the target image in a manner corresponding to fig. 4 or fig. 5.

In one embodiment of the present disclosure, in order to further improve accuracy of the object region, after the object region is determined in the above manner, the target image is processed according to the pre-trained object detection model, and a candidate region corresponding to the object region is determined from the target image, so that the object region is corrected according to the candidate region. Optionally, training an object detection model through a sample image based on the deep neural network and the target detection algorithm, wherein the sample image is marked with a occupied area of an object to be detected in the image, the object detection model is input into the image, the output area is taken as a candidate area, as an example, a common area between the correction area and the object area is taken as an updated object area, as another example, if no common area exists between the correction area and the object area, the correction area and the object area are combined to form the updated object area.

In step 205, a plurality of feature points are extracted based on the non-object region in the target image, and the pose of the camera is determined based on the plurality of feature points.

The explanation of the steps 103 and 104 in the foregoing embodiments is also applicable to this step, and will not be repeated here.

According to the method, the device and the system, the inclination angle of the camera in the appointed direction is generated through the pose of the inertial measurement unit, the proportion of the object area in the target image is determined according to the field angle and the inclination angle of the camera, the height of the object area is determined according to the proportion and the height of the target image, and the object area matched with the height of the object area is determined from the lower area of the target image.

Fig. 6 is a schematic structural diagram of a camera pose determining apparatus according to an embodiment of the present disclosure, where the apparatus may be implemented by software and/or hardware, and may be generally integrated in an electronic device for determining a camera pose. As shown in fig. 6, the camera pose determining apparatus includes: the position and orientation determining module 64 includes an acquiring module 61, a region determining module 62, an extracting module 63.

The acquiring module 61 is configured to acquire a target image acquired by the camera.

The region determining module 62 is configured to determine a region of the object in the target image.

An extracting module 63 is configured to extract a plurality of feature points based on a non-object area in the target image.

The pose determining module 64 is configured to determine a pose of the camera based on the plurality of feature points.

The camera pose determining device provided by the embodiment of the disclosure can execute the camera pose determining method provided by any embodiment of the disclosure, and has the corresponding functional modules and beneficial effects of the executing method.

To achieve the above embodiments, the present disclosure also proposes a computer program product comprising a computer program/instruction which, when executed by a processor, implements the camera pose determination method in the above embodiments.

Referring now in particular to fig. 7, a schematic diagram of an electronic device 700 suitable for use in implementing embodiments of the present disclosure is shown. The electronic device 700 in the embodiments of the present disclosure may include, but is not limited to, mobile terminals such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), in-vehicle terminals (e.g., in-vehicle navigation terminals), and the like, and stationary terminals such as digital TVs, desktop computers, and the like. The electronic device shown in fig. 7 is merely an example and should not be construed to limit the functionality and scope of use of the disclosed embodiments.

As shown in fig. 7, the electronic device 700 may include a processing means (e.g., a central processor, a graphics processor, etc.) 701, which may perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 702 or a program loaded from a storage means 708 into a Random Access Memory (RAM) 703. In the RAM703, various programs and data required for the operation of the electronic device 700 are also stored. The processing device 701, the ROM 702, and the RAM703 are connected to each other through a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.

In general, the following devices may be connected to the I/O interface 705: input devices 706 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, and the like; an output device 707 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 708 including, for example, magnetic tape, hard disk, etc.; and a communication device 709. The communication means 709 may allow the electronic device 700 to communicate wirelessly or by wire with other devices to exchange data. While fig. 7 shows an electronic device 700 having various means, it is to be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may be implemented or provided instead.

In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a non-transitory computer readable medium, the computer program comprising program code for performing the method shown in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via communication device 709, or installed from storage 708, or installed from ROM 702. When the computer program is executed by the processing device 701, the above-described functions defined in the camera pose determination method of the embodiment of the present disclosure are performed.

It should be noted that the computer readable medium described in the present disclosure may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this disclosure, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present disclosure, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, fiber optic cables, RF (radio frequency), and the like, or any suitable combination of the foregoing.

In some implementations, the clients, servers may communicate using any currently known or future developed network protocol, such as HTTP (HyperText Transfer Protocol ), and may be interconnected with any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the internet (e.g., the internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed networks.

The computer readable medium may be contained in the electronic device; or may exist alone without being incorporated into the electronic device.

The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: acquiring a target image acquired by a camera; determining an object region in the target image; extracting a plurality of feature points based on a non-object region in the target image; based on the plurality of feature points, a pose of the camera is determined.

Computer program code for carrying out operations of the present disclosure may be written in one or more programming languages, including, but not limited to, an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units involved in the embodiments of the present disclosure may be implemented by means of software, or may be implemented by means of hardware. Wherein the names of the units do not constitute a limitation of the units themselves in some cases.

The functions described above herein may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: a Field Programmable Gate Array (FPGA), an Application Specific Integrated Circuit (ASIC), an Application Specific Standard Product (ASSP), a system on a chip (SOC), a Complex Programmable Logic Device (CPLD), and the like.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

According to one or more embodiments of the present disclosure, the present disclosure provides a camera pose determination method, including: acquiring a target image acquired by a camera; determining an object region in the target image; extracting a plurality of feature points based on a non-object region in the target image; based on the plurality of feature points, a pose of the camera is determined.

According to one or more embodiments of the present disclosure, in the method for determining a pose of a camera provided by the present disclosure, the determining an object region in the target image includes: and processing the target image according to a pre-trained object detection model to generate an object region in the target image.

According to one or more embodiments of the present disclosure, in a camera pose determining method provided by the present disclosure, the method is applied to a target device, an inertial measurement unit and the camera are disposed on the target device, and the determining an object region in the target image includes: acquiring the pose of the inertial measurement unit, and generating an inclination angle of the camera in a specified direction according to the pose of the inertial measurement unit; and determining an object region from the target image according to the field angle of the camera and the inclination angle, wherein the size of the object region increases with the increase of the inclination angle.

According to one or more embodiments of the present disclosure, in the method for determining a pose of a camera provided by the present disclosure, the generating an inclination angle of the camera in a specified direction according to the pose of the inertial measurement unit includes: generating an inclination angle of the camera in a specified direction according to the pose of the inertial measurement unit and the parameters of the camera; or determining a first angle between the target device and the specified direction according to the pose of the inertial measurement unit; the tilt angle is determined from the first angle, a second angle between the camera and the target device.

According to one or more embodiments of the present disclosure, in the camera pose determining method provided by the present disclosure, the specified direction is a vertical direction, and the determining, according to the field angle and the tilt angle of the camera, the object region from the target image includes: determining a target angle according to half of the field angle and the inclination angle; and determining the object area according to the ratio between the target angle and the field angle. Optionally, a proportion of the object region in the target image is determined according to a ratio between the target angle and the field angle, and the object region is determined from the target image according to the proportion.

According to one or more embodiments of the present disclosure, in the method for determining a pose of a camera provided by the present disclosure, the target device is a head-mounted device, and determining, according to a field angle and the tilt angle of the camera, a target area from the target image includes: determining the proportion of the object area in the target image according to the field angle and the inclination angle of the camera; determining the height of the object region according to the ratio and the height of the target image; an object region matching a height of the object region is determined from a lower region of the target image.

According to one or more embodiments of the present disclosure, in the camera pose determination method provided by the present disclosure, the object region is a rectangular region, and determining, from a lower region of the target image, an object region that is highly matched with the object region includes: the rectangular region is determined from a lower region of the target image based on the width of the target image and the height of the target region, with the width of the target image being the width of the target region.

According to one or more embodiments of the present disclosure, in the camera pose determination method provided by the present disclosure, the object region is a rectangular region or a trapezoidal region, and the determining the object region that is highly matched with the object region from the lower region of the target image includes: determining the width of the object region according to the height of the object region and a preset first aspect ratio coefficient; determining the rectangular region from a lower region of the target image according to the width of the object region and the height of the object region; or determining the upper bottom and the lower bottom of the object region according to the height of the object region and a preset second aspect ratio coefficient; the trapezoid area is determined from a lower area of the target image according to the width of the object area, the upper bottom and the lower bottom of the object area.

According to one or more embodiments of the present disclosure, in the camera pose determination method provided by the present disclosure, after determining the object region from the target image, the method further includes: processing the target image according to a pre-trained object detection model, and determining a candidate region corresponding to the object region from the target image; and correcting the object area according to the candidate area.

According to one or more embodiments of the present disclosure, the present disclosure provides a camera pose determination apparatus including: the acquisition module is used for acquiring a target image acquired by the camera; the area determining module is used for determining an object area in the target image; an extraction module for extracting a plurality of feature points based on a non-object region in the target image; and the pose determining module is used for determining the pose of the camera based on the plurality of characteristic points.

According to one or more embodiments of the present disclosure, in the camera pose determining apparatus provided by the present disclosure, the region determining module is specifically configured to: and processing the target image according to a pre-trained object detection model to generate an object region in the target image.

According to one or more embodiments of the present disclosure, in the camera pose determining apparatus provided by the present disclosure, the camera pose determining apparatus is applied to a target device, an inertial measurement unit and the camera are disposed on the target device, and an area determining module includes: the inclination angle acquisition unit is used for acquiring the pose of the inertial measurement unit and generating an inclination angle of the camera in a specified direction according to the pose of the inertial measurement unit; and an object region determining unit configured to determine an object region from the target image according to the angle of view of the camera and the tilt angle, wherein a size of the object region increases as the tilt angle increases.

According to one or more embodiments of the present disclosure, in the camera pose determining device provided by the present disclosure, the inclination angle acquiring unit is specifically configured to: generating an inclination angle of the camera in a specified direction according to the pose of the inertial measurement unit and the parameters of the camera; or determining a first angle between the target device and the specified direction according to the pose of the inertial measurement unit; the tilt angle is determined from the first angle, a second angle between the camera and the target device.

According to one or more embodiments of the present disclosure, in the camera pose determining apparatus provided by the present disclosure, the specified direction is a vertical direction, and the object region determining unit is specifically configured to: determining a target angle according to half of the field angle and the inclination angle; and determining the object area according to the ratio between the target angle and the field angle.

According to one or more embodiments of the present disclosure, in the camera pose determination apparatus provided by the present disclosure, the object region determination unit is specifically configured to: according to the field angle and the inclination angle of the camera, the proportion of the object area in the target image; determining the height of the object region according to the ratio and the height of the target image; an object region matching a height of the object region is determined from a lower region of the target image.

According to one or more embodiments of the present disclosure, in the camera pose determining apparatus provided by the present disclosure, the object region is a rectangular region, and the object region determining unit is specifically configured to: the rectangular region is determined from a lower region of the target image based on the width of the target image and the height of the target region, with the width of the target image being the width of the target region.

According to one or more embodiments of the present disclosure, in the camera pose determining apparatus provided by the present disclosure, the object region is a rectangular region or a trapezoidal region, and the object region determining unit is specifically configured to: determining the width of the object region according to the height of the object region and a preset first aspect ratio coefficient; determining the rectangular region from a lower region of the target image according to the width of the object region and the height of the object region; or determining the upper bottom and the lower bottom of the object region according to the height of the object region and a preset second aspect ratio coefficient; the trapezoid area is determined from a lower area of the target image according to the width of the object area, the upper bottom and the lower bottom of the object area.

According to one or more embodiments of the present disclosure, in the camera pose determining apparatus provided by the present disclosure, further includes: the correction module is used for processing the target image according to a pre-trained object detection model and determining a candidate region corresponding to the object region from the target image; and correcting the object area according to the candidate area.

According to one or more embodiments of the present disclosure, the present disclosure provides an electronic device comprising: a processor; a memory for storing the processor-executable instructions; the processor is configured to read the executable instructions from the memory and execute the instructions to implement any of the camera pose determination methods provided in the present disclosure.

According to one or more embodiments of the present disclosure, the present disclosure provides a computer-readable storage medium storing a computer program for performing any one of the camera pose determination methods as provided by the present disclosure.

It should be understood that the various steps recited in the method embodiments of the present disclosure may be performed in a different order and/or performed in parallel. Furthermore, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this respect.

The term "including" and variations thereof as used herein are intended to be open-ended, i.e., including, but not limited to. The term "based on" is based at least in part on. The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments. Related definitions of other terms will be given in the description below.

It should be noted that the terms "first," "second," and the like in this disclosure are merely used to distinguish between different devices, modules, or units and are not used to define an order or interdependence of functions performed by the devices, modules, or units. The references to "a" and "an" in this disclosure are intended to be illustrative rather than limiting, and those of ordinary skill in the art will appreciate that "one or more" is intended to be understood as "one or more" unless the context clearly indicates otherwise.

The names of messages or information interacted between the various devices in the embodiments of the present disclosure are for illustrative purposes only and are not intended to limit the scope of such messages or information.

Claims

1. A camera pose determination method, comprising:

acquiring a target image acquired by a camera;

determining an object region in the target image;

based on the plurality of feature points, a pose of the camera is determined.

2. The method of claim 1, applied to a target device on which an inertial measurement unit and the camera are disposed, the determining the object region in the target image comprising:

acquiring the pose of the inertial measurement unit, and generating an inclination angle of the camera in a specified direction according to the pose of the inertial measurement unit;

and determining an object region from the target image according to the field angle of the camera and the inclination angle, wherein the size of the object region increases with the increase of the inclination angle.

3. The method of claim 2, wherein the generating an inclination angle of the camera in a specified direction from the pose of the inertial measurement unit comprises:

generating an inclination angle of the camera in a specified direction according to the pose of the inertial measurement unit and the parameters of the camera; or,

Determining a first angle between the target device and the specified direction according to the pose of the inertial measurement unit;

the tilt angle is determined from the first angle, a second angle between the camera and the target device.

4. The method of claim 2, wherein the specified direction is a vertical direction, the determining the object region from the target image based on the field angle and the tilt angle of the camera comprises:

determining a target angle according to half of the field angle and the inclination angle;

and determining the object area according to the ratio between the target angle and the field angle.

5. The method of claim 2, wherein the target device is a headset device, the determining the object region from the target image based on the field angle and the tilt angle of the camera comprising:

determining the proportion of the object area in the target image according to the field angle and the inclination angle of the camera;

determining the height of the object region according to the ratio and the height of the target image;

an object region matching a height of the object region is determined from a lower region of the target image.

6. The method of claim 5, wherein the object region is a rectangular region, the determining an object region from a lower region of the target image that matches a height of the object region comprising:

the rectangular region is determined from a lower region of the target image based on the width of the target image and the height of the target region, with the width of the target image being the width of the target region.

7. The method of claim 5, wherein the object region is a rectangular region or a trapezoidal region, the determining the object region that highly matches the object region from a lower region of the target image comprising:

determining the width of the object region according to the height of the object region and a preset first aspect ratio coefficient;

determining the rectangular region from a lower region of the target image according to the width of the object region and the height of the object region; or,

determining the upper bottom and the lower bottom of the object region according to the height of the object region and a preset second aspect ratio coefficient;

the trapezoid area is determined from a lower area of the target image according to the width of the object area, the upper bottom and the lower bottom of the object area.

8. The method of claim 2, further comprising, after determining an object region from the target image:

processing the target image according to a pre-trained object detection model, and determining a candidate region corresponding to the object region from the target image;

and correcting the object area according to the candidate area.

9. A camera pose determination apparatus, characterized by comprising:

10. An electronic device, the electronic device comprising:

a processor;

a memory for storing the processor-executable instructions;

the processor is configured to read the executable instructions from the memory and execute the instructions to implement the camera pose determination method according to any of the preceding claims 1-8.

11. A computer readable storage medium, characterized in that the storage medium stores a computer program for executing the camera pose determination method according to any of the preceding claims 1-8.

12. A computer program product, characterized in that the computer program product comprises a computer program/instruction which, when executed by a processor, implements the camera pose determination method according to any of claims 1-8.