CN113409385B - Characteristic image identification and positioning method and system - Google Patents

Characteristic image identification and positioning method and system Download PDF

Info

Publication number
CN113409385B
CN113409385B CN202010181937.9A CN202010181937A CN113409385B CN 113409385 B CN113409385 B CN 113409385B CN 202010181937 A CN202010181937 A CN 202010181937A CN 113409385 B CN113409385 B CN 113409385B
Authority
CN
China
Prior art keywords
feature
image
positioning
preset number
feature image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010181937.9A
Other languages
Chinese (zh)
Other versions
CN113409385A (en
Inventor
罗勇
施波迪
黄之昊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Bilibili Technology Co Ltd
Original Assignee
Shanghai Bilibili Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Bilibili Technology Co Ltd filed Critical Shanghai Bilibili Technology Co Ltd
Priority to CN202010181937.9A priority Critical patent/CN113409385B/en
Publication of CN113409385A publication Critical patent/CN113409385A/en
Application granted granted Critical
Publication of CN113409385B publication Critical patent/CN113409385B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)
  • Studio Devices (AREA)

Abstract

The application discloses a characteristic image identification and positioning method, which comprises the following steps: scanning and identifying a characteristic image in a picture through the picture shot by the camera; selecting a preset number of feature points in the area where the feature image is located in the picture; respectively obtaining the three-dimensional coordinates of the preset number of feature points by a hitTest method; respectively calculating the distances between the preset number of feature points and the camera according to the three-dimensional coordinates, and judging whether the feature image is positioned effectively or not according to the calculated distances; and when the feature image is judged to be positioned effectively according to the calculated distance, creating anchor points according to the preset number of feature points, and taking the coordinates of the anchor points as the three-dimensional coordinates of the feature image. The application also discloses a characteristic image recognition and positioning system, a device and a computer readable storage medium. Therefore, the characteristic image space position can be quickly and accurately obtained in the image enhancement function.

Description

Characteristic image identification and positioning method and system
Technical Field
The present application relates to the field of augmented reality technologies, and in particular, to a method, a system, an electronic device, and a computer-readable storage medium for recognizing and positioning a feature image.
Background
The ARCore platform of the Android system and the ARKit platform of the iOS system both have an enhanced image (Augmented Images) function, the function is to scan a feature image, an AR (Augmented Reality) camera can recognize the feature image, and the spatial position (coordinate information) and length and width information of the feature image in an AR camera spatial coordinate system are calculated. After the spatial position of the feature image is obtained, a picture, a video or a 3D model can be displayed at the spatial position.
The process of implementing the enhanced image function includes two parts: firstly, picture comparison and identification are carried out, and whether a prefabricated characteristic image appears in a shot picture of the AR camera or not is found out; and secondly, calculating the three-dimensional coordinate position of the characteristic image in the space coordinate system of the AR camera.
However, the enhanced image function of the existing ARCore platform and ARKit platform has certain defects in implementation. Taking the example of the arc platform, the computation time is longer when the spatial position of the feature image is computed, and especially, the larger the target image (the feature image) is, the longer the corresponding computation time is. In addition, the openness of the Android system causes the hardware composition to be uneven, so that the calculation speed of the ARCore platform for the space position of the feature image is slower, and the result may be inaccurate.
It should be noted that the above-mentioned contents are not intended to limit the scope of the claims.
Disclosure of Invention
The present application mainly aims to provide a method, a system, an electronic device and a computer readable storage medium for identifying and positioning feature images, and aims to solve the problem of how to quickly and accurately obtain the spatial position of a feature image in the image enhancement function.
In order to achieve the above object, an embodiment of the present application provides a feature image identification and location method, where the method includes:
scanning and identifying a characteristic image in a picture through the picture shot by a camera;
selecting a preset number of feature points in the area where the feature image is located in the picture;
respectively obtaining the three-dimensional coordinates of the preset number of feature points by a hitTest method;
respectively calculating the distances between the preset number of feature points and the camera according to the three-dimensional coordinates, and judging whether the feature image is positioned effectively or not according to the calculated distances; and
and when the positioning of the characteristic image is judged to be effective according to the calculated distance, creating anchor points according to the preset number of characteristic points, and taking the coordinates of the anchor points as the three-dimensional coordinates of the characteristic image.
Optionally, the method further comprises:
and when the positioning of the characteristic image is judged to be invalid according to the calculated distance, prompting a user to adjust the picture.
Optionally, the preset number is five.
Optionally, the selecting a preset number of feature points in the area where the feature image is located in the picture includes:
acquiring a rectangular frame of the characteristic image in the picture through a target detection algorithm of a TensorFlow Lite technology;
selecting five feature points in the range of the rectangular frame according to a preset rule, wherein the preset rule is as follows:
taking the upper left corner of the frame as the origin of the coordinate system, the coordinates of the upper left corner of the rectangular frame are (a, b), and the coordinates of the lower right corner of the rectangular frame are (c, d), so that the coordinates of the five selected feature points are ((a + c) × 0.5, (b + d) × 0.5), ((a + c) × 0.375), ((a + c) × 0.5, (b + d) × 0.625), ((a + c) × 0.375, (b + d) × 0.5), and ((a + c) × 0.675, (b + d) × 0.5).
Optionally, the determining whether the positioning of the feature image is valid according to the calculated distance includes:
calculating a mathematical expected value according to the obtained distance, wherein the mathematical expected value is an average value of the distances between the preset number of feature points and the camera;
calculating a standard deviation of the obtained distance and the mathematical expected value;
judging whether the standard deviation exceeds a preset threshold value or not so as to determine whether the positioning of the feature images through the preset number of feature points is effective or not, wherein when the value of the standard deviation is larger than the threshold value, the positioning of the feature images through the preset number of feature points is determined to be ineffective; when the value of the standard deviation is less than or equal to the threshold, determining that the positioning of the feature image by the preset number of feature points is valid.
Optionally, the anchor point is a middle point of the preset number of feature points.
Optionally, after the picture including the feature image captured by the camera is scanned and identified, the method further includes:
and prompting a user to align the camera with the feature image and ensuring that at least 50% of the feature image is filled in the picture.
In addition, to achieve the above object, an embodiment of the present application further provides a feature image recognition and positioning system, where the system includes:
the scanning module is used for scanning and identifying the characteristic image through a picture which is shot by the camera and contains the characteristic image;
the selection module is used for selecting a preset number of feature points in the area where the feature image is located in the picture;
the test module is used for respectively obtaining the three-dimensional coordinates of the preset number of feature points by a hitTest method;
the judging module is used for respectively calculating the distances between the preset number of feature points and the camera according to the three-dimensional coordinates and judging whether the feature image is positioned effectively or not according to the calculated distances;
and the positioning module is used for creating an anchor point according to the preset number of feature points and taking the coordinates of the anchor point as the three-dimensional coordinates of the feature image when the feature image is judged to be effectively positioned according to the calculated distance.
In order to achieve the above object, an embodiment of the present application further provides an electronic device, including: the system comprises a memory, a processor and a characteristic image identification and positioning program which is stored on the memory and can run on the processor, wherein the characteristic image identification and positioning program realizes the characteristic image identification and positioning method when being executed by the processor.
In order to achieve the above object, an embodiment of the present application further provides a computer-readable storage medium, where a feature image recognition and positioning program is stored on the computer-readable storage medium, and when executed by a processor, the feature image recognition and positioning program implements the feature image recognition and positioning method as described above.
The characteristic image identification and positioning method, the characteristic image identification and positioning system, the electronic device and the computer readable storage medium provided by the embodiment of the application can quickly and accurately obtain the spatial position of the characteristic image in the image enhancement function by selecting the preset number of characteristic points in the area where the characteristic image is located, so that the image enhancement functions of an ARCore platform and an ARKit platform are improved.
Drawings
FIG. 1 is a diagram of an application environment architecture in which various embodiments of the present application may be implemented;
fig. 2 is a flowchart of a feature image recognition and positioning method according to a first embodiment of the present application;
FIG. 3 is a detailed flowchart of step S206 in FIG. 2;
fig. 4 is a flowchart of a feature image recognition and positioning method according to a second embodiment of the present application;
fig. 5 is a schematic hardware architecture diagram of an electronic device according to a third embodiment of the present application;
fig. 6 is a schematic block diagram of a feature image recognition and localization system according to a fourth embodiment of the present application;
fig. 7 is a schematic block diagram of a feature image recognition and localization system according to a fifth embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of and not restrictive on the broad application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
It should be noted that the descriptions relating to "first", "second", etc. in the embodiments of the present application are only for descriptive purposes and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one of the feature. In addition, technical solutions between various embodiments may be combined with each other, but must be realized by a person skilled in the art, and when the technical solutions are contradictory or cannot be realized, such a combination should not be considered to exist, and is not within the protection scope of the present application.
Referring to fig. 1, fig. 1 is a diagram illustrating an architecture of an application environment for implementing various embodiments of the present application. The present application is applicable to an application environment including, but not limited to, the terminal device 1 and the characteristic image 2. The terminal equipment 1 comprises a camera 3, a characteristic image identification and positioning device 4 and the like. The terminal device 1 can be an electronic terminal with AR technology such as a mobile phone, a tablet computer and a wearable device, and is particularly suitable for a smart phone with an image enhancement function of an ARCore platform or an ARKit platform.
The AR technology is a technology for skillfully fusing virtual information and a real world, and a plurality of technical means such as multimedia, three-dimensional modeling, real-time tracking and registration, intelligent interaction, sensing and the like are widely applied, and virtual information such as characters, images, three-dimensional models, music, videos and the like generated by a computer is applied to the real world after analog simulation, and the two kinds of information supplement each other, so that the real world is enhanced. The ARKit platform is an augmented reality development platform introduced by apple Inc. that developers can use to create augmented reality applications. The ARCore platform is a software platform for building an augmented reality application program, which is derived from Google, is similar to an ARKit platform, and can put a digital object into the real world by using the progress of cloud software and equipment hardware.
The enhanced image function is a function of the arcre platform and the ARKit platform, and can identify the coordinates, size, and direction of locating a specific image (feature image 2) in the AR camera coordinate system. A developer may build an AR application that can respond to the particular image. The characteristic image 2 is a specific object image in the real world, is a two-dimensional plane image, and can be a portrait stand plate of an offline exhibition scene, a picture on a computer screen, or a product package or a movie poster. The camera 3 is used for shooting a picture containing the characteristic image 2. The characteristic image identification and positioning device 4 is used for identifying the characteristic image 2 according to the picture shot by the camera 3 and positioning the space coordinate of the characteristic image 2. The characteristic image recognition and positioning device 4 may be a hardware device in the terminal device 1 or a software system running in the terminal device 1.
With the enhanced image function, the user may trigger an AR experience when aligning the camera 3 of the terminal device 1 with the feature image 2. For example, the user may aim the camera 3 of the terminal device 1 at a movie poster, pop up a person in the poster, and then bring up a scene.
It is understood that other components for implementing basic functions, such as a display device, may also be included in the terminal device 1, and are not described herein again.
It should be noted that the feature image recognition and positioning device 4 may be separately present in the terminal device 1 from the camera 3, may be integrated with the camera 3 (for example, an AR camera), or may be in other combination forms, which is not limited herein.
Example one
Fig. 2 is a flowchart of a feature image recognition and positioning method according to a first embodiment of the present application. It is to be understood that the flow charts in the embodiments of the present method are not intended to limit the order in which the steps are performed.
The method comprises the following steps:
and S200, identifying the characteristic image 2 by scanning a picture shot by the camera 3 and containing the characteristic image 2.
And starting the camera 3, shooting a picture containing the characteristic image 2 by controlling the camera 3, and scanning and identifying the characteristic image 2 in the picture. This step is in a state where the feature image 2 is recognized but the spatial position information of the feature image 2 is not yet acquired, that is, in a tracking state suspended state (tracking state suspended state).
It should be noted that if the filling area of the feature image 2 in the frame is too small, the subsequently selected feature points may appear to be out of the range of the feature image 2, which results in a large error in the final positioning result. Therefore, in order to ensure the positioning effect, the user is prompted to align the camera 3 with the feature image 2 in this step, and to ensure that at least 50% of the feature image 2 is filled in the frame.
S202, selecting a preset number of feature points in the area where the feature image 2 is located in the picture.
The core problem of slow speed of identifying and positioning the feature image 2 by the existing enhanced image function is that there are too many feature points to be automatically identified and tracked, and in practical application scenes, the feature image 2 is generally a two-dimensional plane image, such as a portrait stand of an offline exhibition scene. The user uses the camera 3 (near the feature image 2) to shoot the picture of the feature image 2, and then after the enhanced image function identifies and positions information such as the spatial position of the feature image 2, the AR display can be performed at the spatial position, for example, a portrait in the portrait stand is popped up, and some actions are performed through a 3D animation. The essence of the enhanced image functionality is therefore to identify a close-range planar object (e.g. a portrait stand at a close distance from the camera 3), so that too many feature points are not required. In order to increase the calculation speed of positioning the space coordinates of the feature image 2, only a few feature points need to be selected for tracking. In this embodiment, the preset number is five. The larger the number of feature points, the more accurate the calculation result of the spatial coordinates for locating the feature image 2, with the corresponding cost of longer time consumption for execution. The number of the characteristic points is determined to be five, and the characteristic points are values obtained through multiple comparison tests, wherein the accuracy and the efficiency are balanced.
The specific process of selecting the five characteristic points is as follows:
first, a rectangular frame of the feature image 2 in the screen (or screen) is obtained by an Object Detection algorithm of the tensrflow Lite technique. And then, selecting five characteristic points in the range of the rectangular frame according to a preset rule. The preset rule is as follows:
assuming that the upper left corner of the picture (or screen) is the origin of the coordinate system, the coordinates of the upper left corner of the rectangular frame are (a, b), and the coordinates of the lower right corner of the rectangular frame are (c, d). The five feature point coordinates selected are ((a + c) × 0.5, (b + d) × 0.5), ((a + c) × 0.5, (b + d) × 0.375), ((a + c) × 0.5, (b + d) × 0.625), ((a + c) × 0.375, (b + d) × 0.5) and ((a + c) × 0.675, (b + d) < 0.5).
The five feature points selected in the above manner are located in different directions of the region where the feature image 2 is located, and on the basis of reducing the number of feature points as much as possible (five feature points are controlled), the accuracy of positioning the feature image 2 by the feature points can be effectively satisfied.
The TensorFlow Lite technology is a neural network computing framework which is derived from Google and faces to mobile equipment and embedded equipment, is a lightweight solution for deep learning in mobile platform deployment, supports machine learning reasoning, and has low delay and smaller binary file size. The technology supports floating point operation and a quantification model, is optimized aiming at a mobile platform, can be used for creating and running a self-defined model, and a developer can also add self-defined operation in the model. The Object Detection algorithm is a computer vision technique intended to detect objects such as cars, buildings and humans, which are usually recognized by images or video. The algorithm includes locating the presence of an object in the image and drawing a bounding box around the object.
Of course, in other embodiments, other suitable numbers of feature points may be selected, or the coordinates of the feature points may be determined according to other rules.
And S204, respectively obtaining the three-dimensional coordinates of the preset number of feature points by a hitTest (click test) method.
The hitTest is a functional interface of the ARCore platform and the ARKit platform, and is used for finding a real world interface corresponding to a certain point in the picture, and when a user inputs any point (x, y) in a screen, for example, clicks the point on the screen, the output is to obtain the three-dimensional coordinates of the point (the point in the real world) corresponding to the point in the picture. In this embodiment, after the five feature points are selected, three-dimensional coordinates of the five feature points in an AR camera coordinate system (with the position of the camera 3 as the origin of the coordinate system) can be obtained by a hitTest method, and are respectively denoted as P1 to P5.
And S206, calculating the distance between the preset number of feature points and the camera 3 according to the three-dimensional coordinates, and judging whether the positioning of the feature image 2 is effective or not according to the calculated distance.
According to the three-dimensional coordinates of the five feature points, the distance between each feature point and the camera 3 (AR camera) can be calculated, and if the difference between the distances is small, it indicates that the five feature points are located on the same two-dimensional plane, that is, the plane where the feature image 2 is located is found, and the positioning mode of the feature image 2 by the preset number of feature points is effective.
Further referring to fig. 3, a detailed flow chart of the step S206 is shown. In this embodiment, the step S206 specifically includes:
s2060, calculating the distances between the preset number of feature points and the camera 3 according to the three-dimensional coordinates.
Assuming that the preset number is n (n is a positive integer), n distance values can be obtained and recorded as x 1 、x 2 、…、x n . For example, the distance between each feature point and the camera 3 is calculated according to the three-dimensional coordinates of the five feature points, and five distance values are obtained and are marked as x 1 、x 2 、x 3 、x 4 、x 5
S2062, a mathematical expected value is calculated from the obtained distance.
In this embodiment, the mathematical expected value is an average value of the n distance values, and the calculation formula is the mathematical expected value
Figure BDA0002412862710000091
For example, the average of the five distance values is calculated, i.e.
Figure BDA0002412862710000092
S2064, calculating the standard deviation of the obtained distance and the mathematical expectation value.
The standard deviation is each distance value (x) calculated in the above steps i I =1,2, …, n) and the mathematical expected value
Figure BDA0002412862710000093
The square root of the arithmetic mean of the sum of squared deviations is calculated as
Figure BDA0002412862710000094
Figure BDA0002412862710000095
S2066, determining whether the standard deviation exceeds a preset threshold, thereby determining whether the positioning of the feature image 2 by the preset number of feature points is valid.
The standard deviation represents the fluctuation of the difference of the numerical values, and the larger the value of the standard deviation, the larger the difference between the respective distance values. When the value of the standard deviation exceeds (is greater than) the threshold, it indicates that the positioning of the feature image 2 by the feature point is invalid (result is inaccurate). When the value of the standard deviation does not exceed (is less than or equal to) the threshold, it indicates that the positioning of the feature image 2 by the feature point is valid. In the present embodiment, the threshold is set to 10%.
Returning to fig. 2, in step S208, when it is determined that the positioning of the feature image 2 is valid according to the calculated distance, an anchor point is created according to the preset number of feature points, and the coordinates of the anchor point are used as the three-dimensional coordinates of the feature image 2.
If the feature image 2 is determined to be located effectively through the feature points, the intermediate points of the preset number of feature points are used as the anchor points, the three-dimensional coordinates of the anchor points are used as the three-dimensional coordinates of the feature image 2, and then function display can be performed according to the positions of the feature image 2.
The feature image identification and positioning method provided by the embodiment can improve the identification and positioning speed of the feature image 2 by selecting five feature points in the area where the feature image 2 is located, and improve the image enhancement functions of an ARCore platform and an ARKit platform. Experiments prove that the method can obviously shorten the time for calculating the spatial position of the characteristic image 2, for example, the response time for identifying and positioning a standing card can be increased from 6 seconds to 3 seconds.
Example two
Fig. 4 is a flowchart of a feature image recognition and positioning method according to a second embodiment of the present application. In the second embodiment, the method for identifying and locating a feature image further includes step S410 on the basis of the first embodiment. It is to be understood that the flow charts in the embodiments of the present method are not intended to limit the order in which the steps are performed.
The method comprises the following steps:
and S400, identifying the characteristic image 2 through scanning of a picture containing the characteristic image 2 shot by the camera 3.
And starting the camera 3, shooting a picture containing the characteristic image 2 by controlling the camera 3, and scanning and identifying the characteristic image 2 in the picture. This step is in a state where the feature image 2 is recognized but the spatial position information of the feature image 2 is not yet acquired, that is, in a trackingstate.
It should be noted that if the filling area of the feature image 2 in the frame is too small, the subsequently selected feature points may appear to be points outside the range of the feature image 2, which results in a large error in the final positioning result. Therefore, in order to ensure the positioning effect, the user is prompted to align the camera 3 with the feature image 2 in this step, and it is ensured that at least 50% of the feature image 2 is filled in the screen.
S402, selecting a preset number of feature points in the area of the feature image 2 in the picture.
The core problem of slow speed of identifying and positioning the feature image 2 by the existing enhanced image function is that there are too many feature points to be automatically identified and tracked, and in practical application scenes, the feature image 2 is generally a two-dimensional plane image, such as a portrait stand of an offline exhibition scene. The user uses the camera 3 (near the feature image 2) to shoot the picture of the feature image 2, and then after the enhanced image function identifies and positions information such as the spatial position of the feature image 2, the AR display can be performed at the spatial position, for example, a portrait in the portrait stand is popped up, and some actions are performed through a 3D animation. The essence of the enhanced image functionality is therefore to identify a close-range planar object (e.g. a portrait stand at a close distance from the camera 3), so that too many feature points are not required. In order to increase the calculation speed of positioning the space coordinates of the feature image 2, only a few feature points need to be selected for tracking. In this embodiment, the preset number is five. The larger the number of feature points, the more accurate the calculation result of the spatial coordinates for locating the feature image 2, with the corresponding cost of longer time consumption for execution. The number of the characteristic points is determined to be five, and the characteristic points are values obtained through multiple comparison tests, wherein the accuracy and the efficiency are balanced.
The specific process of selecting the five characteristic points is as follows:
first, a rectangular frame of the feature image 2 in the screen (or screen) is acquired by an Object Detection algorithm of the tensrflow Lite technique. And then, selecting five characteristic points in the range of the rectangular frame according to a preset rule. The preset rule is as follows:
assuming that the upper left corner of the picture (or screen) is the origin of the coordinate system, the coordinates of the upper left corner of the rectangular frame are (a, b), and the coordinates of the lower right corner of the rectangular frame are (c, d). The five feature point coordinates selected are ((a + c) × 0.5, (b + d) × 0.5), ((a + c) × 0.5, (b + d) × 0.375), ((a + c) × 0.5, (b + d) × 0.625), ((a + c) × 0.375, (b + d) × 0.5) and ((a + c) × 0.675, (b + d) × 0.5).
Of course, in other embodiments, other suitable numbers of feature points may be selected, or the coordinates of the feature points may be determined according to other rules.
And S404, obtaining the three-dimensional coordinates of the characteristic points by a hitTest method.
In this embodiment, after the five feature points are selected, three-dimensional coordinates of the five feature points in an AR camera coordinate system (with the position of the camera 3 as the origin of the coordinate system) can be obtained by a hitTest method, and are respectively denoted as P1 to P5.
And S406, calculating the distance between the preset number of feature points and the camera 3 according to the three-dimensional coordinates, and judging whether the positioning of the feature image 2 is effective or not according to the calculated distance.
According to the three-dimensional coordinates of the five feature points, the distance between each feature point and the camera 3 (AR camera) can be calculated, and if the difference between the distances is small, it indicates that the five feature points are located on the same two-dimensional plane, that is, the plane where the feature image 2 is located is found, and the positioning mode of the feature image 2 by the preset number of feature points is effective.
The specific implementation process of step S306 is shown in fig. 3 and corresponding description, which are not repeated herein.
S408, when the positioning of the characteristic image 2 is judged to be effective according to the calculated distance, creating an anchor point according to the preset number of characteristic points, and taking the anchor point coordinate as the three-dimensional coordinate of the characteristic image 2.
If the feature image 2 is determined to be located effectively through the feature points, the intermediate points of the preset number of feature points are used as the anchor points, the three-dimensional coordinates of the anchor points are used as the three-dimensional coordinates of the feature image 2, and then function display can be performed according to the positions of the feature image 2.
And S410, prompting a user to adjust the picture when the positioning of the characteristic image 2 is judged to be invalid according to the calculated distance.
If the feature point is determined to be invalid for positioning the feature image 2, it represents that there is a deviation in selecting the feature point, or the feature image 2 placement position does not satisfy a condition, for example, the filling area in the frame is insufficient (less than 50%). Therefore, the user may be prompted to adjust the picture captured by the camera 3, and the adjustment manners include, but are not limited to, changing the distance between the camera 3 and the feature image 2, changing the capturing angle of the camera 3 for the feature image 2, changing the placement manner of the feature image 2, and the like.
In addition, adjustment suggestions can be provided for the user according to the calculation results of the steps while prompting. For example, if a distance value is particularly large in the calculated distances between five feature points and the camera 3, it indicates that the feature point corresponding to the distance value is selected inaccurately, and it may be that the feature point exceeds the range of the feature image 2, so that it may be suggested that the user moves the position of the feature image 2 in the screen to the direction of the feature point.
The feature image recognition and positioning method provided by the embodiment can improve the recognition and positioning speed of the feature image 2 and improve the image enhancement functions of an ARCore platform and an ARKit platform by selecting five feature points in the region where the feature image 2 is located. And when the feature point is judged to be invalid for positioning the feature image 2, the user can be prompted to adjust the picture and an adjustment suggestion is given, so that an accurate positioning result can be obtained more quickly, and the user experience is improved.
EXAMPLE III
As shown in fig. 5, a hardware architecture of an electronic device 20 is provided for a third embodiment of the present application. In the present embodiment, the electronic device 20 may include, but is not limited to, a memory 21, a processor 22, and a network interface 23, which are communicatively connected to each other through a system bus. It is noted that fig. 5 only shows the electronic device 20 with components 21-23, but it is to be understood that not all of the shown components are required to be implemented, and that more or fewer components may be implemented instead. In this embodiment, the electronic device 20 may be the terminal device 1 or the feature image recognition and positioning device 4. When the electronic apparatus 20 is the terminal device 1, the camera 3 is also included in the electronic apparatus 20. When the electronic device 20 is the feature image recognition and positioning device 4, the camera 3 may be included in the electronic device 20 or may be located outside the electronic device 20.
The memory 21 includes at least one type of readable storage medium including a flash memory, a hard disk, a multimedia card, a card type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a Programmable Read Only Memory (PROM), a magnetic memory, a magnetic disk, an optical disk, etc. In some embodiments, the storage 21 may be an internal storage unit of the electronic device 20, such as a hard disk or a memory of the electronic device 20. In other embodiments, the memory 21 may also be an external storage device of the electronic apparatus 20, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), or the like, provided on the electronic apparatus 20. Of course, the memory 21 may also include both an internal storage unit and an external storage device of the electronic apparatus 20. In this embodiment, the memory 21 is generally used for storing an operating system installed in the electronic device 20 and various types of application software, such as program codes of the feature image recognition and positioning system 60. Further, the memory 21 may also be used to temporarily store various types of data that have been output or are to be output.
The processor 22 may be a Central Processing Unit (CPU), controller, microcontroller, microprocessor, or other data Processing chip in some embodiments. The processor 22 is generally used to control the overall operation of the electronic device 20. In this embodiment, the processor 22 is configured to execute the program codes stored in the memory 21 or process data, such as the characteristic image recognition and positioning system 60.
The network interface 23 may include a wireless network interface or a wired network interface, and the network interface 23 is generally used for establishing a communication connection between the electronic apparatus 20 and other electronic devices.
Example four
As shown in fig. 6, a schematic block diagram of a feature image recognition and localization system 60 is provided for a fourth embodiment of the present application. The feature image recognition location system 60 may be partitioned into one or more program modules, which are stored in a storage medium and executed by one or more processors to implement embodiments of the present application. The program modules referred to in the embodiments of the present application refer to a series of computer program instruction segments capable of performing specific functions, and the following description will specifically describe the functions of each program module in the embodiments.
In this embodiment, the feature image recognition and positioning system 60 includes:
and the scanning module 600 is used for scanning and identifying the characteristic image 2 through a picture shot by the camera 3 and containing the characteristic image 2.
And starting the camera 3, shooting a picture containing the characteristic image 2 by controlling the camera 3, and scanning and identifying the characteristic image 2 in the picture. At this time, the feature image 2 is recognized, but the spatial position information of the feature image 2 is not yet acquired, that is, the tracked state.
It should be noted that if the filling area of the feature image 2 in the frame is too small, the subsequently selected feature points may appear to be points outside the range of the feature image 2, which results in a large error in the final positioning result. Therefore, in order to ensure the positioning effect, the user is prompted to align the camera 3 with the feature image 2, and it is ensured that at least 50% of the feature image 2 is filled in the frame.
A selecting module 602, configured to select a preset number of feature points in an area where the feature image 2 is located in the screen.
The core problem of slow speed of identifying and positioning the feature image 2 by the existing enhanced image function is that there are too many feature points to be automatically identified and tracked, and in practical application scenes, the feature image 2 is generally a two-dimensional plane image, such as a portrait stand of an offline exhibition scene. The user uses the camera 3 (near the feature image 2) to shoot the picture of the feature image 2, and then after the enhanced image function identifies and positions information such as the spatial position of the feature image 2, the AR display can be performed at the spatial position, for example, a portrait in the portrait stand is popped up, and some actions are performed through 3D animation. The essence of the enhanced image functionality is therefore to identify a close-range planar object (e.g. a portrait stand at a close distance from the camera 3), so that too many feature points are not required. In order to increase the calculation speed of positioning the space coordinates of the feature image 2, only a few feature points need to be selected for tracking. In this embodiment, the preset number is five. The greater the number of feature points, the more accurate the calculation result of the spatial coordinates of the feature image 2 is to be located, with the corresponding penalty of taking longer to perform. The number of the characteristic points is determined to be five, and the characteristic points are values obtained through multiple comparison tests, wherein the accuracy and the efficiency are balanced.
The specific process of selecting the five characteristic points is as follows:
first, a rectangular frame of the feature image 2 in the screen (or screen) is acquired by an Object Detection algorithm of the tensrflow Lite technique. And then, selecting five characteristic points in the range of the rectangular frame according to a preset rule. The preset rule is as follows:
assuming that the top left corner of the picture (or screen) is the origin of the coordinate system, the coordinates of the top left corner of the rectangular frame are (a, b), and the coordinates of the bottom right corner of the rectangular frame are (c, d). The five feature point coordinates selected are ((a + c) × 0.5, (b + d) × 0.5), ((a + c) × 0.5, (b + d) × 0.375), ((a + c) × 0.5, (b + d) × 0.625), ((a + c) × 0.375, (b + d) × 0.5) and ((a + c) × 0.675, (b + d) × 0.5).
Of course, in other embodiments, other suitable numbers of feature points may be selected, or the coordinates of the feature points may be determined according to other rules.
The testing module 604 is configured to obtain the three-dimensional coordinates of the preset number of feature points by using a hitTest method.
In this embodiment, after the five feature points are selected, three-dimensional coordinates of the five feature points in an AR camera coordinate system (with the position of the camera 3 as the origin of the coordinate system) can be obtained by a hitTest method, and are respectively denoted as P1 to P5.
A determining module 606, configured to calculate distances between the preset number of feature points and the camera 3 according to the three-dimensional coordinates, and determine whether the positioning of the feature image 2 is valid according to the calculated distances.
According to the three-dimensional coordinates of the five feature points, the distance between each feature point and the camera 3 (AR camera) can be calculated, and if the difference between the distances is small, it indicates that the five feature points are located on the same two-dimensional plane, that is, the plane where the feature image 2 is located is found, and the positioning mode of the feature image 2 by the preset number of feature points is effective.
For a specific implementation process of this portion, refer to fig. 3 and corresponding description, which are not described herein again.
A positioning module 608, configured to create an anchor point according to the preset number of feature points when it is determined that the positioning of the feature image 2 is valid according to the calculated distance, and use coordinates of the anchor point as three-dimensional coordinates of the feature image 2.
If the feature image 2 is determined to be located effectively through the feature points, the intermediate points of the preset number of feature points are used as the anchor points, the three-dimensional coordinates of the anchor points are used as the three-dimensional coordinates of the feature image 2, and then function display can be performed according to the positions of the feature image 2.
The feature image recognition and positioning system provided by this embodiment can improve the recognition and positioning speed of the feature image 2 by selecting five feature points in the region where the feature image 2 is located, and improve the image enhancement functions of the arcre platform and the ARKit platform. Experiments prove that the system can obviously shorten the time for calculating the spatial position of the characteristic image 2, for example, the response time for identifying and positioning a standing card can be increased from 6 seconds to 3 seconds.
EXAMPLE five
Fig. 7 is a block diagram of a feature image recognition and localization system 60 according to a fifth embodiment of the present invention. In this embodiment, the feature image recognition and positioning system 60 further includes a prompt module 610 in addition to the scanning module 600, the selecting module 602, the testing module 604, the determining module 606, and the positioning module 608 in the fourth embodiment.
The prompting module 610 is configured to prompt a user to adjust the screen when it is determined that the positioning of the feature image 2 is invalid according to the calculated distance.
If the feature point is determined to be invalid for positioning the feature image 2, it represents that there is a deviation in selecting the feature point, or the feature image 2 placement position does not satisfy a condition, for example, the filling area in the frame is insufficient (less than 50%). Therefore, the user may be prompted to adjust the picture captured by the camera 3, and the adjustment manners include, but are not limited to, changing the distance between the camera 3 and the feature image 2, changing the capturing angle of the camera 3 for the feature image 2, changing the placement manner of the feature image 2, and the like.
In addition, adjustment suggestions can be provided for the user according to the calculation results while prompting. For example, if a distance value is particularly large among the calculated distances between five feature points and the camera 3, it indicates that the feature point corresponding to the distance value is selected inaccurately, and it may be that the feature point exceeds the range of the feature image 2, so that the user may be suggested to move the position of the feature image 2 in the frame toward the feature point.
The feature image recognition and positioning system provided by this embodiment can improve the recognition and positioning speed of the feature image 2 by selecting five feature points in the region where the feature image 2 is located, and improve the image enhancement functions of the arcre platform and the ARKit platform. And when the feature point is judged to be invalid for positioning the feature image 2, the user can be prompted to adjust the picture and an adjustment suggestion is given, so that an accurate positioning result can be obtained more quickly, and the user experience is improved.
EXAMPLE six
The present application further provides another embodiment, which is to provide a computer readable storage medium, wherein the computer readable storage medium stores a feature image identification and location program, and the feature image identification and location program can be executed by at least one processor, so as to cause the at least one processor to execute the steps of the feature image identification and location method as described above.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising a … …" does not exclude the presence of another identical element in a process, method, article, or apparatus that comprises the element.
The above-mentioned serial numbers of the embodiments of the present application are merely for description and do not represent the merits of the embodiments.
It will be apparent to those skilled in the art that the modules or steps of the embodiments of the present application described above may be implemented by a general purpose computing device, they may be centralized on a single computing device or distributed across a network of multiple computing devices, and alternatively, they may be implemented by program code executable by a computing device, such that they may be stored in a storage device and executed by a computing device, and in some cases, the steps shown or described may be performed in an order different from that described herein, or they may be separately fabricated into individual integrated circuit modules, or multiple ones of them may be fabricated into a single integrated circuit module. Thus, embodiments of the present application are not limited to any specific combination of hardware and software.
The above description is only a preferred embodiment of the present application, and is not intended to limit the scope of the present application, and all equivalent structures or equivalent processes that are directly or indirectly applied to other related technical fields, which are made by using the contents of the specification and the drawings of the present application, are also included in the scope of the present application.

Claims (9)

1. A method for recognizing and positioning a characteristic image is characterized by comprising the following steps:
scanning and identifying a characteristic image in a picture through the picture shot by a camera, wherein the characteristic image is a specific object image in the real world;
selecting a preset number of feature points in the area where the feature image is located in the picture;
respectively obtaining the three-dimensional coordinates of the preset number of feature points by a hitTest method;
respectively calculating the distances between the preset number of feature points and the camera according to the three-dimensional coordinates, and judging whether the feature image is positioned effectively or not according to the calculated distances; and
and when the feature image is judged to be positioned effectively according to the calculated distance, creating anchor points according to the feature points of the preset number, and taking the coordinates of the anchor points as the three-dimensional coordinates of the feature image, wherein the anchor points are intermediate points of the feature points of the preset number.
2. The method for recognizing and positioning the feature images as claimed in claim 1, further comprising:
and when the positioning of the characteristic image is judged to be invalid according to the calculated distance, prompting a user to adjust the picture.
3. The method for recognizing and positioning the characteristic images as claimed in claim 1 or 2, wherein the preset number is five.
4. The method for recognizing and positioning the feature image according to claim 3, wherein the selecting a preset number of feature points in the area where the feature image is located in the picture comprises:
acquiring a rectangular frame of the characteristic image in the picture through a target detection algorithm of a TensorFlowLite technology;
selecting five feature points in the range of the rectangular frame according to a preset rule, wherein the preset rule is as follows:
taking the upper left corner of the frame as the origin of the coordinate system, the coordinates of the upper left corner of the rectangular frame are (a, b), and the coordinates of the lower right corner of the rectangular frame are (c, d), so that the coordinates of the five selected feature points are ((a + c) × 0.5, (b + d) × 0.5), ((a + c) × 0.375), ((a + c) × 0.5, (b + d) × 0.625), ((a + c) × 0.375, (b + d) × 0.5), and ((a + c) × 0.675, (b + d) × 0.5).
5. The method for recognizing and positioning the feature image according to claim 1 or 2, wherein the determining whether the positioning of the feature image is valid according to the calculated distance comprises:
calculating a mathematical expected value according to the obtained distance, wherein the mathematical expected value is an average value of the distances between the preset number of feature points and the camera;
calculating a standard deviation of the obtained distance and the mathematical expected value;
judging whether the standard deviation exceeds a preset threshold value or not so as to determine whether the positioning of the feature images through the preset number of feature points is effective or not, wherein when the value of the standard deviation is larger than the threshold value, the positioning of the feature images through the preset number of feature points is determined to be ineffective; when the value of the standard deviation is less than or equal to the threshold, determining that the positioning of the feature image by the preset number of feature points is valid.
6. The method for recognizing and positioning the characteristic image according to the claim 1 or 2, characterized in that the method further comprises, after the characteristic image is recognized through the picture scan of the camera containing the characteristic image:
and prompting a user to align the camera with the feature image and ensuring that at least 50% of the feature image is filled in the picture.
7. A feature image recognition positioning system, the system comprising:
the scanning module is used for scanning and identifying the characteristic image through a picture which is shot by a camera and contains the characteristic image, wherein the characteristic image is a specific object image in the real world;
the selection module is used for selecting a preset number of feature points in the area where the feature image is located in the picture;
the test module is used for respectively obtaining the three-dimensional coordinates of the preset number of feature points by a hitTest method;
the judging module is used for respectively calculating the distances between the preset number of feature points and the camera according to the three-dimensional coordinates and judging whether the feature image is positioned effectively or not according to the calculated distances;
and the positioning module is used for creating an anchor point according to the preset number of feature points when the feature image is judged to be effectively positioned according to the calculated distance, and taking the coordinates of the anchor point as the three-dimensional coordinates of the feature image, wherein the anchor point is the middle point of the preset number of feature points.
8. An electronic device, comprising: a memory, a processor and a feature image recognition and localization program stored on the memory and executable on the processor, the feature image recognition and localization program, when executed by the processor, implementing the feature image recognition and localization method according to any one of claims 1-6.
9. A computer-readable storage medium, wherein the computer-readable storage medium stores thereon a feature image recognition and positioning program, and the feature image recognition and positioning program, when executed by a processor, implements the feature image recognition and positioning method according to any one of claims 1 to 6.
CN202010181937.9A 2020-03-16 2020-03-16 Characteristic image identification and positioning method and system Active CN113409385B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010181937.9A CN113409385B (en) 2020-03-16 2020-03-16 Characteristic image identification and positioning method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010181937.9A CN113409385B (en) 2020-03-16 2020-03-16 Characteristic image identification and positioning method and system

Publications (2)

Publication Number Publication Date
CN113409385A CN113409385A (en) 2021-09-17
CN113409385B true CN113409385B (en) 2023-02-24

Family

ID=77676397

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010181937.9A Active CN113409385B (en) 2020-03-16 2020-03-16 Characteristic image identification and positioning method and system

Country Status (1)

Country Link
CN (1) CN113409385B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103927243A (en) * 2013-01-15 2014-07-16 株式会社日立制作所 Graphical user interface operation monitoring method and device
CN108682031A (en) * 2018-05-21 2018-10-19 深圳市酷开网络科技有限公司 Measurement method, intelligent terminal based on augmented reality and storage medium
CN108932055A (en) * 2018-06-04 2018-12-04 艾律有限责任公司 A kind of method and apparatus of augmented reality content
CN109448050A (en) * 2018-11-21 2019-03-08 深圳市创梦天地科技有限公司 A kind of method for determining position and terminal of target point
CN110782499A (en) * 2019-10-23 2020-02-11 Oppo广东移动通信有限公司 Calibration method and calibration device for augmented reality equipment and terminal equipment

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6546277B1 (en) * 1998-04-21 2003-04-08 Neutar L.L.C. Instrument guidance system for spinal and other surgery
US10462406B2 (en) * 2014-08-01 2019-10-29 Sony Corporation Information processing apparatus and information processing method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103927243A (en) * 2013-01-15 2014-07-16 株式会社日立制作所 Graphical user interface operation monitoring method and device
CN108682031A (en) * 2018-05-21 2018-10-19 深圳市酷开网络科技有限公司 Measurement method, intelligent terminal based on augmented reality and storage medium
CN108932055A (en) * 2018-06-04 2018-12-04 艾律有限责任公司 A kind of method and apparatus of augmented reality content
CN109448050A (en) * 2018-11-21 2019-03-08 深圳市创梦天地科技有限公司 A kind of method for determining position and terminal of target point
CN110782499A (en) * 2019-10-23 2020-02-11 Oppo广东移动通信有限公司 Calibration method and calibration device for augmented reality equipment and terminal equipment

Also Published As

Publication number Publication date
CN113409385A (en) 2021-09-17

Similar Documents

Publication Publication Date Title
CN112767489B (en) Three-dimensional pose determining method and device, electronic equipment and storage medium
CN110568447B (en) Visual positioning method, device and computer readable medium
WO2018107910A1 (en) Method and device for fusing panoramic video images
US20180137644A1 (en) Methods and systems of performing object pose estimation
CN113808253B (en) Method, system, equipment and medium for processing dynamic object of three-dimensional reconstruction of scene
CN105303514A (en) Image processing method and apparatus
CN111583381B (en) Game resource map rendering method and device and electronic equipment
CN110866497B (en) Robot positioning and mapping method and device based on dotted line feature fusion
CN110648363A (en) Camera posture determining method and device, storage medium and electronic equipment
CN108628442B (en) Information prompting method and device and electronic equipment
CN111415420B (en) Spatial information determining method and device and electronic equipment
CN110686676A (en) Robot repositioning method and device and robot
WO2022048468A1 (en) Planar contour recognition method and apparatus, computer device, and storage medium
CN111459269A (en) Augmented reality display method, system and computer readable storage medium
CN113256719A (en) Parking navigation positioning method and device, electronic equipment and storage medium
CN113052907A (en) Positioning method of mobile robot in dynamic environment
CN113947768A (en) Monocular 3D target detection-based data enhancement method and device
US10295403B2 (en) Display a virtual object within an augmented reality influenced by a real-world environmental parameter
CN113223064A (en) Method and device for estimating scale of visual inertial odometer
CN113486941B (en) Live image training sample generation method, model training method and electronic equipment
CN113409385B (en) Characteristic image identification and positioning method and system
CN117132649A (en) Ship video positioning method and device for artificial intelligent Beidou satellite navigation fusion
CN113673288A (en) Idle parking space detection method and device, computer equipment and storage medium
CN113436332A (en) Digital display method and device for fire-fighting plan, server and readable storage medium
CN114972587A (en) Expression driving method and device, electronic equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant