CN107516099B

CN107516099B - Method and device for detecting marked picture and computer readable storage medium

Info

Publication number: CN107516099B
Application number: CN201710719860.4A
Authority: CN
Inventors: 贾琼; 赵凌; 孙星; 余宗桥; 郭晓威
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2017-08-21
Filing date: 2017-08-21
Publication date: 2022-11-29
Anticipated expiration: 2037-08-21
Also published as: CN107516099A

Abstract

The application discloses a method for detecting a marked picture, which comprises the following steps: acquiring an image to be detected, and extracting gradient information of pixel points in the image to be detected by a table look-up method to obtain a local feature set of the image to be detected, wherein an arc tangent table comprises a corresponding relation between a table look-up index and an angle, and the table look-up index is determined according to the neon function; matching the local feature set with the feature set of the marked picture to determine a matching set, wherein the matching set is an intersection formed by the local feature set and the feature set of the marked picture; determining that the tagged picture is detected when the number of matching features in the matching set is greater than a number threshold. According to the embodiment of the application, the extraction speed of the gradient information of the pixel points in the image can be improved, and the speed of detecting the marked image is increased.

Description

Method and device for detecting marked picture and computer readable storage medium

Technical Field

The present application relates to the field of computer technologies, and in particular, to a method and an apparatus for detecting a marked picture, and a computer-readable storage medium.

Background

Augmented Reality (AR) is also called mixed Reality. Virtual information is applied to the real world through a computer technology, and a real environment and a virtual object are overlaid to the same picture or space in real time.

The process of superimposing a virtual object onto a real environment is usually implemented by means of a marked picture (Marker), for example: the method comprises the steps that a marked picture is arranged on one magazine, after the magazine is read by a camera, the marked picture can be detected from an image, and then a virtual image is superposed on the real magazine according to pose information, namely position information and posture information, of the marked picture, so that an AR effect is generated.

In the process of detecting the Marker, gradient information of each pixel point is extracted, the gradient information comprises angle information, an arc tangent trigonometric function is needed for obtaining an angle in the extraction process of the angle information, the number of times of calling the arc tangent trigonometric function for an image to be detected with megapixels is also megahertz, and therefore the calculation time consumption for optimizing the function is very important for reducing the whole time consumption of the Marker detection.

Disclosure of Invention

In order to solve the problem that Marker detection in the prior art consumes a lot of time, embodiments of the present application provide a method for detecting a Marker picture, gradient information can be searched by using a table lookup method, so that the extraction speed of the gradient information of a pixel point in an image is increased, and the Marker detection speed is accelerated. The embodiment of the application also provides a corresponding device and a computer readable storage medium.

A first aspect of the present application provides a method for detecting a marked picture, including:

acquiring an image to be detected, wherein the image to be detected comprises a marked picture with texture;

extracting gradient information of pixel points in the image to be detected by a table look-up method to obtain a local feature set of the image to be detected;

matching the local feature set with the feature set of the marked picture to determine a matching set, wherein the matching set is an intersection formed by the local feature set and the feature set of the marked picture;

determining that the tagged picture is detected when the number of matching features in the matching set is greater than a number threshold.

The second aspect of the present application provides an apparatus for detecting a marked picture, comprising:

the acquisition program module is used for acquiring an image to be detected, wherein the image to be detected comprises a mark picture with texture;

the extraction program module is used for extracting gradient information of pixel points in the image to be detected through a table look-up method so as to obtain a local feature set of the image to be detected;

a matching program module, configured to match the local feature set extracted by the extraction program module with the feature set of the tagged picture to determine a matching set, where the matching set is an intersection formed by the local feature set and the feature set of the tagged picture;

a determining program module for determining that the tagged picture is detected when the number of matching features in the matching set determined by the matching program module is greater than a number threshold.

A third aspect of the present application provides a computer device comprising: an input/output (I/O) interface, a processor, and a memory having stored therein instructions for tagged picture detection as described in the first aspect;

the I/O interface is used for receiving and acquiring an image to be detected;

the processor is configured to execute the instructions for tagged picture detection stored in the memory, to perform the steps of the method for tagged picture detection as described in the first aspect.

Yet another aspect of the present application provides a computer-readable storage medium having stored therein instructions, which when executed on a computer, cause the computer to perform the method of the above-described aspects.

A further aspect of the present application provides a computer program product comprising instructions which, when run on a computer, cause the computer to perform the method of the above-mentioned aspects.

According to the method and the device, the gradient information can be searched through a table look-up method, so that the extraction speed of the gradient information of the pixel points in the image is increased, and the Marker detection speed is increased.

Drawings

FIG. 1 is an example schematic diagram of an AR scene;

FIG. 2 is a schematic flow diagram of the AR principle;

FIG. 3 is an exemplary diagram of an AR scene using a tagged picture;

FIG. 4 is another exemplary diagram of an AR scene using a tagged picture;

FIG. 5 is a schematic diagram of an embodiment of a method for detecting a marked picture in an embodiment of the present application;

FIG. 6 is a schematic diagram of another embodiment of a method for detecting a marked picture in the embodiment of the present application;

FIG. 7 is a schematic diagram of an example of an image pyramid;

FIG. 8 is a schematic diagram of another embodiment of a method for detecting a marked picture in the embodiment of the present application;

FIG. 9 is a schematic diagram of an embodiment of an apparatus for marked picture detection in the embodiment of the present application;

FIG. 10 is a schematic diagram of another embodiment of an apparatus for marked picture detection in the embodiment of the present application;

fig. 11 is a schematic diagram of an embodiment of a terminal in the embodiment of the present application.

Detailed Description

Embodiments of the present application will now be described with reference to the accompanying drawings, and it is to be understood that the described embodiments are merely illustrative of some, but not all, embodiments of the present application. As can be known to those skilled in the art, with the development of technology and the emergence of new application scenarios, the technical solutions provided in the embodiments of the present application are also applicable to similar technical problems.

The method for detecting the marked picture can search the angle information in the gradient information through a table look-up method, so that the extraction speed of the gradient information of the pixel points in the image is improved, and the Marker detection speed is accelerated. The embodiment of the application also provides a corresponding device and a computer readable storage medium. The following are detailed descriptions.

Augmented Reality (AR), also called mixed Reality, applies virtual information to the real world through computer technology, and superimposes real environment and virtual objects on the same picture in real time in visual presentation.

As shown in fig. 1, in an AR scenario, the user shown in (a) of fig. 1 uses an AR device, for example: the AR glasses can see a picture shown in (b) of the figure 1 in a real venue, wherein the venue and the audience are real scenes, the picture of the whale going out of the sea is a superposed virtual object, the virtual whale is superposed in the real venue to appear in the same picture, and the user can see the picture shown in (b) of the figure 1.

The principle of implementation of the AR picture can be understood with reference to fig. 2.

As shown in fig. 2, when an AR picture is experienced by the AR device 10, the camera 101 collects an object 10 in the real world, and the object is digitally imaged by the camera to form video data, the sensor 102 may be a position sensor or another type of sensor, the data acquired by the sensor 102 is referred to as sensor data, and then the processor 103 in the AR device 10 perceptively understands a three-dimensional (3D) world through the video data and the sensor data, and obtains an understanding of three-dimensional interaction. The purpose of 3D interactive understanding is to inform the AR device of what is to be "augmented". For example, in an AR-assisted repair system, if the system recognizes a page-turning gesture by a repairman, the next page of the virtual manual is superimposed into the real image. In contrast, the purpose of 3D environment understanding is to tell the AR device where to "boost". For example, in the above example, it is desirable that the new display page and the previous display page appear to be identical in spatial position, thereby achieving a strong sense of realism. This requires the AR device to have an accurate understanding of the surrounding real 3D world in real time. Once the system knows the content and location to be enhanced, a virtual-real join can be performed, which is typically done by the rendering module 104. Finally, the synthesized video is transmitted to the visual system of the user, so that the effect of augmented reality is achieved.

In an AR scene, a Marker picture (Marker) is widely applied, marker detection is to store information of a Marker image in advance, search and identify the Marker image in a current image through an image identification technology, and then superimpose related information on the Marker image. For example, a Marker is configured in advance on a magazine, image content on the magazine is acquired through a camera, if the Marker in the image content is detected, a virtual image can be superimposed at a corresponding position according to pose information of the Marker, namely position information and posture information of the Marker, and an AR picture can be displayed after a synthetic image is obtained. As shown in FIG. 3, an image of the walking of the cartoon pet is seen through the AR device, and the walking process of the cartoon pet is a process of moving the magazine, so that the pose information of the Marker changes. For the AR device, the mark picture P1 is detected from the image to be detected, when the mark picture P1 is detected, the cartoon pet P2 is superposed at the corresponding position, a three-dimensional cartoon pet is shown, and when the magazine is moved, the position of the mark picture P1 is changed, and the visual effect that the cartoon pet P2 moves along with the magazine is shown. The process of detecting the marked picture in the scene shown in fig. 3 can be understood in combination with the process of detecting the marked picture in the subsequent embodiments. Of course, the Marker shown in fig. 3 is just a relatively simple Marker, similar to a simple two-dimensional code. The Marker in the embodiment of the present application is not limited to this form, and may be any picture including a texture.

As shown in fig. 4, the marked picture M1 may be a character picture in a game, and after the marked picture M1 is identified from the picture to be detected by the mobile phone, the game character M2 may be overlaid on the screen of the mobile phone, so that a visual effect of the game character coming to the real world is achieved. The process of detecting the marked picture M1 by the mobile phone can be understood by combining the process of detecting the marked picture in the subsequent embodiments.

The foregoing examples shown in fig. 3 and fig. 4 relate to detection of a marked picture, and a method for detecting a marked picture in the embodiment of the present application is described below with reference to the drawings.

As shown in fig. 5, an embodiment of the method for detecting a marked picture provided in the embodiment of the present application includes:

201. and acquiring an image to be detected, wherein the image to be detected comprises a marked picture with texture.

The image to be detected refers to an image for which a marked picture needs to be detected, for example: and acquiring an image which is prestored with a marked picture through a camera, wherein the image acquired by the camera is the image to be detected.

202. And extracting gradient information of pixel points in the image to be detected by using a table look-up method so as to obtain a local feature set of the image to be detected.

Optionally, in this step 202, an arc tangent table and a neon function may be used to extract gradient information of a pixel point in the image to be detected, so as to obtain a local feature set of the image to be detected, where the arc tangent table includes a corresponding relationship between a table look-up index and an angle, and the table look-up index is determined according to the neon function.

The neon function may be included in a neon instruction set, the neon instruction set may include a plurality of neon functions, the types of the neon functions may include a min function and a max function, and the like, although the neon functions included in the neon instruction set are not limited to the min function and the max function, and are only described herein by way of example.

The gradient information of the pixel point includes an angle, an arc tangent table and a neon function need to be used in the embodiment of the present application when the angle of the pixel point is determined, and the arc tangent table is described below.

The arctan table can be understood with reference to table 1, as shown in table 1:

table 1: table look-up index and angle corresponding relation table

The arctangent table can be prepared in an off-line manner, and the number of angles in the arctangent table shown in table 1 can be determined according to the degree of discretization, for example, if the range of the angle domain of the arctangent function included in the arctangent table is [0,n pi/4 ], n is a positive integer less than 8. Taking n =1 as an example, an arctan value table is prepared offline: the tangent range [0,1.0 ] corresponds to the angle range

Discretizing, where the tangent value is 0.0001, the precision of the lookup table is

And then calculating the corresponding angle of each discrete tangent value and storing the angle into a table. Here, the angle corresponding to the discretized arctan value is calculated using std: (atan) of the c + + standard library. And the discrete tangent value is amplified by 10000 times and is used as a table look-up index of the index table.

The neon function is used to determine the table lookup index, and after the table lookup index is determined, the corresponding angle can be found through the arc tangent table shown in table 1, for example: if the Index of the lookup table determined by the neon function is Index6, the corresponding angle can be determined to be angle 6.

The table indices and angles included in the arctangent table are not limited to a few in Table 1, and are merely exemplary, such as for the tangent field [0,1.0), corresponding to the angle field

There may be 10000 table lookup indexes, and 10000 angles would be corresponded.

Because the neon function can reduce the operation amount, the table lookup index can be quickly determined by using the neon function, and then the angle of the pixel point can be quickly determined.

In addition, in the embodiment of the application, the discretization range of the arc tangent table is [0,n pi/4 ], n is a positive integer smaller than 8, and under the condition that the discretization number is kept unchanged, the discretization range is reduced, and the precision of determining the angle by using the arc tangent table to look up the table can be improved.

According to gradient information of pixel points in the image to be detected, a local feature set of the image to be detected can be obtained, wherein the local feature set comprises at least one local feature.

The units of angle may be degrees (°), minutes ('), seconds (").

203. And matching the local feature set with the feature set of the marked picture to determine a matching set, wherein the matching set is an intersection formed by the local feature set and the feature set of the marked picture.

The features in the feature set of the tagged picture are pre-extracted for the tagged picture by using the above arctan table and the neon function.

If the local feature set of the image to be detected is marked by the set A and the feature set of the pre-configured marked picture is marked by the set B, the matching set is the intersection of the set A and the set B.

204. Determining that the tagged picture is detected when the number of matching features in the matching set is greater than a number threshold.

According to the method and the device, the gradient information is searched through a table look-up method, so that the extraction speed of the gradient information of the pixel points in the image is improved, and the speed of Marker detection is increased. The angle information in the gradient information can be searched by using a method of combining the arc tangent table and the neon function, so that the extraction speed of the gradient information of the pixel points in the image is further improved, and the Marker detection speed is accelerated.

Optionally, with reference to fig. 6 on the basis of the embodiment corresponding to fig. 5, in the step 202, an alternative uses an arc tangent table and a neon function to extract gradient information of a pixel point in the image to be detected, so as to obtain a local feature set of the image to be detected, where the arc tangent table includes a correspondence between a table lookup index and an angle, and the determining of the table lookup index according to the neon function may include:

2021. and generating an image pyramid according to the image to be detected.

And carrying out Gaussian blur and downsampling on the image to be detected at different scales to generate an image pyramid for resisting the scaling factor.

The image pyramid can be understood with reference to fig. 7, which explains a structure of an image in multiple resolutions. The image pyramid typically contains multiple layers of images. Each layer of image contains one or more pixel points, and generally, the number of the pixel points is smaller towards the top layer, the resolution of the pixel points at the bottom layer is higher, and then the resolution is lower towards the top layer. An image pyramid may also be understood as a collection of progressively lower resolution images arranged in a pyramid shape. The resolution of the image layer at the base of the pyramid is higher and the resolution of the top image layer is lower. The number of layers included in the image pyramid and the distance between the number of layers can be set according to the scene requirements.

2022. Gradient information is extracted for pixels in each layer of images in the image pyramid using an arctan table and a neon function to form a gradient map.

And extracting gradient information for each pixel of each layer of image of the image pyramid.

The process of extracting gradient information from pixels in each layer of image in the image pyramid using the arctan table and the neon function may be:

determining a target table look-up index of the pixel point in the arc tangent table;

and determining the target angle of the pixel point corresponding to the target table look-up index according to the arc tangent table, wherein the target angle of the pixel point is contained in the gradient information of the target point.

Determining a target table look-up index of the pixel point in the arc tangent table comprises the following steps:

determining a ratio of a first numerical value and a second numerical value, wherein the first numerical value is the minimum value of the absolute value of a first difference value and the absolute value of a second difference value, the second numerical value is the maximum value of the absolute value of the first difference value and the absolute value of the second difference value, the first difference value is the difference value of a right gray value and a left gray value, and the second difference value is the difference value of a lower gray value and an upper gray value, the right gray value is the gray value of a pixel on the right side of a currently extracted pixel, the left gray value is the gray value of a pixel on the left side of the currently extracted pixel, the lower gray value is the gray value of a pixel below the currently extracted pixel, and the upper gray value is the gray value of a pixel above the currently extracted pixel;

enlarging the ratio by N times to obtain a calculation result, wherein N is the reciprocal of the discretized ratio;

and performing rounding-down on the calculation result to obtain the target table look-up index.

The formulaic representation can be expressed as:

the entry represents the target table look-up index, dx = right gray value-left gray value, dy = lower gray value-upper gray value, wherein the right gray value is the gray value of the pixel on the right side of the currently extracted pixel, the left gray value is the gray value of the pixel on the left side of the currently extracted pixel, the lower gray value is the gray value of the pixel below the currently extracted pixel, the upper gray value is the gray value of the pixel above the currently extracted pixel, N is the reciprocal of the discretization proportion, and if the discretization degree is 1 10000, N =10000.

In the above table lookup index formula, the functions min and max are both neon functions, and the index formula is exemplified by that the discretization degree is 10000, so the final expansion multiple in the formula is 10000, and if the discretization degree is other values, the final expansion multiple in the formula will change correspondingly.

For example: if the entry = index3 is calculated according to the above formula, the angle of the pixel point can be determined to be the angle 3 according to the mapping relationship in table 1.

Since the above-mentioned angle domain range is less than [0,2 pi ], the table lookup involves the case that the target table lookup index entry is not in the arc tangent table, and for these cases, the present embodiment provides:

if the target table look-up index entry is not in the range of the table look-up index of the arc tangent table, the target angle is the largest angle in the angle domain;

if the absolute value of the second difference is greater than the absolute value of the first difference, that is, dy | > | dx |, the target angle is the complementary angle of the table look-up result corresponding to the target table look-up index;

if the first difference is smaller than 0, namely dx is smaller than 0, the target angle is the supplementary angle of the table look-up result corresponding to the target table look-up index;

if the second difference is smaller than 0, namely dy <0, the target angle is the opposite number of the table look-up result corresponding to the target table look-up index;

and when the target angle comprises the inverse number of the table lookup result corresponding to the target table lookup index entry, mapping the result to [0,2 pi ] by using a neon function.

Where the result can be mapped to [0,2 π ] using the neon function for floating point number addition.

For example: if the range of the angle domain is

When the entry exceeds the range of the lookup table, the target angle is

2023. And determining local extreme points on the gradient map, wherein the local extreme points are pixel points which are not influenced by the environment in the image to be detected.

The local extreme points are relatively stable pixel points in the image, and the pixel points can not disappear due to the change of the visual angle, the change of illumination and the interference of noise, such as angular points, edge points, bright points in dark areas and dark points in bright areas. Thus, if the two images have the same scene, the stable pixel points can appear on the same scene of the two images at the same time, and the matching can be realized.

2024. And taking the local extreme point as a center, extracting the texture features of the pixel points adjacent to the local extreme point to obtain a local feature set of the image to be detected, wherein the local feature set comprises the texture features.

Detecting a local extreme point on the gradient map, and taking the local extreme point as a center to extract binary texture features of the neighborhood; the local feature can resist factors such as rotation, translation, complex background and partial occlusion. The set of texture features obtained from these local extreme points is used to represent the entire image to be identified and is used for feature matching in the next stage.

Alternatively, referring to fig. 8, step 203 in fig. 5 may include:

2031. determining a Hamming distance between a first feature and a second feature, wherein the first feature is a binary feature in the local feature set, and the second feature is a binary feature in the feature set of the marked picture.

Hamming distance (Hamming distance) is a concept that represents the different number of corresponding bits of two (same length) words, and d (x, y) represents the Hamming distance between two words x, y. And carrying out exclusive OR operation on the two character strings, and counting the number of 1, wherein the number is the Hamming distance.

2032. When the hamming distance is less than a distance threshold, then the first feature and the second feature are determined to be the matching features.

2033. Traversing each matching feature in the local feature set and the feature set of the marked picture according to the distance comparison mode of the first feature and the second feature to determine a matching set.

Optionally, another embodiment of the method for detecting a marked picture provided in the embodiment of the present application further includes:

and determining pose information of the marked picture according to each matching feature in the matching set, wherein the pose information is used for superposing a virtual image on the image to be detected, and the pose information comprises position information and posture information.

If the size of the matching set exceeds a threshold value, the detected image is considered to contain a marked picture; and calculating the homography matrix by using the pixel positions of the features in the matching set, so that the most matching can meet the same homography matrix with the error rate lower than the error threshold value. The homography matrix represents the position information and the attitude information of the marked picture in the detected image, namely the attitude information, the attitude information can also be called a position matrix, and the position matrix can be used for performing the rendering operation of the augmented reality in the subsequent stage.

Homographies (homographies) are simply the times when projections can be found in reverse, for example: an object can obtain two different photos by rotating a camera lens, and the contents of the two photos do not necessarily need to be completely corresponding, but only partially corresponding. The homography can be set to a two-dimensional matrix M, and then picture 1 multiplied by M is picture 2. Such as image correction, image alignment, or camera motion calculations (rotation and translation) between two images, etc. Once the rotation and translation are extracted from the estimated homography matrix, this information can be used to navigate or insert 3D object models into the image or video that can be rendered from the correct perspective and become part of the original scene.

In fact, the scheme of tagged picture detection provided in the embodiment of the present application may also be applied to other scenes including tagged picture detection.

The above is a description of a method for detecting a marked picture, and an apparatus for detecting a marked picture in the embodiment of the present application is described below with reference to the accompanying drawings.

As shown in fig. 9, an embodiment of the apparatus 30 for detecting a marked picture provided in the embodiment of the present application includes:

an obtaining program module 301, configured to obtain an image to be detected, where the image to be detected includes a marked picture with texture;

an extraction program module 302, configured to extract gradient information of a pixel point in the image to be detected, which is obtained by the obtaining program module 301, through a table lookup method, so as to obtain a local feature set of the image to be detected;

a matching program module 303, configured to match the local feature set extracted by the extraction program module 302 with the feature set of the tagged picture to determine a matching set, where the matching set is an intersection formed by the local feature set and the feature set of the tagged picture;

a determining program module 304, configured to determine that the tagged picture is detected when the number of matching features in the matching set determined by the matching program module 303 is greater than a number threshold.

According to the method and the device, the gradient information can be searched through a table look-up method, so that the extraction speed of the gradient information of the pixel points in the image is increased, and the speed of Marker detection is increased.

Optionally, the extraction program module 302 is configured to extract gradient information of a pixel point in the image to be detected by using an arc tangent table and a neon function to obtain a local feature set of the image to be detected, where the arc tangent table includes a correspondence between a table lookup index and an angle, and the table lookup index is determined according to the neon function.

The angle information in the gradient information is searched by using a method of combining the arctan table and the neon function, so that the extraction speed of the gradient information of the pixel points in the image is further improved, and the Marker detection speed is accelerated.

Optionally, referring to fig. 10, in another embodiment of the apparatus 30 for detecting a marked picture provided in the embodiment of the present application, the extracting program module 302 includes:

a generating unit 3021, configured to generate an image pyramid according to the image to be detected;

a first extracting unit 3022 configured to extract gradient information for pixels in each layer of the image pyramid generated by the generating unit 3021 using an arctangent table in which an angle domain of the arctangent function is included in a range of [0,n pi/4 ] and n is a positive integer smaller than 8 and a neon function to form a gradient map;

a determining unit 3023, configured to determine a local extreme point on a gradient map formed by extracting gradient information by the first extracting unit 3022, where the local extreme point is a pixel point that is not affected by an environment in the image to be detected;

a second extracting unit 3024, configured to extract, with the local extreme point determined by the determining unit 3023 as a center, a texture feature of a pixel point adjacent to the local extreme point, so as to obtain a local feature set of the to-be-detected image including the texture feature.

Optionally, the first extracting unit 3022 is configured to:

Can be as follows:

determining the target lookup table index using the formula:

the entry represents the target table look-up index, dx = right gray value-left gray value, dy = lower gray value-upper gray value, wherein the right gray value is the gray value of the pixel on the right side of the currently extracted pixel, the left gray value is the gray value of the pixel on the left side of the currently extracted pixel, the lower gray value is the gray value of the pixel below the currently extracted pixel, the upper gray value is the gray value of the pixel above the currently extracted pixel, and N is the reciprocal of the discretization proportion.

Optionally, the matching program module 303 is configured to:

determining a Hamming distance between a first feature and a second feature, wherein the first feature is a binary feature in the local feature set, and the second feature is a binary feature in the feature set of the marked picture;

when the hamming distance is less than a distance threshold, determining the first feature and the second feature as the matching features;

traversing each matching feature in the local feature set and the feature set of the marked picture according to the distance comparison mode of the first feature and the second feature to determine a matching set.

Optionally, the determining program module 304 is further configured to determine pose information of the marker picture according to each matching feature in the matching set, where the pose information is used to overlay a virtual image on the image to be detected, and the pose information includes position information and pose information.

The embodiment of the present invention further provides another apparatus for detecting a marked picture, where the apparatus for detecting a marked picture may be a terminal, as shown in fig. 11, for convenience of description, only a part related to the embodiment of the present invention is shown, and details of the specific technology are not disclosed, please refer to the method part of the embodiment of the present invention. The terminal may be any terminal device including a mobile phone, a tablet computer, a PDA (Personal Digital Assistant), a POS (Point of Sales), a computer, etc., taking the terminal as a mobile phone for example:

fig. 11 is a block diagram showing a partial structure of a cellular phone related to a terminal provided by an embodiment of the present invention. Referring to fig. 11, the cellular phone includes: radio Frequency (RF) circuit 1110, memory 1120, input unit 1130, display unit 1140, sensor 1150, audio circuit 1160, wireless fidelity (WiFi) module 1170, processor 1180, and camera 1190. Those skilled in the art will appreciate that the handset configuration shown in fig. 11 is not intended to be limiting and may include more or fewer components than shown, or some components may be combined, or a different arrangement of components.

The following specifically describes each component of the mobile phone with reference to fig. 11:

RF circuit 1110 may be used for receiving and transmitting signals during a message transmission or call, and in particular, for receiving downlink messages from a base station and then processing the received downlink messages to processor 1180; in addition, the data for designing uplink is transmitted to the base station. In general, RF circuit 1110 includes, but is not limited to, an antenna, at least one Amplifier, a transceiver, a coupler, a Low Noise Amplifier (LNA), a duplexer, and the like. In addition, the RF circuitry 1110 may also communicate with networks and other devices via wireless communications. The wireless communication may use any communication standard or protocol, including but not limited to Global System for Mobile communication (GSM), general Packet Radio Service (GPRS), code Division Multiple Access (CDMA), wideband Code Division Multiple Access (WCDMA), long Term Evolution (LTE), email, short Messaging Service (SMS), and the like.

The memory 1120 may be used to store software programs and modules, and the processor 1180 executes various functional applications and data processing of the mobile phone by operating the software programs and modules stored in the memory 1120. The memory 1120 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required for at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the cellular phone, and the like. Further, the memory 1120 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.

The input unit 1130 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the cellular phone. Specifically, the input unit 1130 may include a touch panel 1131 and other input devices 1132. Touch panel 1131, also referred to as a touch screen, can collect touch operations of a user on or near the touch panel 1131 (for example, operations of the user on or near touch panel 1131 by using any suitable object or accessory such as a finger, a stylus, etc.) and drive corresponding connection devices according to a preset program. Alternatively, the touch panel 1131 may include two parts, namely, a touch detection device and a touch controller. The touch detection device detects the touch direction of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch sensing device, converts the touch information into touch point coordinates, sends the touch point coordinates to the processor 1180, and receives and executes commands sent by the processor 1180. In addition, the touch panel 1131 can be implemented by using various types, such as resistive, capacitive, infrared, and surface acoustic wave. The input unit 1130 may include other input devices 1132 in addition to the touch panel 1131. In particular, other input devices 1132 may include, but are not limited to, one or more of a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and the like.

The display unit 1140 may be used to display information input by the user or information provided to the user and various menus of the cellular phone. The Display unit 1140 may include a Display panel 1141, and optionally, the Display panel 1141 may be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like. Further, the touch panel 1131 can cover the display panel 1141, and when the touch panel 1131 detects a touch operation on or near the touch panel, the touch panel is transmitted to the processor 1180 to determine the type of the touch event, and then the processor 1180 provides a corresponding visual output on the display panel 1141 according to the type of the touch event. Although in fig. 11, the touch panel 1131 and the display panel 1141 are two independent components to implement the input and output functions of the mobile phone, in some embodiments, the touch panel 1131 and the display panel 1141 may be integrated to implement the input and output functions of the mobile phone.

The handset may also include at least one sensor 1150, such as a light sensor, motion sensor, and other sensors. Specifically, the light sensor may include an ambient light sensor and a proximity sensor, wherein the ambient light sensor may adjust the brightness of the display panel 1141 according to the brightness of ambient light, and the proximity sensor may turn off the display panel 1141 and/or the backlight when the mobile phone moves to the ear. As one of the motion sensors, the accelerometer sensor can detect the magnitude of acceleration in each direction (generally, three axes), can detect the magnitude and direction of gravity when stationary, and can be used for applications of recognizing the gesture of the mobile phone (such as horizontal and vertical screen switching, related games, magnetometer gesture calibration), vibration recognition related functions (such as pedometer and tapping), and the like; as for other sensors such as a gyroscope, a barometer, a hygrometer, a thermometer, and an infrared sensor, which can be configured on the mobile phone, further description is omitted here.

Audio circuitry 1160, speakers 1161, and microphone 1162 may provide an audio interface between a user and a cell phone. The audio circuit 1160 may transmit the electrical signal converted from the received audio data to the speaker 1161, and convert the electrical signal into a sound signal for output by the speaker 1161; on the other hand, the microphone 1162 converts the collected sound signals into electrical signals, which are received by the audio circuit 1160 and converted into audio data, which are then processed by the audio data output processor 1180, and then transmitted to, for example, another cellular phone via the RF circuit 1110, or output to the memory 1120 for further processing.

WiFi belongs to short-distance wireless transmission technology, and the cell phone can help a user to receive and send e-mails, browse webpages, access streaming media and the like through the WiFi module 1170, and provides wireless broadband internet access for the user. Although fig. 11 shows the WiFi module 1170, it is understood that it does not belong to the essential constitution of the handset, and can be omitted entirely as needed within the scope not changing the essence of the invention.

The processor 1180 is a control center of the mobile phone, and is connected to various parts of the whole mobile phone through various interfaces and lines, and executes various functions of the mobile phone and processes data by operating or executing software programs and/or modules stored in the memory 1120 and calling data stored in the memory 1120, thereby performing overall monitoring of the mobile phone. Optionally, processor 1180 may include one or more processing units; preferably, the processor 1180 may integrate an application processor, which mainly handles operating systems, user interfaces, application programs, and the like, and a modem processor, which mainly handles wireless communications. It will be appreciated that the modem processor described above may not be integrated within processor 1180.

The camera 1190 is used to collect images.

The mobile phone further includes a power supply (e.g., a battery) for supplying power to the components, and preferably, the power supply may be logically connected to the processor 1180 through a power management system, so that the power management system may manage charging, discharging, and power consumption.

Although not shown, the mobile phone may further include a camera, a bluetooth module, etc., which will not be described herein.

In this embodiment of the present invention, the processor 1180 included in the terminal further has the following functions:

Optionally, the extracting gradient information of pixel points in the image to be detected by a table lookup method to obtain a local feature set of the image to be detected includes:

and extracting gradient information of pixel points in the image to be detected by using an arc tangent table and a neon function so as to obtain a local feature set of the image to be detected, wherein the arc tangent table comprises a corresponding relation between a table look-up index and an angle, and the table look-up index is determined according to the neon function.

Optionally, the extracting, by using an arctangent table and a neon function, gradient information of a pixel point in the image to be detected to obtain a local feature set of the image to be detected may include:

generating an image pyramid according to the image to be detected;

extracting gradient information from pixels in each layer of image in the image pyramid by using an arctangent table and a neon function to form a gradient map, wherein the range of an angle domain of the arctangent function included in the arctangent table is [0,n pi/4 ], and n is a positive integer less than 8;

determining local extreme points on the gradient map, wherein the local extreme points are pixel points which are not influenced by the environment in the image to be detected;

and taking the local extreme point as a center, extracting the texture features of the pixel points adjacent to the local extreme point to obtain a local feature set of the image to be detected, wherein the local feature set comprises the texture features.

Optionally, the extracting gradient information from the pixels in each layer of image in the image pyramid by using an arctan table and a neon function includes:

Optionally, the determining a target table look-up index of the pixel point in the arc tangent table may include:

The target lookup table index may also be determined using the following formula:

Optionally, the method further comprises:

Optionally, the matching the local feature set with the feature set of the marked picture to determine a matching set may include:

Optionally, the method further comprises:

In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product.

The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the application to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that a computer can store or a data storage device, such as a server, a data center, etc., that includes one or more available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid State Disk (SSD)), among others.

Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by associated hardware instructed by a program, which may be stored in a computer-readable storage medium, and the storage medium may include: ROM, RAM, magnetic or optical disks, and the like.

The method, apparatus and computer-readable storage medium for detecting a marked image provided by the embodiments of the present application are described in detail above, and a specific example is applied in the present application to explain the principle and the implementation of the present application, and the description of the above embodiments is only used to help understand the method and the core idea of the present application; meanwhile, for a person skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims

1. A method for detecting a marked picture is applied to an augmented reality scene, and comprises the following steps:

extracting gradient information of pixel points in the image to be detected by a table look-up method to obtain a local feature set of the image to be detected, wherein the method comprises the following steps: extracting gradient information of pixel points in the image to be detected by using an arc tangent table and a neon function to obtain a local feature set of the image to be detected, wherein the arc tangent table comprises a corresponding relation between a table look-up index and an angle, and the table look-up index is determined according to the neon function;

determining that the tagged picture is detected when the number of matching features in the matching set is greater than a number threshold;

the method for extracting gradient information of pixel points in the image to be detected by using an arctangent table and a neon function to obtain a local feature set of the image to be detected comprises the following steps:

generating an image pyramid according to the image to be detected;

extracting gradient information from pixels in each layer of image in the image pyramid using an arctangent table and a neon function to form a gradient map, the arctangent table including arctangent functions having angular domains in a range

N is a positive integer less than 8; wherein the extracting gradient information for pixels in each layer of image in the image pyramid using an arctan table and a neon function comprises: determining a target table look-up index of the pixel point in the arc tangent table, including: determining a ratio of a first value to a second value, the first value being the minimum of the absolute value of the first difference and the absolute value of the second difference, the second value being the absolute value of the first difference and the second differenceThe maximum value of the two difference values is the difference value between the right gray value and the left gray value, and the second difference value is the difference value between the lower gray value and the upper gray value, wherein the right gray value is the gray value of the pixel point on the right side of the currently extracted pixel point, the left gray value is the gray value of the pixel point on the left side of the currently extracted pixel point, the lower gray value is the gray value of the pixel point below the currently extracted pixel point, and the upper gray value is the gray value of the pixel point above the currently extracted pixel point; enlarging the ratio by N times to obtain a calculation result, wherein N is the reciprocal of the discretized ratio; rounding down the calculation result to obtain the target table look-up index;

determining the target angle of the pixel point corresponding to the target table look-up index according to the arc tangent table, wherein the target angle of the pixel point is one kind of gradient information of the pixel point;

2. The method of claim 1, further comprising:

if the target table look-up index is not in the range of the table look-up index of the arc tangent table, the target angle is the largest angle in the angle domain;

if the absolute value of the second difference is greater than the absolute value of the first difference, the target angle is the complementary angle of the table look-up result corresponding to the target table look-up index;

if the first difference is smaller than 0, the target angle is a supplementary angle of a table look-up result corresponding to the target table look-up index;

if the second difference is less than 0, the target angle is the opposite number of the table look-up results corresponding to the target table look-up index;

when the target angle comprises the inverse number of the table lookup result corresponding to the target table lookup index, mapping the result to the table lookup result by using a neon function

。

3. The method according to claim 1 or 2, wherein the matching the local feature set with the feature set of the tagged picture to determine a matching set comprises:

4. The method of claim 3, further comprising:

5. An apparatus for detecting a tagged picture, wherein the apparatus is applied in an augmented reality scene, and comprises:

the extraction program module is used for extracting gradient information of pixel points in the image to be detected, which is acquired by the acquisition program module, by a table look-up method so as to obtain a local feature set of the image to be detected;

a determining program module for determining that the tagged picture is detected when the number of matching features in the matching set determined by the matching program module is greater than a number threshold;

the extraction program module is used for extracting gradient information of pixel points in the image to be detected by using an arc tangent table and a neon function so as to obtain a local feature set of the image to be detected, wherein the arc tangent table comprises a corresponding relation between a table look-up index and an angle, and the table look-up index is determined according to the neon function;

the extraction program module includes:

the generating unit is used for generating an image pyramid according to the image to be detected;

a first extracting unit, configured to extract gradient information for pixels in each layer of the image pyramid generated by the generating unit using an arctan table and a neon function to form a gradient map, where an angle domain of the arctan function included in the arctan table is in a range of

N is a positive integer less than 8;

the determining unit is used for determining local extreme points on a gradient map formed by extracting gradient information by the first extracting unit, wherein the local extreme points are pixel points which are not influenced by the environment in the image to be detected;

the second extraction unit is used for extracting the texture features of pixel points adjacent to the local extreme point by taking the local extreme point determined by the determination unit as a center so as to obtain a local feature set of the image to be detected containing the texture features;

the first extraction unit is specifically configured to:

determining a target table look-up index of the pixel point in the arc tangent table, including: determining a ratio of a first numerical value and a second numerical value, wherein the first numerical value is the minimum value of the absolute value of a first difference value and the absolute value of a second difference value, the second numerical value is the maximum value of the absolute value of the first difference value and the absolute value of the second difference value, the first difference value is the difference value of a right gray value and a left gray value, and the second difference value is the difference value of a lower gray value and an upper gray value, the right gray value is the gray value of a pixel on the right side of a currently extracted pixel, the left gray value is the gray value of a pixel on the left side of the currently extracted pixel, the lower gray value is the gray value of a pixel below the currently extracted pixel, and the upper gray value is the gray value of a pixel above the currently extracted pixel; enlarging the ratio by N times to obtain a calculation result, wherein N is the reciprocal of the discretized ratio; rounding the calculation result downwards to obtain the target table look-up index; and determining the target angle of the pixel point corresponding to the target table look-up index according to the arc tangent table, wherein the target angle of the pixel point is one kind of gradient information of the pixel point.

6. The apparatus of claim 5,

the matching program module is used for:

7. A computer device, characterized in that the computer device comprises: an input/output (I/O) interface, a processor, and a memory having stored therein instructions corresponding to the method of tagged picture detection of any of claims 1-4;

the I/O interface is used for receiving and acquiring an image to be detected;

the processor is configured to execute instructions stored in the memory to perform the steps of the method of marker picture detection as claimed in any one of claims 1 to 4.

8. A computer-readable storage medium having stored therein instructions for tagged picture detection, which when run on a computer, cause the computer to perform the method of any of claims 1-4 above.