CN110929741A

CN110929741A - Image feature descriptor extraction method, device, equipment and storage medium

Info

Publication number: CN110929741A
Application number: CN201911158816.6A
Authority: CN
Inventors: 任明星
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2019-11-22
Filing date: 2019-11-22
Publication date: 2020-03-27

Abstract

The application discloses an image feature descriptor extraction method, device, equipment and storage medium, and relates to the field of image recognition. The method comprises the following steps: acquiring an image to be processed; adopting pure color pixel points to expand the edge of the image to be processed outwards to obtain a filled image; extracting image feature descriptors of the filling images; the box filter can extract the characteristics of the edge pixel points of the image to be processed, so that the characteristics in the image to be processed are fully extracted, and the edge effect is effectively avoided. And aiming at the small-size image, enough features can be accurately extracted from the processed filling image through the SIFT/SURF algorithm.

Description

Image feature descriptor extraction method, device, equipment and storage medium

Technical Field

The present application relates to the field of image recognition, and in particular, to a method, an apparatus, a device, and a storage medium for extracting an image feature descriptor.

Background

In the field of image recognition, image areas located similar to a template image can be identified in images represented at different resolutions based on the template image (e.g., a button icon).

For the implementation of the above functions, there are various Feature extraction algorithms, such as a Scale Invariant Feature Transform (SIFT) algorithm and an accelerated version robust Features (SURF) algorithm in a classic Computer Vision library (OpenCV) of cross-platform. The SIFT/SURF algorithm can quickly and accurately extract the feature points with size invariance in the images represented by different resolutions, and further can be positioned to the region position of the template image in the images through the feature points.

However, the SIFT/SURF algorithm has the problem that the feature extraction is too little or the feature extraction cannot be performed for the image edge pixel point.

Disclosure of Invention

The embodiment of the application provides an image feature descriptor extraction method, device, equipment and storage medium, and can solve the problem that the SIFT/SURF algorithm has too few feature extraction or cannot extract features for image edge pixel points. The technical scheme is as follows:

according to an aspect of the present application, there is provided a method for extracting an image feature descriptor, the method including:

acquiring an image to be processed;

adopting pure color pixel points to expand the edge of the image to be processed outwards to obtain a filled image;

and extracting an image feature descriptor of the filling image, wherein the image feature descriptor is obtained based on the feature points extracted by the box filter in the filling image.

According to another aspect of the present application, there is provided a method of automatically controlling a user interface, the method including:

acquiring a local area graph on a user interface, and acquiring a template image corresponding to a control;

extending the edge of the template image outwards by adopting pure color pixel points to obtain a filling image;

extracting a first feature set of the filling image and extracting a second feature set of the local area map;

performing feature matching on the first feature set and the second feature set to obtain the position of the control in the local area graph;

and automatically triggering a control on the user interface based on the position.

According to another aspect of the present application, there is provided an image feature descriptor extraction apparatus, including:

the acquisition module is used for acquiring an image to be processed;

the filling module is used for adopting the pure color pixel points to expand the edge of the image to be processed outwards to obtain a filled image;

and the extraction module is used for extracting the image feature descriptors of the filling image, and the image feature descriptors are obtained based on the feature points extracted by the box filter in the filling image.

According to another aspect of the present application, there is provided an automatic control apparatus of a user interface, the apparatus including:

the acquisition module is used for acquiring a local area image on the user interface and acquiring a template image corresponding to the control;

the filling module is used for expanding the edge of the template image outwards by adopting the pure color pixel points to obtain a filled image;

the extraction module is used for extracting a first feature set of the filling image and extracting a second feature set of the local area image;

the matching module is used for performing feature matching on the first feature set and the second feature set to obtain the position of the control in the local area map;

and the control module is used for automatically triggering the control on the user interface based on the position.

According to another aspect of the present application, there is provided a server including:

a memory;

a processor coupled to the memory;

wherein the processor is configured to load and execute executable instructions to implement the method of extracting an image feature descriptor as described in the first aspect and its alternative embodiments above, or the method of automatically controlling a user interface as described in the second aspect and its alternative embodiments above.

According to another aspect of the present application, there is provided a terminal including:

a memory;

a processor coupled to the memory;

wherein the processor is configured to load and execute executable instructions to implement the method of extracting an image feature descriptor as described in the above one aspect and its alternative embodiments, or the method of automatically controlling a user interface as described in the above another aspect and its alternative embodiments.

According to another aspect of the present application, there is provided a computer-readable storage medium having at least one instruction, at least one program, code set, or instruction set stored therein, the at least one instruction, at least one program, code set, or instruction set being loaded and executed by a processor to implement the method for extracting an image feature descriptor according to the above-mentioned one aspect and its optional embodiments, or the method for automatically controlling a user interface according to the above-mentioned another aspect and its optional embodiments.

The beneficial effects brought by the technical scheme provided by the embodiment of the application at least comprise:

in the process of extracting the image feature descriptor, the edge of the image to be processed is expanded outwards by adopting a pure color pixel point to obtain a filling image; and extracting the image feature descriptors of the filled image, so that the box filter can extract the features of the edge pixel points of the image to be processed, the features in the image to be processed are fully extracted, and the edge effect is effectively avoided. And because the characteristic points can not be extracted from the continuous flaky pure color pixel points, the pure color pixel points are used for filling the newly-added edge pixel points, so that redundant characteristic points can not be extracted in the characteristic extraction process. Therefore, for small-size images, sufficient features can be accurately extracted from the processed filler images through the SIFT/SURF algorithm.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 is a schematic block diagram of a computer system provided in an exemplary embodiment of the present application;

FIG. 2 is a flowchart of a method for extracting an image feature descriptor provided in an exemplary embodiment of the present application;

FIG. 3 is a schematic diagram of an expansion of a newly added edge pixel site according to an exemplary embodiment of the present application;

FIG. 4 is a flowchart of a method for extracting an image feature descriptor provided in another exemplary embodiment of the present application;

FIG. 5 is a schematic diagram of an expansion of a newly added edge pixel site according to another exemplary embodiment of the present application;

FIG. 6 is a flowchart of a method for extracting an image feature descriptor according to another exemplary embodiment of the present application;

FIG. 7 is a schematic diagram of an expansion of a newly added edge pixel site according to another exemplary embodiment of the present application;

FIG. 8 is a schematic diagram of an expansion of a newly added edge pixel site according to another exemplary embodiment of the present application;

FIG. 9 is a flowchart of a method for extracting an image feature descriptor according to another exemplary embodiment of the present application;

FIG. 10 is a flow chart of a method for automatic control of a user interface provided by an exemplary embodiment of the present application;

FIG. 11 is a schematic diagram illustrating adding pixel padding for new edges according to an exemplary embodiment of the present application;

FIG. 12 is a schematic diagram of newly added edge pixel population according to another exemplary embodiment of the present application;

FIG. 13 is an interface schematic of an automated control user interface provided by an exemplary embodiment of the present application;

FIG. 14 is a block diagram of an apparatus for extracting an image feature descriptor provided in an exemplary embodiment of the present application;

FIG. 15 is a block diagram of an automated control system for a user interface provided in an exemplary embodiment of the present application;

FIG. 16 is a block diagram of a server provided in an exemplary embodiment of the present application;

fig. 17 is a schematic structural diagram of a terminal according to an exemplary embodiment of the present application.

Detailed Description

To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.

Several terms referred to in this application are explained below:

boundary effects: the method refers to a phenomenon of feature loss caused by the fact that the features of edge pixel points of an image cannot be acquired in the image processing process. In this application, the boundary effect refers to that when a box filter is used to collect feature points of an image to be processed (including a template image), features of edge source pixel points of the image to be processed cannot be collected, for example, a box filter with a size of 3 × 3 collects feature points of an image to be processed with a pixel size of 5 × 5, so as to obtain a feature point matrix with a pixel size of 3 × 3, and cannot collect feature points of edges of the image to be processed.

Image feature descriptor: the descriptor is a feature point descriptor, and can be extracted from an image through a SIFT algorithm or a SURF algorithm. Each image feature descriptor is obtained by assigning an i-dimensional direction parameter to a feature point based on the gradient direction distribution feature of the neighborhood pixels of the corresponding feature point, for example, assigning a 128-dimensional direction parameter to the feature point based on the gradient direction distribution feature of the neighborhood pixels of the feature point in the SIFT algorithm, so as to obtain the image feature descriptor of the feature point. The SIFT algorithm and the SURF algorithm are both feature extraction algorithms in the field of image processing.

The SIFT algorithm and the SURF algorithm are applied to the field of image recognition, and can be used for quickly and accurately extracting the template image and the image feature descriptors in the image, so that the feature matching of the image feature descriptors is realized, and the template image is recognized from the image. However, in the above-mentioned SIFT algorithm and SURF algorithm, in the process of extracting feature points using a box filter, feature points corresponding to image edge pixel points cannot be extracted due to a boundary effect. In the case of extracting feature points from an image with a large pixel size, the number of feature points that can be extracted is large, and the influence on the calculation of the final image feature descriptor is limited, but for an image with a small pixel size, the number of feature points that can be extracted from such an image is small, and the boundary effect has a great influence on the calculation of the final image feature descriptor of such an image. Therefore, the present application provides a method for extracting an image feature descriptor to solve the above problem, please refer to the following embodiments for explaining the method.

Referring to fig. 1, a schematic structural diagram of a computer system provided by an exemplary embodiment of the present application is shown, where the computer system includes a terminal 120 and a server 140. Wherein the terminal 120 is connected to the server 140 through a wired or wireless network.

An application is installed and operated in the terminal 120; optionally, the application may comprise a terminal-side cloud application. The cloud application refers to an application program based on cloud computing; in the running process of the cloud application, program operation is completed in the cloud server, and the terminal 120 carries decompression, rendering and display of the user interface of the cloud application; optionally, the cloud application in the terminal 120 further has a function of executing a part of the instructions.

Illustratively, the cloud application may include a cloud game, the cloud game is run in the terminal 120, and the terminal 120 receives a control instruction on a game interface of the cloud game and sends the control instruction to the cloud server through a wired or wireless network; processing the control instruction of the cloud game through cloud computing provided by a cloud server, and further generating a game picture and a sound signal; receiving game pictures and sound signals sent by a cloud server; and displaying the game picture and the sound signal on a user interface. The cloud server is equipment for providing background service for cloud application; illustratively, a cloud server provides cloud computing services for cloud applications.

The server 140 provides background services for the application programs; alternatively, the server 140 may include a cloud server. The cloud server 140 is installed with an operating system, which may include an Android (Android) system or an Apple (Apple) system. And the operating system is provided with a server side cloud application in an installing and running mode.

In some embodiments, the terminal 120 independently performs the image feature extraction method provided herein or the automatic control method of the user interface.

In some embodiments, the image feature extraction method provided herein, or the automatic control method of the user interface, is independently performed by the server 140.

In some embodiments, the method for extracting image features or the method for automatically controlling the user interface provided by the present application is cooperatively performed by both the terminal 120 and the server 140.

Schematically, taking an example that the terminal 120 and the server 140 cooperate to realize the extraction method of the image feature descriptor provided by the present application, the server 140 obtains an image to be processed from the terminal 120; adopting pure color pixel points to expand the edge of the image to be processed outwards to obtain a filled image; feature points extracted in the filler image based on a box filter; filling an image feature descriptor of the image according to the feature point framework; finally, the extracted image feature descriptors are saved or fed back to the terminal 120.

It should be noted that, when the application program is a cloud application, the method for extracting image features or the method for automatically controlling the user interface provided in the present application only needs to be executed in the server 140, that is, in a cloud server providing a background server for the cloud application. Schematically, an automatic control method of a user interface is taken as an example for explanation, and in the running process of a cloud application, when a cloud server detects a generated user interface, the cloud server intercepts a local area map on the user interface; the cloud server stores a template image corresponding to the control, acquires the template image corresponding to the control, and expands the edge of the template image outwards by adopting pure-color pixel points to obtain a filling image; and then the cloud server extracts a first feature set of the filling image, extracts a second feature set of the local area image, performs feature matching on the first feature set and the second feature set to obtain the position of the control in the local area image, and automatically triggers the control on the user interface based on the position.

The terminal 120 is provided with a memory and a processor; the memory and processor in the terminal 120 are used to support the execution of the application programs. The server 140 is provided with a memory and a processor, and the memory and the processor in the server 140 are used for providing a background service for the application program.

Optionally, the memory may include, but is not limited to, the following: random Access Memory (RAM), Read Only Memory (ROM), Programmable Read-Only Memory (PROM), Erasable Read-Only Memory (EPROM), and electrically Erasable Read-Only Memory (EEPROM).

Alternatively, the processor may be formed of one or more integrated circuit chips. Alternatively, the first Processor may be a general purpose Processor, such as a Central Processing Unit (CPU) or a Network Processor (NP).

Alternatively, the terminal 120 may include at least one of a notebook computer, a desktop computer, a smart phone, and a tablet computer.

It should be noted that the above method for extracting image features can be applied to image feature extraction in various fields, for example, in the field of image recognition, image feature descriptors in two images can be extracted by the above method for extracting image features, and image recognition is performed by matching the image feature descriptors of the two images. For example, the automatic control method of the user interface extracts the template image corresponding to the control and the image feature descriptors in the local area image of the user interface respectively by the extraction method of the image feature descriptors, and then determines the position of the image corresponding to the control in the user interface by matching the image feature descriptors of the two images, thereby realizing automatic triggering of the control on the user interface.

Referring to fig. 2, a flowchart of a method for extracting an image feature descriptor according to an exemplary embodiment of the present application is shown, which is described by taking the method as an example applied to a terminal, and the method includes:

step 201, acquiring an image to be processed.

The image to be processed may include: the image stored in the terminal, or the image obtained by screenshot of the user interface, or the image obtained by shooting through the camera is not limited in the application.

Step 202, extending the edge of the image to be processed outwards by using the pure color pixel points to obtain a filled image.

It should be noted that, because the characteristic of the patch solid-color pixel points is not extracted in the characteristic extraction process, in order to avoid adding redundant characteristic points in the characteristic point extraction process of the filled image, the edge of the image to be processed is extended outwards by the solid-color pixel points, so as to obtain the filled image.

Optionally, the terminal uses the pure color pixel point to expand each edge of the image to be processed by m rows of pixel points outwards, so as to obtain a filled image. Whether the feature points of the primary edge pixel points can be extracted from the filled image is related to the size of a proper filter used for extracting the feature points, if the size of the box filter is (2n +1) × (2n +1), then m is larger than or equal to n, and then the box filter can be ensured to be capable of extracting the feature points corresponding to the primary edge pixel points from the filled image, wherein m and n are positive integers.

Schematically, as shown in fig. 3, if the size of the box filter is 3 × 3, the terminal performs edge pixel expansion on a to-be-processed image 11 with a pixel size of 4 × 4 to obtain a filled image 12 with a pixel size of 5 × 5.

Step 203, extracting the image feature descriptors of the filling image.

The image feature descriptor is obtained based on feature points extracted by the box filter from the filling image, and schematically, the terminal constructs an image pyramid based on the filling image; detecting and obtaining feature points with size invariance in the image pyramid through a box filter; and calculating the gradient direction of the characteristic points, and constructing an image characteristic descriptor according to the gradient direction of the characteristic points. Optionally, the detailed step of extracting the descriptors by the SIFT/SURF algorithm may be referred to for extracting the descriptors from the image feature descriptors in the filler image, and details are not repeated here.

In summary, in the method for extracting an image feature descriptor provided in this embodiment, in the process of extracting the image feature descriptor, the edge of the image to be processed is extended outward by using the pure color pixel point, so as to obtain a filled image; and extracting the image feature descriptors of the filled image, so that the box filter can extract the features of the edge pixel points of the image to be processed, the features in the image to be processed are fully extracted, and the edge effect is effectively avoided. And because the characteristic points can not be extracted from the continuous flaky pure color pixel points, the pure color pixel points are used for filling the newly-added edge pixel points, so that redundant characteristic points can not be extracted in the characteristic extraction process. Therefore, for small-size images, sufficient features can be accurately extracted from the processed filler images through the SIFT/SURF algorithm.

Based on fig. 2, the expansion of the edge of the image to be processed includes at least two of the following:

1) directly copying the image to be processed to the central area of the solid image template, as shown in fig. 4;

2) and (3) expanding newly-added edge pixel points at each edge of the image to be processed, and filling the pixel values of the newly-added edge pixel points, as shown in fig. 6.

Referring to fig. 4, which shows a flowchart of an image feature descriptor extraction method provided in another exemplary embodiment of the present application, step 202 in fig. 2 includes step 2021, and the exemplary steps are as follows:

step 2021, copy the image to be processed to the central area of the solid image template to obtain a filled image.

The terminal stores a pure color image template which is used for filling edge pixels of an image to be processed. The edge length of the pure color image template is greater than that of the image to be processed; due to the boundary effect, if the feature points of the edge pixel points of the image to be processed need to be extracted, the difference between the edge length of the pure-color image template and the edge length of the image to be processed is greater than or equal to 2n, wherein the size of the box filter is (2n +1) × (2n + 1).

Optionally, the solid image template comprises a black image template or a white image template. Taking the pure color image template as the black image template as an example, as shown in fig. 5, if the size of the box filter is 3 × 3, the terminal copies the to-be-processed image 21 with the pixel size of 3 × 3 to the central area of the black image template 22 with the pixel size of 5 × 5, and obtains a filled image 23.

According to the filling mode, the preset template is used, the image to be processed is directly copied to the template, and filling of edge pixel points of the image to be processed can be rapidly completed.

Referring to fig. 6, which shows a flowchart of an image feature descriptor extraction method provided in another exemplary embodiment of the present application, step 202 in fig. 2 includes step 2022 to step 2023, and exemplary steps are as follows:

step 2022, the edge of the image to be processed is extended outward to obtain newly added edge pixel points.

The pixel point expansion of the image to be processed by the terminal can include at least one of the following modes:

1) and the terminal expands m rows of newly-added edge pixel points from inside to outside on each edge of the image to be processed.

Wherein m is a preset filling line number in the terminal; the terminal expands m rows of pixel points from inside to outside at each edge of the image to be processed, the m rows of pixel points are newly added edge pixel points, the newly added edge pixel points are blank pixel points, and the blank pixel points refer to pixel points without pixel values. For example, when m is 2, as shown in fig. 7, one grid represents one pixel, and the size of the pixel of the image 31 to be processed is 5 × 5, the size of the pixel of the image 32 after edge expansion is 9 × 9, where a blank grid is the newly added edge pixel.

The value of m is related to the size of a box filter used for extracting a feature point in an image. If the size of the box filter is (2n +1) × (2n +1), then m is greater than or equal to n, m and n being positive integers, and × denotes the multiplication. Illustratively, if feature point extraction is performed on each layer of image set in the image pyramid, the size of the box filter used for feature point extraction in each layer is different, and therefore, (2n +1) × (2n +1) may be defined as the maximum size of several box filter sizes, and the value of m needs to be greater than or equal to n of the maximum sizes.

2) Copying an image to be processed to a central area of a blank image template by the terminal; and determining unoccupied pixel points positioned at the periphery in the blank image template as newly-added edge pixel points relative to the image to be processed.

And a blank image template is stored in the terminal, the terminal acquires the blank image template, the image to be processed is copied to the central area of the blank image template, and unoccupied pixel points positioned at the periphery in the blank image template are determined as the newly-added edge pixel points.

For example, the size of the blank image template 41 stored in the terminal is 6 × 6, and as shown in fig. 8, the size of the image 42 to be processed is 4 × 4, the terminal copies the image 42 to be processed to the central area 43 of the blank image template 41 to obtain an edge-extended image 44, where the size of the edge-extended image 44 is 6 × 6, and the blank lattices around the edge-extended image 44 are the newly added edge pixel points.

The edge length of the blank image template is larger than that of the image to be processed.

Alternatively, the image to be processed may adopt a Red Green Blue (RGB) format, or the image to be processed may adopt a Luminance-Bandwidth-Chrominance (YUV) format.

Step 2023, filling the newly added edge pixel points with the pure color pixel values to obtain a filled image.

Optionally, the terminal sets the pixel value of the newly added edge pixel point as the pixel value of the black pixel point to obtain a filling image; or, the terminal sets the pixel value of the newly added edge pixel point as the pixel value of the white pixel point to obtain the filling image.

It should be noted that, if the image of the closed control in some application programs is white, when the template image of the closed control after edge expansion is filled with newly added edge pixel points, black pixel points may be used for filling; on the contrary, if the image of the closed control in some application programs is black, white pixel points can be adopted for filling when newly-added edge pixel points are filled in the template image of the closed control after the edge expansion.

Optionally, when the newly added edge pixel is filled by using the black pixel, the terminal sets the RGB pixel value of the newly added edge pixel to (0, 0, 0), so as to obtain a filled image; or, the YUV pixel value of the newly added edge pixel point is set to (0, 128, 128), and a filling image is obtained.

In the method, the pixel points in the blank image template are not assigned, and the occupied control is small when the pixel points are stored in the memory; and the filling mode can rapidly complete the filling of the edge pixel points of the image to be processed.

It should be further noted that the terminal is provided with a preset size; before executing step 202, the terminal also judges the size of a pixel point of the image to be processed; when the pixel point size of the image to be processed is smaller than the preset size, executing step 202 to step 203; and when the size of the pixel point of the image to be processed is larger than or equal to the preset size, directly extracting the image feature descriptor of the image to be processed. For example, the preset size is 80 × 80, when the size of the pixel point of the image to be processed is 60 × 60, the terminal executes steps 203 to 204, and when the size of the pixel point of the image to be processed is 90 × 90, the terminal directly extracts the image feature descriptor of the image to be processed.

The whole process is schematically illustrated, and fig. 9 shows a flowchart of an image feature descriptor extraction method provided in another embodiment of the present application, where the method includes: a terminal acquires an image to be processed; judging whether the size of the pixel points of the image to be processed is less than 60 x 60; when the size of the pixel point of the image to be processed is larger than or equal to 60 x 60, the terminal extracts an image feature descriptor from the image to be processed through an SIFT algorithm or an SURF algorithm; when the size of the pixel points of the image to be processed is smaller than 60 x 60, the terminal expands and fills the edge pixel points of the image to be processed to obtain a filled image with the size of 80 x 80, and an image feature descriptor is extracted from the filled image through an SIFT algorithm or an SURF algorithm. The preset size is less than or equal to 80 x 80, that is, the preset size is less than or equal to the size of the filled image.

The method can classify the images to be processed into two types by judging the pixel point size of the images to be processed, and can extract enough characteristic points from the images to be processed with the size larger than or equal to the preset size, so that edge expansion and edge pixel filling are not needed; sufficient feature points cannot be extracted from a type of image to be processed smaller than a preset size, and therefore, edge expansion and edge pixel filling are required to ensure that the number of feature points extracted from the image is sufficient when the feature points are extracted.

In the field of image recognition, the image feature descriptor based extraction method can realize the recognition of the image corresponding to the control in the user interface, and further realize the automatic control method of the user interface. Referring to fig. 10, a flowchart of an automatic control method for a user interface according to an exemplary embodiment of the present application is shown, which is described by way of example as being applied to a server, and includes:

step 301, a local area map on a user interface is obtained.

Optionally, the server includes a cloud server, an operating system is installed in the cloud server, an application program is installed and run in the operating system, and the application program is a server-side cloud application, so that when it is detected that the cloud server generates the user interface, the cloud server intercepts a local area map on the user interface.

Or the server acquires the local area map on the user interface from the terminal. The local area graph comprises an image corresponding to the control.

Step 302, acquiring a template image corresponding to the control.

Template images corresponding to j controls are stored in the server, control identifications are stored corresponding to the template images, each control corresponds to one template image, and j is a positive integer. And the server acquires the corresponding template image according to the control identification. It should be noted that, when the automatic control method of the user interface is executed, the task code established by the server includes the control identifier.

And 303, expanding the edge of the template image outwards by adopting the pure color pixel points to obtain a filling image.

In some embodiments, a solid color image template is stored in the server; when the server expands the edge of the template image outwards, the server acquires the pure color image template, and copies the template image to the central area of the pure color image template to obtain a filled image; the edge length of the solid image template is greater than the edge length of the template image. The detailed description may refer to the description of the edge augmentation shown in fig. 5.

Optionally, the solid image template includes a black image template or a white image template.

In some embodiments, server augmentation of the filler image may include the steps of:

1) and the server expands the edge of the template image outwards to obtain newly-added edge pixel points.

The server expands m rows of newly-added edge pixel points on each edge of the template image from inside to outside; or, storing a blank image template in the server, copying the template image to the central area of the blank image template, wherein the edge length of the blank image template is greater than that of the template image; and determining unoccupied pixel points positioned at the periphery in the blank image template as newly-added edge pixel points relative to the template image. The detailed description may refer to the description of the edge augmentation shown in fig. 7 and 8.

A box filter is arranged in the server and used for extracting the characteristic points in the image; optionally, the size of the box filter is (2n +1) × (2n +1), and then the value of m is greater than or equal to n.

2) And the server fills the newly added edge pixel points by adopting the pure color pixel values to obtain a filled image.

Optionally, the server sets the pixel value of the newly added edge pixel point as the pixel value of the black pixel point to obtain a filling image; or setting the pixel value of the newly added edge pixel point as the pixel value of the white pixel point to obtain the filling image.

For example, the server sets the RGB pixel values of the newly added edge pixel points to (0, 0, 0), so as to obtain a filling image; or setting the RGB pixel values of the newly added edge pixel points to be (255, 255, 255) to obtain the filling image. As shown in fig. 11, when the template image 51 of the closed control is subjected to edge pixel expansion and filling, and the image of the closed control in the template image 51 is white, the RGB pixel values of the newly added edge pixel are set to (0, 0, 0), so as to obtain a filled image 52. As shown in fig. 12, when the template image 61 of the closed control is subjected to edge pixel expansion and filling, and the image of the closed control in the template image 61 is black, the RGB pixel values of the newly added edge pixel points are set to (255, 255, 255), so as to obtain a filled image 62.

Step 304, a first set of features of the filler image is extracted.

The first feature set comprises k first descriptors, wherein k is a positive integer; the server constructs an image pyramid based on the filled image; detecting and obtaining feature points with size invariance in the image pyramid through a box filter; and calculating the gradient direction of the characteristic points, and constructing and obtaining k first descriptors according to the gradient direction of the characteristic points. The first descriptor is an image feature descriptor constructed on the basis of feature points extracted by the box filter in the filling image. For example, the step of extracting the descriptor of the image feature in the filler image may refer to a detailed step of extracting the descriptor by a SIFT/SURF algorithm, which is not described herein again.

Step 305, extracting a second feature set of the local area map.

The second feature set comprises a second descriptor, and the second descriptor is an image feature descriptor constructed on the basis of feature points extracted by the box filter in the local area map. The extraction of the second descriptor may refer to the extraction manner of the first descriptor in step 304, which is not described herein again.

And step 306, performing feature matching on the first feature set and the second feature set to obtain the position of the control in the local area map.

Optionally, the determining of the above location by the server may include the following steps:

1) the server calculates the euclidean distance between the first descriptor and the second descriptor.

2) And determining the characteristic subset meeting the preset condition from the second characteristic set based on the Euclidean distance.

The smaller the Euclidean distance between the first descriptor and the second descriptor, the smaller the Euclidean distance between the first descriptor and the second descriptor is, the first descriptor and the second descriptor are represented

The more similar the second descriptor. Optionally, a distance threshold is set in the server, and when the euclidean distance between the first descriptor and the second descriptor is smaller than the distance threshold, the second descriptor is determined to be a similar descriptor of the first descriptor.

Optionally, a preset proportion is set in the server; the preset condition is that the ratio of the similar descriptor to the first descriptor in the feature subset is larger than a preset ratio. That is, when the ratio of the similar descriptor to the first descriptor in the candidate feature subset is greater than a preset ratio, determining the candidate feature subset as the feature subset; the second feature set comprises at least two candidate feature subsets.

For example, the preset ratio is 0.7, and when the ratio of the similar descriptor to the first descriptor in the candidate feature subset is greater than 0.7, the candidate feature subset is determined as the feature subset.

3) And determining the position of the control in the local area graph according to the descriptors in the feature subset.

Each image feature descriptor includes location information, which may be used to indicate the location of the image feature descriptor in the image.

The characteristic subset comprises a second descriptor corresponding to the control, and the server determines the position of the control in the local area graph, namely the position of the control on the image user interface, according to the position information in the second descriptor corresponding to the control.

Step 307, automatically triggering a control on the user interface based on the location.

The server simulates a control instruction triggered at the position of the image of the control on the user interface, and executes a corresponding function according to the control instruction. For example, if the control is a closing control, the server generates a closing instruction of the user interface, and the server or the terminal executes the closing instruction to close the user interface. For another example, if the control is a return control, the server generates a return instruction for returning to the previous user interface, and the server or the terminal executes the return instruction to return to the previous user interface from the user interface. For another example, if the control is a web page link control, the server generates a jump instruction, and the server or the terminal executes the jump instruction to jump from the user interface to the web page. It should be noted that the type of the control is not limited in this application.

In summary, the automatic control method for the user interface provided by this embodiment obtains the local area map on the user interface and obtains the template image corresponding to the control; extending the edge of the template image outwards by adopting pure color pixel points to obtain a filling image; extracting a first feature set of the filling image and extracting a second feature set of the local area map; performing feature matching on the first feature set and the second feature set to obtain the position of the control in the local area graph; and automatically triggering a control on the user interface based on the position.

In the method, in the process of extracting the image feature descriptor, newly-added edge pixel points are obtained by expanding the edge of a to-be-template image outwards; filling the newly added edge pixel points by adopting pure color pixel values to obtain a filled image; and extracting the image feature descriptors of the filled image, so that the box filter can extract the features of the edge pixel points of the image to be processed, the features in the template image are fully extracted, and the edge effect is effectively avoided. And because the characteristic points can not be extracted from the continuous flaky pure color pixel points, the pure color pixel points are used for filling the newly-added edge pixel points, so that redundant characteristic points can not be extracted in the characteristic extraction process. And aiming at the small-size image, the filled image is obtained after the processing, the characteristic points are accurately extracted from the filled image, the position of the image of the control in the user interface can be accurately and quickly identified through the characteristic points, and the control on the user interface can be accurately and automatically triggered.

An application is installed and run on a terminal, and a technician updates the application in order to fix a bug of a program code of the application or enrich a service function provided by the application. For example, cloud games frequently update game scenes, game props, and the like to improve the interest of game players.

After the cloud game is updated, an update notice is displayed on a user interface of the cloud game to prompt a game player that the program of the cloud game is updated. After the user manually closes the update announcement, one user interface of the cloud game displayed on the terminal can be selected by the user to enter another user interface. In some special scenes, a user interface is required to jump to another user interface automatically, an update notice may appear in the page jump process, at this time, the position of a closing button in a picture of the update notice generated in the cloud game is required to be identified, and then user operation triggered on the closing button is simulated to automatically close the update notice.

Since the size of the picture corresponding to the close button is too small, the identification accuracy of the close button in the automatic close process of the update notice can be reduced, and the automatic control method of the user interface shown in fig. 10 can solve the problem, it should be noted that the automatic close of the update notice in the cloud game is realized by the cloud server, and the terminal does not need to participate in the terminal, and the terminal can only display the video stream generated in the cloud server, and the video stream includes the continuous image frames of the game picture.

Illustratively, as shown in fig. 13, when the pre-loading script is loaded, the cloud server pops up an update notice 72 on a user interface 71 of the cloud game, and the cloud server executes the automatic control method of the user interface, identifies a position 73 of the closing control, and then automatically closes the update notice, and directly jumps to a user interface 74 corresponding to the pre-loading script.

Referring to fig. 14, there is shown an extracted block diagram of an image feature descriptor provided in an exemplary embodiment of the present application, where the apparatus is implemented as part of or all of a terminal or a server through software, hardware, or a combination of the two, and the apparatus includes:

an obtaining module 401, configured to obtain an image to be processed;

a filling module 402, configured to expand an edge of the image to be processed outward by using a pure color pixel point to obtain a filled image;

and an extracting module 403, configured to extract an image feature descriptor of the pad image, where the image feature descriptor is obtained based on feature points extracted by the box filter in the pad image.

In some embodiments, the filling module 402 is configured to copy the image to be processed to a central area of the solid image template, resulting in a filled image; the edge length of the solid image template is greater than the edge length of the image to be processed.

In some embodiments, the solid image template comprises a black image template.

In some embodiments, the fill module 402 includes:

the expansion submodule 4021 is configured to expand the edge of the image to be processed outward to obtain a newly added edge pixel point;

and the filling sub-module 4022 is configured to fill the newly added edge pixel points with the pure color pixel values to obtain a filled image.

In some embodiments, the filling sub-module 4022 is configured to set a pixel value of the newly added edge pixel to a pixel value of the black pixel, so as to obtain a filled image.

In some embodiments, the image to be processed is in a red, green, blue, RGB, format;

the filling submodule 4022 is configured to set the RGB pixel values of the newly added edge pixel to (0, 0, 0), so as to obtain a filled image.

In some embodiments, the expansion submodule 4021 is configured to expand each edge of the to-be-processed image from inside to outside by m rows of newly-added edge pixel points;

or copying the image to be processed to the central area of a blank image template, wherein the edge length of the blank image template is greater than that of the image to be processed; and determining unoccupied pixel points positioned at the periphery in the blank image template as newly-added edge pixel points relative to the image to be processed.

In some embodiments, the size of the box filter is (2n +1) × (2n +1), then m is greater than or equal to n, and m and n are positive integers.

In some embodiments, an extraction module 403 to construct an image pyramid based on the filler image; detecting and obtaining feature points with size invariance in the image pyramid through a box filter; and calculating the gradient direction of the characteristic points, and constructing an image characteristic descriptor according to the gradient direction of the characteristic points.

In summary, in the image feature descriptor extraction device provided in this embodiment, in the process of extracting the image feature descriptor, the pure color pixel points are used to expand the edge of the image to be processed outward, so as to obtain a filled image; and extracting the image feature descriptors of the filled image, so that the box filter can extract the features of the edge pixel points of the image to be processed, the features in the image to be processed are fully extracted, and the edge effect is effectively avoided. And because the characteristic points can not be extracted from the continuous flaky pure color pixel points, the pure color pixel points are used for filling the newly-added edge pixel points, so that redundant characteristic points can not be extracted in the characteristic extraction process. Therefore, for small-size images, sufficient features can be accurately extracted from the processed filler images through the SIFT/SURF algorithm.

Referring to fig. 15, a block diagram of an automatic control device of a user interface provided by an exemplary embodiment of the present application, which is implemented as part of or all of a terminal or a server by software, hardware or a combination of the two, is shown, and the device includes:

an obtaining module 501, configured to obtain a local area map on a user interface, and obtain a template image corresponding to a control;

a filling module 502, configured to expand an edge of the template image outward by using pure color pixel points to obtain a filled image;

an extracting module 503, configured to extract a first feature set of the filler image and extract a second feature set of the local area map;

a matching module 504, configured to perform feature matching on the first feature set and the second feature set to obtain a position of the control in the local area map;

and a control module 505 for automatically triggering a control on the user interface based on the location.

In some embodiments, the filling module 502 is configured to copy the template image to a central region of the solid image template, resulting in a filled image; the edge length of the solid image template is greater than the edge length of the template image.

In some embodiments, the solid image template comprises a black image template.

In some embodiments, the fill module 502 includes:

the expansion submodule 5021 is used for expanding the edge of the template image outwards to obtain a newly-added edge pixel point;

and the filling submodule 5022 is used for filling the newly-added edge pixel points by adopting the pure-color pixel values to obtain a filled image.

In some embodiments, the padding sub-module 5022 is configured to set the pixel values of the newly added edge pixels to the pixel values of the black pixels, so as to obtain a padded image.

In some embodiments, the template image is in a red, green, blue, RGB format;

and the filling submodule 5022 is used for setting the RGB pixel values of the newly added edge pixel points to (0, 0, 0) to obtain a filled image.

In some embodiments, the expansion submodule 5021 is configured to expand each edge of the template image from inside to outside by m rows of newly-added edge pixel points; or copying the template image to the central area of a blank image template, wherein the edge length of the blank image template is greater than that of the template image; and determining unoccupied pixel points positioned at the periphery in the blank image template as newly-added edge pixel points relative to the template image.

In some embodiments, the first set of features comprises a first descriptor and the second set of features comprises a second descriptor;

a matching module 504 for calculating a euclidean distance between the first descriptor and the second descriptor; determining a characteristic subset meeting a preset condition from the second characteristic set based on the Euclidean distance; determining the position according to the descriptors in the feature subset;

the first descriptor is an image feature descriptor constructed on the basis of the feature points extracted by the box filter in the filling image, and the second descriptor is an image feature descriptor constructed on the basis of the feature points extracted by the box filter in the local area map.

In some embodiments, the first feature set comprises k first descriptors, k being a positive integer;

an extraction module 503, configured to construct an image pyramid based on the filler image; detecting and obtaining feature points with size invariance in the image pyramid through a box filter; and calculating the gradient direction of the characteristic points, and constructing and obtaining k first descriptors according to the gradient direction of the characteristic points.

In summary, the automatic control apparatus for a user interface provided in this embodiment obtains a local area map on the user interface and obtains a template image corresponding to a control; extending the edge of the template image outwards by adopting pure color pixel points to obtain a filling image; extracting a first feature set of the filling image and extracting a second feature set of the local area map; performing feature matching on the first feature set and the second feature set to obtain the position of the control in the local area graph; and automatically triggering a control on the user interface based on the position.

In the process of extracting the image feature descriptor, the device obtains newly-added edge pixel points by expanding the edge of the image to be template outwards; filling the newly added edge pixel points by adopting pure color pixel values to obtain a filled image; and extracting the image feature descriptors of the filled image, so that the box filter can extract the features of the edge pixel points of the image to be processed, the features in the template image are fully extracted, and the edge effect is effectively avoided. And because the characteristic points can not be extracted from the continuous flaky pure color pixel points, the pure color pixel points are used for filling the newly-added edge pixel points, so that redundant characteristic points can not be extracted in the characteristic extraction process. And aiming at the small-size image, the filled image is obtained after the processing, the characteristic points are accurately extracted from the filled image, the position of the image of the control in the user interface can be accurately and quickly identified through the characteristic points, and the control on the user interface can be accurately and automatically triggered.

Referring to fig. 16, a schematic structural diagram of a server according to an embodiment of the present application is shown. The server is used for implementing part or all of the server side in the image feature descriptor extracting method provided in the embodiment; or, a part or all of the server side in the automatic control method of the user interface.

Specifically, the method comprises the following steps:

the server 600 includes a CPU (Central Processing Unit) 601, a system Memory 604 including a RAM (Random Access Memory) 602 and a ROM (Read-Only Memory) 603, and a system bus 605 connecting the system Memory 604 and the Central Processing Unit 601. The server 600 also includes a basic I/O (Input/Output) system 606, which facilitates the transfer of information between devices within the computer, and a mass storage device 607, which stores an operating system 613, application programs 614, and other program modules 615.

The basic input/output system 606 includes a display 608 for displaying information and an input device 609 such as a mouse, keyboard, etc. for a user to input information. Wherein the display 608 and the input device 609 are connected to the central processing unit 601 through an input output controller 610 connected to the system bus 605. The basic input/output system 606 may also include an input/output controller 610 for receiving and processing input from a number of other devices, such as a keyboard, mouse, or electronic stylus. Similarly, input/output controller 610 may also provide output to a display screen, a printer, or other type of output device.

The mass storage device 607 is connected to the central processing unit 601 through a mass storage controller (not shown) connected to the system bus 605. The mass storage device 607 and its associated computer-readable media provide non-volatile storage for the server 600. That is, the mass storage device 607 may include a computer-readable medium (not shown) such as a hard disk or a CD-ROM (Compact disk Read-Only Memory) drive.

Without loss of generality, the computer-readable media may comprise computer storage media and communication media. Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Computer storage media includes RAM, ROM, EPROM (Erasable Programmable Read-Only Memory), EEPROM (Electrically Erasable Programmable Read-Only Memory), Flash Memory (Flash Memory) or other solid state Memory technology, CD-ROM, DVD (Digital versatile disk) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices. Of course, those skilled in the art will appreciate that the computer storage media is not limited to the foregoing. The system memory 604 and mass storage device 607 described above may be collectively referred to as memory.

The server 600 may also operate in accordance with various embodiments of the present application by connecting to remote computers over a network, such as the internet. That is, the server 600 may be connected to the network 612 through the network interface unit 611 connected to the system bus 605, or may be connected to other types of networks or remote computer systems (not shown) using the network interface unit 611.

Referring to fig. 17, a block diagram of a terminal 700 according to an exemplary embodiment of the present application is shown. The terminal 700 may be: a smart phone, a tablet computer, an MP3 player (Moving Picture Experts Group audio Layer III, motion Picture Experts compression standard audio Layer 3), an MP4 player (Moving Picture Experts Group audio Layer IV, motion Picture Experts compression standard audio Layer 4), a notebook computer, or a desktop computer. Terminal 700 may also be referred to by other names such as user equipment, portable terminal, laptop terminal, desktop terminal, and so on.

In general, terminal 700 includes: a processor 701 and a memory 702.

The processor 701 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and so on. The processor 701 may be implemented in at least one hardware form of a DSP (Digital Signal Processing), an FPGA (Field-Programmable Gate Array), and a PLA (Programmable Logic Array). The processor 701 may also include a main processor and a coprocessor, where the main processor is a processor for processing data in an awake state, and is also called a Central Processing Unit (CPU); a coprocessor is a low power processor for processing data in a standby state. In some embodiments, the processor 701 may be integrated with a GPU (Graphics Processing Unit), which is responsible for rendering and drawing the content required to be displayed on the display screen. In some embodiments, the processor 701 may further include an AI (Artificial Intelligence) processor for processing computing operations related to machine learning.

Memory 702 may include one or more computer-readable storage media, which may be non-transitory. Memory 702 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in memory 702 is used to store at least one instruction for execution by processor 701 to implement a method for extracting image feature descriptors, or a method for automatic control of a user interface, provided by method embodiments herein.

In some embodiments, the terminal 700 may further optionally include: a peripheral interface 703 and at least one peripheral. The processor 701, the memory 702, and the peripheral interface 703 may be connected by buses or signal lines. Various peripheral devices may be connected to peripheral interface 703 via a bus, signal line, or circuit board. Specifically, the peripheral device includes: at least one of a radio frequency circuit 704, a display 705, an audio circuit 706, a positioning component 707, and a power source 708.

The peripheral interface 703 may be used to connect at least one peripheral related to I/O (Input/Output) to the processor 701 and the memory 702. In some embodiments, processor 701, memory 702, and peripheral interface 703 are integrated on the same chip or circuit board; in some other embodiments, any one or two of the processor 701, the memory 702, and the peripheral interface 703 may be implemented on a separate chip or circuit board, which is not limited in this embodiment.

The Radio Frequency circuit 704 is used for receiving and transmitting RF (Radio Frequency) signals, also called electromagnetic signals. The radio frequency circuitry 704 communicates with communication networks and other communication devices via electromagnetic signals. The rf circuit 704 converts an electrical signal into an electromagnetic signal to transmit, or converts a received electromagnetic signal into an electrical signal. Optionally, the radio frequency circuit 704 includes: an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a subscriber identity module card, and so forth. The radio frequency circuitry 704 may communicate with other terminals via at least one wireless communication protocol. The wireless communication protocols include, but are not limited to: metropolitan area networks, various generation mobile communication networks (2G, 3G, 4G, and 5G), Wireless local area networks, and/or WiFi (Wireless Fidelity) networks. In some embodiments, the radio frequency circuit 704 may also include NFC (Near Field Communication) related circuits, which are not limited in this application.

The display screen 705 is used to display a UI (User Interface). The UI may include graphics, text, icons, video, and any combination thereof. When the display screen 705 is a touch display screen, the display screen 705 also has the ability to capture touch signals on or over the surface of the display screen 705. The touch signal may be input to the processor 701 as a control signal for processing. At this point, the display 705 may also be used to provide virtual buttons and/or a virtual keyboard, also referred to as soft buttons and/or a soft keyboard. In some embodiments, the display 705 may be one, providing the front panel of the terminal 700; in other embodiments, the display 705 can be at least two, respectively disposed on different surfaces of the terminal 700 or in a folded design; in still other embodiments, the display 705 may be a flexible display disposed on a curved surface or on a folded surface of the terminal 700. Even more, the display 705 may be arranged in a non-rectangular irregular pattern, i.e. a shaped screen. The Display 705 may be made of LCD (liquid crystal Display), OLED (Organic Light-Emitting Diode), or the like.

The audio circuitry 706 may include a microphone and a speaker. The microphone is used for collecting sound waves of a user and the environment, converting the sound waves into electric signals, and inputting the electric signals to the processor 701 for processing or inputting the electric signals to the radio frequency circuit 704 to realize voice communication. For the purpose of stereo sound collection or noise reduction, a plurality of microphones may be provided at different portions of the terminal 700. The microphone may also be an array microphone or an omni-directional pick-up microphone. The speaker is used to convert electrical signals from the processor 701 or the radio frequency circuit 704 into sound waves. The loudspeaker can be a traditional film loudspeaker or a piezoelectric ceramic loudspeaker. When the speaker is a piezoelectric ceramic speaker, the speaker can be used for purposes such as converting an electric signal into a sound wave audible to a human being, or converting an electric signal into a sound wave inaudible to a human being to measure a distance. In some embodiments, the audio circuitry 706 may also include a headphone jack.

The positioning component 707 is used to locate the current geographic position of the terminal 700 to implement navigation or LBS (location based Service). The positioning component 707 may be a positioning component based on a GPS (global positioning System) in the united states, a beidou System in china, a graves System in russia, or a galileo System in the european union.

The power supply 708 is used to power the various components in the terminal 700. The power source 708 may be alternating current, direct current, disposable batteries, or rechargeable batteries. When power source 708 includes a rechargeable battery, the rechargeable battery may support wired or wireless charging. The rechargeable battery may also be used to support fast charge technology.

Those skilled in the art will appreciate that the configuration shown in fig. 17 is not intended to be limiting of terminal 700 and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components may be used.

The above-mentioned serial numbers of the embodiments of the present application are merely for description and do not represent the merits of the embodiments.

It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.

The above description is only exemplary of the present application and should not be taken as limiting, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims

1. A method for extracting image feature descriptors, the method comprising:

acquiring an image to be processed;

2. The method of claim 1, wherein the extending the edge of the image to be processed outward using the solid-color pixel points to obtain a filled image comprises:

copying the image to be processed to a central area of a pure color image template to obtain the filling image; the edge length of the pure color image template is greater than the edge length of the image to be processed.

3. The method of claim 2, wherein the solid image template comprises a black image template.

4. The method of claim 1, wherein the extending the edge of the image to be processed outward using the solid-color pixel points to obtain a filled image comprises:

expanding the edge of the image to be processed outwards to obtain a newly added edge pixel point;

and filling the newly-added edge pixel points by adopting pure color pixel values to obtain the filled image.

5. The method of claim 4, wherein the filling the newly added edge pixel with the pure color pixel values to obtain the filled image comprises:

and setting the pixel value of the newly added edge pixel point as the pixel value of the black pixel point to obtain the filling image.

6. The method according to claim 5, characterized in that the image to be processed is in a red, green and blue RGB format;

the setting the pixel value of the newly added edge pixel point as the pixel value of the black pixel point to obtain the filled image includes:

and setting the RGB pixel value of the newly added edge pixel point as (0, 0, 0) to obtain the filling image.

7. The method according to any one of claims 3 to 6, wherein said expanding said edge of said image to be processed outward to obtain new edge pixel points comprises:

expanding m rows of the newly-added edge pixel points from inside to outside on each edge of the image to be processed;

or the like, or, alternatively,

copying the image to be processed to a central area of a blank image template, wherein the edge length of the blank image template is greater than that of the image to be processed; and determining unoccupied pixel points positioned at the periphery in the blank image template as the newly-added edge pixel points relative to the image to be processed.

8. The method of claim 7, wherein the box filter has a size of (2n +1) × (2n +1), and wherein m is greater than or equal to n, and wherein m and n are positive integers.

9. The method of any one of claims 1 to 6, wherein said extracting the image feature descriptors of the filler images comprises:

constructing an image pyramid based on the filled image;

detecting and obtaining the characteristic points with size invariance in the image pyramid through the box filter;

and calculating the gradient direction of the characteristic points, and constructing the image characteristic descriptor according to the gradient direction of the characteristic points.

10. A method for automatic control of a user interface, the method comprising:

adopting pure color pixel points to expand the edge of the template image outwards to obtain a filling image;

performing feature matching on the first feature set and the second feature set to obtain the position of the control in the local area map;

automatically triggering the control on the user interface based on the location.

11. An apparatus for extracting an image feature descriptor, the apparatus comprising:

the acquisition module is used for acquiring an image to be processed;

the filling module is used for expanding the edge of the image to be processed outwards by adopting pure color pixel points to obtain a filled image;

and the extraction module is used for extracting the image feature descriptor of the filling image, and the image feature descriptor is obtained based on the feature points extracted by the box filter in the filling image.

12. An apparatus for automatic control of a user interface, the apparatus comprising:

the filling module is used for expanding the edge of the template image outwards by adopting pure color pixel points to obtain a filled image;

the extraction module is used for extracting a first feature set of the filling image and extracting a second feature set of the local area map;

13. A server, characterized in that the server comprises:

a memory;

a processor coupled to the memory;

wherein the processor is configured to load and execute executable instructions to implement the method of image feature descriptor extraction according to any one of claims 1 to 9, or the method of automatic control of a user interface according to claim 10.

14. A computer readable storage medium having stored therein at least one instruction, at least one program, set of codes, or set of instructions; the at least one instruction, the at least one program, the set of codes or the set of instructions being loaded and executed by a processor to implement a method of extracting an image feature descriptor according to any of claims 1 to 9, or a method of automatically controlling a user interface according to claim 10.