CN114462440A

CN114462440A - Object position determination method and device

Info

Publication number: CN114462440A
Application number: CN202210135545.8A
Authority: CN
Inventors: 王凯; 谢世斌; 李文慧; 李以志; 周璐; 李铭
Original assignee: Zhejiang Huaray Technology Co Ltd
Current assignee: Zhejiang Huaray Technology Co Ltd
Priority date: 2022-02-14
Filing date: 2022-02-14
Publication date: 2022-05-10

Abstract

The embodiment of the invention provides a method and a device for determining the position of an object, wherein the method comprises the following steps: acquiring an image set, wherein the image set comprises an original image, and the original image comprises a target object; obtaining a candidate frame set according to the gradient of each image in the image set in a preset window; and determining a target positioning frame in the candidate frame set, and determining the target positioning frame as the position of the target object on the original image. By the method and the device, the problem of low positioning accuracy of the dimension code is solved, and the effect of improving the positioning accuracy of the dimension code is achieved.

Description

Object position determination method and device

Technical Field

The embodiment of the invention relates to the field of image recognition, in particular to a method and a device for determining the position of an object.

Background

The method is applied to various fields of life, such as commodity sales and postal logistics, through dimension codes (such as one-dimensional codes, two-dimensional codes and the like). The dimension code is a bar code coding graph which is usually composed of a group of regularly arranged bars and spaces, has small area, records a large amount of information, is convenient to popularize and attach, and improves the industrial efficiency. In the identification process of the dimension code, the positioning of the dimension code is important.

The existing dimension code identification technology has a large amount of morphological operations, and an algorithm for identifying dimension codes by using the morphological operations has high implementation difficulty and long time consumption, cannot cope with dimension codes under complex scenes, and cannot accurately identify the dimension codes.

Aiming at the problem of low positioning accuracy of dimension codes in the prior art, no effective solution exists at present.

Disclosure of Invention

The embodiment of the invention provides a method and a device for determining the position of an object, which are used for at least solving the problem of low positioning accuracy of a dimension code in the related technology.

According to an embodiment of the present invention, there is provided a method of determining a position of an object, including: acquiring an image set, wherein the image set comprises an original image, and the original image comprises a target object; obtaining a candidate frame set according to the gradient of each image in the image set in a preset window; and determining a target positioning frame in the candidate frame set, and determining the target positioning frame as the position of the target object on the original image.

In an exemplary embodiment, obtaining the candidate frame set according to a gradient of each image in the image set in a preset window includes: performing the following operations on each image in the image set, wherein each image when the following operations are performed is a current image: traversing the current image through a preset window, and acquiring the consistency strength of the gradient main direction of the current image on the preset window to obtain a group of gradient main direction consistency strengths of the current image; and obtaining a candidate frame of the current image according to a group of gradient main direction consistency strengths of the current image, wherein the candidate frame set comprises the candidate frame of the current image.

In one exemplary embodiment, obtaining the gradient main direction consistency strength of the current image on a preset window comprises: acquiring the gradient direction of each pixel in a preset window and the gradient strength corresponding to the gradient direction to obtain a group of gradient directions and a group of gradient strengths of the preset window; and determining the consistency strength of the gradient main direction of the current image on the preset window according to the group of gradient directions and the group of gradient strengths of the preset window.

In an exemplary embodiment, determining a gradient main direction consistency strength of the current image on the preset window according to a set of gradient directions and a set of gradient strengths of the preset window includes: mapping a group of gradient directions and corresponding gradient strengths to a plurality of preset direction areas, and determining a preset direction area with the maximum sum of the corresponding gradient strengths in the plurality of preset direction areas as a target preset direction area; and determining the ratio of the sum of the gradient strengths in the target preset direction area to the target sum as the consistency strength of the gradient main direction of the current image on a preset window, wherein the target sum is the sum of a group of gradient strengths.

In an exemplary embodiment, obtaining a candidate frame of the current image according to a set of gradient main direction consistency strengths of the current image includes: carrying out binarization processing on the consistency strength of the gradient main directions to obtain a binarization image of the current image; and analyzing the connected domain of the binary image to obtain a candidate frame of the current image.

In one exemplary embodiment, determining the target location box in the set of candidate boxes includes: determining a target positioning frame for a candidate frame with an aspect ratio meeting a preset condition in the candidate frame set; or determining the candidate frame with the foreground ratio meeting the preset condition in the candidate frame set as a target positioning frame; or determining the candidate frame with pixels meeting the preset conditions in the candidate frame as the target positioning frame.

In an exemplary embodiment, a plurality of candidate frames are obtained according to a gradient of an image in the image set on a preset window: under the condition that the image set only comprises an original image, obtaining a candidate frame set according to the gradient of the original image on a preset window; and under the condition that the image set also comprises the thumbnail images, obtaining a candidate frame set according to the gradients of the original images and the thumbnail images in the image set on a preset window, wherein the thumbnail images are images obtained by reducing the original images.

According to another embodiment of the present invention, there is provided an apparatus for determining a position of an object, including: the system comprises an acquisition module, a processing module and a display module, wherein the acquisition module is used for acquiring an image set, the image set comprises an original image, and the original image comprises a target object; the analysis module is used for obtaining a candidate frame set according to the gradient of each image in the image set in a preset window; and the determining module is used for determining a target positioning frame in the candidate frame set and determining the target positioning frame as the position of the target object on the original image.

According to a further embodiment of the present invention, there is also provided a computer-readable storage medium having a computer program stored thereon, wherein the computer program is arranged to perform the steps of any of the above method embodiments when executed.

According to yet another embodiment of the present invention, there is also provided an electronic device, including a memory in which a computer program is stored and a processor configured to execute the computer program to perform the steps in any of the above method embodiments.

According to the invention, because the gradient directions in the one-dimensional code region are very consistent, the candidate frames with consistent gradient directions in the image can be found through the gradient in the preset window in the image, so that the position of the one-dimensional code in the image can be determined, therefore, the problem of low positioning accuracy of the dimension code can be solved, and the effect of the positioning accuracy of the dimension code is improved.

Drawings

Fig. 1 is a block diagram of a hardware structure of a mobile terminal of a method of determining a location of an object according to an embodiment of the present invention;

FIG. 2 is a flow chart of a method of determining a location of an object according to an embodiment of the invention;

FIG. 3 is a diagram illustrating a preset window traversal in a current image according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of a connected component analysis on a binary map according to an embodiment of the present invention;

FIG. 5 is a schematic diagram illustrating the target location boxes determined on images of different sizes being restored to the original image according to an embodiment of the present invention;

FIG. 6 is a schematic diagram of a preferred object location determination method according to an embodiment of the present invention;

fig. 7 is a block diagram of an apparatus for determination of a location of an object according to an embodiment of the present invention.

Detailed Description

Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings in conjunction with the embodiments.

It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.

The method embodiments provided in the embodiments of the present application may be executed in a mobile terminal, a computer terminal, or a similar computing device. Taking the example of the mobile terminal, fig. 1 is a hardware block diagram of the mobile terminal according to the method for determining the object position in the embodiment of the present invention. As shown in fig. 1, the mobile terminal may include one or more (only one shown in fig. 1) processors 102 (the processor 102 may include, but is not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA), and a memory 104 for storing data, wherein the mobile terminal may further include a transmission device 106 for communication functions and an input-output device 108. It will be understood by those skilled in the art that the structure shown in fig. 1 is only an illustration, and does not limit the structure of the mobile terminal. For example, the mobile terminal may also include more or fewer components than shown in FIG. 1, or have a different configuration than shown in FIG. 1.

The memory 104 may be used to store computer programs, for example, software programs and modules of application software, such as computer programs corresponding to the object position determination method in the embodiment of the present invention, and the processor 102 executes various functional applications and data processing by running the computer programs stored in the memory 104, so as to implement the above-mentioned method. The memory 104 may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 104 may further include memory located remotely from the processor 102, which may be connected to the mobile terminal over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The transmission device 106 is used for receiving or transmitting data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of the mobile terminal. In one example, the transmission device 106 includes a Network adapter (NIC), which can be connected to other Network devices through a base station so as to communicate with the internet. In one example, the transmission device 106 may be a Radio Frequency (RF) module, which is used for communicating with the internet in a wireless manner.

In the present embodiment, a method for determining a position of an object is provided, and fig. 2 is a flowchart of the method for determining a position of an object according to an embodiment of the present invention, as shown in fig. 2, the flowchart includes the following steps:

step S202, acquiring an image set, wherein the image set comprises an original image, and the original image comprises a target object;

the target object is a dimension code, including but not limited to a one-dimensional code, a two-dimensional code, a three-dimensional code, and the like. The original image is an image obtained by shooting an object containing the dimensional code through an image acquisition device.

Step S204, obtaining a candidate frame set according to the gradient of each image in the image set in a preset window;

the size of the preset window may be set according to actual conditions, for example, 0.5 cm by 0.5 cm, 1.5 cm by 1.5 cm, 1 cm by 1 cm, and the like.

Step S206, determining a target positioning frame in the candidate frame set, and determining the target positioning frame as the position of the target object on the original image.

Through the steps, the problem of low positioning accuracy of the dimension code is solved, and the positioning accuracy of the dimension code is improved.

The main body of the above steps may be a terminal or an image processing device, and the main body of the above steps may also be other processing devices or processing units with similar processing capabilities, but is not limited thereto. The following description is given by taking an example in which the image processing apparatus performs the above operation (which is only an exemplary description, and may be performed by another apparatus or module in an actual operation).

In this embodiment, taking the one-dimensional code as an example of the target object, to identify the position of the one-dimensional code in one picture, first, an image set is obtained, where the image set includes an original image and a thumbnail image obtained after the original image extracts some pixel points according to a certain rule. Obtaining some candidate frames in each image in the image set according to the gradient characteristics in a preset window to form a candidate frame set, and determining a target positioning frame in the candidate frames in the candidate frame set, wherein the position of the target frame in the image is the position of the one-dimensional code in the picture to be identified.

In this embodiment, a preset window with a preset size is selected from a current image, the preset window traverses the current image, when the preset window is at different positions on the current image, the consistency strength of the main gradient direction in the preset window is different, a group of gradient main direction consistency strengths of the current image can be obtained after the preset window traverses the whole current image, and a plurality of candidate frames are obtained in the current image according to the group of gradient main direction consistency strengths in the current image.

Fig. 3 is a schematic diagram of a preset window traversing a current image according to an embodiment of the present invention, assuming that the current image includes 9 × 13 pixel points, where each small box represents one pixel point, the size of the preset window is selected to be 3 × 3, and the preset window traversing the current image is completed by moving the position of the preset window.

It should be noted that, when the position of the preset window is moved, there may be overlapping portions between the moved preset window and the preset window before the movement, and whether there is an overlapping portion depends on the distance of moving the preset window each time, but the pixel points in the same relative position in different windows are necessarily different pixel points, for example, the central pixel points in the preset windows in different positions are necessarily different.

In this embodiment, a preset window has a plurality of pixel points, each pixel point in the preset window corresponds to a gradient strength and a gradient direction, a group of gradient directions of the preset window includes gradient directions corresponding to all pixel points in the preset window, and a group of gradient strengths of the preset window includes gradient strengths corresponding to all pixel points in the preset window; the consistency strength of the main gradient direction on the current preset window can be determined by a group of gradient directions and a group of gradient strengths corresponding to the preset window.

In this embodiment, each pixel point in the preset window corresponds to one gradient direction and one gradient strength, there may be multiple strength directions for multiple pixel points, all possible strength directions are divided into multiple preset direction regions, the sum of the gradient strengths in each preset direction region is calculated, and the sum of a group of gradient strengths in the preset window is calculated, where the preset region with the largest sum of the gradient strengths in the preset direction regions is the target preset direction region, i.e., the strongest direction region. And calculating the ratio of the sum of the gradient strengths in the target preset direction area to the sum of a group of gradient strengths in the preset window, wherein the ratio is the consistency strength of the gradient main direction of the preset window.

For example, the possible direction range of all gradient directions is 0 to 180 degrees (180 to 360 degrees can be converted into 0 to 180 degrees), the 0 to 180 degree direction is averagely divided into 18 preset direction regions, that is, 0 to 10 degrees is the first preset direction region, 10 to 20 degrees is the second preset direction region, … … 170 to 180 degrees is the eighteenth preset direction region, the gradient strength corresponding to a group of gradient directions in the preset window is mapped into 18 direction regions, for example, the gradient strength on one pixel is 50, the gradient direction is 15 degrees, then the gradient direction of the pixel is mapped into the second preset direction region, and the gradient strength in the second preset direction region is increased by 50.

Comparing the sum of gradient strengths in the 18 preset direction regions, taking the preset direction region with the largest sum of gradient strengths as the largest direction region, for example, the sum of gradient strengths in the 18 preset direction regions is at most 800, the corresponding preset direction region is the third preset direction region (20-30 degrees), the sum of a group of gradient strengths in the preset window is 1000, the gradient main direction consistency strength of the preset window is 800 ÷ 1000 ═ 0.8, and the gradient main direction of the preset window is the third preset direction region.

It should be noted that after the preset window traverses the current image, a group of gradient principal direction consistency strengths of the current image are obtained, and the gradient principal direction consistency strength calculated in each preset window is used as the gradient principal direction consistency strength of the target pixel point in the corresponding preset window to obtain a gradient principal direction consistency strength map of the current image, where the target pixel point may be a central pixel point of the current preset window. The gradient principal direction consistency intensity graph of the current image obtained in the way is smaller than that of the current image, because only partial pixel points are extracted, the phenomenon that the binaryzation is inconvenient because of the overlapped part when the window moves can be avoided.

It should be noted that before the consistency strength of the gradient main direction corresponding to the preset window is calculated, the sum of a group of gradient strengths in the preset window is compared with the first threshold, when the sum of a group of gradient strengths in the preset window is greater than or equal to the first threshold, the consistency strength of the gradient main direction corresponding to the preset window is calculated according to the above method, and when the sum of a group of gradient strengths in the preset window is less than the first threshold, the consistency strength of the gradient main direction corresponding to the current preset window is directly set to zero.

In this embodiment, a group of gradient principal direction consistency strengths in a current image is subjected to binarization processing, a gray value of a pixel point with the gradient principal direction consistency strength smaller than a second threshold value is set to be 0, a gray value of a pixel point with the gradient principal direction consistency strength greater than or equal to the second threshold value is set to be 255, a binary image of the current image is obtained, connected domain analysis is performed on the binary image, a pixel point with the gray value of 255 and adjacent pixels on the binary image is found out to obtain a plurality of connected domains, for each connected domain, a minimum outsourcing frame is set, and the obtained minimum outsourcing frame is determined as a candidate frame of the current image and represents a possible position of a one-dimensional code in the current image.

Fig. 4 is a schematic diagram of performing connected component analysis on a binary map according to an embodiment of the present invention, and as shown in fig. 4, a gray value of 255 is found out on the binary map, and adjacent pixel points are found out to obtain a plurality of connected components, and a minimum bounding box is set on the connected components. In the figure 401, 402, 403 are candidate frames determined on the current image.

In this embodiment, for the candidate frames in the candidate frame set, the frame has a characteristic that the gradient main directions are consistent, but the candidate frame having this characteristic is not necessarily the position of the two-dimensional code, for example, there is a piece of clothing with black and white stripes in the target image, and then the clothing is also determined as the candidate frame when the candidate frame is determined, that is, there may be a target frame of a non-one-dimensional code in the candidate frame, and through the above steps, the target positioning frame corresponding to the one-dimensional code can be found out from the candidate frame.

It should be noted that the aspect ratio of the one-dimensional code is relatively fixed, and candidate frames meeting preset conditions can be screened out through the preset aspect ratio and determined as target positioning frames; for example, the aspect ratio of the candidate box 402 in fig. 4 is 1, which is obviously not the position of the one-dimensional code; alternatively, the first and second electrodes may be,

the gradient directions of the positions of the one-dimensional codes are very consistent, so the foreground proportion in the candidate frame where the one-dimensional codes are located should be higher, wherein the pixel point with the gray value of 255 in the candidate frame is the foreground in the candidate frame, and the ratio of the number of the pixel points with the gray value of 255 in the candidate frame to the number of all the pixel points in the candidate frame is called the foreground proportion. When the foreground proportion in the candidate frame is smaller than a third threshold, the corresponding candidate frame is considered not to be the position where the one-dimensional code is located, and when the foreground proportion in the candidate frame is greater than or equal to the third threshold, the corresponding candidate frame is considered to be the position where the one-dimensional code is located, and the candidate frame is determined to be a target positioning frame, for example, the foreground proportion of the candidate frame 403 in fig. 4 is very small, and obviously not to be the position where the one-dimensional code is located; alternatively, the first and second electrodes may be,

because the pixel point that the one-dimensional code corresponds in the image has some other characteristics, can screen the candidate frame that does not conform to the requirement through some other characteristics, will satisfy the candidate frame of predetermineeing the condition as the target location frame, for example the one-dimensional code is alternate black and white strip to the pixel of different pixel points is also different, and the pixel point that the black strip corresponds is darker, and the pixel point that corresponds daytime is brighter, also can confirm the target location frame according to this kind of characteristic.

In this embodiment, an original image is obtained first, a thumbnail image with a size different from that of the original image can be obtained on the basis of the original image by jumping over the original image, a plurality of thumbnail images with different sizes can be obtained by jumping over the original image a plurality of times, and target positioning frames are respectively determined in the original image and all the thumbnail images. The skip point is to select a pixel point on the original image according to a certain rule, for example, starting from a first pixel point, selecting a pixel point every other pixel point, and the selected pixel point forms a thumbnail 1, or selecting a pixel point every two pixel points and the selected pixel point forms a thumbnail 2.

In addition, the target positioning frames determined on different images can be restored to the original image, the target positioning frames determined under different sizes can be seen on the original image at the moment, the target positioning frames can be compared, sorted and screened on the original image, if the overlapping area of at least two target positioning frames in all the target positioning frames is larger than a fourth threshold value, the corresponding target positioning frames are scored according to the characteristics of pixel points in the target positioning frames, the target positioning frame with the highest score is reserved, and other target positioning frames overlapped with the target positioning frames are removed.

Fig. 5 is a schematic diagram of the target positioning frames determined on the images with different sizes being restored to the original image according to the embodiment of the present invention, and as shown in fig. 5, after being restored to the original image, the original image shows 3 target positioning frames, including: the target positioning frame 501 comprises pixel points A1-A10, the target positioning frame 502 comprises pixel points A3-A12, the target positioning frame 503 comprises pixel points B1-B10, the target positioning frame 501 is determined on an original image, the target positioning frame 502 is determined on a thumbnail image 1, the frame 504 is an overlapped part of the target positioning frame 501 and the target positioning frame 502, the frame 504 comprises pixel points A3-A10, and the target positioning frame 503 is determined on a thumbnail image 2. The target positioning frame 501 and the target positioning frame 502 are overlapped in a large area, and the target positioning frame with high score is reserved for the corresponding target positioning frame 501 and the target positioning frame 502 according to the characteristics of the pixel points in the target positioning frame.

It is to be understood that the above-described embodiments are only a few, but not all, embodiments of the present invention.

The present invention will be described in detail with reference to the following examples:

fig. 6 is a schematic diagram of a preferred object position determination method according to an embodiment of the present invention, as shown in fig. 6, including the following steps:

step 602, solving the gradient strength and direction of the single-channel image, and solving the gradient strength and direction of each pixel point on the basis of the original image, wherein the gradient strength and direction can be solved by using a Sobel operator to solve the gradient strength and direction in the horizontal direction and the vertical direction, and then calculating to obtain the gradient strength and direction of the whole image, or calculating the gradient strength and direction of the whole image by using a Canny operator or a gradient operator designed by the Canny operator;

step 604, determining a positioning frame corresponding to the one-dimensional code on the first layer of pyramid images, wherein the sizes of the first layer of pyramid images and the original image are kept consistent;

step 606, determining a positioning frame corresponding to the one-dimensional code on a second layer of golden sub-tower image, wherein the second layer of golden sub-tower image is obtained by skipping points in the first layer of golden sub-tower image, for example, selecting a pixel point every other pixel point in the first layer of golden sub-tower image to obtain the second layer of golden sub-tower image;

step 608, determining a positioning frame corresponding to the one-dimensional code on the third layer of gold tower image; the third layer of golden pagoda image is obtained by jumping points in the second layer of golden pagoda image;

step 610, summarizing and sorting the positioning frames; and restoring the three layers of golden tower images to the original image, and comparing, combining and sequencing the images.

Performing the following steps on the images of step 604, step 606 and step 608 to determine the positioning frame corresponding to the one-dimensional code on the current image:

step 61, counting the main gradient direction in the window, and combining the gradient strength to obtain a consistency strength diagram of the main gradient direction, wherein the step of obtaining the consistency strength diagram of the main gradient direction comprises the following steps: dividing all possible gradient directions into 36 regions (0-180 degrees), selecting a proper window size, overlapping the gradient strength in the window to the regions in all directions, setting a threshold value, calculating the ratio of the strength sum in the strongest direction neighborhood to the strength sum in the window, and taking the ratio as the gradient main direction consistency strength of the current point to obtain a gradient main direction consistency strength graph of a whole graph, or calculating the gradient main direction consistency strength through other methods such as weighted overlapping and the like.

Step 62, binarizing the gradient main direction consistency intensity map, segmenting a foreground, setting a lowest threshold value, meanwhile, binarizing the consistency intensity map by combining a local threshold value, and calculating the local threshold value by using mean value filtering, median filtering, maximum value filtering and the like;

step 63, performing connected domain analysis on the binary image to obtain a candidate minimum outsourcing rectangular frame;

and step 64, filtering the outsourcing frames which do not meet the requirements, adjusting the one-dimensional code positioning frame, checking all the minimum outsourcing frames, filtering the non-code frames by utilizing the width-height configuration, the width-height ratio, the foreground proportion, the in-frame bar space characteristics and the like, and performing operations of rotating, expanding outwards, contracting inwards and the like on the candidate one-dimensional code positioning frame by utilizing the characteristics of the in-frame main direction and the like.

Through the above description of the embodiments, those skilled in the art can clearly understand that the method according to the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but the former is a better implementation mode in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present invention.

There is also provided an apparatus for determining a position of an object in the present embodiment, and fig. 7 is a block diagram of a structure of the apparatus for determining a position of an object according to an embodiment of the present invention, as shown in fig. 7, the apparatus includes:

an obtaining module 72, configured to obtain an image set, where the image set includes an original image, and the original image includes a target object;

the analysis module 74 is configured to obtain a candidate frame set according to a gradient of each image in the image set in a preset window;

and a determining module 76, configured to determine a target positioning frame from the candidate frame set, and determine the target positioning frame as a position of the target object on the original image.

In an exemplary embodiment, the analysis module is further configured to perform the following operations on each image in the image set, where each image when the following operations are performed is a current image: traversing the current image through a preset window, and acquiring the consistency strength of the gradient main direction of the current image on the preset window to obtain a group of gradient main direction consistency strengths of the current image; and obtaining a candidate frame of the current image according to a group of gradient main direction consistency strengths of the current image, wherein the candidate frame set comprises the candidate frame of the current image.

In an exemplary embodiment, the analyzing module is further configured to obtain a gradient direction of each pixel in the preset window and a gradient strength corresponding to the gradient direction, so as to obtain a group of gradient directions and a group of gradient strengths of the preset window; and determining the consistency strength of the gradient main direction of the current image on the preset window according to the group of gradient directions and the group of gradient strengths of the preset window.

In an exemplary embodiment, the analysis module is further configured to map a group of gradient directions and corresponding gradient strengths to a plurality of preset direction regions, and determine a preset direction region with a maximum sum of corresponding gradient strengths in the plurality of preset direction regions as a target preset direction region; and determining the ratio of the sum of the gradient strengths in the target preset direction area to the target sum as the consistency strength of the gradient main direction of the current image on a preset window, wherein the target sum is the sum of a group of gradient strengths.

In an exemplary embodiment, the analysis module is further configured to perform binarization processing on a group of gradient main direction consistency intensities to obtain a binarized image of the current image; and analyzing the connected domain of the binary image to obtain a candidate frame of the current image.

In an exemplary embodiment, the determining module is further configured to determine a target location frame for a candidate frame in the candidate frame set, where an aspect ratio of the candidate frame satisfies a preset condition; or determining the candidate frame with the foreground ratio meeting the preset condition in the candidate frame set as a target positioning frame; or determining the candidate frame with pixels meeting the preset conditions in the candidate frame as the target positioning frame.

In an exemplary embodiment, the analysis module is further configured to: under the condition that the image set only comprises an original image, obtaining a candidate frame set according to the gradient of the original image on a preset window; and under the condition that the image set also comprises the thumbnail images, obtaining a candidate frame set according to the gradients of the original images and the thumbnail images in the image set on a preset window, wherein the thumbnail images are images obtained by reducing the original images.

It should be noted that, the above modules may be implemented by software or hardware, and for the latter, the following may be implemented, but not limited to: the modules are all positioned in the same processor; alternatively, the modules are respectively located in different processors in any combination.

Embodiments of the present invention also provide a computer-readable storage medium having a computer program stored thereon, wherein the computer program is arranged to perform the steps of any of the above-mentioned method embodiments when executed.

In the present embodiment, the above-mentioned computer-readable storage medium may be configured to store a computer program for executing the steps of:

s1, acquiring an image set, wherein the image set comprises an original image, and the original image comprises a target object;

s2, obtaining a candidate frame set according to the gradient of each image in the image set in a preset window;

and S3, determining a target positioning frame in the candidate frame set, and determining the target positioning frame as the position of the target object on the original image.

In an exemplary embodiment, the computer-readable storage medium may include, but is not limited to: various media capable of storing computer programs, such as a usb disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic disk, or an optical disk.

Embodiments of the present invention also provide an electronic device comprising a memory having a computer program stored therein and a processor arranged to run the computer program to perform the steps of any of the above method embodiments.

In an exemplary embodiment, the electronic apparatus may further include a transmission device and an input/output device, wherein the transmission device is connected to the processor, and the input/output device is connected to the processor.

In an exemplary embodiment, the processor may be configured to execute the following steps by a computer program:

For specific examples in this embodiment, reference may be made to the examples described in the foregoing embodiments and exemplary implementations, and details of this embodiment are not repeated herein.

It will be apparent to those skilled in the art that the various modules or steps of the invention described above may be implemented using a general purpose computing device, they may be centralized on a single computing device or distributed across a network of computing devices, and they may be implemented using program code executable by the computing devices, such that they may be stored in a memory device and executed by the computing device, and in some cases, the steps shown or described may be performed in an order different than that described herein, or they may be separately fabricated into various integrated circuit modules, or multiple ones of them may be fabricated into a single integrated circuit module. Thus, the present invention is not limited to any specific combination of hardware and software.

The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A method for determining a location of an object, comprising:

acquiring an image set, wherein the image set comprises an original image, and the original image comprises a target object;

obtaining a candidate frame set according to the gradient of each image in the image set in a preset window;

and determining a target positioning frame in the candidate frame set, and determining the target positioning frame as the position of the target object on the original image.

2. The method of claim 1, wherein obtaining a candidate frame set according to a gradient of each image in the image set in a preset window comprises:

performing the following operations on each image in the image set, wherein each image when the following operations are performed is a current image:

traversing the current image through a preset window, and acquiring the consistency strength of the gradient main direction of the current image on the preset window to obtain a group of consistency strengths of the gradient main direction of the current image;

obtaining a candidate frame of the current image according to the group of gradient main direction consistency strengths of the current image, wherein the candidate frame set comprises the candidate frame of the current image.

3. The method according to claim 2, wherein the obtaining of the gradient main direction consistency strength of the current image on the preset window comprises:

acquiring the gradient direction of each pixel in the preset window and the gradient strength corresponding to the gradient direction to obtain a group of gradient directions and a group of gradient strengths of the preset window;

and determining the consistency strength of the gradient main direction of the current image on the preset window according to the group of gradient directions and the group of gradient strengths of the preset window.

4. The method according to claim 3, wherein the determining the gradient main direction consistency strength of the current image on the preset window according to a set of gradient directions and a set of gradient strengths of the preset window comprises:

mapping a group of gradient directions and corresponding gradient strengths to a plurality of preset direction areas, and determining a preset direction area with the maximum sum of the corresponding gradient strengths in the plurality of preset direction areas as a target preset direction area;

and determining the ratio of the sum of the gradient strengths in the target preset direction area to the target sum as the gradient main direction consistency strength of the current image on the preset window, wherein the target sum is the sum of the group of gradient strengths.

5. The method of claim 2, wherein the deriving the candidate frame of the current image according to the set of main direction of gradient consistency strengths of the current image comprises:

carrying out binarization processing on the consistency strength of the group of gradient main directions to obtain a binarization image of the current image;

and analyzing the connected domain of the binary image to obtain a candidate frame of the current image.

6. The method of claim 1, wherein determining a target location box in the set of candidate boxes comprises:

determining the target positioning frame for the candidate frame with the aspect ratio meeting the preset condition in the candidate frame set; alternatively, the first and second electrodes may be,

determining a candidate frame with a foreground ratio meeting a preset condition in the candidate frame set as the target positioning frame; alternatively, the first and second electrodes may be,

and determining the candidate frame with pixels meeting preset conditions in the candidate frame as the target positioning frame.

7. The method according to any one of claims 1 to 6, wherein a plurality of candidate frames are obtained according to the gradient of the images in the image set on a preset window:

under the condition that the image set only comprises the original image, obtaining the candidate frame set according to the gradient of the original image on a preset window;

and under the condition that the image set further comprises a thumbnail image, obtaining the candidate frame set according to the gradient of the original image and the thumbnail image in the image set on the preset window, wherein the thumbnail image is an image obtained by reducing the original image.

8. An apparatus for determining a position of an object, comprising:

the system comprises an acquisition module, a processing module and a display module, wherein the acquisition module is used for acquiring an image set, the image set comprises an original image, and the original image comprises a target object;

the analysis module is used for obtaining a candidate frame set according to the gradient of each image in the image set in a preset window;

and the determining module is used for determining a target positioning frame in the candidate frame set and determining the target positioning frame as the position of the target object on the original image.

9. A computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium, wherein the computer program, when being executed by a processor, carries out the steps of the method as claimed in any one of the claims 1 to 7.

10. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method as claimed in any of claims 1 to 7 are implemented when the computer program is executed by the processor.