CN115147623A - Target image acquisition method and related equipment - Google Patents

Target image acquisition method and related equipment Download PDF

Info

Publication number
CN115147623A
CN115147623A CN202210667286.3A CN202210667286A CN115147623A CN 115147623 A CN115147623 A CN 115147623A CN 202210667286 A CN202210667286 A CN 202210667286A CN 115147623 A CN115147623 A CN 115147623A
Authority
CN
China
Prior art keywords
image frame
image
ith
matching
size
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210667286.3A
Other languages
Chinese (zh)
Inventor
姜威
白志奇
许彬
林辉
段亦涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Netease Youdao Information Technology Beijing Co Ltd
Original Assignee
Netease Youdao Information Technology Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Netease Youdao Information Technology Beijing Co Ltd filed Critical Netease Youdao Information Technology Beijing Co Ltd
Priority to CN202210667286.3A priority Critical patent/CN115147623A/en
Publication of CN115147623A publication Critical patent/CN115147623A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/10Image acquisition
    • G06V10/16Image acquisition using multiple overlapping images; Image stitching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The disclosure provides a target image acquisition method and related equipment. The method comprises the following steps: acquiring a plurality of image frames on the target image frame by frame; wherein the image frame comprises a partial image of the target image; a segmentation step: segmenting the current image frame based on the ith preset size in the n different preset sizes to obtain an ith size image; wherein i is an integer greater than or equal to 1 and less than or equal to n, and n is a positive integer; matching: performing image matching on the ith size image and a previous image frame of the current image frame to obtain an ith matching degree of the ith size image and the previous image frame and an ith offset of the ith size image in the previous image frame; a fusion step: in response to the ith matching degree being within a preset range, fusing the current image frame with the previous image frame based on the ith offset.

Description

Target image acquisition method and related equipment
Technical Field
The present disclosure relates to the field of data processing technologies, and in particular, to a method for acquiring a target image and a related device.
Background
This section is intended to provide a background or context to the embodiments of the disclosure recited in the claims. The description herein is not admitted to be prior art by inclusion in this section.
The dictionary pen is used as a common learning tool for contemporary students, namely the characteristic of scanning and searching simplifies the process of searching words and sentences of the students, and greatly improves the learning efficiency. The existing dictionary pen scans contents to be checked through a camera arranged on a pen point to obtain multi-frame scanning images, matches and splices the multi-frame scanning images based on feature points of the multi-frame scanning images to finally obtain a target image comprising the contents to be checked. However, the method based on feature point matching and splicing has a large calculation amount, so that the acquisition delay of the target image is large, and the user experience is reduced. Meanwhile, due to the limitation of the visual field range of the camera of the dictionary pen, enough feature points may not be obtained, thereby causing matching failure or errors and causing poor effect of the target image.
Disclosure of Invention
In view of the above, there is a need for an improved method capable of effectively improving the problems of large delay and poor effect of target image acquisition.
The exemplary embodiment of the present disclosure provides a method for acquiring a target image, including:
acquiring a plurality of image frames on the target image frame by frame; wherein the image frame comprises a partial image of the target image;
a segmentation step: segmenting a current image frame based on an ith preset size in n different preset sizes to obtain an ith size image corresponding to the current image frame; wherein i is an integer greater than or equal to 1 and less than or equal to n, and n is a positive integer;
matching: performing image matching on the ith size image and a previous image frame of the current image frame to obtain an ith matching degree of the ith size image and the previous image frame and an ith offset of the ith size image in the previous image frame;
a fusion step: in response to the ith matching degree being within a preset range, fusing the current image frame with the previous image frame based on the ith offset.
In some embodiments, the method further comprises:
and executing the segmentation step, the matching step and the fusion step for every two adjacent image frames in the plurality of image frames until the fusion of the plurality of image frames is completed to obtain the target image.
In some embodiments, the method further comprises:
responding to the fact that the ith matching degree is smaller than the lower limit value of the preset range, and judging whether i is equal to n or not;
in response to the fact that i is not equal to n, the current image frame is segmented based on the i +1 th preset size, and an i +1 th size image corresponding to the current image frame is obtained;
performing image matching on the i +1 th size image and the previous image frame to obtain the i +1 th matching degree of the i +1 th size image and the previous image frame and the i +1 th offset of the i +1 th size image in the previous image frame;
in response to the i +1 th matching degree being within a preset range, fusing the current image frame with the previous image frame based on the i +1 th offset.
In some embodiments, the method further comprises:
in response to that i is equal to n, fusing the current image frame with the previous image frame based on the offset corresponding to the maximum value of all matching degrees corresponding to the n different preset sizes.
In some embodiments, the method further comprises:
calculating the speed and the acceleration of each first image frame based on the displacement and the frame rate between every two adjacent first image frames in a plurality of first image frames positioned before the current image frame in the plurality of image frames;
predicting a prediction matching region in the current image frame that matches the previous image frame based on the velocity and acceleration;
determining n priorities of the preset sizes based on the predicted matching area, and segmenting the current image frame in a priority order based on the preset sizes.
In some embodiments, fusing the current image frame with the previous image frame based on the ith matching degree and the ith offset includes:
in response to the ith matching degree being less than a first threshold, fusing the current image frame with the previous image frame based on a first fusion method;
in response to the ith matching degree being greater than or equal to a first threshold and less than a second threshold, fusing the current image frame with the previous image frame based on a second fusing method.
In some embodiments, the second fusion method comprises a linear weighted fusion method, wherein the weight of the current image frame in fusion with the previous image frame is different when the ith matching degree is in different intervals between the first threshold and the second threshold.
In some embodiments, the method further comprises: discarding the current image frame in response to the ith match score being greater than or equal to the second threshold.
In some embodiments, the method further comprises: and performing light supplement processing and/or light spot removing processing when each image frame is acquired.
Based on the same inventive concept, the exemplary embodiments of the present disclosure also provide an apparatus for acquiring a target image, including:
an acquisition module for acquiring a plurality of image frames on the target image frame by frame; wherein the image frame comprises a partial image of the target image;
the image segmentation module is used for segmenting a current image frame based on the ith preset size in n different preset sizes to obtain an ith size image corresponding to the current image frame; wherein i is an integer greater than or equal to 1 and less than or equal to n, and n is a positive integer;
the matching module is used for carrying out image matching on the ith size image and a previous image frame of the current image frame to obtain the ith matching degree of the ith size image and the previous image frame and the ith offset of the ith size image in the previous image frame;
and the fusion module is used for fusing the current image frame and the previous image frame based on the ith offset in response to the ith matching degree being in a preset range.
Based on the same inventive concept, the disclosed exemplary embodiments also provide an electronic device, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, and when the processor executes the program, the processor implements the method for acquiring the target image as described in any one of the above items.
Based on the same inventive concept, the disclosed exemplary embodiments also provide a non-transitory computer-readable storage medium storing computer instructions for causing a computer to execute the target image acquisition method as described in any one of the above.
Based on the same inventive concept, the disclosed exemplary embodiments also provide a computer program product comprising computer program instructions which, when run on a computer, cause the computer to perform the method of acquiring a target image as described in any of the above.
As can be seen from the above, the method for acquiring a target image and the related device provided by the present disclosure divide image frames acquired frame by frame according to different sizes, match the divided image with a previous image frame, and fuse the image frame with the previous image frame after the matching is successful, so as to obtain the target image. Compared with the traditional method, the method has the advantages that the image matching by utilizing the segmented image and the previous image frame is smaller than the calculated amount based on the characteristic points, the delay is reduced, meanwhile, the method is not limited by the number of the characteristic points contained in the image frame, and the effect of the obtained target image is improved.
Drawings
In order to clearly illustrate the technical solutions of the present disclosure or related technologies, the drawings used in the embodiments or related technologies description will be briefly introduced below, and obviously, the drawings in the following description are only embodiments of the present disclosure, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1 is a schematic diagram of an acquisition architecture of a target image according to an exemplary embodiment of the present disclosure.
Fig. 2 is a schematic application scenario diagram of a target image acquisition method according to an exemplary embodiment of the present disclosure.
Fig. 3 is a flowchart illustrating a target image acquisition method according to an exemplary embodiment of the present disclosure.
Fig. 4 is a schematic diagram of image frame segmentation according to an exemplary embodiment of the present disclosure.
Fig. 5 is a schematic illustration of an offset amount of an exemplary embodiment of the present disclosure.
Fig. 6 is a schematic diagram of a prediction matching region according to an exemplary embodiment of the present disclosure.
Fig. 7 is a schematic structural diagram of an apparatus for acquiring a target image according to an exemplary embodiment of the present disclosure.
Fig. 8 is a schematic structural diagram of an electronic device according to an exemplary embodiment of the present disclosure.
Detailed Description
The principles and spirit of the present application will be described with reference to a number of exemplary embodiments. It should be understood that these embodiments are presented only to enable those skilled in the art to better understand and to implement the present disclosure, and are not intended to limit the scope of the present application in any way. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
According to the embodiment of the disclosure, a method for acquiring a target image and related equipment are provided.
In this document, it is to be understood that any number of elements in the figures are provided by way of illustration and not limitation, and any nomenclature is used for differentiation only and not in any limiting sense.
The principles and spirit of the present application are explained in detail below with reference to several representative embodiments of the present disclosure.
Summary of The Invention
The scheme of the present disclosure aims to provide a target content acquisition method and a related device, so as to realize an improved target image acquisition scheme.
Currently, when a user scans a content to be acquired by using a device such as a dictionary pen provided with an image acquisition device (e.g., a camera), the image acquisition device acquires an image of the content at a current position according to a certain frame rate along with the movement of the device, so as to obtain a plurality of frames of images. The device detects the key feature points and the invariant feature descriptors of each image of the multiple frames of images through corresponding detection algorithms (for example, the key feature points of the images are detected through Harris and other algorithms, the invariant feature descriptors of the images are calculated through SIFT and other algorithms), matching is carried out on the basis of the key feature points and the invariant feature descriptors of the two frames of images, a plurality of matching point pairs are obtained, and error matching is removed. And then, performing affine transformation on the images based on the matching points, splicing, and fusing the overlapped parts until all the images are fused to obtain a panoramic image of the content to be acquired.
In the course of implementing the present disclosure, the inventors found that the above prior arts all have significant disadvantages. According to the traditional target image acquisition mode, fusion and panorama splicing are performed based on key feature points, so that the calculated amount is large, high delay is achieved, and the user experience is reduced. Due to the limitation of the volume of the equipment, the field of view (field of view) range of the image acquisition device is small, and enough key feature point support algorithms may not be obtained, so that the wrong matching is caused, and the final panoramic image is poor in effect.
Based on the characteristics of the target image acquisition process and aiming at the problems in the prior art, the present disclosure provides a target image acquisition method and related equipment, which divide the image frames acquired frame by frame according to different sizes, match the divided image with the previous image frame, and fuse the divided image with the previous image frame after the matching is successful, so as to obtain the target image. Compared with the traditional method, the method has the advantages that the image matching by utilizing the segmented image and the previous image frame is smaller than the calculated amount based on the characteristic points, the delay is reduced, meanwhile, the method is not limited by the number of the characteristic points contained in the image frame, and the effect of the obtained target image is improved.
Having described the general principles of the present disclosure, various non-limiting embodiments of the present disclosure are described in detail below.
Application scene overview
Reference is made to fig. 1, which is a schematic diagram illustrating an architecture for acquiring a target image according to an embodiment of the present disclosure. The architecture 100 for acquiring a target image includes a server 110, a terminal 120, and a network 130 providing a communication link. The server 110 and the terminal 120 may be connected via a wired or wireless network 130. The server 110 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as cloud service, a cloud database, cloud computing, cloud functions, cloud storage, network service, cloud communication, middleware service, domain name service, security service, CDN, and big data and artificial intelligence platforms.
The terminal 120 may be a hardware or software implementation. For example, when the terminal 120 is implemented in hardware, it may be various electronic devices having a display screen and supporting page display, including but not limited to a smart phone, a tablet computer, an e-book reader, a laptop portable computer, a desktop computer, and the like. When the terminal 120 device is implemented by software, it can be installed in the electronic devices listed above; it may be implemented as a plurality of software or software modules (for example, software or software modules for providing distributed services), or may be implemented as a single software or software module, which is not specifically limited herein.
It should be noted that the method for acquiring the target content provided in the embodiment of the present application may be executed by the terminal 120, or may be executed by the server 110. It should be understood that the number of terminals, networks, and servers in fig. 1 are illustrative only and not intended to be limiting. There may be any number of terminals, networks, and servers, as desired for an implementation.
Fig. 2 is a schematic view of an application scenario of the method for acquiring a target image according to the embodiment of the present disclosure. As shown in fig. 2, the content 201 to be collected includes a character string "Abcdefgh ijkl mnoppq rstuvw xyz". It should be understood that the content 201 may also be an image, and the content 201 is not limited herein. The user scans the content 201 with an image acquisition device of the device, resulting in m image frames 202a, 202b, 202c, \ 8230 \ 8230, 202m, m being positive integers. The image frame 202a includes a character string "Abcde", the image frame 202b includes a character string "defgh", the image frame 202c includes a character string "h ijkl" \8230, the image frame 8230includes a character string "xyz". The latter image frame may be sequentially spliced to the former image frame in order, i.e., the same portions may be fused and then spliced. Since the image frame 202a is the first frame image, it can be directly used for stitching. Image frame 202b may be stitched to image frame 202a, and specifically, image frame 202b may be sequentially divided into n preset sizes size1, size2, \8230:, sizen, where n is a positive integer. After each division, the divided sub-image is image-matched with the previous image frame 202a to calculate the matching degree and offset of each sub-image with the previous image frame 202a, and when the matching degree meets the preset range, the division may be stopped, and the image frame 202b is fused with the image frame 202a based on the corresponding offset. For example, the same partial bit string "de" between image frame 202b and image frame 202a, the character string "de" portion of image frame 202b may be fused with the character string "de" portion of image frame 202a to stitch image frame 202b to image frame 202a. And so on until all the image frames 202a-202m are spliced, and a final panoramic image is obtained as the target image 203 of the content 201.
According to the method, the image frames acquired frame by frame are divided according to different sizes, the divided images are matched with the previous image frame, and the images are fused with the previous image frame after the matching is successful, so that the target image is obtained. Compared with the traditional method, the image matching by using the segmented image and the previous image frame has smaller calculation amount based on the characteristic points, reduces delay, is not limited by the number of the characteristic points contained in the image frame, and improves the effect of the obtained target image.
The following describes a method for acquiring target content according to an exemplary embodiment of the present disclosure, with reference to an application scenario of fig. 2. It should be noted that the above application scenarios are only illustrated for the convenience of understanding the spirit and principles of the present disclosure, and the embodiments of the present disclosure are not limited in any way in this respect. Rather, embodiments of the present disclosure may be applied to any scenario where applicable.
Exemplary method
First, the present disclosure provides a method for acquiring a target image, which may be executed by a server, such as the server 110 in fig. 1; may also be performed by a client, such as client 120 in FIG. 1; there may also be a server and a client for performing together, for example, the client 120 may obtain a plurality of image frames about the target image, send the plurality of image frames to the server 110 via the network 130, and the server 110 performs the segmentation step, the matching step, and the fusion step of the plurality of image frames to obtain the final target image; the target image may then be sent to the client 120. Referring to fig. 3, the method 300 for acquiring a target image according to the embodiment of the present disclosure may further include the following steps.
At step S301, a plurality of image frames with respect to the target image are acquired frame by frame; wherein the image frame includes a partial image of the target image.
The image acquisition device acquires the content to be acquired at a fixed frequency in the scanning process to obtain a plurality of corresponding image frames. Each image frame may include a partial image of the target image corresponding to the content to be acquired. For example, in FIG. 1, the image capture device scans the string "Abcdefgh ijkl mnoppq rstuvw xyz" to obtain a plurality of image frames 202a-202m. Wherein the image frames 202a-202m each comprise a partial image in the target image 203.
In some embodiments, the method 300 further comprises: and performing light supplement processing and/or light spot removal processing when the image frame is acquired.
If the target image obtained by the image frames after subsequent fusion and splicing is too dark, subsequent target object detection and target object recognition, such as text detection and text recognition, are affected. Therefore, the image acquisition device can set a light source for light supplement while acquiring the image frame. If the carrier of the content to be acquired is a smooth surface, obvious light spots may exist on each image frame during scanning, so that the light spots of each image frame can be processed to ensure the definition of the image frame, and the accuracy of subsequent fusion and splicing is improved.
At step S302, the dividing step: segmenting a current image frame based on an ith preset size in n different preset sizes to obtain an ith size image corresponding to the current image frame; wherein i is an integer of 1 or more and n or less, and n is a positive integer.
The scanning speed of the image acquisition device is different, and the positions of the overlapped areas in two adjacent image frames are different. For example, the faster the scanning speed, the smaller the similarity of adjacent image frames; conversely, the greater the similarity between adjacent image frames. N preset sizes for division, for example, size1, size2, \8230;, size, n is a positive integer, may be set according to a frame rate, a field angle, and the like of the data acquisition apparatus. As shown in fig. 4, fig. 4 shows a schematic diagram of image frame segmentation according to an embodiment of the present disclosure. In fig. 4, the image frame 202b shown in fig. 2 may be divided into a first preset size1 to obtain a corresponding first size image 401. The first size image 401 may be matched with the previous image frame 202a to obtain a corresponding first degree of matching. If the first matching degree meets the preset requirement, the image frame 202b is stopped from being divided by the second preset size 2. If the first matching degree does not meet the preset requirement, the image frame 202b continues to be segmented by a second preset size2 to obtain a second size image (not shown). By analogy, if the ith matching degree corresponding to the ith preset size meets the preset requirement, stopping continuously segmenting the image frame 202b by other preset sizes; if the matching degree of the ith matching degree does not meet the preset requirement, the image frame 202b is continuously divided by other preset sizes until the matching degree of the divided image with a certain size and the previous image frame meets the preset requirement or all the preset sizes are divided.
Further, in some embodiments, the segmentation of the current image frame based on the ith preset size may be based on a preset rule. In some embodiments, the preset rule may be to segment out a preset region of the current image frame. For example, with the left side of the current image frame as a reference, a first preset region (401 in fig. 4) of the ith size located behind the left side may be divided as the ith size image. The second preset region located at the left side of the center line and/or the third preset region located at the right side of the center line (e.g., 402, 403 in fig. 4) may be divided as the ith size image with respect to the center line of the current image frame. It is also possible to segment a fourth preset region (not shown) centered on the center point of the current image frame with reference to the center point. It should be understood that the preset area is only an example, and is not intended to limit the preset area, and the position of the preset area may be set as required, and the like, and is not limited herein.
In some embodiments, the preset size may have a priority. The image frames may be segmented according to a priority order of n preset sizes.
At step S303, a matching step: and carrying out image matching on the ith size image and a previous image frame of the current image frame to obtain the ith matching degree of the ith size image and the previous image frame and the ith offset of the ith size image in the previous image frame.
For each segmentation, similarity matching can be carried out on the size image obtained by segmentation and the previous image frame, and if the matching degree is higher, the similarity between the size image and the previous image frame is higher; if the matching degree is smaller, the similarity of the image with the previous image frame is smaller. On the basis, the meter can also countThe offset between the size image and the previous image frame is calculated. For example, in fig. 4, after the image frame 202b is segmented at the first preset size1, a first matching degree S between the first size image 401 and the previous image frame 202a may be calculated 401 . Further, a first offset Δ w of the first size image 401 with respect to the previous image frame 202a may also be obtained 401 . As shown in fig. 5, fig. 5 shows a schematic diagram of an offset according to an embodiment of the present disclosure. In fig. 5, the upper left corner of the previous image frame 202a is the origin (0, 0) of the coordinate system, when the first size image 401a is matched with the previous image frame 202a, the two are aligned based on the overlapping region, the upper left corner of the aligned first size image 401a is marked as a point a in the coordinate system, and the distance between the point a and the abscissa of the origin (0, 0) is the first offset Δ w 401 . It should be understood that the ith offset calculation of the ith size image and the previous image frame is similar to the first size image 401, and is not repeated here.
Further, in some embodiments, the speed and the acceleration of each first image frame may be obtained based on the displacement amount and the frame rate between every two adjacent first image frames in a plurality of first image frames located before the current image frame in the plurality of image frames;
predicting a prediction matching region in the current image frame that matches the previous image frame based on the velocity and acceleration;
determining n priorities of the preset sizes based on the predicted matching area, and segmenting the current image frame in a priority order based on the preset sizes.
The displacement amount corresponding to the ith size image with the matching degree in the preset range can be used as the displacement amount of the current image frame and the previous image frame. The speed and acceleration during scanning can be calculated through the offset among a plurality of image frames, so that the motion track during scanning is predicted, and the possible position of the next image frame is deduced. Therefore, the priority of the preset size during the segmentation can be dynamically adjusted, so that the segmentation step is quickened, the calculation amount is further reduced, and the delay is reduced. For example, in fig. 2, when the displacement amount between the image frame 202b and the image frame 202a is Δ wba, the displacement amount between the image frame 202c and the image frame 202b is Δ wcb, and the frame rate of the image acquisition apparatus is f, the velocity vb =Δwba = f of the image frame 202b, the velocity vc =Δwcb = f of the image frame 202c, and the acceleration a = (vc-vb) = f from the image frame 202b to the image frame 202c can be calculated. From this, the position of the matching area of the image frame 202d and the image frame 202c can be predicted, for example, the speed of the image frame 202d can be predicted to be vd = vc + a/f, and the displacement amount Δ wdc = vd/f between the image frame 202d and the image frame 202 c. Since the width of each image frame is consistent and fixed, assuming the width is w, the predicted matching area of the image frame 202d and the image frame 202c can be determined to be the area of the second half w- Δ wdc of the image frame 202c, as shown in fig. 6, which is a schematic diagram of the predicted matching area according to the embodiment of the present disclosure. In fig. 6, the predicted matching area of the image frame 202d and the image frame 202c includes the character string "kl", the priority of the preset size may be adjusted according to the size w- Δ wdc of the area. Since the predicted matching region is a region where the image frame 202d and the image frame 202c overlap with each other with a high probability, a divided image (for example, an i-th size image) having a size closer to the predicted matching region has a higher matching degree, and thus a preset range of the matching degree is more easily satisfied. This speeds up the segmentation step and further reduces the delay in the acquisition of the entire target image. The priority of the preset size closer to the size of the predicted matching area can be improved, for example, the difference between the size of the predicted matching area and each preset size can be calculated, the priority order of the preset sizes is set according to the difference from small to large, and the smaller the difference is, the closer the size of the predicted matching area is to the preset size is, the higher the priority of the preset size is; the larger the difference is, the size of the prediction matching area is inconsistent with the preset size, and the priority of the preset size is lower.
At step S304, a fusion step: in response to the ith matching degree being within a preset range, fusing the current image frame with the previous image frame based on the ith offset.
And performing template matching on the current image frame by taking the ith prediction size as a cropping template, wherein the template matching can be a traditional template matching method or a pyramid-based template matching method. Therefore, compared with the method of matching by adopting key feature points in the prior art, the method has the advantages that the calculated amount of template matching is greatly reduced, the speed of acquiring the target image is increased, and the delay is reduced. Specifically, the ith prediction size is used as a cropping template to crop the current image frame to obtain a corresponding ith size image, the ith size image and the previous image frame are respectively subjected to template matching to obtain a corresponding ith matching degree, and when the ith matching degree is within a preset range, fusion can be performed.
In some embodiments, the method 300 further comprises:
responding to the fact that the ith matching degree is smaller than the lower limit value of the preset range, and judging whether i is equal to n or not;
in response to the fact that i is not equal to n, the current image frame is segmented based on the i +1 th preset size, and an i +1 th size image corresponding to the current image frame is obtained;
performing image matching on the i +1 th image and the previous image frame to obtain an i +1 th matching degree of the i +1 th image and the previous image frame and an i +1 th offset of the i +1 th image in the previous image frame;
in response to the i +1 th matching degree being within a preset range, fusing the current image frame with the previous image frame based on the i +1 th offset.
In some embodiments, the method 300 further comprises:
in response to that i is equal to n, fusing the current image frame with the previous image frame based on the offset corresponding to the maximum value of all matching degrees corresponding to the n different preset sizes.
When the ith matching degree corresponding to the ith size image corresponding to the current image frame is in the preset range, judging whether i is n at the moment, namely judging whether the current image frame is divided based on all preset sizes at the moment, if i is not n, indicating that the current image frame is not divided based on all preset sizes, and continuing the dividing step based on the next predicted size (such as the (i + 1) th predicted size). If i is n, it indicates that the current image frame is segmented based on all preset sizes, and at this time, the maximum matching degree of all matching degrees corresponding to 1-n size images obtained by segmenting with all preset sizes may be taken, and the offset corresponding to the maximum matching degree is used as the basis for fusing the current image frame and the previous image frame.
In some embodiments, the method 300 further comprises:
fusing the current image frame with the previous image frame based on the ith matching degree and the ith offset, including:
in response to the ith matching degree being less than a first threshold, fusing the current image frame with the previous image frame based on a first fusion method;
in response to the ith matching degree being greater than or equal to a first threshold and less than a second threshold, fusing the current image frame with the previous image frame based on a second fusing method.
Wherein for different matching degrees, different fusion methods, such as linear weighted fusion or poisson fusion, can be used to improve the effect and efficiency of fusion. Specifically, a first fusion method may be employed when the matching degree is smaller than a first threshold; and adopting a second fusion method when the matching degree is greater than or equal to the first threshold and smaller than the second threshold. Further, the first fusion method may be different from the second fusion method. Still further, in some embodiments, the first fusion method comprises a poisson fusion method. For example, if the first threshold value is 70 minutes and the matching degree is less than 70, the poisson fusion method is used.
In some embodiments, the method 300 further comprises:
the second fusion method includes a linear weighted fusion method in which the weight occupied by the current image frame in fusion with the previous image frame is different when the ith matching degree is in a different interval between the second threshold and the third threshold.
For the linear weighted fusion method, different weights are used for different matching degrees. For example, the second threshold may be 95 minutes, then (70, 95) is between the first threshold and the second threshold, and there may be a plurality of different intervals between the first threshold and the second threshold, for example, the intervals are divided by a preset interval (for example, 5 minutes), the matching degree is above 70 minutes, that is, greater than the first threshold, and a linear weighted fusion method may be used. The segments are then set to weighted values, such as a degree of match between 70 and 75 for a first set of weights (e.g., the current image frame is weighted w1, the previous image frame is weighted w 2), a degree of match between 76 and 80 for a second set of weights (e.g., the current image frame is weighted w3, the previous image frame is weighted w 4), and so on. Therefore, different fusion methods can be selected based on different matching degrees, and a clearer panoramic target image can be obtained.
In some embodiments, the method 300 further comprises: discarding the current image frame in response to the ith match being greater than or equal to the second threshold.
Wherein different matching degrees represent different meanings, if the matching degree is extremely high, for example, 95 minutes above the second threshold, which may be that the moving distance is small or not moving during scanning, the current frame may be optionally discarded. Therefore, unnecessary calculation can be avoided, the acquisition efficiency of the target image is further improved, and delay is reduced.
In some embodiments, the method 300 further comprises:
and executing the segmentation step, the matching step and the fusion step for every two adjacent image frames in the plurality of image frames until the fusion of the plurality of image frames is completed to obtain the target image.
Specifically, with continuous acquisition of image frames, the segmentation step, the matching step, and the fusion step of the embodiment of the present disclosure may be performed on every two adjacent image frames in real time until the scanning is completed, and the segmentation step, the matching step, and the fusion step of the embodiment of the present disclosure are performed on all the adjacent image frames, so as to obtain a target image of a panorama about content to be acquired.
In some embodiments, the method 300 further comprises: object recognition may be performed based on the object image. Further, the target recognition includes text recognition and/or image recognition. For example, after obtaining the target image 203 as shown in fig. 2, the user may identify a character string in the target image 203. Further, the corresponding database may be searched to return the corresponding meaning of the string (e.g., chinese and/or foreign language interpretation, etc.). Therefore, according to the embodiment of the disclosure, when the image acquisition equipment is a dictionary pen, the image acquisition equipment can quickly and effectively help a user to accurately identify the content to be searched and return a query result, so that low-delay, efficient and accurate query is realized, and the user experience is improved.
Exemplary device
Referring to fig. 7, based on the same inventive concept as the above-mentioned embodiment of the method for acquiring an arbitrary target image, the embodiment of the present disclosure further provides an apparatus for acquiring a target image. The target image acquisition device comprises:
an acquisition module for acquiring a plurality of image frames on the target image frame by frame; wherein the image frame comprises a partial image of the target image;
the image segmentation module is used for segmenting a current image frame based on the ith preset size in n different preset sizes to obtain an ith size image corresponding to the current image frame; wherein i is an integer of 1 or more and n or less, and n is a positive integer;
the matching module is used for carrying out image matching on the ith size image and a previous image frame of the current image frame to obtain the ith matching degree of the ith size image and the previous image frame and the ith offset of the ith size image in the previous image frame;
and the fusion module is used for fusing the current image frame and the previous image frame based on the ith offset in response to the ith matching degree being in a preset range.
In some embodiments, for each two adjacent image frames of the plurality of image frames, the segmentation module is further configured to perform the segmentation step, the matching module is further configured to perform the matching step, and the fusion module is further configured to perform the fusion step until the fusion of the plurality of image frames is completed to obtain the target image.
In some embodiments, the apparatus further comprises:
the judging module is used for responding to the fact that the ith matching degree is smaller than the lower limit value of the preset range and judging whether i is equal to n or not;
the segmentation module is further configured to: in response to the fact that i is not equal to n, the current image frame is segmented based on the i +1 th preset size, and an i +1 th size image corresponding to the current image frame is obtained;
the matching module is further configured to: performing image matching on the i +1 th image and the previous image frame to obtain an i +1 th matching degree of the i +1 th image and the previous image frame and an i +1 th offset of the i +1 th image in the previous image frame;
the fusion module is further configured to: in response to the i +1 th matching degree being within a preset range, fusing the current image frame with the previous image frame based on the i +1 th offset.
In some embodiments, the fusion module is further configured to: in response to i being equal to n, fusing the current image frame with the previous image frame based on an offset corresponding to a maximum value of all matching degrees corresponding to the n different preset sizes.
In some embodiments, the apparatus further comprises:
the calculation module is used for calculating the speed and the acceleration of each first image frame based on the displacement and the frame rate between every two adjacent first image frames in a plurality of first image frames positioned before the current image frame in the plurality of image frames;
a prediction module for predicting a prediction matching region in the current image frame that matches the previous image frame based on the velocity and acceleration;
the segmentation module is further configured to determine n priorities of the preset sizes based on the prediction matching region, so as to segment the current image frame based on the priority order of the preset sizes.
In some embodiments, the fusion module is further configured to:
in response to the ith matching degree being less than a first threshold, fusing the current image frame with the previous image frame based on a first fusion method;
in response to the ith matching degree being greater than or equal to a second threshold and less than a third threshold, fusing the current image frame with the previous image frame based on a second fusing method.
In some embodiments, the second fusion method includes a linear weighted fusion method, wherein the weight occupied by the current image frame in fusion with the previous image frame is different when the ith matching degree is in different intervals between the second threshold and the third threshold.
In some embodiments, the fusion module is further configured to: discarding the current image frame in response to the ith match being greater than or equal to the third threshold.
In some embodiments, the apparatus further comprises:
and the preprocessing module is used for performing light supplement processing and/or light spot removal processing on each image frame when the image frame is acquired.
The apparatus of the foregoing embodiment is used to implement the corresponding target image obtaining method in any one of the foregoing exemplary target image obtaining method portions, and has the beneficial effects of the corresponding method embodiment, which are not described herein again.
Based on the same inventive concept as any of the above embodiments of the target image obtaining method, an embodiment of the present disclosure further provides an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the method for obtaining the target content according to any of the above embodiments is implemented.
Fig. 8 shows a schematic structural diagram of an electronic device according to an exemplary embodiment of the present disclosure, where the electronic device may include: a processor 810, a memory 820, an input/output interface 830, a communication interface 840, and a bus 850. Wherein processor 810, memory 820, input/output interface 830, and communication interface 840 are communicatively coupled to each other within the device via bus 850.
The processor 810 may be implemented by a general-purpose CPU (Central Processing Unit), a microprocessor, an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits, and is configured to execute related programs to implement the technical solutions provided in the embodiments of the present specification.
The Memory 820 may be implemented in the form of a ROM (Read Only Memory), a RAM (Random Access Memory), a static storage device, a dynamic storage device, or the like. The memory 1020 may store an operating system and other application programs, and when the technical solutions provided by the embodiments of the present specification are implemented by software or firmware, the relevant program codes are stored in the memory 820 and called to be executed by the processor 810.
The input/output interface 830 is used for connecting an input/output module to realize information input and output. The i/o module may be configured as a component in a device (not shown) or may be external to the device to provide a corresponding function. The input devices may include a keyboard, a mouse, a touch screen, a microphone, various sensors, etc., and the output devices may include a display, a speaker, a vibrator, an indicator light, etc.
The communication interface 840 is used for connecting a communication module (not shown in the figure) to realize communication interaction between the device and other devices. The communication module can realize communication in a wired mode (such as USB, network cable and the like) and also can realize communication in a wireless mode (such as mobile network, WIFI, bluetooth and the like).
Bus 850 includes a pathway for communicating information between various components of the device, such as processor 810, memory 1020, input/output interface 830, and communication interface 840.
It should be noted that although the above-mentioned device only shows the processor 810, the memory 820, the input/output interface 830, the communication interface 840 and the bus 850, in a specific implementation, the device may also include other components necessary for normal operation. In addition, those skilled in the art will appreciate that the above-described apparatus may also include only those components necessary to implement the embodiments of the present description, and not necessarily all of the components shown in the figures.
The electronic device of the foregoing embodiment is used to implement the method for acquiring the corresponding target content in any embodiment of the foregoing exemplary method portions, and has the beneficial effects of the corresponding method embodiment, which are not described herein again.
Exemplary program product
Based on the same inventive concept as any of the above-described embodiments of the method for acquiring target content, the disclosed embodiments also provide a non-transitory computer-readable storage medium storing computer instructions for causing the computer to perform the method for acquiring corresponding target content in any of the foregoing exemplary method portions.
The non-transitory computer readable storage medium may be any available medium or data storage device that can be accessed by a computer, including but not limited to magnetic memory (e.g., floppy disks, hard disks, magnetic tape, magneto-optical disks (MOs), etc.), optical memory (e.g., CDs, DVDs, BDs, HVDs, etc.), and semiconductor memory (e.g., ROMs, EPROMs, EEPROMs, non-volatile memories (NAND FLASH), solid State Disks (SSDs)), etc.
The computer instructions stored in the storage medium of the above embodiment are used to enable the computer to execute the target image obtaining method according to any one of the above exemplary method embodiments, and have the beneficial effects of the corresponding method embodiments, and are not described herein again.
As will be appreciated by one skilled in the art, embodiments of the present invention may be embodied as a system, method or computer program product. Accordingly, the present disclosure may be embodied in the form of: entirely hardware, entirely software (including firmware, resident software, micro-code, etc.) or a combination of hardware and software, and is referred to herein generally as a "circuit," module "or" system. Furthermore, in some embodiments, the invention may also be embodied in the form of a computer program product in one or more computer-readable media having computer-readable program code embodied in the medium.
Any combination of one or more computer-readable media may be employed. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive example) of the computer readable storage medium may include, for example: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable medium that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable medium produce an article of manufacture including instruction means which implement the function/act specified in the flowchart and/or block diagram block or blocks.
The computer program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
Moreover, while the operations of the method of the invention are depicted in the drawings in a particular order, this does not require or imply that the operations must be performed in this particular order, or that all of the illustrated operations must be performed, to achieve desirable results. Rather, the steps depicted in the flowcharts may change order of execution. Additionally or alternatively, certain steps may be omitted, multiple steps may be combined into one step execution, and/or breaking a step into multiple step executions.
Use of the verbs "comprise", "comprise" and their conjugations in this application does not exclude the presence of elements or steps other than those stated in this application. The article "a" or "an" preceding an element does not exclude the presence of a plurality of such elements.
While the spirit and principles of the invention have been described with reference to several particular embodiments, it is to be understood that the invention is not limited to the disclosed embodiments, nor is the division of aspects, which is for convenience only as the features in such aspects may not be combined to benefit. The invention is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims. The scope of the following claims is to be accorded the broadest interpretation so as to encompass all such modifications and equivalent structures and functions.

Claims (10)

1. A method for acquiring a target image, comprising:
acquiring a plurality of image frames on the target image frame by frame; wherein the image frame comprises a partial image of the target image;
a segmentation step: segmenting a current image frame based on an ith preset size in n different preset sizes to obtain an ith size image corresponding to the current image frame; wherein i is an integer greater than or equal to 1 and less than or equal to n, and n is a positive integer;
matching: performing image matching on the ith size image and a previous image frame of the current image frame to obtain an ith matching degree of the ith size image and the previous image frame and an ith offset of the ith size image in the previous image frame;
a fusion step: in response to the ith matching degree being within a preset range, fusing the current image frame with the previous image frame based on the ith offset.
2. The method of claim 1, further comprising:
and executing the segmentation step, the matching step and the fusion step for every two adjacent image frames in the plurality of image frames until the fusion of the plurality of image frames is completed to obtain the target image.
3. The method of claim 1, further comprising:
responding to the fact that the ith matching degree is smaller than the lower limit value of the preset range, and judging whether i is equal to n or not;
in response to the fact that i is not equal to n, the current image frame is segmented based on the i +1 th preset size, and an i +1 th size image corresponding to the current image frame is obtained;
performing image matching on the i +1 th size image and the previous image frame to obtain the i +1 th matching degree of the i +1 th size image and the previous image frame and the i +1 th offset of the i +1 th size image in the previous image frame;
in response to the i +1 th matching degree being within a preset range, fusing the current image frame with the previous image frame based on the i +1 th offset.
4. The method of claim 3, further comprising:
in response to that i is equal to n, fusing the current image frame with the previous image frame based on the offset corresponding to the maximum value of all matching degrees corresponding to the n different preset sizes.
5. The method of claim 1, further comprising:
calculating the speed and the acceleration of each first image frame based on the displacement and the frame rate between every two adjacent first image frames in a plurality of first image frames positioned before the current image frame in the plurality of image frames;
predicting a prediction matching region in the current image frame that matches the previous image frame based on the velocity and acceleration;
determining n priorities of the preset sizes based on the predicted matching area, and segmenting the current image frame in a priority order based on the preset sizes.
6. The method of claim 1, wherein fusing the current image frame with the previous image frame based on the ith matching degree and the ith offset comprises:
in response to the ith matching degree being less than a first threshold, fusing the current image frame with the previous image frame based on a first fusion method;
in response to the ith matching degree being greater than or equal to a first threshold and less than a second threshold, fusing the current image frame with the previous image frame based on a second fusing method.
7. The method according to claim 6, wherein the second fusion method comprises a linear weighted fusion method, wherein the current image frame has a different weight in fusion with the previous image frame when the ith matching degree is in a different interval between the first threshold and the second threshold.
8. The method of claim 6, comprising:
discarding the current image frame in response to the ith match being greater than or equal to the second threshold.
9. The method of claim 1, further comprising:
and performing light supplement processing and/or light spot removing processing when each image frame is acquired.
10. An apparatus for acquiring a target content, comprising:
an acquisition module for acquiring a plurality of image frames on the target image frame by frame; wherein the image frame comprises a partial image of the target image;
the image segmentation module is used for segmenting a current image frame based on the ith preset size in n different preset sizes to obtain an ith size image corresponding to the current image frame; wherein i is an integer of 1 or more and n or less, and n is a positive integer;
the matching module is used for carrying out image matching on the ith size image and a previous image frame of the current image frame to obtain the ith matching degree of the ith size image and the previous image frame and the ith offset of the ith size image in the previous image frame;
and the fusion module is used for fusing the current image frame and the previous image frame based on the ith offset in response to the ith matching degree being in a preset range.
CN202210667286.3A 2022-06-13 2022-06-13 Target image acquisition method and related equipment Pending CN115147623A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210667286.3A CN115147623A (en) 2022-06-13 2022-06-13 Target image acquisition method and related equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210667286.3A CN115147623A (en) 2022-06-13 2022-06-13 Target image acquisition method and related equipment

Publications (1)

Publication Number Publication Date
CN115147623A true CN115147623A (en) 2022-10-04

Family

ID=83407979

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210667286.3A Pending CN115147623A (en) 2022-06-13 2022-06-13 Target image acquisition method and related equipment

Country Status (1)

Country Link
CN (1) CN115147623A (en)

Similar Documents

Publication Publication Date Title
US11423695B2 (en) Face location tracking method, apparatus, and electronic device
KR20200040885A (en) Target tracking methods and devices, electronic devices, storage media
US11625433B2 (en) Method and apparatus for searching video segment, device, and medium
CN110246160B (en) Video target detection method, device, equipment and medium
CN109934229B (en) Image processing method, device, medium and computing equipment
CN110660102B (en) Speaker recognition method, device and system based on artificial intelligence
CN111612696B (en) Image stitching method, device, medium and electronic equipment
CN113539304B (en) Video strip splitting method and device
CN111314626B (en) Method and apparatus for processing video
CN113496208B (en) Video scene classification method and device, storage medium and terminal
CN113903036B (en) Text recognition method and device, electronic equipment, medium and product
CN115205925A (en) Expression coefficient determining method and device, electronic equipment and storage medium
CN111325798A (en) Camera model correction method and device, AR implementation equipment and readable storage medium
CN114723646A (en) Image data generation method with label, device, storage medium and electronic equipment
CN113436226A (en) Method and device for detecting key points
CN109934185B (en) Data processing method and device, medium and computing equipment
CN112085842B (en) Depth value determining method and device, electronic equipment and storage medium
CN111444834A (en) Image text line detection method, device, equipment and storage medium
CN116543397A (en) Text similarity calculation method and device, electronic equipment and storage medium
CN111062374A (en) Identification method, device, system, equipment and readable medium of identity card information
CN115147623A (en) Target image acquisition method and related equipment
CN115004245A (en) Target detection method, target detection device, electronic equipment and computer storage medium
CN114299074A (en) Video segmentation method, device, equipment and storage medium
CN114842396A (en) Video motion positioning method and device
CN117115139A (en) Endoscope video detection method and device, readable medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination