CN111464716B

CN111464716B - Certificate scanning method, device, equipment and storage medium

Info

Publication number: CN111464716B
Application number: CN202010273442.9A
Authority: CN
Inventors: 李悦馨
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2020-04-09
Filing date: 2020-04-09
Publication date: 2022-08-19
Anticipated expiration: 2040-04-09
Also published as: CN111464716A

Abstract

The embodiment of the application discloses a certificate scanning method, when a certificate needs to be scanned, after a user triggers scanning operation, a target video frame image including the certificate to be scanned is obtained, edge detection and filtering are carried out on the target video frame image in real time, and an edge line segment set is obtained. And then determining the certificate outline of the certificate to be scanned according to the edge line segment set, wherein the image corresponding to the area surrounded by the certificate outline is the image of the certificate to be scanned with the background removed, so that the target video frame image is processed according to the certificate outline to obtain a target area image, the scanned image of the certificate to be scanned is determined according to the target area image, and the scanned image is displayed to a user. Therefore, the method and the device can directly position the certificate outline by using an image processing method to further obtain the scanned image of the certificate, do not need to use a deep learning mode to carry out a large amount of complex calculations, greatly reduce the calculation complexity, improve the calculation efficiency and further realize real-time certificate scanning.

Description

Certificate scanning method, device, equipment and storage medium

Technical Field

The present application relates to the field of automatic identification, and in particular, to a certificate scanning method, apparatus, device, and storage medium.

Background

With the development of internet technology, online transaction services are becoming more and more common, and more services, such as the communication industry, the financial industry, the entry and exit fields and other fields, need to acquire and register the certificate information of users for real-name management. In order to improve the efficiency of acquiring and registering the certificate information, the certificate information is generally acquired by scanning the certificate.

Currently, a scanned image of a document can be obtained by using a deep learning technique, so as to identify document information according to the scanned image. However, this method has a high computational complexity and cannot complete real-time certificate scanning.

Disclosure of Invention

In order to solve the technical problems, the application provides a certificate scanning method, a certificate scanning device, certificate scanning equipment and a storage medium, a large amount of complex calculation is not required to be performed in a deep learning mode, the calculation complexity is greatly reduced, the calculation efficiency is improved, and then real-time certificate scanning is achieved.

The embodiment of the application discloses the following technical scheme:

in a first aspect, an embodiment of the present application provides a certificate scanning method, where the method includes:

responding to the scanning trigger operation, and acquiring a target video frame image comprising a certificate to be scanned;

performing edge detection filtering on the target video frame image to obtain an edge line segment set;

determining the certificate outline of the certificate to be scanned according to the edge line segment set;

processing the target video frame image according to the certificate outline to obtain a target area image, wherein the target area image is an image corresponding to an area surrounded by the certificate outline;

and displaying the scanned image of the certificate to be scanned to a user according to the target area image.

In a second aspect, embodiments of the present application provide a document scanning device, the device including an acquisition unit, a first determination unit, a second determination unit, a processing unit, and a display unit:

the acquisition unit is used for responding to the scanning trigger operation and acquiring a target video frame image comprising a certificate to be scanned;

the first determining unit is used for performing edge detection and filtering on the target video frame image to obtain an edge line segment set;

the second determining unit is used for determining the certificate outline of the certificate to be scanned according to the edge line segment set;

the processing unit is used for processing the target video frame image according to the certificate outline to obtain a target area image, and the target area image is an image corresponding to an area surrounded by the certificate outline;

and the display unit is used for displaying the scanned image of the certificate to be scanned to a user according to the target area image.

In a third aspect, an embodiment of the present application provides an apparatus for credential scanning, the apparatus comprising a processor and a memory:

the memory is used for storing program codes and transmitting the program codes to the processor;

the processor is configured to perform the method of any of the first aspects in accordance with instructions in the program code.

In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium for storing program code for executing the method of any one of the first aspect.

According to the technical scheme, when the certificate needs to be scanned, the image processing can be directly carried out on the target video frame image including the certificate to be scanned to obtain the scanned image. Specifically, after a user triggers a scanning operation, a target video frame image including a certificate to be scanned is acquired, and edge detection filtering is performed on the target video frame image in real time to obtain an edge line segment set. And then determining a certificate outline of the certificate to be scanned according to the edge line segment set, wherein the image corresponding to the area surrounded by the certificate outline is the image of the certificate to be scanned with the background removed, so that the target video frame image is processed according to the certificate outline to obtain a target area image, the scanned image of the certificate to be scanned is determined according to the target area image, and the scanned image is displayed to a user. Therefore, the method and the device can directly position the certificate outline by using an image processing method, further obtain the scanned image of the certificate, do not need to use a deep learning mode to carry out a large amount of complex calculations, greatly reduce the calculation complexity, improve the calculation efficiency, and further realize real-time certificate scanning.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without inventive exercise.

Fig. 1 is a schematic view of an application scenario of a certificate scanning method according to an embodiment of the present application;

FIG. 2 is a flowchart of a document scanning method provided in an embodiment of the present application;

FIG. 3 is a diagram illustrating an example of a scanning interface provided in an embodiment of the present application;

FIG. 4 is a diagram illustrating an example of a scan interface provided in an embodiment of the present application;

fig. 5 is a flowchart of stability determination provided in the embodiment of the present application;

FIG. 6 is an exemplary diagram of a display interface provided in an embodiment of the present application;

FIG. 7 is a flow chart of determining a set of edge segments according to an embodiment of the present application;

fig. 8 is a flowchart of cluster merging provided in the embodiment of the present application;

FIG. 9 is a flow chart of determining a profile of a document provided by an embodiment of the present application;

FIG. 10 is a flow chart for determining the profile of a document as provided by an embodiment of the present application;

FIG. 11 is a flowchart for determining a scanned image according to an embodiment of the present application;

FIG. 12 is a diagram illustrating an example of a display interface for a scanned image according to an embodiment of the present application;

FIG. 13 is a flowchart of a credential scanning method provided in an embodiment of the present application;

FIG. 14 is a block diagram of a document scanning device according to an embodiment of the present application;

FIG. 15 is a block diagram of a credential scanning device according to an embodiment of the present application;

fig. 16 is a block diagram of a server according to an embodiment of the present application.

Detailed Description

Embodiments of the present application are described below with reference to the accompanying drawings.

In the related art, the certificate is scanned based on a deep learning mode, and the calculation amount of the deep learning is huge, so that more time is consumed in the calculation process, the processing speed is slow, and the real-time requirement of certificate scanning is difficult to meet.

In order to solve the above technical problem, an embodiment of the present application provides a certificate scanning method, which may directly perform image processing on a target video frame image including a certificate to be scanned to obtain a scanned image, and does not need to perform a large amount of complex calculations in a deep learning manner, thereby greatly reducing the calculation complexity, improving the calculation efficiency, and further implementing real-time certificate scanning.

It is emphasized that the certificate scanning method provided by the embodiment of the application can be applied to various scenes of automatic certificate identification and electronic certificate material supply. For example, when a bank transacts business, in order to facilitate registration of certificate information of a user, the certificate information can be automatically input without manual input, the certificate of the user can be scanned, and information is extracted from an image obtained by scanning.

It should be noted that the method provided in the embodiment of the present application may be applied to a processing device with a scanning function, for example, a scanning client is installed on the processing device. The processing device may have a camera through which video frame images are captured. The processing device may be a terminal device, and the terminal device may be, for example, a smart phone, a tablet computer, a notebook computer, a desktop computer, and the like, but is not limited thereto.

In order to facilitate understanding of the technical solution of the present application, the certificate scanning method provided in the embodiment of the present application is described below with reference to an actual application scenario. Referring to fig. 1, the application scenario includes a terminal device 101, and a scanning client is installed on the terminal device 101.

When a user needs to scan a certain certificate, for example, a certificate to be scanned, the user can open the scanning client to trigger automatic scanning. The certificate can be various types of certificates such as identity cards, passports, driving licenses, bills, house property cards, business cards, work cards and the like.

After the terminal device 101 starts the scanning function, it may capture a video frame image, and the terminal device 101 obtains a target video frame image including a certificate to be scanned from the captured video frame image. The target video frame image is a video frame image which needs to be subjected to subsequent processing in the acquired video frame images.

Since the document to be scanned is located in a certain area of the target video frame image, the area and the background can be distinguished by the document outline. Therefore, in order to accurately obtain a scanned image of a document to be scanned, the profile of the document needs to be determined. The certificate profile is composed of line segments serving as edges, so that the terminal device 101 performs edge detection filtering on the target video frame image to obtain an edge line segment set, and the certificate profile of the certificate to be scanned is determined according to the edge line segment set.

The image corresponding to the area surrounded by the certificate outline is the image of the certificate to be scanned with the background removed, so the terminal device 101 can cut out the target area image from the target video frame image according to the certificate outline, and further display the scanned image of the certificate to be scanned to the user according to the target area image.

After obtaining the scanned image, the user can save the scanned image, so that the material is conveniently submitted later, and certainly, the user can further identify the certificate information of the scanned image so as to collect, register, audit and the like the certificate information.

Next, a document scanning method provided in an embodiment of the present application will be described in detail with reference to the drawings.

Referring to FIG. 2, FIG. 2 shows a flow diagram of a credential scanning method, the method comprising:

s201, responding to the scanning trigger operation, and acquiring a target video frame image including the certificate to be scanned.

When the certificate to be scanned needs to be scanned, a user can open a scanning client on the terminal device, so that the certificate to be scanned is scanned by using the scanning client. The scanning interface can be seen in fig. 3, where the detected certificate outline is inside a quadrilateral enclosed by line segments in a dashed box, and other invalid areas, such as a background, are outside the quadrilateral.

Wherein, the scanning triggering operation may be that the user opens the scanning client. In addition, in some cases, because the documents may also be of various types, the subsequent processing of the scanned image may differ for different documents after the scanned image is obtained, for example, some types of documents include only text, while some types of documents include pictures in addition to text, and different processing may be possible for the two types of documents. Therefore, in order to facilitate the subsequent selection of a proper processing mode, the user can select the certificate type before scanning, and after the user finishes the selection of the certificate type, the terminal equipment is automatically triggered to start scanning. At this time, the scan trigger operation may be a user selecting a certificate type.

For example, as shown in fig. 3, fig. 3 takes the example that the certificate type includes an identity card and a business card, and when the certificate to be recognized is the business card shown in fig. 4, the user selects "business card", and then the terminal device is triggered to start scanning. The function of 'certificate recognition' is also included in fig. 3, and when the user selects 'certificate recognition', the information on the certificate can be directly recognized.

After a user opens a scanning client, the terminal device may acquire a video frame image through the camera, however, in some cases, the terminal device may not be aligned with a certificate to be scanned or may not be stable, the acquired video frame image is not actually required by the user, or the acquired video frame image may have jitter and blur, which is not beneficial to subsequently determining the certificate outline. Therefore, in order to avoid the shaking and blurring of the video frame images and ensure the accurate identification of the certificate outline, the stability judgment can be carried out on the collected video frame images, so that the target video frame images are determined from the collected video frame images according to the stability judgment result to carry out the subsequent processing, but not all the video frame images are subjected to the subsequent processing.

By the method, unnecessary processing on some video frame images can be avoided, unnecessary calculation amount is reduced, and calculation efficiency is improved. In addition, because the target video frame image is stable, the video frame image is prevented from shaking and blurring, and the accurate identification of the certificate outline is ensured.

The stability determination method may include various methods, such as a three-frame difference method, a two-frame difference method, and the like. The embodiment of the application introduces the stability determination method by taking a three-frame difference method as an example. If the stability of the current video frame image, such as the pending video frame image, needs to be judged, the pending video frame image is a video frame image with the frame number larger than 3, a difference image can be obtained based on a three-frame difference method according to the pending video frame image, the difference image can reflect the difference between the certificate to be scanned in the pending video frame image and the certificate to be scanned in the two previous frame video frame images, and the smaller the difference is, the more stable the pending video frame image is. The difference can be embodied by the number of pixel points with the pixel value larger than zero, if the number of pixel points with the pixel value larger than zero in the difference image is smaller than a first threshold value, the difference between the pending video frame image and the two previous video frame images is not large, the stability judgment is passed, and the pending video frame image can be used as the target video frame image.

However, in order to avoid the contingency of the determination result, it may be determined whether a plurality of consecutive video frame images are stable, that is, the number of consecutive video frame images passed through the stability determination is determined, and if the number of consecutive video frame images passed through the stability determination reaches the second threshold after the stability determination of the pending video frame image passes, it indicates that a plurality of consecutive video frame images acquired within the period of time are stable, the pending video frame image is taken as the target video frame image, and the step of S202 is performed for the pending video frame image.

Of course, it may also be considered that the video frame image subsequent to the pending video frame image is also stable, and the step of S202 may also be performed on the subsequent video frame image.

Referring to fig. 5, a stability determination flowchart may be shown, and a difference map is obtained according to a pending video frame image based on a three-frame difference method (as in S501 in fig. 5), for example, the pending video frame image is F1, a previous frame video frame image of F1 is F2, a previous two-frame video frame image of F1 is F3, the three-frame difference method may be that F1 and the previous two-frame video frame images are respectively used for difference to obtain corresponding difference results, and then the two difference results are subtracted to obtain the difference map. It is determined whether the number C1 of pixel points having pixel values greater than zero in the difference map is less than a first threshold (Thres1) (as shown in S502 in fig. 5), wherein the pixel points having pixel values greater than zero may be, for example, white points in the difference map. If C1 is smaller than Thres1, the number C2 of the consecutive video frame images that have passed the stability determination is incremented (as shown in S503 in fig. 5), and it is determined whether the number C2 of the consecutive video frame images that have passed the stability determination reaches a second threshold (Thres2) (as shown in S504 in fig. 5), if so, a step of performing S202 to perform edge detection filtering on the target video frame image to obtain an edge line segment set (as shown in S505 in fig. 5) is performed, and if not, the processing flow of the video frame image to be determined is terminated (as shown in S506 in fig. 5).

After the stability determination is passed, the subsequent processing may be performed on the target video frame image (for example, the video frame image of a certain hospital information card), at this time, a display interface on the terminal device may be as shown in fig. 6, and the display interface may prompt the user that "intelligent recognition is being performed", that is, the subsequent processing such as S202 is being performed on the target video frame image.

S202, performing edge detection and filtering on the target video frame image to obtain an edge line segment set.

And the terminal equipment carries out edge detection and filtering on the target video frame image to obtain an edge line segment set, and line segments included in the edge line segment set can form a certificate outline.

In some embodiments, the implementation manner of S202 may be to perform edge finding on the target video frame image to obtain a probability value that each pixel belongs to an edge, where the higher the probability value is, the higher the probability that the pixel belongs to the edge is, and then determine an edge filter graph according to the probability value that each pixel belongs to the edge. For example, if the probability value is greater than a certain threshold (Thres3), it indicates that the probability that the pixel belongs to the edge is very high, and the pixel may be considered to belong to the edge and be retained; otherwise, setting the probability value to be 0, and considering that the pixel does not belong to the edge.

The edge searching method can include multiple methods, and in the embodiment of the application, the possible edges can be searched based on the structure forest algorithm. The structure forest algorithm is a machine learning algorithm and can ensure that the extracted edge is accurate. In addition, in the embodiment of the application, although the machine learning algorithm is adopted, the machine learning algorithm is only used for edge finding, and the whole process of determining the scanned image does not utilize the machine learning algorithm, so that the calculation amount is greatly reduced compared with that of the related art.

In some cases, the edge is searched to obtain a probability value, and an edge map is generated at the same time, the probability value of each pixel point belonging to the edge can be reflected by the color of the pixel point in the edge map, for example, the probability value corresponding to the whiter the color of the pixel point is, the bigger the probability value is.

Since the certificate is a rectangle in general and the outline of the certificate is composed of straight line segments, a straight line can be searched according to the edge filter graph to obtain an approximate straight line, so that a straight line graph is drawn. The method of line finding includes, but is not limited to, hough transform. The linear edge filtering graph is obtained according to the edge filtering graph and the linear graph, mutual correction of the linear graph and the pixel points belonging to the edge is achieved, and the accuracy of subsequent certificate contour recognition is improved. For example, the straight-line edge filter graph may be determined by intersecting the edge filter graph with the straight-line graph.

Then, a Line Segment detection algorithm (LSD) is used to convert the straight lines in the Line graph into Line segments, and an edge Line Segment set is obtained according to the conversion result. The flowchart for determining the edge line segment set can be seen from fig. 7, and an edge map of an image of a pending video frame is determined based on a structural forest algorithm (as shown in S701 in fig. 7). Then, edge filtering is performed according to the edge map to obtain an edge filtered map (as shown in S702 in fig. 7). The edge filtering method may be to determine whether the probability value is greater than a threshold value Thres3 for each pixel on the edge map, if so, the probability value of the current pixel is retained, otherwise, the probability value is set to 0, and the edge filtering map is obtained. And then finding an approximate straight line on the edge filtering graph by using Hough transform to draw a straight line graph (as shown in S703 in FIG. 7). And intersecting the edge filtering graph and the straight line graph to obtain a straight line edge filtering graph (as shown by S704 in FIG. 7). Finally, the LSD is used to convert the edge into line segments for the straight edge filtering graph, and a set of edge line segments is obtained (as shown in S705 in fig. 7).

S203, determining the certificate outline of the certificate to be scanned according to the edge line segment set.

After the terminal equipment determines the edge line segment set, the certificate outline of the certificate to be scanned can be determined according to the edge line segment set.

If the certificate outline is a rectangle, the rectangle is composed of two pairs of parallel line segments, so the mode of determining the certificate outline of the certificate to be scanned according to the edge line segment set can be that a parallel line segment set is determined according to the edge line segment set and the parallel relation between the line segments, and then a quadrilateral set is determined according to the adjacent edge rule for any two pairs of parallel line segments in the parallel line segment set. For example, if two pairs of parallel line segments can form a quadrangle according to the adjacent edge rule, a quadrangle set is added, and after comparing the line segments of the whole set of parallel line segments, a quadrangle set is obtained. And determining the certificate outline according to the optimal quadrangle determined from the quadrangle set.

Wherein, the parallel relation may include an included angle, a distance, etc.; the adjacent edge rules may include included angles, whether to intersect, intersection points and line segment positions, line segment proportions, line segment lengths, quadrilateral areas, and the like.

In some cases, if an edge line segment set is obtained, some line segments may be actually a line segment in the edge line segment set, and in order to avoid detection errors, when a parallel line segment set is determined according to the edge line segment set and the parallel relationship between the line segments, clustering and merging the line segments in the edge line segment set may be performed to obtain a second merged line segment set, and then the certificate profile is determined according to the parallel relationship between the second merged line segment set and the line segments. Referring to fig. 7, after the edge line segment set is obtained, it is determined whether an edge line segment set exists (as shown in S706 in fig. 7), if an edge line segment set exists, the line segment is clustered and merged according to the relationship between the line segments (as shown in S707 in fig. 7), otherwise, the processing flow for the pending video frame image is terminated (as shown in S708 in fig. 7).

In some embodiments, the line segments included in the edge line segment set may be clustered and merged according to inter-line segment relationships, which include inter-line segment angles, line segment projection overlap ratios, inter-line segment distances, and the like. For example, if the distance between two line segments is very small, the two line segments may be considered to be the same line segment, and the two line segments may be clustered together to form a line segment class. The line segment classes include all line segments that may be the same line segment, and each line segment class includes one or more line segments.

Referring to fig. 8, clustering the edge line segment set according to the relationship between the line segments to obtain a line segment set (as shown in S801 in fig. 8), where the line segments in different line segment sets are represented by different colors. Then, for each line segment class, according to the line segment characteristics such as the coincidence degree with the edge filter graph, the line segment length, the line width and the like, finding a line segment with the highest probability, taking the slope and the intercept of the line segment, and projecting and merging all the line segments in the line segment class to obtain a merged line segment of the line segment class. Finally, all the line segment classes included in the edge line segment set are merged based on the line segment characteristics, so as to obtain a first merged line segment set (as shown in S802 in fig. 8).

In some cases, during the line segment detection process, a line segment may be identified as two line segments because some portion of the line segment is not detected, and the two line segments may form a straight line by being extended by the line segment. In this case, in order to reduce the detection error, each line segment in the first merged line segment set may be extended, and the above clustering and merging steps may be repeated to obtain a second merged line segment set (as shown in S803 in fig. 8). And judging whether a second merge line segment set exists or not (as shown in S804 in FIG. 8), if so, determining the certificate outline according to the second merge line segment set (as shown in S805 in FIG. 8), and if not, terminating the processing flow aiming at the image of the video frame to be determined (as shown in S806 in FIG. 8).

As shown in fig. 9, a certificate contour is determined according to the parallel relationship between the second merged line segment set and the line segments, and a parallel line segment set is determined according to the parallel relationship between the second merged line segment set and the line segments (as shown in S901 in fig. 9), where different pairs of parallel line segments are represented by different colors, and then, for any two pairs of parallel line segments in the parallel line segment set, a quadrilateral set is determined according to an adjacent edge rule (as shown in S902 in fig. 9). An optimal quadrangle is determined from the quadrangle set (as shown in S903 in fig. 9).

In some embodiments, the optimal quadrangle may be determined by calculating, for each quadrangle in the quadrangle set, a coincidence degree of the quadrangle with the line segments constituting the quadrangle and with the edge filter map E2 to obtain a coincidence probability. Finally, taking the quadrangle with the highest coincidence probability as an optimal quadrangle (Sperfect) for the whole quadrangle set. And judging whether the optimal quadrangle exists (see S904 in FIG. 9), if so, performing stability judgment on the optimal quadrangle (see S905 in FIG. 9), otherwise, ending the processing flow for the target video frame image (see S906 in FIG. 9).

It should be noted that, in this embodiment, the processing in S202-S203 may be performed on all the video frame images corresponding to the number of consecutive video frame images that pass the stability determination and reach the second threshold, that is, in this embodiment, a plurality of optimal quadrangles may be obtained. The stability judgment means whether the number of the continuous video frame images with the coincidence degree reaching the third threshold value reaches a fourth threshold value.

Therefore, in S203, the implementation manner may be to calculate a coincidence ratio of the optimal quadrangle corresponding to the target video frame image and the optimal quadrangle corresponding to the previous video frame image, and determine the certificate contour according to a relationship between the coincidence ratio and the third threshold. Wherein, the coincidence degree can refer to the ratio of the intersection and union of the two optimal quadrilaterals.

The flowchart for determining the certificate profile can be shown in fig. 10, and the contact ratio between the optimal quadrangle of the target video frame image and the optimal quadrangle of the previous frame video frame image is calculated (see S1001 in fig. 10), wherein the contact ratio can be calculated by intersecting the optimal quadrangle of the target video frame image and the optimal quadrangle of the previous frame video frame image to obtain an intersection image, merging the optimal quadrangle of the target video frame image and the optimal quadrangle of the previous frame video frame image to obtain an intersection image, and calculating the contact ratio by using the ratio between the intersection image and the union image. Judging whether the contact ratio is greater than a third threshold (see S1002 in fig. 10), and if so, adding one to the number of consecutive stable frames (see S1003 in fig. 10); and then judging whether the number of the continuous stable frames reaches a fourth threshold (see S1004 in fig. 10), if so, taking the optimal quadrangle of the video frame image corresponding to the number of the continuous stable frames reaching the fourth threshold as the certificate outline (see S1005 in fig. 10), otherwise, terminating the processing flow of the target video frame image (see S1006 in fig. 10).

And S204, processing the target video frame image according to the certificate outline to obtain a target area image, wherein the target area image is an image corresponding to an area surrounded by the certificate outline.

And S205, displaying the scanned image of the certificate to be scanned to a user according to the target area image.

The mode of processing the target video frame image according to the certificate outline can be to cut the target video frame image according to the certificate outline so as to obtain a target area image.

In some cases, if the document outline is rectangular, the quadrilateral obtained in the previous step may not be a standard rectangle, and therefore, after obtaining the target area image, the target area image may be converted into a rectangular image by using perspective transformation.

After the rectangular image is obtained, in order to improve the image definition, the rectangular image may be sharpened according to the color distribution to obtain a scanned image.

As shown in fig. 11, the specific process of S205 may be that, for the target video frame image, the target video frame image is cut according to the certificate outline to obtain a target area image, a rectangular image is obtained by perspective transformation, the rectangular image is sharpened to obtain a scanned image, and the process is terminated.

And the terminal equipment displays the scanned image to a user after determining the scanned image according to the target area image. The scanned image is displayed on an editing page, that is, the terminal device can display the scanned image through the editing page, and the editing page includes an editing operation area. An editing page can be seen in fig. 12, in which a scanned image 1201 and an editing operation area 1202 are included, and in the editing operation area 1202, a user can perform various editing operations, such as filling out a name, a telephone call, a mailbox, adding a material, adding an address book, saving, and the like. When the user performs an editing operation in the editing operation area, the terminal may perform processing on the scanned image in response to the editing operation, such as saving the scanned image, or adding related information to the scanned image, such as filling out a name, a telephone, a mailbox, and the like.

According to the technical scheme, when the certificate needs to be scanned, the image processing can be directly carried out on the target video frame image including the certificate to be scanned to obtain the scanned image. Specifically, after a user triggers a scanning operation, a target video frame image including a certificate to be scanned is acquired, and edge detection and filtering are performed on the target video frame image in real time to obtain an edge line segment set. And then determining a certificate outline of the certificate to be scanned according to the edge line segment set, wherein the image corresponding to the area surrounded by the certificate outline is the image of the certificate to be scanned with the background removed, so that the target video frame image can be processed according to the certificate outline to obtain a target area image, the scanned image of the certificate to be scanned is determined according to the target area image, and the scanned image is displayed to a user. Therefore, the method and the device can directly position the certificate outline by using an image processing method to further obtain the scanned image of the certificate, do not need to use a deep learning mode to carry out a large amount of complex calculations, greatly reduce the calculation complexity, improve the calculation efficiency and further realize real-time certificate scanning.

The method provided by the application enables a user to provide a convenient, efficient, free and high-quality certificate scanning scheme for the user through real-time edge detection and filtration, accurate certificate contour finding and intelligent capturing and correcting sharpening technologies under the conditions that the user does not need to go to a specific place, a limited specific scene, a large amount of time and energy consumption and a certain economic cost. The scheme has good imaging effect and improves the satisfaction degree of users on the quality of the certificate; the requirement on the shooting environment is low, the system can freely respond in a complex scene, and the use threshold of a user is reduced; the automatic correction and optimization image scanning device can automatically correct, optimize and adjust images to scan various certificates, and improves the use efficiency of users.

Next, the embodiments of the present application will be described with reference to practical application scenarios. In the application scenario, video frame images are input to the camera (S1301), a corresponding difference map is obtained for each video frame image based on a three-frame difference method from the third video frame (S1302), it is determined whether C1 is smaller than a first threshold (S1303), if so, it is determined whether C2 reaches a second threshold (S1304), and if so, an edge map is determined based on a structure forest algorithm (S1305). Performing edge filtering according to an edge map to obtain an edge filtering map (S1306), drawing a straight line map by Hough transform according to the edge filtering map to obtain an edge line segment set (S1307), judging whether the edge line segment set exists or not (S1308), if so, clustering the edge line segment set according to the relationship between line segments to obtain a line segment set (S1309), merging all line segments included in the edge line segment set based on line segment characteristics to obtain a first merged line segment set (S1310), clustering and merging the merged extended line segments to obtain a second merged line segment set (S1311), judging whether the second merged line segment set exists or not (S1312), if so, determining a quadrangle set according to the second merged line segment set (S3), judging whether the quadrangle set exists or not (S1314), if so, determining an optimal quadrangle from the quadrangle set (S5), judging that the coincidence degree of the optimal quadrangle corresponding to a previous frame 131exceeds a third threshold value (S1316), if yes, whether the continuous stable frame number exceeds a fourth threshold value is determined (S1317), if yes, a target area image is obtained by cutting the target video frame image according to the certificate outline (S1318), a rectangular image is obtained by performing perspective transformation on the target area image (S1319), and a scanning image is obtained by performing sharpening processing on the rectangular image (S1320) (S1321). In all the steps, if not, the process returns to S1301. S1302-S1304 may be referred to as stability determination 1, S1305-S1307 may be referred to as edge filtering detection, S1309-S1311 may be referred to as cluster merging, S1313-S1315 may be referred to as optimal quadrilateral determination, S1316-S1317 may be referred to as stability determination 2, and S1318-S1320 may be referred to as adaptive image processing.

Based on the foregoing method, an embodiment of the present application further provides a document scanning apparatus, referring to fig. 14, the apparatus includes an acquisition unit 1401, a first determination unit 1402, a second determination unit 1403, a processing unit 1404, and a display unit 1405:

the acquisition unit 1401 is used for responding to the scanning trigger operation and acquiring a target video frame image comprising a certificate to be scanned;

the first determining unit 1402 is configured to perform edge detection filtering on the target video frame image to obtain an edge line segment set;

the second determining unit 1403 is configured to determine a certificate outline of the certificate to be scanned according to the edge line segment set;

the processing unit 1404 is configured to cut out a target area image from the target video frame image according to the certificate outline, where the target area image is an image corresponding to an area surrounded by the certificate outline;

the display unit 1405 is configured to display the scanned image of the document to be scanned to a user according to the target area image.

In a possible implementation manner, before the extracting the document outline of the document to be scanned according to the target video frame image, the first determining unit 1402 is further configured to:

judging the stability of the collected video frame image;

and determining the target video frame image from the acquired video frame images according to the stability judgment result.

In a possible implementation manner, the first determining unit 1402 is further configured to:

obtaining a difference image based on a three-frame difference method according to the image of the video frame to be determined; the video frame image to be determined is a video frame image with the frame number larger than 3;

if the number of the pixels with the pixel values larger than zero in the difference image is smaller than a first threshold value, the stability judgment is passed;

and if the number of the continuous video frame images passing the stability judgment reaches a second threshold value, taking the to-be-determined video frame image as the target video frame image, and executing the step of determining the certificate outline of the to-be-scanned certificate according to the target video frame image.

In a possible implementation manner, if the certificate contour is a rectangle, the second determining unit 1403 is configured to:

determining a parallel line segment set according to the edge line segment set and the parallel relation between the line segments;

aiming at any two pairs of parallel line segments in the parallel line segment set, determining a quadrilateral set according to an adjacent edge rule;

and determining the certificate outline according to the optimal quadrangle determined from the quadrangle set.

In a possible implementation manner, the second determining unit 1403 is configured to:

calculating the contact ratio of the optimal quadrangle corresponding to the target video frame image and the optimal quadrangle corresponding to the previous frame video frame image;

and determining the certificate profile according to the relation between the contact ratio and a third threshold value.

In a possible implementation manner, the first determining unit 1402 is configured to:

performing edge search on the target video frame image, and determining an edge filter graph according to the probability value of each pixel point belonging to the edge;

performing straight line search according to the edge filter graph, and drawing a straight line graph;

acquiring a linear edge filtering graph according to the edge filtering graph and the linear graph;

and converting the straight lines in the straight line graph into line segments by using a line segment detection algorithm, and obtaining an edge line segment set according to a conversion result.

In a possible implementation manner, the display unit 1405 is configured to display the scanned image through an editing page, where the editing page includes an editing operation area;

the display unit 1405 is also configured to:

and processing the scanned image in response to the editing operation of the user in the editing operation area.

The embodiment of the application also provides a device for document scanning, which is described below with reference to the attached drawings. Referring to fig. 15, an embodiment of the present application provides an apparatus 1500 for certificate scanning, where the apparatus 1500 may also be a terminal apparatus, and the terminal apparatus may be any intelligent terminal including a mobile phone, a tablet computer, a Personal Digital Assistant (PDA), a Point of Sales (POS), a vehicle-mounted computer, and the terminal apparatus is a mobile phone:

fig. 15 is a block diagram illustrating a partial structure of a mobile phone related to a terminal device provided in an embodiment of the present application. Referring to fig. 15, the cellular phone includes: a Radio Frequency (RF) circuit 1510, a memory 1520, an input unit 1530, a display unit 1540, a sensor 1550, an audio circuit 1560, a wireless fidelity (WiFi) module 1570, a processor 1580, and a power supply 1590. Those skilled in the art will appreciate that the handset configuration shown in fig. 15 is not intended to be limiting and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.

The following describes each component of the mobile phone in detail with reference to fig. 15:

the RF circuit 1510 may be configured to receive and transmit signals during information transmission and reception or during a call, and in particular, receive downlink information from a base station and process the received downlink information to the processor 1580; in addition, the data for designing uplink is transmitted to the base station. In general, RF circuit 1510 includes, but is not limited to, an antenna, at least one Amplifier, a transceiver, a coupler, a Low Noise Amplifier (LNA), a duplexer, and the like. In addition, RF circuit 1510 may also communicate with networks and other devices via wireless communication. The wireless communication may use any communication standard or protocol, including but not limited to Global System for Mobile communication (GSM), General Packet Radio Service (GPRS), Code Division Multiple Access (CDMA), Wideband Code Division Multiple Access (WCDMA), Long Term Evolution (LTE), email, Short Message Service (SMS), and the like.

The memory 1520 may be used to store software programs and modules, and the processor 1580 performs various functional applications and data processing of the cellular phone by operating the software programs and modules stored in the memory 1520. The memory 1520 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the cellular phone, and the like. Further, the memory 1520 may include high-speed random access memory and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage device.

The input unit 1530 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the cellular phone. In particular, the input unit 1530 may include a touch panel 1531 and other input devices 1532. The touch panel 1531, also referred to as a touch screen, can collect touch operations of a user (e.g., operations of the user on or near the touch panel 1531 using any suitable object or accessory such as a finger or a stylus) and drive corresponding connection devices according to a preset program. Alternatively, the touch panel 1531 may include two parts, a touch detection device and a touch controller. The touch detection device detects the touch direction of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch sensing device, converts the touch information into touch point coordinates, and sends the touch point coordinates to the processor 1580, and can receive and execute commands sent by the processor 1580. In addition, the touch panel 1531 may be implemented by various types such as a resistive type, a capacitive type, an infrared ray, and a surface acoustic wave. The input unit 1530 may include other input devices 1532 in addition to the touch panel 1531. In particular, other input devices 1532 may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and the like.

The display unit 1540 may be used to display information input by the user or information provided to the user and various menus of the mobile phone. The Display unit 1540 may include a Display panel 1541, and optionally, the Display panel 1541 may be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like. Further, the touch panel 1531 may cover the display panel 1541, and when the touch panel 1531 detects a touch operation on or near the touch panel 1531, the touch operation is transmitted to the processor 1580 to determine a type of the touch event, and then the processor 1580 provides a corresponding visual output on the display panel 1541 according to the type of the touch event. Although in fig. 15 the touch panel 1531 and the display panel 1541 are shown as two separate components to implement the input and output functions of the mobile phone, in some embodiments the touch panel 1531 and the display panel 1541 may be integrated to implement the input and output functions of the mobile phone.

The handset can also include at least one sensor 1550, such as light sensors, motion sensors, and other sensors. Specifically, the light sensor may include an ambient light sensor and a proximity sensor, wherein the ambient light sensor may adjust the brightness of the display panel 1541 according to the brightness of ambient light, and the proximity sensor may turn off the display panel 1541 and/or the backlight when the mobile phone is moved to the ear. As one of the motion sensors, the accelerometer sensor can detect the magnitude of acceleration in each direction (generally three axes), detect the magnitude and direction of gravity when stationary, and can be used for applications of recognizing gestures of a mobile phone (such as horizontal and vertical screen switching, related games, magnetometer gesture calibration), vibration recognition related functions (such as pedometers and taps), and the like; as for other sensors such as a gyroscope, a barometer, a hygrometer, a thermometer, and an infrared sensor, which can be configured on the mobile phone, further description is omitted here.

Audio circuitry 1560, speaker 1561, and microphone 1562 may provide an audio interface between a user and a cell phone. The audio circuit 1560 may transmit the electrical signal converted from the received audio data to the speaker 1561, and convert the electrical signal into an audio signal by the speaker 1561 and output the audio signal; on the other hand, the microphone 1562 converts collected sound signals into electrical signals, which are received by the audio circuit 1560 and converted into audio data, which are processed by the audio data output processor 1580 and then passed through the RF circuit 1510 for transmission to, for example, another cellular phone, or for output to the memory 1520 for further processing.

WiFi belongs to a short-distance wireless transmission technology, a mobile phone can help a user to receive and send electronic mails, browse webpages, access streaming media and the like through a WiFi module 1570, and wireless broadband internet access is provided for the user. Although fig. 15 shows WiFi module 1570, it is understood that it does not belong to the essential constitution of the handset and can be omitted entirely as needed within the scope not changing the essence of the invention.

The processor 1580 is a control center of the mobile phone, connects various parts of the entire mobile phone by using various interfaces and lines, and performs various functions of the mobile phone and processes data by operating or executing software programs and/or modules stored in the memory 1520 and calling data stored in the memory 1520, thereby integrally monitoring the mobile phone. Optionally, the processor 1580 may include one or more processing units; preferably, the processor 1580 may integrate an application processor, which mainly handles operating systems, user interfaces, application programs, and the like, and a modem processor, which mainly handles wireless communications. It is to be appreciated that the modem processor may not be integrated into the processor 1580.

The handset also includes a power supply 1590 (e.g., a battery) for powering the various components, which may preferably be logically coupled to the processor 1580 via a power management system, thereby enabling management of charging, discharging, and power consumption via the power management system.

Although not shown, the mobile phone may further include a camera, a bluetooth module, etc., which will not be described herein.

In this embodiment, the processor 1580 included in the terminal device further has the following functions:

responding to a scanning trigger operation, and acquiring a target video frame image comprising a certificate to be scanned;

Referring to fig. 16, fig. 16 is a structural diagram of a server 1600 provided in the embodiment of the present application, and the server 1600 may generate a relatively large difference due to different configurations or performances, and may include one or more Central Processing Units (CPUs) 1622 (e.g., one or more processors) and a memory 1632, and one or more storage media 1630 (e.g., one or more mass storage devices) storing an application program 1642 or data 1644. Memory 1632 and storage media 1630 may be transient or persistent storage, among others. The program stored on the storage medium 1630 may include one or more modules (not shown), each of which may include a sequence of instructions operating on a server. Further, central processing unit 1622 may be configured to communicate with storage medium 1630 to execute a series of instruction operations on storage medium 1630 at server 1600.

Server 1600 may also include one or more power supplies 1626, one or more wired or wireless network interfaces 1650, one or more input/output interfaces 1658, and/or one or more operating systems 1641, such as Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, and so forth.

In this embodiment, the steps performed by the server in the above-described embodiment may be performed by the structure shown in fig. 16.

Embodiments of the present application also provide a computer-readable storage medium for storing program code for executing the method of any one of the foregoing embodiments.

The terms "first," "second," "third," "fourth," and the like in the description of the application and the above-described figures, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged under appropriate circumstances such that the embodiments of the application described herein may be implemented, for example, in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on multiple network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit may be implemented in the form of hardware, or may also be implemented in the form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

The above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims

1. A method of document scanning, the method comprising:

performing edge searching on the target video frame image, and determining an edge filtering graph according to the probability value of each pixel point belonging to the edge;

solving the intersection of the edge filtering graph and the line graph to obtain a line edge filtering graph;

converting the edge in the straight line edge filtering graph into a line segment by using a line segment detection algorithm, and obtaining an edge line segment set according to a conversion result;

clustering the edge line segment set according to the relation between the line segments to obtain a line segment set;

for each line segment class in the line segment class set, merging all line segment classes according to line segment characteristics to obtain a first merged line segment set, wherein the line segment characteristics comprise the contact ratio of the line segments and the edge filter graph, the line segment length and the line width;

extending each line segment in the first merged line segment set, and repeatedly executing the clustering and merging steps to obtain a second merged line segment set;

determining a parallel line segment set according to the second combined line segment set and the parallel relation between the line segments;

determining a quadrilateral set according to adjacent side rules aiming at any two pairs of parallel line segments in the parallel line segment set;

for each quadrangle in the quadrangle set, calculating the coincidence degree of the quadrangle and the line segments forming the quadrangle and the coincidence degree of the quadrangle and the edge filter graph to obtain the coincidence probability;

for the whole quadrangle set, taking the quadrangle with the highest coincidence probability as the optimal quadrangle;

judging the stability of the collected video frame image;

determining the target video frame image from the collected video frame images according to the stability judgment result;

determining the certificate outline according to the relation between the contact ratio and a third threshold value;

2. The method of claim 1, wherein the determining the stability of the captured video frame image comprises:

determining the target video frame image from the acquired video frame images according to the stability judgment result, comprising:

3. The method of claim 1, wherein displaying the scanned image of the document to be scanned to a user in accordance with the target area image comprises:

displaying the scanned image through an editing page, wherein the editing page comprises an editing operation area;

the method further comprises the following steps:

4. A document scanning device, characterized in that the device comprises an acquisition unit, a first determination unit, a second determination unit, a processing unit and a display unit:

the first determining unit is used for searching the edge of the target video frame image and determining an edge filtering graph according to the probability value of the edge of each pixel point; performing straight line search according to the edge filter graph, and drawing a straight line graph; solving the intersection of the edge filtering graph and the line graph to obtain a line edge filtering graph; converting the edge in the straight line edge filtering graph into a line segment by using a line segment detection algorithm, and obtaining an edge line segment set according to a conversion result; the second determining unit is used for clustering the edge line segment set according to the relationship between the line segments to obtain a line segment set; for each line segment class in the line segment class set, merging all line segment classes according to line segment characteristics to obtain a first merged line segment set, wherein the line segment characteristics comprise the coincidence degree of line segments and the edge filter graph, the line segment length and the line width; extending each line segment in the first merged line segment set, and repeatedly executing the clustering and merging steps to obtain a second merged line segment set; determining a parallel line segment set according to the second merged line segment set and the parallel relation between the line segments; aiming at any two pairs of parallel line segments in the parallel line segment set, determining a quadrilateral set according to an adjacent edge rule; for each quadrangle in the quadrangle set, calculating the coincidence degree of the quadrangle and the line segments forming the quadrangle and the coincidence degree of the quadrangle and the edge filter graph to obtain the coincidence probability; for the whole quadrangle set, taking the quadrangle with the highest coincidence probability as the optimal quadrangle; judging the stability of the collected video frame image; determining the target video frame image from the acquired video frame images according to the stability judgment result; calculating the contact ratio of the optimal quadrangle corresponding to the target video frame image and the optimal quadrangle corresponding to the previous frame video frame image; determining the certificate outline according to the relation between the contact ratio and a third threshold value;

5. An apparatus for credential scanning, the apparatus comprising a processor and a memory:

the processor is configured to perform the method of any of claims 1-3 according to instructions in the program code.

6. A computer-readable storage medium, characterized in that the computer-readable storage medium is configured to store a program code for performing the method of any of claims 1-3.