WO2021147437A1 - 证卡边缘检测方法、设备及存储介质 - Google Patents

证卡边缘检测方法、设备及存储介质 Download PDF

Info

Publication number
WO2021147437A1
WO2021147437A1 PCT/CN2020/125083 CN2020125083W WO2021147437A1 WO 2021147437 A1 WO2021147437 A1 WO 2021147437A1 CN 2020125083 W CN2020125083 W CN 2020125083W WO 2021147437 A1 WO2021147437 A1 WO 2021147437A1
Authority
WO
WIPO (PCT)
Prior art keywords
edge
card
target frame
key point
iteration
Prior art date
Application number
PCT/CN2020/125083
Other languages
English (en)
French (fr)
Inventor
张国辉
雷晨雨
宋晨
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021147437A1 publication Critical patent/WO2021147437A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/13Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components

Definitions

  • This application belongs to the field of image processing technology, and in particular relates to a method, device and storage medium for detecting the edge of a card.
  • ID cards With the massive use of various cards such as ID cards, social security cards, and bank cards, related ID card identification services will follow. Among them, the edge (frame) detection of various ID cards is a very important part of ID card recognition.
  • the current card edge detection mainly uses neural networks or traditional edge detection algorithms to find all the edge information in the image, and then sets various conditions to filter out some edge information to obtain the edge information of the card.
  • the inventor realizes that the above method is prone to misjudgment in the case of complex background or blurred edges, leading to edge detection errors and affecting the subsequent operation of other services such as the extraction of card information. On the other hand, the calculation efficiency of the above method is very low. , Does not support card edge detection on the mobile terminal.
  • One of the objectives of the embodiments of the present application is to provide a card edge detection method, device, and storage medium, so as to solve the technical problem that the card edge retrieval method in the prior art is prone to misjudgment and low computational efficiency.
  • an embodiment of the present application provides a method for detecting the edge of a card.
  • the method includes:
  • the first key point information includes the corner point information of the card
  • the first card detection result of the target frame is determined.
  • an embodiment of the present application provides a card edge detection device, including:
  • the first obtaining module is used to obtain the target frame to be processed in the target video
  • the second acquisition module is configured to acquire first key point information of adjacent frames adjacent to the target frame according to the location of the target frame, wherein the adjacent frame is on the time axis of the target video The above position is before the target frame, the adjacent frame contains a card, and the first key point information includes corner point information of the card;
  • the position tracking module is used to input the first key point information and the target frame into a preset key point position tracking model to obtain the second key point information of the target frame and the determination information of the target frame , Wherein the determination information is used to characterize whether the target frame contains a card;
  • the first determining module is configured to determine the first card detection result of the target frame according to the second key point information and the determination information.
  • an embodiment of the present application provides a card edge detection device, including a memory, a processor, and a computer program stored in the memory and running on the processor, and the processor executes the computer program when the computer program is executed:
  • the first key point information includes the corner point information of the card
  • the first card detection result of the target frame is determined.
  • the embodiments of the present application provide a computer-readable storage medium.
  • the computer-readable storage medium may be non-volatile or volatile.
  • the computer-readable storage medium stores a computer program. Realized when executed by the processor:
  • the first key point information includes the corner point information of the card
  • the first card detection result of the target frame is determined.
  • the embodiment of the present application has the beneficial effect that the card edge detection method provided by the embodiment of the present application obtains the information of adjacent frames adjacent to the target frame according to the position of the target frame.
  • the first key point information because the adjacent frame is the video frame before the target frame on the time axis of the target video, and the adjacent frame contains the card, the first key point information of the adjacent frame can be used as the target frame
  • the initial constraint position of the key point, and then the key point tracking process key point position tracking model prediction
  • the card edge detection method is performed according to the first key point information of the adjacent frames to obtain the second key point information of the target frame, and the target frame is determined according to the second key point
  • the first card detection result Compared with the prior art method of determining that the target frame contains the object edge information directly based on the edge detection algorithm, the card edge detection method provided in this application is affected by the complex background and/or blurred edge of the video frame. The impact of, the detection error is small, and the key point tracking model does not need to perform feature point matching processing
  • FIG. 1 is a schematic flow chart of a card edge detection method provided by an embodiment of this application.
  • Figure 2 is a schematic diagram of the first card provided by an embodiment of the application.
  • FIG. 3 is a schematic diagram of a process for obtaining second key point information according to an embodiment of the application
  • FIG. 4 is a schematic flowchart of a card edge detection method provided by another embodiment of this application.
  • FIG. 5 is a schematic diagram of a process for determining a detection result of a second card according to an embodiment of the application
  • FIG. 6 is a schematic flowchart of determining the edge line corresponding to each edge region according to an embodiment of the application
  • FIG. 7 is a schematic diagram of providing a first edge area and a first direction according to an embodiment of the application.
  • FIG. 8 is a schematic diagram of a sub-image provided by an embodiment of the application.
  • FIG. 9 is a schematic diagram of a network structure of an encoder provided by an embodiment of the application.
  • FIG. 10 is a schematic structural diagram of a card edge detection device provided by an embodiment of the application.
  • FIG. 11 is a schematic structural diagram of a card edge detection device provided by another embodiment of this application.
  • FIG. 12 is a schematic diagram of the hardware composition of a card edge detection device provided by an embodiment of the present application.
  • the card edge detection method, device and storage medium provided in this application are applicable to the field of artificial intelligence and image processing technology.
  • Fig. 1 is a schematic flow chart of a card edge detection method provided by an embodiment of this application, which is suitable for execution in a terminal device or a server. As shown in Fig. 1, the method includes:
  • the target video includes M consecutive video frames, which are the first frame, the second frame...the Mth frame, the target frame can be any frame in the target video, and M is an integer greater than 1.
  • the first key point information includes the corner point information of the card.
  • the position where the target frame is located may refer to the position of the target frame in the target video sorted by playing time.
  • the position of the target frame on the time axis of the target video For example, the position of the target frame on the time axis of the target video.
  • the target video includes M video frames, and the M video frames are sorted according to the playback time as the first frame, the second frame...the Mth frame, and the first frame is the first frame of the target video.
  • the adjacent frame is the j-1th frame, where j is an integer greater than 1 and less than or equal to M.
  • the target frame is not the first frame of the target video.
  • the card may refer to various cards such as an ID card, a social security card, and a bank card, which is not specifically limited here.
  • the first key point information of the adjacent frame may include the corner point coordinates of the card.
  • FIG. 2 is a schematic diagram of the first card provided by an embodiment of the application.
  • the first card is in the XOY coordinate system, which is the coordinate system of the adjacent frame.
  • the key point information of the first card includes the coordinates of the four corners of the first card, that is, the coordinates of the four corners of ABCD in FIG. 2. After obtaining the four corner coordinates of the first card, the length and width of the first card and the straight line parameters of the four edge lines of the first card can be calculated according to the four corner coordinates.
  • the preset key point position tracking model may be a pre-trained active contour model.
  • the input of the keypoint position tracking model is the initial contour (initial edge information) and the target frame, and then iterates step by step based on the initial contour, updating the target frame containing the contour of the object until the preset condition is reached.
  • the preset condition may be a preset number of iterations or the iteration error is less than a preset value, etc., and there is no specific limitation here.
  • the key point position tracking model may include an input layer, two convolution layers (the first convolution layer Conv1 and the second convolution layer Conv2), a classifier, and an output layer.
  • composition network structure of the first convolutional layer Conv1 and the second convolutional layer Conv2 may be the same.
  • both the first convolution layer Conv1 and the second convolution layer Conv2 include a convolution layer, a BN layer and an activation function, and the size of the convolution kernel is 3*3.
  • the key point position tracking model can output the classification result and the convolution result in parallel.
  • the classification result may refer to determination information that characterizes whether the target frame contains a card, and the convolution result may be used to calculate the second key point information of the target frame.
  • the second key point information may include the target frame including the corner point coordinates of the object.
  • S40 Determine the first card detection result of the target frame according to the second key point information and the determination information.
  • the first ID card detection result may include whether it contains the label information of the ID card, and the edge information of the ID card when the target frame contains the ID card.
  • the edge information includes the parameters of the edge line and the corner coordinates.
  • the target frame contains a card according to the determination information.
  • the determination information indicates that the target frame includes a card
  • the generated target frame does not include the ID of the ID card.
  • the next frame of the target video can be obtained, where the next frame is adjacent to the target frame in the target video and the playback time is later than the aforementioned target frame.
  • Video frames
  • next frame is used as the updated target frame, and the steps of this embodiment are repeated until the card detection result of each video frame included in the target video is obtained.
  • the first key point information of the adjacent frame adjacent to the target frame is obtained according to the position of the target frame, because the adjacent frame is the time axis of the target video
  • the position of the above is in the video frame before the target frame, and the adjacent frame contains the card, so the first key point information of the adjacent frame can be used as the initial constraint position of the key point of the target frame, and then according to the first key point of the adjacent frame
  • the information performs key point tracking processing (key point position tracking model prediction) to obtain the second key point information of the target frame, and the first card detection result of the target frame is determined according to the second key point; compared with the prior art directly based on
  • the edge detection algorithm determines that the target frame contains the edge information of the object.
  • the card edge detection method provided by this application is less affected by the complex background and/or blurred edge of the video frame, the detection error is small, and the key point tracking model does not require
  • the feature point matching process greatly reduces the amount of calculation, improves the efficiency of edge detection, and is suitable for the real-time detection requirements of the card coding of mobile terminals.
  • FIG. 3 is a schematic diagram of a process for obtaining second key point information according to an embodiment of the application, and describes a possible implementation manner for obtaining second key point information of a target frame in S30 in the embodiment of FIG. 1, as shown in FIG. 3 , Input the first key point information and the target frame into the key point position tracking model to obtain the second key point information of the target frame, including:
  • S301 Determine, according to the first key point information, the first reference position where the target frame contains the object.
  • the first parameter position is used to determine that the target frame contains the initial edge information of the object.
  • the first reference position may include the coordinates of a corner point of the card, the length of the card, and the width of the card, and the initial edge information may be an edge straight line calculated according to the first reference position.
  • the first key point information includes the corner point coordinates of the first card. According to the first key point information, it is determined that the target frame contains the first reference position of the object. Coordinates, determine the edge information of the first card, including any corner coordinates, the length of the first card, and the width of the first card. The edge information of the first card is determined as the first card containing the object in the target frame. Reference positions.
  • the first key point information includes the four corner coordinates of the ABCD of the first card as A(x,y), B(x',y), C(x, y') and D(x',y'), the first reference position can be expressed as G1(x,y,w,h), where (x,y) represents the corner point of the lower left corner of the first card In the coordinates of A, w represents the length of the first card, which is equal to x'-x, and h represents the width of the first card, which is equal to y'-y.
  • the first reference position is determined by the first key point information, and the multiple key points of the first iteration are located on the first reference line, and the first reference line includes Determined edge straight line; if i is an integer greater than 1, the i-th reference position is determined according to the result of the i-1th iteration, and multiple key points of the i-th iteration are located on the second reference straight line, and the second reference straight line contains the basis The edge line determined by the i-th reference position.
  • the input in the first iteration is the first reference position and the target frame
  • the input of the i-th iteration is the multiple key points and the target frame of the i-1th iteration.
  • the first reference position and the target frame are input into the key point position tracking model to obtain multiple key points X 1 of the first iteration and the iteration error delta_1 of the first iteration, And according to the iteration error delta_1, the first reference position G1 is updated to obtain the second reference position G2 (x2, y2, w, h).
  • the multiple key points of the first iteration are located on the first reference straight line, and the first reference straight line includes the edge straight line determined according to the first reference position.
  • the first reference straight line is determined according to the first reference position.
  • the first reference straight line includes the 4 edge straight lines represented by the first reference position, that is, the 4 edges of the first card. Edge straight line; correspondingly, multiple key points obtained in the first iteration are evenly distributed on the four edge straight lines of the first card.
  • the positions and target frames of multiple key points of the i-1th iteration are input into the key-point position tracking model to obtain multiple key points Xi and the i-th iteration of the i-th iteration.
  • the iteration error delta_i of the second iteration, and according to the iteration error delta_i, the i-th reference position is updated to obtain the i+1-th reference position; where i is an integer greater than 1.
  • the multiple key points of the i-th iteration are located on the second reference straight line, and the second reference straight line includes the edge straight line determined according to the i-th reference position;
  • the i-th reference position can be expressed as Gi(xi, yi, w, h), then the second reference line of the i-th iteration contains four edge lines determined according to Gi, and accordingly , The i-th iteration obtains multiple key points Xi uniformly distributed on the four edge straight lines determined according to Gi.
  • the result of each prediction is Strong constraints are added to the key points, so that the key points of each prediction are points located on four edge straight lines, which specifically refer to the four edge straight lines of a rectangular card.
  • the termination condition of the iteration is the number of iterations. After a preset number of iterations, the iteration is terminated to obtain the second key point information, where the preset number can be preset by the user.
  • the second key point information may include the fourth reference position, G4 (x4, y4, w, h).
  • the coordinates of the multiple key points are input into the classifier, and the classifier determines whether the target frame contains the card according to the coordinates of the multiple key points, and generates the corresponding The judgment information.
  • the preset number of times is 3, and the first reference position can be expressed as G1(x, y, w, h) ,
  • the key point tracking model is expressed as evolve_gcn.
  • Step 1 In the first iteration, according to G1 (x, y, w, h) and the target frame, initialize and obtain multiple initial key points of the first iteration, which can be represented by X 0 , and specifically X 0 can be referred to Formula 1):
  • n represents the number of key points
  • (p n , q n ) is the coordinates of the n-th initial key point.
  • obtaining X 0 according to G1 initialization may mean performing linear interpolation between the four corner points determined by G1 to obtain X 0 .
  • the first reference position G1 is updated according to the iteration error delta_1 of the first iteration at the same time to obtain G2.
  • Step 3 In the second iteration, the multiple key points X 2 of the first iteration and the target frame are used as input, and the key point tracking model evolve_gcn is run to obtain the iteration error delta_2 of the second iteration.
  • the second reference position G2 is updated according to the iteration error delta_1 of the second iteration to obtain G3.
  • Step 5 in the third iteration, the key point of the second plurality of iteration X 2, and third iterations Iteration delta_3 target frame errors as inputs, the key operating point tracking model evolve_gcn, obtained.
  • Step 7 X 3 determination target frame of the second key information, and determines whether the target frame contains according to card.
  • the target frame is the first frame of the target video frame, and the target frame has no adjacent frames at this time;
  • the second is that although the target frame is not the first frame of the target video frame, but its adjacent frame does not contain a card, the first key point information of the adjacent frame cannot be obtained at this time.
  • the card edge detection of the target frame is performed based on the end-to-end edge detection model in this application, which will be exemplified by the embodiment of FIG. 4 below.
  • Fig. 4 is a schematic flowchart of a card edge detection method according to another embodiment of the application. As shown in Fig. 4, after obtaining the target frame to be processed in the target video, the card edge detection method further includes:
  • the first frame of the target video refers to the video frame with the earliest playing time in the target video.
  • the size of the gray image is smaller than the size of the target frame.
  • binarization processing is performed on the scaled target frame to obtain a grayscale image of the target frame.
  • the target frame is a color picture with a size of 1080*1090.
  • Preprocessing the target frame may mean that the target frame is first scaled to an image with a size of 128*256, and then the above image is binarized to obtain the corresponding gray Degree image.
  • the purpose of this step is to scale and binarize the target frame, so as to reduce the amount of data processing for edge detection in subsequent steps and improve the efficiency of edge detection.
  • S60 Input the gray image to an edge detection model to obtain third key point information of the gray image; the edge detection model is an end-to-end neural network model, and the third key point information includes multiple edge line parameters of the gray image.
  • the purpose of this embodiment is to detect the edge of the card. Based on the edges of the card are all straight lines, the edge detection model in this embodiment adds linear regression processing after sampling the grayscale image, by adding linear constraints , Directly output the parameters of the straight edge, realize the end-to-end edge detection of the image.
  • the edge detection model includes an encoder, a decoder, and a linear regression sub-model connected in sequence.
  • the encoder is used to obtain multiple local features of the grayscale image, and classify the pixel values of the grayscale image according to the multiple local features, to obtain local pixel values corresponding to different elements; the elements include edge lines.
  • the encoder may be a lightweight convolutional neural network to meet the application requirements of mobile terminals with limited computing power.
  • the encoder may be a shuffle Net network model.
  • the decoder is used to match the classified local pixel values with the pixels of the grayscale image. Specifically, the decoder is used to perform up-sampling processing on the reduced feature map, and perform convolution processing on the up-sampling processed image to make up for the loss of detail caused by the reduction of the image by the pooling layer in the encoder.
  • the linear regression sub-model is used to determine multiple edge straight line parameters according to the pixel points of the matching edge straight line.
  • the optimal solution of the linear regression sub-model satisfies the weighted least squares method.
  • the input of the linear regression sub-model can be expressed as input, and the size of the input is 4*128*256, which contains 4 feature maps with a size of 128*256, corresponding to 4 straight lines whose classification features are "edges".
  • W represents the feature map
  • X_map represents the x-axis coordinates of the pixels on the feature map to form the sub-feature map
  • Y_map represents the y-axis coordinates of the pixels on the feature map to form the sub-feature map
  • V contains Represents the straight line parameters of the linear constraint function.
  • T(Y_map) represents the transposition of Y_map
  • T(X_map) represents the transposition of X_map
  • inv represents inverse processing.
  • the value of V is calculated. Since the input has 4 feature maps, 4 straight line parameters can be obtained.
  • the third key point information includes a plurality of edge line parameters of the gray image, and the shape of the object contained in the gray image can be determined according to the plurality of edge line parameters.
  • multiple edge line parameters determine a rectangle
  • the object contained in the gray image is a card
  • the corner coordinates of the card are determined according to the above multiple edge line parameters, and then the target frame is determined according to the corner coordinates.
  • the corner coordinates of the card are determined according to the above multiple edge line parameters.
  • the card edge detection method provided in this embodiment is suitable for the case where the target frame is the first frame of the target video, or the adjacent frame does not contain a card.
  • the method first obtains a grayscale image according to the target frame, and inputs the grayscale image
  • the edge detection model reduces the amount of data processing for edge detection and improves the efficiency of edge detection; and the edge detection model in this embodiment is an end-to-end neural network model, and the result of training/prediction is directly multiple edges of the gray image Linear parameters, while improving the detection speed, the fitting effect is better than the segmented processing method in the prior art (the method in the background art).
  • the card edge detection method provided in the embodiment of FIG. 4 with the embodiment of FIG. 1, the card edge detection of each video frame in each target video is realized.
  • the key point tracking of the video frame can be performed based on the card edge detection method provided in the embodiment of FIG. 1. If the key point tracking has been successful, it will continue to enter the key point tracking loop provided by the embodiment of FIG. 1 to achieve High-precision and efficient card edge detection; if the key point tracking fails, that is, if the target video does not contain the card, it usually means that the card is replaced in the target video. At this time, the card provided by the embodiment in Figure 4 is used.
  • the edge detection method directly performs the edge detection of the card.
  • the end-to-end edge detection model can also support real-time and efficient card edge detection.
  • After obtaining the updated edge of the card enter the embodiment of Figure 1 again.
  • the key point tracking loop repeats this until the card edge detection result of each video frame in the target video is obtained, which realizes the high-efficiency and high-precision detection of the target video, which can be applied to the real-time detection of the card of the mobile terminal.
  • the edge information of the target frame containing the card can be directly calculated according to the corner coordinates. Since the target frame is reduced before entering the edge detection model, the corner coordinates of the gray image are obtained after the zoom process is performed. After zooming in to the original image, the edge of the target frame may have errors. In order to improve the accuracy of edge detection, After the grayscale image is enlarged and the target frame contains the edge of the card, the edge can be corrected to improve the accuracy of the edge detection of the target frame. Give an exemplary explanation.
  • Fig. 5 is a schematic flow chart of determining the detection result of the second card provided by an embodiment of the application, and describes a possible implementation of S70 in the embodiment of Fig. 4. As shown in Fig. 5, the determination is made according to the third key point information.
  • the second card detection result of the target frame includes:
  • the grayscale image contains a card
  • the target frame contains a card
  • the card contained in the target frame may be a card to be detected.
  • the gray image contains the corner point coordinates of the card according to multiple edge line parameters, and then the corner point coordinates are enlarged according to a preset ratio to obtain multiple corner point coordinates of the card to be detected.
  • the preset ratio is the reduction ratio when the target frame is preprocessed in the embodiment of FIG. 4.
  • the card to be detected contains 4 corner points, and the coordinates of the 4 corner points of the card to be detected can be obtained in this step.
  • S702 According to the coordinates of the multiple corner points, intercept multiple edge regions of the card to be detected, and the multiple edge regions correspond to the multiple corner points in a one-to-one manner.
  • the region of interest corresponding to each corner point is determined according to the coordinates of multiple corner points, and the region of interest is intercepted to obtain multiple edge regions corresponding to the multiple corner points one-to-one.
  • the region of interest refers to the region to be processed obtained by intercepting the target frame in the form of a box, a circle, an ellipse, and an irregular polygon.
  • a block can be used for interception.
  • S703 Determine the edge line corresponding to each edge area, and determine the edge line as the edge line of the to-be-detected card;
  • the method of determining the edge line corresponding to each edge region is the same.
  • multiple sub-regions can be obtained by partitioning an edge region, and after determining the target line segment corresponding to each sub-region, fitting processing is performed on the multiple target line segments to obtain the edge straight line corresponding to the above-mentioned one edge region.
  • the target line segment is the edge line segment of the sub-region.
  • the error caused by the image scaling process can be effectively reduced, and the accuracy of the edge line corresponding to the edge region is improved, thereby Improve the accuracy of the edge straight line of the card to be detected.
  • FIG. 6 is a schematic flowchart of determining the edge line corresponding to each edge area provided by an embodiment of the application, and describes a possible implementation of S703 in the embodiment of FIG. 5. As shown in FIG. 6, it is determined that each edge area corresponds to The edges of the straight line include:
  • S7031 Perform edge detection on the first edge area to obtain an edge image of the first edge area along the first direction; the first edge area is any one of the multiple edge areas, and the first direction is the edge of any edge of the card to be detected direction.
  • the edge is composed of pixels whose pixel values undergo transitions (gradient changes) in the image. Based on this characteristic, the edge detection of the first edge region can be performed according to the Sobel operator.
  • the Sobel operator contains two sets of 3x3 matrices, namely X-direction matrix and Y-direction matrix.
  • the two sets of matrices are respectively subjected to planar convolution processing with the image of the first edge region, and the first edge region can be obtained respectively.
  • Approximate gradients in the X and Y directions, so that the edges of the first edge region in the X and Y directions can be obtained.
  • the first direction is the direction of any edge of the card to be detected, and the constituent elements of the card to be detected include content and edges.
  • the first edge area can be determined according to the position of the first edge area relative to the content in the card to be detected. direction.
  • the first direction is the Y direction
  • the first direction is the X direction
  • the first direction is a preset direction.
  • the first edge region may be flipped first, and then the flipped first edge region may be flipped in the first direction.
  • the edge area undergoes planar convolution processing.
  • flip includes horizontal flip and vertical flip.
  • FIG. 7 is a schematic diagram of a first edge area and a first direction provided in an embodiment of the present application.
  • the first edge area is a rectangular area selected based on the dashed box.
  • the first edge area can be any one of the four edge areas of the card to be detected, which can be 1, 2, 3, and 4. anyone.
  • the first direction is the Y direction
  • the Sobel operator is determined to be the Y direction matrix.
  • the first edge area may be flipped first.
  • the first edge area is 1, the first edge area is directly subjected to planar convolution processing based on the Y-direction matrix to obtain the edge image of the first edge area along the Y direction, and the content of the card to be detected is in the first The right side of the edge area.
  • the first edge area is 2, firstly flip the first edge area horizontally, and then perform planar convolution processing on the flipped first edge area based on the Y-direction matrix to obtain the edge image of the first edge area along the Y direction At this time, the content of the card to be detected is also located on the right side of the first edge area.
  • first edge area is 3, firstly flip the first edge area clockwise and vertically, and then perform planar convolution processing on the flipped first edge area based on the Y-direction matrix to obtain the Y-direction of the first edge area Edge image.
  • the content of the card to be detected is also located on the right side of the first edge area.
  • first edge area is 4, firstly flip the first edge area counterclockwise and vertically, and then perform planar convolution processing on the flipped first edge area based on the Y-direction matrix to obtain the Y-direction of the first edge area Edge image.
  • the content of the card to be detected is also located on the right side of the first edge area.
  • the turning direction is different; similarly, if the relative position between the content of the card to be detected and the edge line is different, the turning direction is different.
  • edge detection is performed on the first edge area to obtain an edge image of the first edge area along the first direction, and the relative position of the content of the card to be detected in the edge image and the target edge is fixed.
  • the edge image can be equally divided into N sub-images.
  • S7033 Perform straight line detection on N binarized sub-images to obtain N target straight lines and 2N end points of the N target straight lines; wherein, the target straight line is the straight line closest to the edge of the target in the binarized sub-image, and the target The edge is determined according to the first direction.
  • the sub-images are subjected to line detection processing to obtain multiple straight lines contained in the sub-images, and the line with the closest distance to the target edge among the multiple straight lines can be obtained. Determined as the target straight line.
  • the target edge is the edge closest to the content of the card to be detected in the sub-image, which can be determined according to the first direction.
  • FIG. 8 is a schematic diagram of a sub-image provided by an embodiment of this application. As shown in FIG. 8, after a straight line detection process, two straight line segments included in the sub-image can be obtained. , Respectively, Z1 (PQ) and Z2 (RS). In the edge image after planar convolution processing in this example, the content of the card to be detected is always on the right side of the first edge area, so the target edge in Figure 8 is Z3.
  • the processing of this step is performed on the N sub-images in the first edge region, and 2N endpoints can be obtained.
  • S7034 Perform straight line fitting on the 2N endpoints to obtain an edge straight line corresponding to the first edge region.
  • the straight line fitting process can be performed based on the ransac algorithm.
  • the method for determining the edge line corresponding to each edge area is to perform partition processing on each edge area, and after obtaining the target line of each edge area, fitting is performed according to the multiple end points of the target line Processing to obtain the edge line corresponding to the edge area can effectively reduce the error caused by the image scaling process, improve the accuracy of the edge line corresponding to the edge area, and further improve the accuracy of the edge line of the card to be detected.
  • the lightweight convolutional neural network model in the prior art such as the Shuffle Net network model, usually includes a channel confusion layer to achieve the calculation amount of a multi-channel image.
  • the image input to the edge detection model is a grayscale image.
  • Channel shuffling is required.
  • the network structure of the ShuffleNet network model in the prior art is further optimized in the embodiment of the present application.
  • the card edge detection method, device, and storage medium of the present application can be used for processing medical data, which helps to improve the efficiency, security, or stability of medical data processing. Used for rapid identification of patient ID documents.
  • FIG. 9 is a schematic diagram of a network structure of an encoder provided by an embodiment of the application.
  • each network node of the encoder includes a first branch and a second branch that are operated in parallel.
  • the first branch includes a sequentially connected average pooling layer, a 1*1 convolutional layer, and an up-sampling layer
  • the second branch includes a 1*1 convolutional layer.
  • the first branch is used to extract local features of grayscale images
  • the second branch is used to extract global features of grayscale images.
  • the connection layer can be implemented based on the Concat function.
  • the average pooling layer in the first branch is used to down-sample the grayscale image and transfer the scale-invariant features to the next layer (ie 1*1 convolutional layer), 1*1 convolution
  • the layer is used to obtain the local features of the incoming feature map.
  • the BN in Figure 9 mainly realizes the normalization of the distribution of the image to accelerate learning.
  • the up-sampling layer in the first branch may perform up-sampling processing based on the bilinear interpolation method.
  • the network structure of the encoder provided by the embodiments of the application streamlines the encoder of the lightweight convolutional neural network in the prior art, removes the channel confusion layer, further reduces the computational complexity of the edge detection model, and improves the performance of the edge detection model.
  • the calculation rate is to meet the real-time processing of the edge detection of the mobile terminal's card.
  • the embodiment of the present application further provides an embodiment of a device that implements the foregoing method embodiment.
  • FIG. 10 is a schematic structural diagram of a card edge detection device provided by an embodiment of the application.
  • the card edge detection device 80 includes a first acquisition module 801, a second acquisition module 802, a position tracking module 803, and a first determination module 804, wherein:
  • the first obtaining module 801 is configured to obtain a target frame to be processed in a target video
  • the second acquisition module 802 is configured to acquire first key point information of adjacent frames adjacent to the target frame according to the location of the target frame, wherein the adjacent frame is at the time of the target video The position on the axis is before the target frame, the adjacent frames include a card, and the first key point information includes corner point information of the card;
  • the position tracking module 803 is configured to input the first key point information and the target frame into a preset key point position tracking model to obtain the second key point information of the target frame and the determination of the target frame Information, wherein the determination information is used to characterize whether the target frame contains a card;
  • the first determination module 804 is configured to determine the first card detection result of the target frame according to the second key point information and the determination information.
  • the position tracking module 803 inputs the first key point information and the target frame into the key point position tracking model to obtain the second key point information of the target frame, which specifically includes:
  • the target frame contains the first reference position of the object
  • the first reference position and target frame are input into the key point position tracking model, and multiple key points of the first iteration and the iteration error of the first iteration are obtained, and the first iteration error is calculated based on the iteration error.
  • the i-th iteration input multiple key points and target frames of the i-1th iteration into the key-point position tracking model to obtain multiple key points of the i-th iteration and the iteration error of the i-th iteration, And update the i-th reference position according to the iteration error to obtain the i+1-th reference position; where i is an integer greater than 1, and multiple key points of the i-th iteration are located on the second reference straight line, and the second The reference straight line includes the edge straight line determined according to the i-th reference position;
  • the second key point information is determined according to the multiple key points obtained in the current iteration; the second key point information includes the intersection of the reference straight lines in the current iteration coordinate.
  • the first determining module 804 determines the card detection result of the target frame according to the second key point information and the determination information, which specifically includes:
  • the determination information indicates that the target frame contains the card
  • the edge information includes the parameters of the edge line and the corner coordinates.
  • FIG. 11 is a schematic structural diagram of a card edge detection device provided by another embodiment of the application. As shown in FIG. 11, the card edge detection device 80 further includes a preprocessing module 805, an edge detection module 806, and a second determination module 807;
  • the preprocessing module 805 is used to preprocess the target frame to obtain the grayscale image of the target frame when the target frame is the first frame of the target video or the adjacent frame does not contain a card; the size of the grayscale image Less than the size of the target frame.
  • the edge detection module 806 is used to input the gray image into the edge detection model to obtain the third key point information of the gray image; the edge detection model is an end-to-end neural network model, and the third key point information includes multiple gray images Edge line parameters.
  • the second determination module 807 is configured to determine the second card detection result of the target frame according to the third key point information.
  • the second determination module 807 determines the second card detection result of the target frame according to the third key point information, which specifically includes:
  • multiple edge line parameters can determine a rectangle, determine multiple corner point coordinates of the card to be detected according to the multiple edge line parameters; the card to be detected is the card contained in the target frame;
  • the second determining module 807 determines the edge line corresponding to each edge area, which specifically includes:
  • the first edge area is any one of the plurality of edge areas, and the first direction is any one of the to-be-detected cards The direction of the edge
  • the target straight line is the line with the closest distance to the target edge in the binarized sub-image, and the target edge is based on The first direction is determined;
  • the edge detection model is a lightweight convolutional neural network;
  • the edge detection model includes: an encoder, a decoder, and a linear regression sub-model connected in sequence;
  • the encoder is used to obtain multiple local features of the grayscale image, and classify the pixel values of the grayscale image according to the multiple local features, to obtain the local pixel values corresponding to different elements;
  • the elements include edge lines;
  • the decoder is used to match the local pixel value with the pixel point of the gray image
  • the linear regression sub-model is used to determine multiple edge line parameters according to the pixels of the matching edge line;
  • the optimal solution of the linear regression model satisfies the weighted least squares method.
  • the network node of the encoder includes a first branch and a second branch of parallel operation
  • the first branch includes a sequentially connected average pooling layer, a 1*1 convolutional layer, and an up-sampling layer
  • the second branch includes a 1*1 convolutional layer.
  • the card edge detection device provided by the embodiments shown in FIG. 10 and FIG. 11 can be used to implement the technical solutions in the foregoing method embodiments, and the implementation principles and technical effects are similar, and the details are not described herein again in this embodiment.
  • Fig. 12 is a schematic diagram of a card edge detection device provided by an embodiment of the present application.
  • the card edge detection device 90 includes: at least one processor 901, a memory 902, and a computer program stored in the memory 902 and running on the processor 901.
  • the card edge detection device further includes a communication component 903, wherein the processor 901, the memory 902, and the communication component 903 are connected by a bus 904.
  • the processor 901 executes the computer program, the steps in the foregoing embodiments of the card edge detection method are implemented, for example, steps S10 to S40 in the embodiment shown in FIG. 1.
  • the processor 901 implements the functions of the modules/units in the foregoing device embodiments when executing the computer program, for example, the functions of the modules 801 to 804 shown in FIG. 10.
  • the computer program may be divided into one or more modules/units, and the one or more modules/units are stored in the memory 902 and executed by the processor 901 to complete the application.
  • One or more modules/units may be a series of computer program instruction segments capable of completing specific functions, and the instruction segments are used to describe the execution process of the computer program in the card edge detection device 90.
  • FIG. 12 is only an example of the card edge detection device, and does not constitute a limitation on the card edge detection device. It may include more or less components than shown in the figure, or a combination of certain components, Or different components, such as input and output devices, network access devices, buses, etc.
  • the card edge detection device in the embodiment of the present application may be a mobile terminal, including but not limited to a smart phone, a tablet computer, a personal digital assistant, an e-book, and the like.
  • the card edge detection device can also be a terminal device, a server, etc., which is not specifically limited here.
  • the so-called processor 901 may be a central processing unit (Central Processkng Unkt, CPU), other general-purpose processors, digital signal processors (Dkgktal Skgnal Processor, DSP), application specific integrated circuits (Applkcatkon Speckfkc Kntegrated Ckrcukt, ASKC), Ready-made programmable gate array (Fkeld-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the memory 902 may be an internal storage unit of the card edge detection device, or an external storage device of the card edge detection device, such as a plug-in hard disk, a smart memory card (Smart Medka Card, SMC), and a secure digital (Secure Dkgktal, SD) card, flash card (Flash Card), etc.
  • the memory 902 is used to store the computer program and other programs and data required by the card edge detection device.
  • the memory 902 can also be used to temporarily store data that has been output or will be output.
  • the bus may be an industry standard architecture (Kndustry Standard Archktecture, KSA) bus, an external device interconnection (Perkpheral Component, PCK) bus, or an extended industry standard architecture (Extended Kndustry Standard Archktecture, EKSA) bus, etc.
  • KSA Knowstry Standard Archktecture
  • PCK Personal Component
  • EKSA Extended industry standard architecture
  • the bus can be divided into address bus, data bus, control bus and so on.
  • the buses in the drawings of this application are not limited to only one bus or one type of bus.
  • the embodiments of the present application also provide a computer-readable storage medium.
  • the computer-readable storage medium may be non-volatile or volatile.
  • the computer-readable storage medium stores a computer program, and the computer program is processed. When the device is executed, the steps in the foregoing method embodiments can be realized.
  • the embodiments of the present application provide a computer program product.
  • the card edge detection device can realize the steps in the foregoing method embodiments when the card edge detection device is executed.
  • the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • this application implements all or part of the processes in the above-mentioned embodiments and methods, which can be completed by instructing relevant hardware through a computer program.
  • the computer program can be stored in a computer-readable storage medium, and the computer program is being processed. When the device is executed, the steps of the above-mentioned method embodiments can be realized.
  • the computer program includes computer program code, and the computer program code may be in the form of source code, object code, executable file, or some intermediate forms.
  • the computer-readable medium may at least include: any entity or device that can carry the computer program code to the camera/terminal device, recording medium, computer memory, read-only memory (ROM, Read-Only Memory), random access memory (RAM) , Random Access Memory), electric carrier signal, telecommunications signal and software distribution medium.
  • ROM read-only memory
  • RAM random access memory
  • electric carrier signal telecommunications signal and software distribution medium.
  • U disk mobile hard disk, floppy disk or CD-ROM, etc.
  • computer-readable media cannot be electrical carrier signals and telecommunication signals.
  • the disclosed apparatus/network equipment and method may be implemented in other ways.
  • the device/network device embodiments described above are merely illustrative, for example, the division of modules or units is only a logical function division, and there may be other divisions in actual implementation, such as multiple units or components. It can be combined or integrated into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

一种证卡边缘检测方法、设备及存储介质,适用于图形处理技术领域以及数字医疗领域,用于患者身份证件的快速识别。该方法包括获取目标视频中待处理的目标帧(S10);根据目标帧所在的位置,获取与目标帧相邻的相邻帧的第一关键点信息,其中,相邻帧在目标视频的时间轴上的位置在目标帧之前,相邻帧中包含证卡,第一关键点信息包括证卡的角点信息(S20);将第一关键点信息与目标帧输入至预设的关键点位置跟踪模型,以获取目标帧的第二关键点信息以及目标帧的判定信息,其中,判定信息用于表征目标帧是否包含证卡(S30);根据第二关键点信息和所述判定信息,确定目标帧的第一证卡检测结果(S40)。该方法受待处理视频帧的复杂背景和/或模糊边缘的影响较小,检测误差小。

Description

证卡边缘检测方法、设备及存储介质
本申请要求于2020年09月22日在中华人民共和国国家知识产权局专利局提交的、申请号为202011002908.8、发明名称为“证卡边缘检测方法、设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请属于图像处理技术领域,尤其涉及一种证卡边缘检测方法、设备及存储介质。
背景技术
随着身份证、社保卡和银行卡等各种卡片大量的使用,相关的证卡识别服务也随之而来。其中各种证卡的边缘(边框)检测是证卡识别中很重要的一环。
目前的证卡边缘检测,主要是采用神经网络或者传统的边缘检测算法找到图像中的所有边缘信息,然后设置各种条件过滤掉一些边缘信息,得到证卡的边缘信息。
发明人意识到上述方法在复杂背景或者边缘模糊的情况下,容易出现误判,导致边缘检测错误,影响后续对证卡信息的提取等其他服务的运行,另一方面上述方法的运算效率很低,不支持移动端的证卡边缘检测。
技术问题
本申请实施例的目的之一在于:提供了一种证卡边缘检测方法、设备及存储介质,以解决现有技术中证卡边缘检索方法容易出现误判且运算效率低的技术问题。
技术解决方案
第一方面,本申请实施例提供了一种证卡边缘检测方法,方法包括:
获取目标视频中待处理的目标帧;
根据目标帧所在的位置,获取与目标帧相邻的相邻帧的第一关键点信息,其中,相邻帧在目标视频的时间轴上的位置在目标帧之前,相邻帧中包含证卡,第一关键点信息包括证卡的角点信息;
将第一关键点信息与目标帧输入至预设的关键点位置跟踪模型,以获取目标帧的第二关键点信息以及目标帧的判定信息,其中,判定信息用于表征目标帧是否包含证卡;
根据所述第二关键点信息和判定信息,确定目标帧的第一证卡检测结果。
第二方面,本申请实施例提供了一种证卡边缘检测装置,包括:
第一获取模块,用于获取目标视频中待处理的目标帧;
第二获取模块,用于根据所述目标帧所在的位置,获取与所述目标帧相邻的相邻帧的第一关键点信息,其中,所述相邻帧在所述目标视频的时间轴上的位置在所述目标帧之前,所述相邻帧中包含证卡,所述第一关键点信息包括所述证卡的角点信息;
位置跟踪模块,用于将所述第一关键点信息与所述目标帧输入至预设的关键点位置跟踪模型,以获取所述目标帧的第二关键点信息以及所述目标帧的判定信息,其中,所述判定信息用于表征所述目标帧是否包含证卡;
第一确定模块,用于根据所述第二关键点信息和所述判定信息,确定所述目标帧的第一证卡检测结果。
第三方面,本申请实施例提供了一种证卡边缘检测设备,包括存储器、处理器以及存储在存储器中并可在处理器上运行的计算机程序,所述处理器执行计算机程序时实现:
获取目标视频中待处理的目标帧;
根据目标帧所在的位置,获取与目标帧相邻的相邻帧的第一关键点信息,其中,相邻帧在目标视频的时间轴上的位置在目标帧之前,相邻帧中包含证卡,第一关键点信息包括证卡的角点信息;
将第一关键点信息与目标帧输入至预设的关键点位置跟踪模型,以获取目标帧的第二关键点信息以及目标帧的判定信息,其中,判定信息用于表征目标帧是否包含证卡;
根据所述第二关键点信息和判定信息,确定目标帧的第一证卡检测结果。
第四方面,本申请实施例提供了一种计算机可读存储介质,计算机可读存储介质可以是非易失性,也可以是易失性,计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时实现:
获取目标视频中待处理的目标帧;
根据目标帧所在的位置,获取与目标帧相邻的相邻帧的第一关键点信息,其中,相邻帧在目标视频的时间轴上的位置在目标帧之前,相邻帧中包含证卡,第一关键点信息包括证卡的角点信息;
将第一关键点信息与目标帧输入至预设的关键点位置跟踪模型,以获取目标帧的第二关键点信息以及目标帧的判定信息,其中,判定信息用于表征目标帧是否包含证卡;
根据所述第二关键点信息和判定信息,确定目标帧的第一证卡检测结果。
有益效果
本申请实施例与现有技术相比存在的有益效果是:本申请实施例提供的证卡边缘检测方法,根据所述目标帧所在的位置,获取与所述目标帧相邻的相邻帧的第一关键点信息,由于相邻帧为目标视频的时间轴上的位置在目标帧之前的视频帧,且相邻帧包含证卡,故相邻帧的第一关键点信息可以作为目标帧的关键点初始约束位置,然后根据相邻帧的第一关键点信息进行关键点跟踪处理(关键点位置跟踪模型预测)获得目标帧的第二关键点信息,并根据第二关键点确定目标帧的第一证卡检测结果;相比于现有技术中直接基于边缘检测算法确定目标帧包含对象边缘信息的方法,本申请提供的证卡边缘检测方法受待视频帧的复杂背景和/或模糊边缘的影响较小,检测误差小,且关键点跟踪模型不需要进行特征点匹配处理,计算量大大减小,提高了边缘检测效率,适用于移动终端的证卡编码实时检测需求。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例或示范性技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其它的附图。
图1为本申请一实施例提供的证卡边缘检测方法的流程示意图;
图2为本申请一实施例提供的第一证卡示意图;
图3为本申请一实施例提供的获取第二关键点信息的流程示意图;
图4为本申请另一实施例提供的证卡边缘检测方法的流程示意图;
图5为本申请一实施例提供的确定第二证卡检测结果的流程示意;
图6为本申请一实施例提供的确定每个边缘区域对应的边缘直线的流程示意图;
图7为本申请实施例提供第一边缘区域和第一方向的示意图;
图8为本申请一实施例提供的子图像的示意图;
图9为本申请一实施例提供的编码器的网络结构示意图;
图10为本申请一实施例提供的证卡边缘检测装置的结构示意图;
图11为本申请另一实施例提供的证卡边缘检测装置的结构示意图;
图12是本申请一实施例提供的证卡边缘检测设备的硬件组成示意图。
本发明的实施方式
以下描述中,为了说明而不是为了限定,提出了诸如特定系统结构、技术之类的具体细节,以便透彻理解本申请实施例。然而,本领域的技术人员应当清楚,在没有这些具体细节的其它实施例中也可以实现本申请。在其它情况中,省略对众所周知的系统、装置、电路以及方法的详细说明,以免不必要的细节妨碍本申请的描述。
在本申请说明书中描述的参考“一个实施例”或“一些实施例”等意味着在本申请的一个或多个实施例中包括结合该实施例描述的特定特征、结构或特点。由此,在本说明书中的不同之处出现的语句“在一个实施例中”、“在一些实施例中”、“在其他一些实施例中”、“在另外一些实施例中”等不是必然都参考相同的实施例,而是意味着“一个或多个但不是所有的实施例”,除非是以其他方式另外特别强调。术语“包括”、“包含”、“具有”及它们的变形都意味着“包括但不限于”,除非是以其他方式另外特别强调。
下面以具体地实施例对本申请的技术方案以及本申请的技术方案如何解决上述技术问题进行示例性说明。值得说明的是,下文中列举的具体的实施例可以相互结合,对于相同或相似的概念或过程可能在某些实施例中不再赘述。
本申请提供的一种证卡边缘检测方法、设备及存储介质,适用于人工智能领域以及图像处理技术领域。
图1为本申请一实施例提供的证卡边缘检测方法的流程示意图,适用于在终端设备或服务器中执行,如图1所示,该方法包括:
S10、获取目标视频中待处理的目标帧。
本实施例中,目标视频包含M个连续的视频帧,分别为第1帧,第2帧….第M帧,目标帧可以为目标视频中的任意一帧,M为大于1的整数。
S20、根据目标帧所在的位置,获取与目标帧相邻的相邻帧的第一关键点信息,其中,相邻帧在目标视频的时间轴上的位置在目标帧之前,相邻帧中包含证卡,第一关键点信息包括证卡的角点信息。
本实施例中,目标帧所在的位置可以是指,目标帧在目标视频中按播放时间排序后的位置。
例如,目标帧在目标视频的时间轴上的位置。
示例性的,目标视频包含M个视频帧,该M个视频帧按照播放时间排序分别为第1帧,第2帧….第M帧,则第1帧为目标视频的第一帧。
若目标帧为第j帧,则相邻帧为第j-1帧,其中j为大于1小于或等于M的整数。
可以理解的是,若目标帧存在相邻帧,则目标帧不是目标视频的第一帧。
本实施例中,证卡可以是指身份证、社保卡和银行卡等多种卡片,在此不做具体限定。
本实施例中,在相邻帧包含证卡(以下简称第一证卡)的情况下,相邻帧的第一关键点信息可以包括证卡的角点坐标。
示例性地,请一并参阅图2,图2为本申请一实施例提供的第一证卡示意图。如图2所示,第一证卡处于XOY坐标系中,该XOY坐标系即为相邻帧的坐标系。
第一证卡的关键点信息包括第一证卡的四个角点坐标,即图2中的ABCD四个角点的角点坐标。在获得第一证卡的四个角点坐标后,可以根据该四个角点坐标计算获得第一证卡的长度、宽度,以及第一证卡的四条边缘线的直线参数。
S30、将第一关键点信息与目标帧输入预设的关键点位置跟踪模型,以获取目标帧的第二关键点信息以及目标帧的判定信息;判定信息用于表征目标帧是否包含证卡。
本实施例中,预设的关键点位置跟踪模型可以为预先训练的主动轮廓模型。关键点位置跟踪模型的输入为初始轮廓(初始边缘信息)以及目标帧,然后以初始轮廓为基准逐步进行迭代,更新目标帧包含对象的轮廓,直至达到预设条件。
其中,初始轮廓可以根据第一关键点信息确认。预设条件可以为预设的迭代次数或者迭代误差小于预设值等,在此不做具体限制。
本实施例中,关键点位置跟踪模型可以包括输入层、两个卷积层(第一个卷积层Conv1和第二个卷积层Conv2)、分类器以及输出层。
其中,第一个卷积层Conv1和第二卷积层Conv2的组成网络结构可以相同。
例如,为了提高处理效率,第一个卷积层Conv1和第二卷积层Conv2均包括卷积层,BN层以及激活函数,且卷积核的大小均为3*3。
本实施例中,关键点位置跟踪模型可以并行输出分类结果以及卷积结果。
其中,分类结果可以是指表征目标帧是否包含证卡的判定信息,卷积结果可以用于计算获得目标帧的第二关键点信息。
本实施例中,第二关键点信息可以包括目标帧包括对象的角点坐标。
S40、根据第二关键点信息和判定信息,确定目标帧的第一证卡检测结果。
本实施例,第一证卡检测结果可以包括是否包含证卡的标记信息,以及在目标帧包含证卡时的证卡的边缘信息。
其中,边缘信息包括边缘直线的参数以及角点坐标。
例如,可以根据判定信息确定目标帧是否包含证卡,在判定信息表征目标帧包含证卡的情况下,可以根据第二关键点信息确定目标帧包含证卡的边缘信息,在判定信息表征目标帧不包含证卡的情况下,则生成目标帧不包含证卡的标识。
本实施例中,在确定目标帧的第一证卡检测结果后,可以继续获取目标视频的下一帧,其中下一帧为目标视频中与目标帧相邻、且播放时间晚于上述目标帧的视频帧。
将上述下一帧作为更新后的目标帧,重复本实施例的步骤,直至获得目标视频包含的每一个视频帧的证卡检测结果。
本申请实施例提供的证卡边缘检测方法,根据所述目标帧所在的位置,获取与所述目标帧相邻的相邻帧的第一关键点信息,由于相邻帧为目标视频的时间轴上的位置在目标帧之前的视频帧,且相邻帧包含证卡,故相邻帧的第一关键点信息可以作为目标帧的关键点初始约束位置,然后根据相邻帧的第一关键点信息进行关键点跟踪处理(关键点位置跟踪模型预测)获得目标帧的第二关键点信息,并根据第二关键点确定目标帧的第一证卡检测结果;相比于现有技术中直接基于边缘检测算法确定目标帧包含对象边缘信息的方法,本申请提供的证卡边缘检测方法受待视频帧的复杂背景和/或模糊边缘的影响较小,检测误差小,且关键点跟踪模型不需要进行特征点匹配处理,计算量大大减小,提高了边缘检测效率,适用于移动终端的证卡编码实时检测需求。
图3为本申请一实施例提供的获取第二关键点信息的流程示意图,描述了图1实施例中S30获取目标帧的第二关键点信息的一种可能的实施方式,如图3所示,将第一关键点信息与目标帧输入关键点位置跟踪模型,获取目标帧的第二关键点信息,包括:
S301、根据第一关键点信息确定目标帧包含对象的第1个参考位置。
本实施例中,第1个参数位置用于确定目标帧包含对象的初始边缘信息,
例如,第1个参考位置可以包括证卡的一个角点坐标、证卡的长度以及证卡的宽度,初始边缘信息可以为根据第1个参考位置计算获得的边缘直线。
本实施例中,第一关键点信息包括了第一证卡的角点坐标,根据第一关键点信息确定目标帧包含对象的第1个参考位置可以是指,根据第一证卡的角点坐标,确定第一证卡的边缘信息,包括任意一个角点坐标、第一证卡的长度以及第一证卡的宽度,将上述第一证卡的边缘信息确定为目标帧包含对象的第1个参考位置。
示例性地,请一并参阅图2,第一关键点信息包括了第一证卡的ABCD四个角点坐标分别为A(x,y)、B(x’,y)、C(x,y’)以及D(x’,y’),则第1个参考位置可以表示为G1(x,y,w,h),其中,(x,y)表示第一证卡左下角的角点A的坐标,w表示了第一证卡的长度,等于x’-x,h表示了第一证卡的宽度,等于y’-y。
S302、在第1次迭代中,将第1个参考位置和目标帧输入关键点位置跟踪模型,获得第1次迭代的多个关键点和第1次迭代的迭代误差,并根据该迭代误差对第1个参考位置进行更新,获得第2个参考位置。
S303、在第i次迭代中,将第i-1次迭代的多个关键点和目标帧输入所述关键点位置跟踪模型,获得第i次迭代的多个关键点和第i次迭代的迭代误差,并根据该迭代误差对第i个参考位置进行更新,获得第i+1个参考位置;其中,i为大于1的整数。
本实施例中,若i为1,第1个参考位置由第一关键点信息确定,第1次迭代的多个关键点位于第一参考直线上,第一参考直线包含根据第1个参考位置确定的边缘直线;若i为大于1的整数,第i个 参考位置根据第i-1次迭代结果确定,第i次迭代的多个关键点位于第二参考直线上,第二参考直线包含根据第i个参考位置确定的边缘直线。
本实施例中,在每一次迭代中获得多个关键点和本次迭代的迭代误差。
其中,第1次迭代中的输入为第1个参考位置以及目标帧,第i次迭代的输入为第i-1次迭代的多个关键点以及目标帧。
本实施例中,在第1次迭代中,将第1个参考位置和目标帧输入关键点位置跟踪模型,获得第1次迭代的多个关键点X 1和第1次迭代的迭代误差delta_1,并根据该迭代误差delta_1对第1个参考位置G1进行更新,获得第2个参考位置G2(x2,y2,w,h)。
其中,第1次迭代的多个关键点位于第一参考直线上,第一参考直线包含根据所述第1个参考位置确定的边缘直线。
示例性地,在第1次迭代中,第一参考直线根据第1个参考位置确定,具体地第一参考直线包含第1个参考位置表征的4条边缘直线,即第一证卡的4条边缘直线;相应地,第1次迭代获得多个关键点均匀分布在第一证卡的四条边缘直线上。
本实施例中,在第i次迭代中,将第i-1次迭代的多个关键点的位置和目标帧输入关键点位置跟踪模型,获得第i次迭代的多个关键点Xi和第i次迭代的迭代误差delta_i,并根据该迭代误差delta_i,对第i个参考位置进行更新,获得第i+1个参考位置;其中,i为大于1的整数.
其中,第i次迭代的多个关键点位于第二参考直线上,第二参考直线包含根据第i个参考位置确定的边缘直线;
示例性地,在i次迭代中,第i个参考位置可以表示为Gi(xi,yi,w,h),则第i次迭代的第二参考直线包含根据Gi确定的四条边缘直线,相应地,第i次迭代获得多个关键点Xi均匀分布在根据Gi确定的四条边缘直线上。
本实施例中,为了使得每一次迭代获得多个关键点位于参考直线(第一参考直线或第二参考直线)上,在对关键点位置跟踪模型进行预先训练时,对每一次预测的结果中关键点添加强约束,使得每一次预测的关键点均为位于4条边缘直线上的点,该4条边缘直线具体是指矩形证卡的四条边缘直线。
S304、在经过预设次数的迭代后,获得当前迭代得到的多个关键点,并根据当前迭代得到的多个关键点确定第二关键点信息;第二关键点信息包括当前迭代中的参考直线的交点坐标。
本实施例中,迭代的终止条件为迭代次数,在经过预设次数的迭代后,终止迭代获得第二关键点信息,其中预设次数可以由用户预先进行设定。
例如,预设次数为4,则第二关键点信息可以包括第4个参考位置,G4(x4,y4,w,h)。
本实施例,关键点跟踪模型在获得当前迭代的多个关键点之后,将该多个关键点的坐标输入分类器,分类器根据该多个关键点坐标判定目标帧是否包含证卡,生成相应的判定信息。
示例性地,为了更清楚的说明本实施例,请一并参阅下述实施例,本实施例中预设次数为3,第1个参考位置可以表示为G1(x,y,w,h),关键点跟踪模型表示为evolve_gcn。
步骤1、在第1次迭代中,根据G1(x,y,w,h)和目标帧,初始化获得第1次迭代的多个初始关键点,可以通过X 0表示,具体地X 0可以参考式(1):
Figure PCTCN2020125083-appb-000001
其中,n表示关键点的个数,(p n,q n)为第n个初始关键点的坐标。
本步骤中,根据G1初始化获得X 0,可以是指对G1确定的四个角点之间进行线性插值,获得X 0
例如,请一并参阅图2,在角点A和角点B之间的边界线上进行均匀采样,获得128个关键点,同上,分别其他三个边界线上进行均匀采样,共获得512个关键点。
本步骤中,在获得X 0后运行evolve_gcn模型,得到第1次迭代的迭代误差delta_1。
步骤2、根据第1次迭代的迭代误差更新关键点的坐标,获得第1次迭代的多个关键点,表示为X 1;其中,X 1=X 0+delta_1。
本步骤中,同时根据第1次迭代的迭代误差delta_1对第1参考位置G1进行更新,获得G2。
步骤3、在第2次迭代中,将第1次迭代的多个关键点X 2以及目标帧作为输入,运行关键点跟踪模型evolve_gcn,得到第2次迭代的迭代误差delta_2。
步骤4、根据第2次迭代的迭代误差delta_2更新关键点的坐标,得到第2次迭代的多个关键点,表示为X 2;其中,X 2=X 1+delta_2,X 2中的多个关键点均匀分布在位于根据G2确定的4条直线上。
本步骤中,同时根据第2次迭代的迭代误差delta_1对第2参考位置G2进行更新,获得G3。
步骤5、在第3次迭代中,将第2次迭代的多个关键点X 2以及目标帧作为输入,运行关键点跟踪模型evolve_gcn,得到第3次迭代的迭代误差delta_3。
步骤6、根据第3次迭代的迭代误差delta_3更新关键点的坐标,获得第3次迭代的多个关键点,表示为X 3;其中,X 3=X 2+delta_3。
步骤7、根据X 3确定目标帧的第二关键点信息,以及判定目标帧是否包含证卡。
实际应用中,在下述情况发生时,不可以根据相邻帧的第一关键点信息进行关键点跟踪处理:
一是目标帧为目标视频帧的第一帧,此时目标帧没有相邻帧;
二是目标帧虽然不是目标视频帧的第一帧,但是其相邻帧不包含证卡,此时无法获取相邻帧的第一关键点信息。
则在上述情况发生时,为了确定目标帧的证卡边缘信息,需要对目标帧进行边缘检测处理,获得目标帧的证卡边缘检测结果。为了保障证卡边缘检测的精度以及满足移动终端的实时处理的要求,本申请中基于端到端的边缘检测模型进行目标帧的证卡边缘检测,下面通过图4的实施例进行示例性的说明。
图4为本申请另一实施例提供的证卡边缘检测方法的流程示意图,如图4所示,获取目标视频中待处理的目标帧之后,证卡边缘检测方法还包括:
S50、在目标帧为目标视频的第一帧,或相邻帧不包含证卡的情况下,对目标帧进行预处理,获得目标帧的灰度图像。
本实施例中,目标视频的第一帧是指目标视频中播放时间最早的视频帧。
其中,灰度图像的尺寸小于目标帧的尺寸。
本实施例中,将目标帧缩放至目标尺寸后,对缩放后的目标帧进行二值化处理,获得目标帧的灰度图像。
例如,目标帧为大小为1080*1090的彩色图片,对目标帧进行预处理可以是指首先将目标帧缩放为大小128*256的图像,然后对上述图像进行二值化处理,获得对应的灰度图像。
本步骤的目的在于对目标帧进行缩放以及二值化处理,以减少后续步骤中边缘检测的数据处理量,提高边缘检测的效率。
S60、将灰度图像输入边缘检测模型,获取灰度图像的第三关键点信息;边缘检测模型为端到端神经网络模型,第三关键点信息包括灰度图像的多个边缘直线参数。
本实施例的目的旨在检测证卡的边缘,基于证卡的边缘均为直线,本实施例中的边缘检测模型在对灰度图像进行采样处理后,增加了线性回归处理,通过增加线性约束,直接输出直线边缘的参数,实现了图像的端到端的边缘检测。
本实施例中,边缘检测模型包括依次连接的编码器、解码器以及线性回归子模型。
其中,编码器用于获取灰度图像的多种局部特征,并根据多种局部特征对灰度图像的像素值进行分类,获得不同元素对应的局域像素值;元素包括边缘直线。
例如,编码器可以为轻量级卷积神经网络,满足计算力受限的移动终端的应用需求,示例性地,编码器可以为shuffle Net网络模型。
其中,解码器用于将分类后的局域像素值与灰度图像的像素点进行匹配。具体地,解码器用于对缩小后的特征图进行上采样处理,并对上采样处理后的图像进行卷积处理,以弥补编码器中池化层将图像缩小造成的细节损失。
其中,线性回归子模型用于根据匹配边缘直线的像素点,确定多个边缘直线参数。线性回归子模型的最优解满足加权最小二乘法。
例如,线性回归子模型的输入可以表示为input,input的尺寸为4*128*256,包含了4个大小为128*256的特征图,分别对应了分类特征为“边缘”的4条直线。
对于每个128*256的特征图W,增加线性约束函数y=ax+b,即特征图上的每个像素图满足上述约束函数,基于此,得到公式如下:
W*[Y_map,1]=A*W*X_map      (2)
其中,W表示特征图,X_map表示了特征图上像素点的x轴坐标构成子特征图,Y_map表示了特征图上像素点的y轴坐标构成子特征图,V包含
Figure PCTCN2020125083-appb-000002
表示了线性约束函数的直线参数。
则基于式(2),直线参数V的计算公式可以参见式(3):
V=inv{(T(Y_map)*Y_map)*T(X_map)*X_map)}      (3)
其中,T(Y_map)表示Y_map的转置,T(X_map)表示X_map的转置,inv表示逆处理。
基于加权最小二乘法,计算获得V的值,由于input共有4个特征图,故可以获得4个直线参数。
S70、根据第三关键点信息,确定目标帧的第二证卡检测结果。
本实施例中,第三关键点信息包括灰度图像的多个边缘直线参数,可以根据该多个边缘直线参数确定灰度图像包含对象的形状。
在多个边缘直线参数确定一个矩形的情况下,可以判定灰度图像中包含对象为证卡,根据上述多个边缘直线参数确定证卡的角点坐标,进而根据该角点坐标确定目标帧包含证卡的角点坐标。
在多个边缘直线参数确定对象不为矩形的情况下,可以判断灰度图像中包含对象不是证卡,生成目标帧不包含证卡的标记信息。
本实施例提供的证卡边缘检测方法,适用于目标帧为目标视频的第一帧,或相邻帧不包含证卡的情况,方法首先根据目标帧获得灰度图像,并将灰度图像输入边缘检测模型,减小边缘检测的数据处理量,提高边缘检测的效率;且本实施例中的边缘检测模型为端到端神经网络模型,训练/预测的结果直接为灰度图像的多个边缘直线参数,在提高了检测速度的同时,拟合的效果要好于现有技术中的分段式处理方法(背景技术中的方法)。
进一步地,通过将图4实施例与图1实施例提供的证卡边缘检测方法进行结合,实现每个目标视频中每个视频帧的证卡边缘检测,在第一次获取到包含证卡的视频帧之后,即可以基于图1实施例提供的证卡边缘检测方法进行视频帧的关键点跟踪,若关键点跟踪一直成功,则会一直进入图1实施例提供 的关键点跟踪循环,实现了高精度高效的证卡边缘检测;若关键点跟踪失败,即在目标视频中出现不包含证卡的情况下,通常表示目标视频中证卡进行更换,此时通过图4实施例提供的证卡边缘检测方法直接进行证卡的边缘检测,通过端到端的边缘检测模型同样也可以支持实时高效的证卡边缘检测,并在获取的更新后的证卡的边缘后,再次进入图1实施例的关键点跟踪循环;如此反复直至获取到目标视频中每个视频帧的证卡边缘检测结果,实现了目标视频的高效高精度的检测,可以应用于移动终端的证卡的实时检测。
本实施例中,在获得目标帧包含证卡的角点坐标后,可以根据该角点坐标直接计算获得目标帧包含证卡的边缘信息。由于目标帧进入边缘检测模型之前有缩小处理,在获得灰度图像的角点坐标后进行了放大处理,在放大至原图后,目标帧的边缘可能会出现误差,为了提升边缘检测的精度,在灰度图像进行放大处理,获得目标帧包含证卡的边缘后,可以对该边缘进行校正处理,以提高最终获得的目标帧证卡边缘检测的精度,下面通过图5和图6的实施例进行示例性的说明。
图5为本申请一实施例提供的确定第二证卡检测结果的流程示意图,描述了图4实施例中S70的一种可能实施方式,如图5所示,根据第三关键点信息,确定目标帧的第二证卡检测结果,包括:
S701、在多个边缘直线参数能够确定一个矩形的情况下,根据多个边缘线参数确定待检测证卡的多个角点坐标;待检测证卡为目标帧包含的证卡。
本实施例中,在多个边缘直线参数确定一个矩形时,可以判定灰度图像包含证卡,即目标帧包含证卡,其中目标帧包含的证卡可以为待检测证卡。
本实施例中,根据多个边缘直线参数确定灰度图像包含证卡的角点坐标,然后根据预设比例对角点坐标进行放大处理,获得待检测证卡的多个角点坐标。
其中,预设比例为图4实施例中对目标帧进行预处理时的缩小比例。
可以理解的是,待检测证卡包含4个角点,本步骤可以获得待检测证卡的4个角点坐标。
S702、根据多个角点坐标,截取待检测证卡的多个边缘区域,多个边缘区域与多个角点一一对应。
本实施例中,根据多个角点坐标确定每个角点对应的感兴趣区域,截取该感兴趣区域获得与多个角点一一对应的多个边缘区域。
其中,感兴趣区域是指在目标帧以方框、圆、椭圆、不规则多边形等方式截取获得的需要处理的区域。本实施例中可以以方框进行截取。
S703、确定每个边缘区域对应的边缘直线,并将该边缘直线确定为待检测证卡的边缘直线;
本实施例中,确定每个边缘区域对应的边缘直线的方法相同。
例如,可以通过对一个边缘区域进行分区获得多个子区域,在确定每个子区域对应的目标线段后,对多个目标线段进行拟合处理,获得上述一个边缘区域对应的边缘直线。
其中,目标线段为子区域的边缘线段。
本实施例提供的方法中通过对多个目标线段进行拟合处理,获得边缘区域对应的边缘直线,可以有效的减少由于图像缩放处理导致的误差,提高了边缘区域对应的边缘直线的精度,进而提高了待检测证卡的边缘直线的精度。
图6为本申请一实施例提供的确定每个边缘区域对应的边缘直线的流程示意图,描述了图5实施例中S703的一种可能实施方式,如图6所示,确定每个边缘区域对应的边缘直线,包括:
S7031、对第一边缘区域进行边缘检测,获得第一边缘区域沿第一方向的边缘图像;第一边缘区域为多个边缘区域中任意一个,第一方向为待检测证卡的任意一个边缘的方向。
边缘由图像中像素值发生跃迁(梯度变化)的像素点构成,基于此特性,可以根据索贝尔算子进行第一边缘区域的边缘检测。
其中,索贝尔算子包含两组3x3的矩阵,分别为X向矩阵和Y向矩阵,将该两组矩阵分别与第一边缘区域的图像进行平面卷积处理,可分别得出第一边缘区域的X向及Y向的近似梯度,从而可以获得第一边缘区域在X向和Y向的边缘。
本实施例中,第一方向为待检测证卡的任意一个边缘的方向,待检测证卡的组成元素包括内容和边缘。
一种实施例中,由于平面图像具有X向(左右)和Y向(上下)两个方向的卷积处理,故可以根据第一边缘区域相对于待检测证卡中内容的位置,确定第一方向。
例如,第一边缘区域位于待检测证卡内容的左右两侧时,则第一方向为Y向,第一边缘区域位于待检测证卡内容的上下两侧时,则第一方向为X向。
又一种实施例中,第一方向为预设方向,则为了获得第一边缘区域的边缘直线,可以首先对第一边缘区域进行翻转处理后,然后在第一方向上对翻转后的第一边缘区域进行平面卷积处理。
其中,翻转包括水平翻转和垂直翻转。
示例性的,请一并参阅图7,图7为本申请实施例提供第一边缘区域和第一方向的示意图。如图7所示,第一边缘区域为基于虚线方框框选的矩形区域,第一边缘区域可以为待检测证卡的四个边缘区域上的任意一个,可以为①、②、③以及④中任意一个。
本示例中,第一方向为Y向,确定索贝尔算子为Y向矩阵。为了使得经过平面卷积处理后的边缘图像中,待检测证卡的内容始终位于第一边缘区域的右侧,可以首先对第一边缘区域进行翻转处理。
若第一边缘区域为①,则直接基于Y向矩阵对第一边缘区域进行平面卷积处理,获得第一边缘区域的沿Y向的边缘图像,且此时待检测证卡的内容位于第一边缘区域的右侧。
若第一边缘区域为②,则首先对第一边缘区域进行水平翻转,然后基于Y向矩阵对翻转后的第一边缘区域进行平面卷积处理,获得第一边缘区域的沿Y向的边缘图像,此时待检测证卡的内容同样位于第一边缘区域的右侧。
若第一边缘区域为③,则首先对第一边缘区域进行顺时针垂直翻转,然后基于Y向矩阵对翻转后的第一边缘区域进行平面卷积处理,获得第一边缘区域的沿Y向的边缘图像,此时待检测证卡的内容同样位于第一边缘区域的右侧。
若第一边缘区域为④,则首先对第一边缘区域进行逆时针垂直翻转,然后基于Y向矩阵对翻转后的第一边缘区域进行平面卷积处理,获得第一边缘区域的沿Y向的边缘图像,此时待检测证卡的内容同样位于第一边缘区域的右侧。
应理解的是,若预设的第一方向不同,则翻转方向不同;同样地,若待检测证卡的内容与边缘直线的相对位置不同,则翻转方向不同。
本步骤中,对第一边缘区域进行边缘检测,获得第一边缘区域沿第一方向的边缘图像,且边缘图像中待检测证卡的内容与目标边缘的相对位置固定。
S7032、将边缘图像划分为N个子图像并对每个子图像进行二值化处理,获得N个二值化子图像;其中,N为大于1的整数。
本实施例中,可以将边缘图像均分为N个子图像。
本实施例中,可以基于大津法对每个子图像进行自适应二值化处理,获得对应的N个二值化子图像。
S7033、对N个二值化子图像进行直线检测,获取N个目标直线以及该N个目标直线的2N个端点;其中,目标直线为二值化子图像中与目标边缘距离最近的直线,目标边缘根据第一方向确定。
本实施例中,针对N个二值化子图像中每个子图像,对该子图像进行直线检测处理,可以获得该子图像包含的多个直线,将多个直线中距离目标边缘距离最近的直线确定为目标直线。
其中,目标边缘为子图像中最靠近待检测证卡的内容的边缘,可以根据第一方向确定。
示例性地,请一并参阅图8,图8为本申请一实施例提供的子图像的示意图,如图8所示,经过直线检测处理后,可以获得该子图像向包含的两个直线段,分别为Z1(PQ)以及Z2(RS)。本示例中经过平面卷积处理后的边缘图像中,待检测证卡的内容始终位于第一边缘区域的右侧,故图8中目标边缘为Z3。
由图8可知,该两条直线段中,Z2(RS)距离目标边缘Z3距离较近,则可以确定该两个直线段中Z2(RS)为目标直线,进而得到目标直线的两个端点R和S。
对第一边缘区域中的N个子图像进行本步骤的处理,可以得到2N个端点。
S7034、对2N个端点进行直线拟合,获得第一边缘区域对应的边缘直线。
本实施例中,可以基于ransac算法进行直线拟合处理。
本申请实施例提供的确定每个边缘区域对应的边缘直线的方法,通过对每个边缘区域进行分区处理,在获得每个边缘区域的目标直线后,根据该目标直线的多个端点进行拟合处理,从而获得边缘区域对应的边缘直线,可以有效的减少由于图像缩放处理导致的误差,提高了边缘区域对应的边缘直线的精度,进而提高了待检测证卡的边缘直线的精度。
现有技术中的轻量级卷积神经网络模型,例如Shuffle Net网络模型,通常包括通道混淆层以实现多通道图像的计算量,本实施例中输入边缘检测模型的图像为灰度图像,不需要进行通道混洗,为了进一步提高数据计算的运算量,本申请实施例中对现有技术中的Shuffle Net网络模型的网络结构进一步进行优化。
本申请证卡边缘检测方法、设备及存储介质能够用于医疗类数据的处理,有助于提高医疗数据处理的高效性、安全性或者稳定性。用于患者身份证件的快速识别。
图9为本申请一实施例提供的编码器的网络结构示意图,如图9所示,编码器的每个网络节点包括并行运算的第一分支和第二分支。其中,第一分支包括顺序连接的平均池化层、1*1卷积层以及上采样层,第二分支包括1*1卷积层。
本实施例中,第一分支用于进行灰度图像的局部特征提取,第二分支用于进行灰度图像的全局特征提取,在获得局部特征后和全局特征通过连接层进行连接处理,并将处理结果作为解码器的输入,具体地,连接层可以基于Concat函数实现。
本实施例中,第一分支中的平均池化层用于对灰度图像进行下采样处理并且将尺度不变特征传送到下一层(即1*1卷积层),1*1卷积层用于获取传入特征图的局域特征,图9中BN主要实现图像的分布归一化,以加速学习。
第一分支中的上采样层可以基于双线性插值方法进行上采样处理。
本申请实施例提供的编码器的网络结构对现有技术中的轻量级卷积神经网络的编码器进行精简,去除通道混淆层,进一步降低边缘检测模型的运算量,提高了边缘检测模型的运算速率,以满足移动终端的证卡边缘检测的实时处理。
应理解,上述实施例中各步骤的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。
基于上述实施例所提供的证卡边缘检测方法,本申请实施例进一步给出实现上述方法实施例的装置实施例。
图10为本申请一实施例提供的证卡边缘检测装置的结构示意图。如图10所示,证卡边缘检测装置80包括第一获取模块801、第二获取模块802、位置跟踪模块803以及第一确定模块804,其中:
第一获取模块801,用于获取目标视频中待处理的目标帧;
第二获取模块802,用于根据所述目标帧所在的位置,获取与所述目标帧相邻的相邻帧的第一关键点信息,其中,所述相邻帧在所述目标视频的时间轴上的位置在所述目标帧之前,所述相邻帧中包含证卡,所述第一关键点信息包括所述证卡的角点信息;
位置跟踪模块803,用于将所述第一关键点信息与所述目标帧输入至预设的关键点位置跟踪模型,以获取所述目标帧的第二关键点信息以及所述目标帧的判定信息,其中,所述判定信息用于表征所述目标帧是否包含证卡;
第一确定模块804,用于根据所述第二关键点信息和所述判定信息,确定所述目标帧的第一证卡检测结果。
可选地,位置跟踪模块803将第一关键点信息与目标帧输入关键点位置跟踪模型,获取目标帧的第二关键点信息,具体包括:
根据第一关键点信息确定目标帧包含对象的第1个参考位置;
在第1次迭代中,将第1个参考位置和目标帧输入关键点位置跟踪模型,获得第1次迭代的多个关键点和第1次迭代的迭代误差,并根据该迭代误差对第1个参考位置进行更新,获得第2个参考位置;第1次迭代的多个关键点位于第一参考直线上,第一参考直线包含根据第1个参考位置确定的边缘直线;
在第i次迭代中,将第i-1次迭代的多个关键点和目标帧输入所述关键点位置跟踪模型,获得第i次迭代的多个关键点和第i次迭代的迭代误差,并根据该迭代误差对第i个参考位置进行更新,获得第 i+1个参考位置;其中,i为大于1的整数,第i次迭代的多个关键点位于第二参考直线上,第二参考直线包含根据第i个参考位置确定的边缘直线;
在经过预设次数的迭代后,获得当前迭代得到的多个关键点,并根据当前迭代得到的多个关键点确定第二关键点信息;第二关键点信息包括当前迭代中的参考直线的交点坐标。
可选地,第一确定模块804根据第二关键点信息和判定信息,确定目标帧的证卡检测结果,具体包括:
在判定信息表征目标帧包含证卡的情况下,根据第二关键点信息确定目标帧包含证卡的边缘信息;边缘信息包括边缘直线的参数以及角点坐标。
图11为本申请另一实施例提供的证卡边缘检测装置的结构示意图。如图11所示,证卡边缘检测装置80还包括预处理模块805、边缘检测模块806以及第二确定模块807;
预处理模块805,用于在目标帧为目标视频的第一帧,或相邻帧不包含证卡的情况下,对目标帧进行预处理,获得目标帧的灰度图像;灰度图像的尺寸小于目标帧的尺寸。
边缘检测模块806,用于将灰度图像输入边缘检测模型,获取灰度图像的第三关键点信息;边缘检测模型为端到端神经网络模型,第三关键点信息包括灰度图像的多个边缘直线参数。
第二确定模块807,用于根据第三关键点信息,确定目标帧的第二证卡检测结果。
可选地,第二确定模块807根据所述第三关键点信息,确定目标帧的第二证卡检测结果,具体包括:
在多个边缘直线参数能够确定一个矩形的情况下,根据多个边缘线参数确定待检测证卡的多个角点坐标;待检测证卡为目标帧包含的证卡;
根据多个角点坐标,截取待检测证卡的多个边缘区域,多个边缘区域与多个角点一一对应;
确定每个边缘区域对应的边缘直线,并将所述边缘直线确定为所述待检测证卡的边缘直线。
可选地,第二确定模块807确定每个边缘区域对应的边缘直线,具体包括:
对第一边缘区域进行边缘检测,获得第一边缘区域沿第一方向的边缘图像;第一边缘区域为所述多个边缘区域中任意一个,第一方向为所述待检测证卡的任意一个边缘的方向;
将边缘图像划分为N个子图像并对每个子图像进行二值化处理,获得N个二值化子图像;其中,N为大于1的整数;
对N个二值化子图像进行直线检测,获取N个目标直线以及该N个目标直线的2N个端点;其中,目标直线为二值化子图像中与目标边缘距离最近的直线,目标边缘根据第一方向确定;
对2N个端点进行直线拟合,获得第一边缘区域对应的边缘直线。
可选地,边缘检测模型为轻量级卷积神经网络;边缘检测模型包括:依次连接的编码器、解码器以及线性回归子模型;
其中,编码器用于获取灰度图像的多种局部特征,并根据多种局部特征对灰度图像的像素值进行分类,获得不同元素对应的局域像素值;元素包括边缘直线;
解码器用于将局域像素值与灰度图像的像素点进行匹配;
线性回归子模型用于根据匹配边缘直线的像素点,确定多个边缘直线参数;
其中,线性回归模型的最优解满足加权最小二乘法。
可选地,编码器的网络节点包括并行运算的第一分支和第二分支;
其中,第一分支包括顺序连接的平均池化层、1*1卷积层以及上采样层,第二分支包括一个1*1卷积层。
图10和图11所示实施例提供的证卡边缘检测装置,可用于执行上述方法实施例中的技术方案,其实现原理和技术效果类似,本实施例此处不再赘述。
图12是本申请一实施例提供的证卡边缘检测设备的示意图。如图12所示,该证卡边缘检测设备90包括:至少一个处理器901、存储器902以及存储在所述存储器902中并可在所述处理器901上运行的计算机程序。证卡边缘检测设备还包括通信部件903,其中,处理器901、存储器902以及通信部件903通过总线904连接。
处理器901执行所述计算机程序时实现上述各个证卡边缘检测方法实施例中的步骤,例如图1所示实施例中的步骤S10至步骤S40。或者,处理器901执行计算机程序时实现上述各装置实施例中各模块/单元的功能,例如图10所示模块801至804的功能。
示例性的,计算机程序可以被分割成一个或多个模块/单元,所述一个或者多个模块/单元被存储在所述存储器902中,并由处理器901执行,以完成本申请。一个或多个模块/单元可以是能够完成特定功能的一系列计算机程序指令段,该指令段用于描述计算机程序在所述证卡边缘检测设备90中的执行过程。
本领域技术人员可以理解,图12仅仅是证卡边缘检测设备的示例,并不构成对证卡边缘检测设备的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件,例如输入输出设备、网络接入设备、总线等。
本申请实施例中的证卡边缘检测设备可以为移动终端,包括但不限于智能手机、平板电脑、个人数字助理、电子书等。
证卡边缘检测设备还可以为终端设备、服务器等,在此不做具体限制。
所称处理器901可以是中央处理单元(Central Processkng Unkt,CPU),还可以是其他通用处理器、数字信号处理器(Dkgktal Skgnal Processor,DSP)、专用集成电路(Applkcatkon Speckfkc Kntegrated Ckrcukt,ASKC)、现成可编程门阵列(Fkeld-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
存储器902可以是证卡边缘检测设备的内部存储单元,也可以是证卡边缘检测设备的外部存储设备,例如插接式硬盘,智能存储卡(Smart Medka Card,SMC),安全数字(Secure Dkgktal,SD)卡,闪存卡(Flash Card)等。所述存储器902用于存储所述计算机程序以及证卡边缘检测设备所需的其他程序和数据。存储器902还可以用于暂时地存储已经输出或者将要输出的数据。
总线可以是工业标准体系结构(Kndustry Standard Archktecture,KSA)总线、外部设备互连(Perkpheral Component,PCK)总线或扩展工业标准体系结构(Extended Kndustry Standard Archktecture,EKSA)总线等。总线可以分为地址总线、数据总线、控制总线等。为便于表示,本申请附图中的总线并不限定仅有一根总线或一种类型的总线。
本申请实施例还提供了一种计算机可读存储介质,该计算机可读存储介质可以是非易失性,也可以是易失性,该计算机可读存储介质存储有计算机程序,该计算机程序被处理器执行时实现可实现上述各个方法实施例中的步骤。
本申请实施例提供了一种计算机程序产品,当计算机程序产品在证卡边缘检测设备上运行时,使得证卡边缘检测设备执行时实现可实现上述各个方法实施例中的步骤。
集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请实现上述实施例方法中的全部或部分流程,可以通过计算机程序来指令相关的硬件来完成,计算机程序可存储于一计算机可读存储介质中,该计算机程序在被处理器执行时,可实现上述各个方法实施例的步骤。其中,计算机程序包括计算机程序代码,计算机程序代码可以为源代码形式、对象代码形式、可执行文件或某些中间形式等。计算机可读介质至少可以包括:能够将计算机程序代码携带到拍照装置/终端设备的任何实体或装置、记录介质、计算机存储器、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、电载波信号、电信信号以及软件分发介质。例如U盘、移动硬盘、磁碟或者光盘等。在某些司法管辖区,根据立法和专利实践,计算机可读介质不可以是电载波信号和电信信号。
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述或记载的部分,可以参见其它实施例的相关描述。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
在本申请所提供的实施例中,应该理解到,所揭露的装置/网络设备和方法,可以通过其它的方式实现。例如,以上所描述的装置/网络设备实施例仅仅是示意性的,例如,模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通讯连接可以是通过一些接口,装置或单元的间接耦合或通讯连接,可以是电性,机械或其它的形式。
作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围,均应包含在本申请的保护范围之内。

Claims (20)

  1. 一种证卡边缘检测方法,其中,包括:
    获取目标视频中待处理的目标帧;
    根据所述目标帧所在的位置,获取与所述目标帧相邻的相邻帧的第一关键点信息,其中,所述相邻帧在所述目标视频的时间轴上的位置在所述目标帧之前,所述相邻帧中包含证卡,所述第一关键点信息包括所述证卡的角点信息;
    将所述第一关键点信息与所述目标帧输入至预设的关键点位置跟踪模型,以获取所述目标帧的第二关键点信息以及所述目标帧的判定信息,其中,所述判定信息用于表征所述目标帧是否包含证卡;
    根据所述第二关键点信息和所述判定信息,确定所述目标帧的第一证卡检测结果。
  2. 如权利要求1所述的证卡边缘检测方法,其中,所述将所述第一关键点信息与所述目标帧输入关键点位置跟踪模型,获取所述目标帧的第二关键点信息,包括:
    根据所述第一关键点信息确定所述目标帧包含对象的第1个参考位置;
    在第1次迭代中,将所述第1个参考位置和所述目标帧输入所述关键点位置跟踪模型,获得第1次迭代的多个关键点和第1次迭代的迭代误差,并根据该迭代误差对所述第1个参考位置进行更新,获得第2个参考位置;所述第1次迭代的多个关键点位于第一参考直线上,所述第一参考直线包含根据所述第1个参考位置确定的边缘直线;
    在第i次迭代中,将第i-1次迭代的多个关键点和所述目标帧输入所述关键点位置跟踪模型,获得第i次迭代的多个关键点和第i次迭代的迭代误差,并根据该迭代误差对所述第i个参考位置进行更新,获得第i+1个参考位置;其中,i为大于1的整数,所述第i次迭代的多个关键点位于第二参考直线上,所述第二参考直线包含根据所述第i个参考位置确定的边缘直线;
    在经过预设次数的迭代后,获得当前迭代得到的多个关键点,并根据所述当前迭代得到的多个关键点确定所述第二关键点信息;所述第二关键点信息包括当前迭代中的参考直线的交点坐标。
  3. 如权利要求1所述的证卡边缘检测方法,其中,所述根据所述第二关键点信息和所述判定信息,确定所述目标帧的证卡检测结果,包括:
    在所述判定信息表征所述目标帧包含证卡的情况下,根据所述第二关键点信息确定所述目标帧包含证卡的边缘信息;所述边缘信息包括边缘直线的参数以及角点坐标。
  4. 如权利要求1-3任一项所述的证卡边缘检测方法,其中,所述获取目标视频中待处理的目标帧之后,所述方法还包括:
    在所述目标帧为所述目标视频的第一帧,或所述相邻帧不包含证卡的情况下,对所述目标帧进行预处理,获得所述目标帧的灰度图像;所述灰度图像的尺寸小于所述目标帧的尺寸;
    将所述灰度图像输入边缘检测模型,获取所述灰度图像的第三关键点信息;所述边缘检测模型为端到端神经网络模型,所述第三关键点信息包括所述灰度图像的多个边缘直线参数;
    根据所述第三关键点信息,确定所述目标帧的第二证卡检测结果。
  5. 如权利要求4所述的证卡边缘检测方法,其中,所述根据所述第三关键点信息,确定所述目标帧的第二证卡检测结果,包括:
    在所述多个边缘直线参数能够确定一个矩形的情况下,根据所述多个边缘线参数确定待检测证卡的多个角点坐标;所述待检测证卡为所述目标帧包含的证卡;
    根据所述多个角点坐标,截取所述待检测证卡的多个边缘区域,所述多个边缘区域与所述多个角点一一对应;
    确定每个边缘区域对应的边缘直线,并将所述边缘直线确定为所述待检测证卡的边缘直线。
  6. 如权利要求5所述的证卡边缘检测方法,其中,所述确定每个边缘区域对应的边缘直线,包括:
    对第一边缘区域进行边缘检测,获得所述第一边缘区域沿第一方向的边缘图像;所述第一边缘区域为所述多个边缘区域中任意一个,所述第一方向为所述待检测证卡的任意一个边缘的方向;
    将所述边缘图像划分为N个子图像并对每个子图像进行二值化处理,获得N个二值化子图像;其中,N为大于1的整数;
    对所述N个二值化子图像进行直线检测,获取N个目标直线以及该N个目标直线的2N个端点;其中,所述目标直线为二值化子图像中与目标边缘距离最近的直线,所述目标边缘根据所述第一方向确定;
    对所述2N个端点进行直线拟合,获得所述第一边缘区域对应的边缘直线。
  7. 如权利要求4所述的证卡边缘检测方法,其中,所述边缘检测模型为轻量级卷积神经网络;
    所述边缘检测模型包括:依次连接的编码器、解码器以及线性回归子模型;
    其中,所述编码器用于获取所述灰度图像的多种局部特征,并根据所述多种局部特征对所述灰度图像的像素值进行分类,获得不同元素对应的局域像素值;所述元素包括边缘直线;
    所述解码器用于将所述局域像素值与所述灰度图像的像素点进行匹配;
    所述线性回归子模型用于根据与所述边缘直线匹配的像素点,确定所述多个边缘直线参数;
    其中,所述线性回归模型的最优解满足加权最小二乘法。
  8. 如权利要求7所述的证卡边缘检测方法,其中,所述编码器的网络节点包括并行运算的第一分支和第二分支;
    其中,所述第一分支包括顺序连接的平均池化层、1*1卷积层以及上采样层,所述第二分支包括一个1*1卷积层。
  9. 一种证卡边缘检测装置,其中,装置包括:
    第一获取模块,用于获取目标视频中待处理的目标帧;
    第二获取模块,用于根据所述目标帧所在的位置,获取与所述目标帧相邻的相邻帧的第一关键点信息,其中,所述相邻帧在所述目标视频的时间轴上的位置在所述目标帧之前,所述相邻帧中包含证卡,所述第一关键点信息包括所述证卡的角点信息;
    位置跟踪模块,用于将所述第一关键点信息与所述目标帧输入至预设的关键点位置跟踪模型,以获取所述目标帧的第二关键点信息以及所述目标帧的判定信息,其中,所述判定信息用于表征所述目标帧是否包含证卡;
    第一确定模块,用于根据所述第二关键点信息和所述判定信息,确定所述目标帧的第一证卡检测结果。
  10. 一种证卡边缘检测设备,其中,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现:
    获取目标视频中待处理的目标帧;
    根据所述目标帧所在的位置,获取与所述目标帧相邻的相邻帧的第一关键点信息,其中,所述相邻帧在所述目标视频的时间轴上的位置在所述目标帧之前,所述相邻帧中包含证卡,所述第一关键点信息包括所述证卡的角点信息;
    将所述第一关键点信息与所述目标帧输入至预设的关键点位置跟踪模型,以获取所述目标帧的第二关键点信息以及所述目标帧的判定信息,其中,所述判定信息用于表征所述目标帧是否包含证卡;
    根据所述第二关键点信息和所述判定信息,确定所述目标帧的第一证卡检测结果。
  11. 如权利要求10所述的证卡边缘检测设备,其中,所述处理器执行所述计算机程序时还实现:
    根据所述第一关键点信息确定所述目标帧包含对象的第1个参考位置;
    在第1次迭代中,将所述第1个参考位置和所述目标帧输入所述关键点位置跟踪模型,获得第1次迭代的多个关键点和第1次迭代的迭代误差,并根据该迭代误差对所述第1个参考位置进行更新,获得第2个参考位置;所述第1次迭代的多个关键点位于第一参考直线上,所述第一参考直线包含根据所述第1个参考位置确定的边缘直线;
    在第i次迭代中,将第i-1次迭代的多个关键点和所述目标帧输入所述关键点位置跟踪模型,获得第i次迭代的多个关键点和第i次迭代的迭代误差,并根据该迭代误差对所述第i个参考位置进行更新,获得第i+1个参考位置;其中,i为大于1的整数,所述第i次迭代的多个关键点位于第二参考直线上,所述第二参考直线包含根据所述第i个参考位置确定的边缘直线;
    在经过预设次数的迭代后,获得当前迭代得到的多个关键点,并根据所述当前迭代得到的多个关键点确定所述第二关键点信息;所述第二关键点信息包括当前迭代中的参考直线的交点坐标。
  12. 如权利要求10所述的证卡边缘检测设备,其中,所述处理器执行所述计算机程序时还实现:
    在所述判定信息表征所述目标帧包含证卡的情况下,根据所述第二关键点信息确定所述目标帧包含证卡的边缘信息;所述边缘信息包括边缘直线的参数以及角点坐标。
  13. 如权利要求10-12任一项所述的证卡边缘检测设备,其中,所述处理器执行所述计算机程序时还实现:
    在所述目标帧为所述目标视频的第一帧,或所述相邻帧不包含证卡的情况下,对所述目标帧进行预处理,获得所述目标帧的灰度图像;所述灰度图像的尺寸小于所述目标帧的尺寸;
    将所述灰度图像输入边缘检测模型,获取所述灰度图像的第三关键点信息;所述边缘检测模型为端到端神经网络模型,所述第三关键点信息包括所述灰度图像的多个边缘直线参数;
    根据所述第三关键点信息,确定所述目标帧的第二证卡检测结果。
  14. 如权利要求13所述的证卡边缘检测设备,其中,所述处理器执行所述计算机程序时还实现:
    在所述多个边缘直线参数能够确定一个矩形的情况下,根据所述多个边缘线参数确定待检测证卡的多个角点坐标;所述待检测证卡为所述目标帧包含的证卡;
    根据所述多个角点坐标,截取所述待检测证卡的多个边缘区域,所述多个边缘区域与所述多个角点一一对应;
    确定每个边缘区域对应的边缘直线,并将所述边缘直线确定为所述待检测证卡的边缘直线。
  15. 如权利要求14所述的证卡边缘检测设备,其中,所述处理器执行所述计算机程序时还实现:
    对第一边缘区域进行边缘检测,获得所述第一边缘区域沿第一方向的边缘图像;所述第一边缘区域为所述多个边缘区域中任意一个,所述第一方向为所述待检测证卡的任意一个边缘的方向;
    将所述边缘图像划分为N个子图像并对每个子图像进行二值化处理,获得N个二值化子图像;其中,N为大于1的整数;
    对所述N个二值化子图像进行直线检测,获取N个目标直线以及该N个目标直线的2N个端点;其中,所述目标直线为二值化子图像中与目标边缘距离最近的直线,所述目标边缘根据所述第一方向确定;
    对所述2N个端点进行直线拟合,获得所述第一边缘区域对应的边缘直线。
  16. 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,其中,所述计算机程序被处理器执行时实现:
    获取目标视频中待处理的目标帧;
    根据所述目标帧所在的位置,获取与所述目标帧相邻的相邻帧的第一关键点信息,其中,所述相邻帧在所述目标视频的时间轴上的位置在所述目标帧之前,所述相邻帧中包含证卡,所述第一关键点信息包括所述证卡的角点信息;
    将所述第一关键点信息与所述目标帧输入至预设的关键点位置跟踪模型,以获取所述目标帧的第二关键点信息以及所述目标帧的判定信息,其中,所述判定信息用于表征所述目标帧是否包含证卡;
    根据所述第二关键点信息和所述判定信息,确定所述目标帧的第一证卡检测结果。
  17. 如权利要求16所述的计算机可读存储介质,其中,所述计算机程序被处理器执行时还实现:
    根据所述第一关键点信息确定所述目标帧包含对象的第1个参考位置;
    在第1次迭代中,将所述第1个参考位置和所述目标帧输入所述关键点位置跟踪模型,获得第1次迭代的多个关键点和第1次迭代的迭代误差,并根据该迭代误差对所述第1个参考位置进行更新,获得第2个参考位置;所述第1次迭代的多个关键点位于第一参考直线上,所述第一参考直线包含根据所述第1个参考位置确定的边缘直线;
    在第i次迭代中,将第i-1次迭代的多个关键点和所述目标帧输入所述关键点位置跟踪模型,获得第i次迭代的多个关键点和第i次迭代的迭代误差,并根据该迭代误差对所述第i个参考位置进行更新,获得第i+1个参考位置;其中,i为大于1的整数,所述第i次迭代的多个关键点位于第二参考直线上,所述第二参考直线包含根据所述第i个参考位置确定的边缘直线;
    在经过预设次数的迭代后,获得当前迭代得到的多个关键点,并根据所述当前迭代得到的多个关键点确定所述第二关键点信息;所述第二关键点信息包括当前迭代中的参考直线的交点坐标。
  18. 如权利要求16所述的计算机可读存储介质,其中,所述处理器执行所述计算机程序时还实现:
    在所述判定信息表征所述目标帧包含证卡的情况下,根据所述第二关键点信息确定所述目标帧包含证卡的边缘信息;所述边缘信息包括边缘直线的参数以及角点坐标。
  19. 如权利要求16-18任一项所述的计算机可读存储介质,其中,所述处理器执行所述计算机程序时还实现:
    在所述目标帧为所述目标视频的第一帧,或所述相邻帧不包含证卡的情况下,对所述目标帧进行预处理,获得所述目标帧的灰度图像;所述灰度图像的尺寸小于所述目标帧的尺寸;
    将所述灰度图像输入边缘检测模型,获取所述灰度图像的第三关键点信息;所述边缘检测模型为端到端神经网络模型,所述第三关键点信息包括所述灰度图像的多个边缘直线参数;
    根据所述第三关键点信息,确定所述目标帧的第二证卡检测结果。
  20. 如权利要求19所述的计算机可读存储介质,其中,所述处理器执行所述计算机程序时还实现:
    在所述多个边缘直线参数能够确定一个矩形的情况下,根据所述多个边缘线参数确定待检测证卡的多个角点坐标;所述待检测证卡为所述目标帧包含的证卡;
    根据所述多个角点坐标,截取所述待检测证卡的多个边缘区域,所述多个边缘区域与所述多个角点一一对应;
    确定每个边缘区域对应的边缘直线,并将所述边缘直线确定为所述待检测证卡的边缘直线。
PCT/CN2020/125083 2020-09-22 2020-10-30 证卡边缘检测方法、设备及存储介质 WO2021147437A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011002908.8 2020-09-22
CN202011002908.8A CN112183517B (zh) 2020-09-22 2020-09-22 证卡边缘检测方法、设备及存储介质

Publications (1)

Publication Number Publication Date
WO2021147437A1 true WO2021147437A1 (zh) 2021-07-29

Family

ID=73956308

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/125083 WO2021147437A1 (zh) 2020-09-22 2020-10-30 证卡边缘检测方法、设备及存储介质

Country Status (2)

Country Link
CN (1) CN112183517B (zh)
WO (1) WO2021147437A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113610799A (zh) * 2021-08-04 2021-11-05 沭阳九鼎钢铁有限公司 基于人工智能的光伏电池板彩虹纹检测方法、装置及设备

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112837285B (zh) * 2021-01-29 2022-07-26 山东建筑大学 一种板面图像的边缘检测方法及装置
CN112991280B (zh) * 2021-03-03 2024-05-28 望知科技(深圳)有限公司 视觉检测方法、系统及电子设备

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109584276A (zh) * 2018-12-04 2019-04-05 北京字节跳动网络技术有限公司 关键点检测方法、装置、设备及可读介质
US20190295267A1 (en) * 2014-07-29 2019-09-26 Alibaba Group Holding Limited Detecting specified image identifiers on objects
CN110660078A (zh) * 2019-08-20 2020-01-07 平安科技(深圳)有限公司 对象追踪方法、装置、计算机设备和存储介质
CN110929738A (zh) * 2019-11-19 2020-03-27 上海眼控科技股份有限公司 证卡边缘检测方法、装置、设备及可读存储介质
CN111027495A (zh) * 2019-12-12 2020-04-17 京东数字科技控股有限公司 用于检测人体关键点的方法和装置
CN111461209A (zh) * 2020-03-30 2020-07-28 深圳市凯立德科技股份有限公司 一种模型训练装置和方法

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2517674A (en) * 2013-05-17 2015-03-04 Wonga Technology Ltd Image capture using client device
CN105678347A (zh) * 2014-11-17 2016-06-15 中兴通讯股份有限公司 行人检测方法及装置
EP3300524A1 (en) * 2015-08-06 2018-04-04 Accenture Global Services Limited Condition detection using image processing
US10977520B2 (en) * 2018-12-18 2021-04-13 Slyce Acquisition Inc. Training data collection for computer vision
CN111464716B (zh) * 2020-04-09 2022-08-19 腾讯科技(深圳)有限公司 一种证件扫描方法、装置、设备及存储介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190295267A1 (en) * 2014-07-29 2019-09-26 Alibaba Group Holding Limited Detecting specified image identifiers on objects
CN109584276A (zh) * 2018-12-04 2019-04-05 北京字节跳动网络技术有限公司 关键点检测方法、装置、设备及可读介质
CN110660078A (zh) * 2019-08-20 2020-01-07 平安科技(深圳)有限公司 对象追踪方法、装置、计算机设备和存储介质
CN110929738A (zh) * 2019-11-19 2020-03-27 上海眼控科技股份有限公司 证卡边缘检测方法、装置、设备及可读存储介质
CN111027495A (zh) * 2019-12-12 2020-04-17 京东数字科技控股有限公司 用于检测人体关键点的方法和装置
CN111461209A (zh) * 2020-03-30 2020-07-28 深圳市凯立德科技股份有限公司 一种模型训练装置和方法

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113610799A (zh) * 2021-08-04 2021-11-05 沭阳九鼎钢铁有限公司 基于人工智能的光伏电池板彩虹纹检测方法、装置及设备
CN113610799B (zh) * 2021-08-04 2022-07-08 沭阳九鼎钢铁有限公司 基于人工智能的光伏电池板彩虹纹检测方法、装置及设备

Also Published As

Publication number Publication date
CN112183517B (zh) 2023-08-11
CN112183517A (zh) 2021-01-05

Similar Documents

Publication Publication Date Title
WO2021147437A1 (zh) 证卡边缘检测方法、设备及存储介质
WO2021164228A1 (zh) 一种图像数据的增广策略选取方法及系统
CN110348294B (zh) Pdf文档中图表的定位方法、装置及计算机设备
CN109344727B (zh) 身份证文本信息检测方法及装置、可读存储介质和终端
CN110852311A (zh) 一种三维人手关键点定位方法及装置
WO2021164269A1 (zh) 基于注意力机制的视差图获取方法和装置
WO2021151319A1 (zh) 卡片边框检测方法、装置、设备及可读存储介质
WO2023124040A1 (zh) 一种人脸识别方法及装置
CN112052845A (zh) 图像识别方法、装置、设备及存储介质
CN111626295A (zh) 车牌检测模型的训练方法和装置
CN110321908A (zh) 图像识别方法、终端设备及计算机可读存储介质
CN111160242A (zh) 图像目标检测方法、系统、电子终端及存储介质
CN111104941B (zh) 图像方向纠正方法、装置及电子设备
WO2022199395A1 (zh) 人脸活体检测方法、终端设备及计算机可读存储介质
CN112597940B (zh) 证件图像识别方法、装置及存储介质
CN112488054B (zh) 一种人脸识别方法、装置、终端设备及存储介质
CN114842478A (zh) 文本区域的识别方法、装置、设备及存储介质
CN114444565A (zh) 一种图像篡改检测方法、终端设备及存储介质
CN112348008A (zh) 证件信息的识别方法、装置、终端设备及存储介质
WO2020244076A1 (zh) 人脸识别方法、装置、电子设备及存储介质
CN113228105A (zh) 一种图像处理方法、装置和电子设备
CN116486151A (zh) 图像分类模型训练方法、图像分类方法、设备及存储介质
CN116246298A (zh) 一种空间占用人数统计方法、终端设备及存储介质
WO2023060575A1 (zh) 图像识别方法、装置、电子设备及存储介质
CN115147469A (zh) 配准方法、装置、设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20915385

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20915385

Country of ref document: EP

Kind code of ref document: A1