WO2020024939A1 - 文案区域识别方法和装置 - Google Patents

文案区域识别方法和装置 Download PDF

Info

Publication number
WO2020024939A1
WO2020024939A1 PCT/CN2019/098414 CN2019098414W WO2020024939A1 WO 2020024939 A1 WO2020024939 A1 WO 2020024939A1 CN 2019098414 W CN2019098414 W CN 2019098414W WO 2020024939 A1 WO2020024939 A1 WO 2020024939A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
area
copywriting
pixel
copy
Prior art date
Application number
PCT/CN2019/098414
Other languages
English (en)
French (fr)
Inventor
吴立薪
吕晶晶
包勇军
陈晓东
Original Assignee
北京京东尚科信息技术有限公司
北京京东世纪贸易有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京京东尚科信息技术有限公司, 北京京东世纪贸易有限公司 filed Critical 北京京东尚科信息技术有限公司
Priority to EP19845554.5A priority Critical patent/EP3812965A4/en
Publication of WO2020024939A1 publication Critical patent/WO2020024939A1/zh
Priority to US17/155,168 priority patent/US11763167B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/24Aligning, centring, orientation detection or correction of the image
    • G06V10/242Aligning, centring, orientation detection or correction of the image by image rotation, e.g. by 90 degrees
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • G06V10/451Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
    • G06V10/454Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/84Arrangements for image or video recognition or understanding using pattern recognition or machine learning using probabilistic graphical models from image or video features, e.g. Markov models or Bayesian networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/191Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06V30/19173Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20016Hierarchical, coarse-to-fine, multiscale or multiresolution image processing; Pyramid transform
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20072Graph-based image processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20076Probabilistic image processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30168Image quality inspection

Definitions

  • the present disclosure relates to the field of image processing technology, and in particular, to a copywriting region recognition method and device.
  • E-commerce websites display a large number of product advertisements in and out of the website in the form of pictures every day.
  • the inventory of these advertisement pictures reaches 10 billion levels, and the daily increase is at least hundreds of thousands.
  • the platform will formulate image copy design specifications and conduct manual review after users upload images. This process often requires a lot of time and labor.
  • a copywriting region identification method which includes: extracting feature information of multiple layers of an image to be processed; separately encoding the feature information of the multiple layers; and jointly decoding according to the multilayer encoding information to obtain a joint decoded output ; Obtaining pixel information according to the joint decoding output, wherein the pixel information includes a distance between each pixel point and a border of the copywriting area, and rotation angle information of the copywriting area; and determining a border position of the copywriting area according to the pixel information.
  • obtaining the pixel information according to the joint decoding output includes: fusing the joint decoded output of each pixel with at least one of the joint decoding output of the previous pixel or the next pixel to obtain fusion decoding information; and obtaining the fusion decoding information according to the fusion decoding information.
  • Output gets pixel information.
  • the feature information of multiple layers is extracted through a Convolutional Neural Network (CNN).
  • CNN Convolutional Neural Network
  • encoding the extracted features separately includes: inputting the feature information to a GCN (Graph Convolutional Network, Graph Convolutional Neural Network), and obtaining encoding information of each layer.
  • GCN Graph Convolutional Network, Graph Convolutional Neural Network
  • the GCN obtains a first code by convolving each feature with a 1 * k and then convolving with a k * 1, where k is a predetermined constant; each feature is passed through a k * After 1's convolution, a 1 * k convolution is used to obtain the second code; the first code and the second code are summed and output after convolution to obtain encoding information.
  • the joint decoding according to the multi-layered coding information includes: decoding the highest-level features to obtain the highest-level decoded output; in the order from high-level to low-level, joint decoding using the decoded output of the previous layer and the coding information of the current layer And output to the next layer, until the current layer is the lowest layer, output joint decoding information.
  • joint decoding using the decoded output of the previous layer and the encoded information of the current layer includes: sampling the encoded information of the current layer 2 times, splicing it with the decoded output of the previous layer, and outputting after convolution.
  • obtaining the pixel information according to the fusion decoding information includes: obtaining the distance of each pixel point from the border of the copy area in four directions through a convolution with a depth of 5, and the rotation of the copy area Angle information; determining the copy area based on the pixel information includes: determining the frame position of the copy area by a non-maximum suppression algorithm according to the distance of each pixel from the copy frame in four directions and the rotation angle information of the copy area.
  • obtaining the pixel information according to the fusion decoding information further includes: obtaining the probability that the position of each pixel point is a copy area by convolutional fusion depth information of 1; determining the copy area based on the pixel information further includes: The probability that each pixel is located in the copy area is screened for pixels that are greater than or equal to a predetermined threshold; the position of the border of the copy area is determined by the non-maximum suppression algorithm: according to the filtered pixels in the four directions respectively The distance, and the rotation angle information of the copy area, determine the border position of the copy area by a non-maximum suppression algorithm.
  • the copywriting area identification method further includes: reviewing the copywriting area according to a predetermined copywriting review rule; and rejecting the copywriting solution corresponding to the image to be processed if the copywriting area does not satisfy the predetermined copywriting review rule.
  • the predetermined copywriting review rule includes at least one of the following: the copywriting font size is within a predetermined font size range; or, the copywriting area does not occupy a predetermined protection area.
  • a copywriting area recognition device including: a feature extraction module configured to extract feature information of multiple layers of an image to be processed; a codec module configured to perform feature information of multiple layers Separately encode and jointly decode according to multi-layer encoding information to obtain joint decoded output; the pixel information acquisition module is configured to obtain pixel information according to the joint decoded output, where the pixel information includes the distance between each pixel and the border of the copy area and Rotation angle information of the copywriting area; the area determination module is configured to determine a frame position of the copywriting area according to the pixel information.
  • the pixel information acquisition module includes a front and rear information fusion unit configured to fuse the joint decoded output of each pixel with at least one of the joint decoded output of the previous pixel or the next pixel to obtain the fusion decoded information.
  • a coordinate regression unit configured to obtain pixel information according to the fused decoded information output.
  • the codec module includes a GCN unit configured to obtain encoding information of each layer according to the characteristic information.
  • the GCN unit is configured to obtain a first code by convolving each feature through a 1 * k convolution and then a k * 1 convolution, where k is a predetermined constant;
  • the feature passes a k * 1 convolution and then passes through a 1 * k convolution to obtain the second code; the first code and the second code are summed, and the convolution output is used to obtain the coding information.
  • the codec module includes a decoding unit configured to decode the highest-level features and obtain the highest-level decoded output; in order from high-level to low-level, use the decoded output of the previous layer and the encoding information of the current layer to combine Decode and output to the next layer until the current layer is the lowest layer, and output joint decoding information.
  • the coordinate regression unit is configured to obtain the distance of each pixel point from the border of the copy area in four directions by convolutional fusion information with a depth of 5, and the rotation angle information of the copy area ;
  • the area determination module is configured to determine the frame position of the copy area by a non-maximum suppression algorithm according to the distance of each pixel from the copy frame in four directions and the rotation angle information of the copy area.
  • the coordinate regression unit is further configured to: obtain the probability that the position of each pixel point is a copy area by convolutional fusion information with a depth of 1; the area determination module is further configured to: according to each pixel The probability that the point is located in the copywriting area is selected for pixels greater than or equal to a predetermined threshold; the border position of the copywriting area is determined by the non-maximum suppression algorithm as the distance from the copied pixel in four directions to the copywriting frame, and the copywriting The rotation angle information of the area is determined by the non-maximum suppression algorithm.
  • the copywriting area identification device further includes an auditing unit configured to: audit the copywriting area according to a predetermined copywriting review rule; and reject the copywriting solution corresponding to the image to be processed if the copywriting area does not satisfy the predetermined copywriting review rule.
  • a copy area identification device which includes: a memory; and a processor coupled to the memory, the processor being configured to perform copy area identification according to any one of the above based on instructions stored in the memory. method.
  • a computer-readable storage medium on which computer program instructions are stored, and the instructions, when executed by a processor, implement the steps of any one of the copy area identification methods described above.
  • FIG. 1 is a flowchart of an embodiment of a copywriting area identification method according to the present disclosure.
  • FIG. 2 is a Resnet (Residual Network, Hierarchical Residual Network) hierarchy structure diagram of the method for identifying a copywriting area in the present disclosure.
  • FIG. 3 is a flowchart of another embodiment of a method for identifying a copywriting area of the present disclosure.
  • FIG. 4 is an embodiment of obtaining pixel information and obtaining a border position of the copy area in the copy area identification method of the present disclosure.
  • FIG. 5 is a schematic diagram of an embodiment of a copywriting area identification device according to the present disclosure.
  • FIG. 6 is a schematic diagram of an embodiment of a pixel information acquisition module in the copywriting area identification device of the present disclosure.
  • FIG. 7 is a schematic diagram of an embodiment of a codec module in the copywriting area identification device of the present disclosure.
  • FIG. 8 is a schematic diagram of an embodiment of a front-back information fusion unit in a copywriting area recognition device of the present disclosure.
  • FIG. 9 is a schematic diagram of an embodiment of pixel information obtained by a coordinate regression unit in a copywriting area recognition device of the present disclosure.
  • FIG. 10 is a schematic diagram of another embodiment of a copy area identification device according to the present disclosure.
  • FIG. 11 is a schematic diagram of another embodiment of the copywriting area identification device of the present disclosure.
  • FIG. 1 A flowchart of an embodiment of the method for identifying a copywriting area of the present disclosure is shown in FIG. 1.
  • step 101 feature information of multiple layers of an image to be processed is extracted.
  • step 102 feature information of multiple layers is separately encoded, and joint decoding is performed according to the multi-layer encoded information to obtain a joint decoded output.
  • the encoding or decoding of each layer is mixed and decoded in the decoding process to obtain a joint decoding output.
  • step 103 pixel information is obtained according to the joint decoding output, where the pixel information includes a distance between each pixel point and a border of the copywriting area and rotation angle information of the copywriting area.
  • joint decoding may be performed through a convolution operation of a predetermined depth to obtain pixel information, and the obtained operation result is used as the pixel information.
  • the frame position of the copy area is determined according to the pixel information.
  • the position of the border of the copy area can be determined based on the relative position of each pixel to the border of the copy area and the pixel position of each pixel.
  • Copywriting detection algorithms in related technologies such as EAST (Efficient and Accurate Scene Detector, accurate and efficient text detector), have only passed tests on some public English data sets, verifying the correctness of the algorithm.
  • EAST Efficient and Accurate Scene Detector, accurate and efficient text detector
  • the Chinese copy and the English copy in advertising pictures are quite different. This technology cannot be used to directly detect the copy area and font size of Chinese advertising pictures.
  • the copy forms in advertising pictures are complex and diverse, and the technology does not take into account the complexity and variability.
  • Chinese copywriting, especially for long and short copywriting has a poor effect, and cannot accurately detect the bounding box of the copywriting, which affects the accuracy of copywriting detection and the accuracy of font size discrimination.
  • feature information of multiple depths can be obtained through feature extraction, and features of each depth can be taken into consideration through encoding and joint decoding, and then the position of the border of the copywriting area can be determined through the acquisition and analysis of pixel information. Speed and accuracy of identifying Chinese case areas in pictures.
  • the feature information of multiple layers of the image to be processed may be extracted by CNN.
  • CNN is a locally connected network. Compared with a fully connected network, CNN has local connectivity and weight sharing. For a certain pixel in an image, generally, the closer the pixel is to the pixel, the greater its influence (local connectivity); in addition, according to the statistical characteristics of the natural image, the weight of a certain area is also Can be used in another area (weight sharing).
  • the weight sharing is the convolution kernel sharing. For a convolution kernel, convolution with a given image can extract the characteristics of one image, and different convolution kernels can extract different image characteristics.
  • a modified Resnet model can be adopted as the feature expression of the original input image.
  • the hierarchical diagram of the Resnet model can be shown in Figure 2.
  • FIG. 3 A flowchart of another embodiment of the method for identifying a copywriting area of the present disclosure is shown in FIG. 3.
  • step 301 feature information of multiple layers in an image to be processed is extracted through a CNN.
  • a Resnet-50 model can be used for feature extraction, and the max-pool layer is removed, and the second to fifth layers are selected for feature analysis.
  • the feature information is input to the GCN respectively, the coding information of each layer is obtained, and then the multi-layer coding information output by the GCN is jointly decoded.
  • the GCN obtains the first code by convolving each feature through a 1 * k convolution and then obtaining the first code, where k is a predetermined constant and passing each feature through a k * After 1's convolution, a 1 * k convolution is used to obtain the second code, and then the first code and the second code are summed and output after convolution to obtain the encoding information. Because GCN has the ability to expand the perceptual field, this method can improve the ability to detect long and short copy.
  • the joint decoding process may include decoding the highest-level features to obtain the highest-level decoded output; in the order from high-level to low-level, the decoding output of the previous layer and the encoding information of the current layer are jointly decoded and output to The next layer, until the current layer is the lowest layer, outputs joint decoding information.
  • joint decoding using the decoded output of the previous layer and the encoded information of the current layer includes: sampling the encoded information of the current layer 2 times and stitching it with the decoded output of the previous layer. After 3 * 3 convolution, Output.
  • step 303 the joint decoded output of each pixel is fused with at least one of the joint decoded output of the previous pixel or the next pixel to obtain the fusion decoded information; and the pixel information is obtained according to the fusion decoded information output.
  • the fusion decoded information of each pixel can have the characteristics of the pixels before and after it, which helps to further improve the accuracy of determining the copy area.
  • step 304 the fusion decoded information is used to obtain the distance of each pixel point from the border of the copy area in four directions through a 3 * 3 convolution with a depth of 5 and the rotation angle information of the copy area.
  • step 305 the position of the border of the copy area is determined by a non-maximum suppression algorithm according to the distances of the pixels from the copy frame in the four directions and the rotation angle information of the copy area, respectively.
  • step 306 the copy area is reviewed according to a predetermined copy review rule. If the copywriting area meets the predetermined copywriting review rule, step 307 is performed; if the copywriting area does not meet the predetermined copywriting review rule, step 308 is performed.
  • the predetermined copy review rule may include requiring the copywriting font size to be within the predetermined font size range.
  • the copy size corresponds to the height or width of the copy (if the copy is arranged horizontally, the font size corresponds to the height of the copy; if the copy is arranged vertically, the font size corresponds to the width of the copy).
  • the height or width of the copy can determine the copy
  • the font size is further compared with the predetermined copy size interval. If the copy size is not within the range of the predetermined copy size, then the requirements are not met.
  • the font size of the copywriting can be guaranteed to be within a predetermined range, avoiding reading difficulties caused by too small font sizes, or affecting aesthetics with too large font sizes, and optimizing display effects.
  • the predetermined copywriting review rule may include a requirement that the copywriting area does not occupy a predetermined protection area, such as an area where items are displayed in a picture, or an area that cannot be occupied according to design and aesthetic needs, according to the coordinates of the border of the copywriting area. Match the coordinates of the predetermined protection area to ensure that the text area does not occupy the predetermined protection area, and avoid the loss of important information of the picture caused by copywriting.
  • a predetermined protection area such as an area where items are displayed in a picture, or an area that cannot be occupied according to design and aesthetic needs
  • step 307 it is determined that the copy scheme corresponding to the image to be processed passes.
  • step 308 the copying scheme corresponding to the image to be processed is rejected.
  • GCN and recurrent neural networks can be added to fuse long text information and refine short text information, thereby improving the detection accuracy of long and short copy areas in advertising pictures, reducing the manpower for review, and improving efficiency.
  • FIG. 4 An embodiment of obtaining the pixel information and obtaining the frame position of the copy area in the copy area identification method of the present disclosure is shown in FIG. 4.
  • step 401 the fusion decoded information is subjected to a 3 * 3 convolution with a depth of 1, and the result (the result is between 0 and 1) is used as the probability that the position of the pixel point is the copy area.
  • step 402 the fusion decoded information is used to obtain the distance of each pixel point from the border of the copywriting area in four directions through a 3 * 3 convolution with a depth of 5, and the rotation angle information of the copywriting area.
  • step 403 the probability value is compared with a predetermined threshold (such as 0.8) according to the probability that the position of the pixel point is a copy area. If the probability that the position of a pixel is a copy area is greater than or equal to a predetermined threshold, step 405 is performed; if the probability that the position of a pixel is a copy area is less than a predetermined threshold, step 404 is performed.
  • a predetermined threshold such as 0.8
  • the location of the discarded pixel is a pixel with a probability that the copy area is less than a predetermined threshold.
  • step 405 the non-maximum suppression algorithm is used to determine the position of the frame of the copy area according to the distances of the pixels from the copy frame in the four directions and the rotation angle information of the copy area, thereby improving the calculation efficiency.
  • the pixels that are determined not to belong to the copywriting area can be filtered out, and then the filtered pixels can be further processed to obtain the boundaries of the copywriting area, reducing the amount of calculation and improving the processing efficiency.
  • FIG. 5 A schematic diagram of an embodiment of the copy area identification device of the present disclosure is shown in FIG. 5.
  • the feature extraction module 51 can extract feature information of multiple layers of an image to be processed.
  • the feature extraction module 51 may be a CNN, which extracts multiple features of the image from concrete to abstract, and improves the accuracy of the Chinese case area of the picture.
  • the codec module 52 can separately encode the characteristic information of multiple layers, and jointly decode according to the multiple-layered coding information to obtain a joint decoded output.
  • the encoding or decoding of each layer is mixed and decoded in the decoding process to obtain a joint decoding output.
  • the pixel information obtaining module 53 can obtain pixel information according to a joint decoding output, where the pixel information includes a distance between each pixel point and a border of the copywriting area and rotation angle information of the copywriting area.
  • the area determination module 54 can determine the frame position of the copy area based on the pixel information. In some embodiments, the position of the border of the copy area can be determined based on the relative position of each pixel to the border of the copy area and the pixel position of each pixel.
  • Such a device can obtain multi-depth feature information through feature extraction, and simultaneously consider the features of each depth through encoding and joint decoding, and then determine the frame position of the copy area through the acquisition and analysis of pixel information, and improve the recognition of Chinese pictures Speed and accuracy of the area.
  • the copywriting area identification device may further include an auditing unit 55 capable of reviewing the copywriting area according to a predetermined copywriting review rule. In the case where the copywriting area meets the predetermined copywriting review rules, it is determined that the copywriting scheme corresponding to the image to be processed passes; if the copywriting area does not meet the predetermined copywriting reviewing rule, the copywriting scheme corresponding to the image to be processed is rejected.
  • Such a device can review and judge the copywriting area according to a predetermined copywriting review rule, and output the review result, avoiding manual operation and improving execution efficiency.
  • the pixel information acquisition module may include a front-back information fusion unit 601 and a coordinate regression unit 602.
  • the before and after information fusion unit 601 can fuse the joint decoded output of each pixel with at least one of the joint decoded output of the previous pixel or the next pixel to obtain the fusion decoded information.
  • the coordinate regression unit 602 can obtain pixel information according to the fused decoded information output. By using such a device to process an image, the fusion decoded information of each pixel can have the characteristics of its previous and subsequent pixels, which helps to further improve the accuracy of determining the copy area.
  • FIG. 7 A schematic diagram of an embodiment of a codec module in the copywriting area identification device of the present disclosure is shown in FIG. 7.
  • the multi-layer features extracted by the feature extraction module 51 such as the second to fifth layer features, are respectively input into GCN 2 to 5, and decoded by the multi-layer decoding unit.
  • the GCN obtains the first code by convolving each feature through a 1 * k convolution and then obtaining the first code, where k is a predetermined constant and passing each feature through a k * After 1's convolution, a 1 * k convolution is used to obtain the second code, and then the first code and the second code are summed and output after convolution to obtain the encoding information.
  • each other decoding unit uses the decoding output of the previous layer and the encoding information of the current layer to jointly decode and output to the next layer.
  • Layer the decoding unit of the lowest layer outputs joint decoding information.
  • the decoding unit samples the encoding information of the current layer 2 times, stitches it with the decoding output of the previous layer, and outputs it after 3 * 3 convolution, so as to use the decoding output of the previous layer and the encoding of the current layer. Joint decoding of information.
  • Such a device can enable the joint decoding of each pixel to have both high-dimensional and low-dimensional features, enrich the feature content of the joint decoding, and improve the accuracy of determining the text area.
  • FIG. 8 A schematic diagram of one embodiment of a front-back information fusion unit in the copywriting area recognition device of the present disclosure is shown in FIG. 8, wherein the left side is a structure diagram of a Bidirectional Long Short-Term Memory (BLSTM) and the right side is One-way LSTM (Long Short-Term Memory) long-term short-term memory network.
  • BSSTM Bidirectional Long Short-Term Memory
  • LSTM Long Short-Term Memory
  • the feature with the dimension C corresponding to all the windows of each row output by the codec module (C is the number of channels, C is a positive integer) is input to a bidirectional RNN (Recurrent Neural Network, Recurrent Neural Network) (BLSTM), and 256 dimensions are obtained. (The number of hidden layers of RNN), and then change the feature size back to C through a fully connected layer.
  • each pixel After processing by the recurrent neural network, each pixel has not only high-dimensional and low-dimensional features, but also features of its previous and subsequent pixels. Such a device can optimize the accuracy of the acquired boundary information when processing long copy information.
  • the coordinate regression unit 602 first obtains the first pixel information through the convolution of 3 * 3 (the depth is 1) of the output of the front-back information fusion module, and then passes another parallel 3 * 3 (the depth is 5)
  • the second pixel information is obtained by convolution.
  • the amplitude value of each point in the first pixel information represents the probability (between 0 and 1) of whether the point is text.
  • the second pixel information contains 5 channels. As shown in Figure 9, the amplitude value of each pixel represents the distance d_left from the pixel to the copy containing the pixel to the left of the bounding box, and the distance d_right from the right. Distance d_up, distance d_down at the bottom, and theta the rotation angle of the copy bounding box.
  • the area determination module selects pixels with a probability greater than or equal to a predetermined threshold according to the probability that each pixel is located in the copy area; according to the distance of the filtered pixels from the copy frame in four directions, and the rotation angle information of the copy area,
  • the non-maximum suppression algorithm is used to determine the border position of the copy area.
  • Such a device can first filter out pixels that are determined not to be part of the copywriting area, and then further process the filtered pixels to obtain the boundaries of the copywriting area, reduce the amount of calculation, and improve processing efficiency.
  • the copy area identification device includes a memory 1001 and a processor 1002.
  • the memory 1001 may be a magnetic disk, a flash memory, or any other non-volatile storage medium.
  • the memory is configured to store the instructions in the corresponding embodiment of the Chinese case area identification method above.
  • the processor 1002 is coupled to the memory 1001 and may be implemented as one or more integrated circuits, such as a microprocessor or a microcontroller.
  • the processor 1002 is configured to execute instructions stored in the memory, which can improve the speed and accuracy of identifying the Chinese case area of the picture.
  • the copy area identification device may also be as shown in FIG. 11.
  • the copy area identification device 1100 includes a memory 1101 and a processor 1102.
  • the processor 1102 is coupled to the memory 1101 through a BUS bus 1103.
  • the copy area identification device 1100 can also be connected to the external storage device 1105 through the storage interface 1104 to call external data, and can also be connected to the network or another computer system (not shown) through the network interface 1106. I won't go into details here.
  • a computer-readable storage medium stores computer program instructions that, when executed by a processor, implement steps of the method for copying area identification corresponding to the method in the embodiment.
  • These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing device to work in a particular manner such that the instructions stored in the computer-readable memory produce a manufactured article including an instruction device, the instructions
  • the device implements the functions specified in one or more flowcharts and / or one or more blocks of the block diagram.
  • These computer program instructions can also be loaded on a computer or other programmable data processing device, so that a series of steps can be performed on the computer or other programmable device to produce a computer-implemented process, which can be executed on the computer or other programmable device.
  • the instructions provide steps for implementing the functions specified in one or more flowcharts and / or one or more blocks of the block diagrams.
  • the methods and apparatus of the present disclosure may be implemented in many ways.
  • the methods and devices of the present disclosure can be implemented by software, hardware, firmware or any combination of software, hardware, firmware.
  • the above order of the steps of the method is for illustration only, and the steps of the method of the present disclosure are not limited to the order specifically described above unless otherwise specifically stated.
  • the present disclosure may also be implemented as programs recorded in a recording medium, which programs include machine-readable instructions for implementing the method according to the present disclosure.
  • the present disclosure also covers a recording medium storing a program for executing a method according to the present disclosure.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Quality & Reliability (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

本公开提出一种文案区域识别方法和装置,涉及图像处理技术领域。本公开的一种文案区域识别方法包括:提取待处理图像的多层的特征信息;对多层的特征信息分别编码,并根据多层编码信息联合解码,获取联合解码输出;根据联合解码输出获取像素信息,其中,像素信息包括每个像素点与文案区域的边框的距离及文案区域的旋转角度信息;根据像素信息确定文案区域的边框位置。通过这样的方法,能够通过特征提取获取多深度的特征信息,并通过编码和联合解码,同时考虑各个深度的特征,进而通过像素点信息的获取和分析实现文案区域边框位置的确定,提高识别图片中文案区域的速度和准确度。

Description

文案区域识别方法和装置
相关申请的交叉引用
本申请是以CN申请号为ZL201810861942.7,申请日为2018年8月1日的申请为基础,并主张其优先权,该CN申请的公开内容在此作为整体引入本申请中。
技术领域
本公开涉及图像处理技术领域,特别是一种文案区域识别方法和装置。
背景技术
电商网站每天有大量的商品广告在站内和站外以图片的形式进行展现,这些广告图片的存量达到百亿级别,同时每天的增量至少达到几十万量级。为了提高图片广告的点击率和转化率,平台会制定图片的文案设计规范,并在用户上传图片后进行人工审核,这一过程往往需要耗费大量的时间和人力。
发明内容
根据本公开一些实施例,提出一种文案区域识别方法,包括:提取待处理图像的多层的特征信息;对多层的特征信息分别编码,并根据多层编码信息联合解码,获取联合解码输出;根据联合解码输出获取像素信息,其中,像素信息包括每个像素点与文案区域的边框的距离,及文案区域的旋转角度信息;根据像素信息确定文案区域的边框位置。
在一些实施例中,根据联合解码输出获取像素信息包括:将每个像素的联合解码输出与前一个像素或后一个像素的联合解码输出中的至少一个融合,获取融合解码信息;根据融合解码信息输出获取像素信息。
在一些实施例中,多层的特征信息为通过CNN(Convolutional Neural Network,卷积神经网络)提取。
在一些实施例中,对提取的特征分别编码包括:将特征信息分别输入GCN(Graph Convolutional Network,图卷积神经网络),获取每层的编码信息。
在一些实施例中,GCN将每个特征通过一个1*k的卷积后再通过一个k*1的卷积,获取第一编码,其中,k为预定常数;将每个特征通过一个k*1的卷积后再通过 一个1*k的卷积,获取第二编码;将第一编码与第二编码求和,并卷积后输出,获取编码信息。
在一些实施例中,根据多层编码信息联合解码包括:对最高层特征解码,获取最高层解码输出;从高层到低层的顺序,利用上一层的解码输出和当前层的编码信息联合解码,并输出至下一层,直至当前层为最低层,输出联合解码信息。
在一些实施例中,利用上一层的解码输出和当前层的编码信息联合解码包括:将当前层的编码信息采样2倍,并与上一层的解码输出拼接,卷积后输出。
在一些实施例中,根据融合解码信息获取像素信息包括:将融合解码信息通过深度为5的卷积获取每个像素点在四个方向上分别与文案区域的边框的距离,以及文案区域的旋转角度信息;根据像素信息确定文案区域包括:根据各个像素点在四个方向上分别与文案边框的距离,以及文案区域的旋转角度信息,通过非极大值抑制算法确定文案区域的边框位置。
在一些实施例中,根据融合解码信息获取像素信息还包括:将融合解码信息通过深度为1的卷积获取每个像素点的位置是文案区域的概率;根据像素信息确定文案区域还包括:根据每个像素点位于文案区域的概率筛选出大于等于预定阈值的像素点;通过非极大值抑制算法确定文案区域的边框位置为:根据筛选出的像素点在四个方向上分别与文案边框的距离,以及文案区域的旋转角度信息,通过非极大值抑制算法确定文案区域的边框位置。
在一些实施例中,文案区域识别方法还包括:根据预定文案审核规则审核文案区域;在文案区域不满足预定文案审核规则的情况下,驳回待处理图像对应的文案方案。
在一些实施例中,预定文案审核规则包括以下至少一种:文案字号在预定字号范围内;或,文案区域不占用预定保护区域。
根据本公开的另一些实施例,提出一种文案区域识别装置,包括:特征提取模块,被配置为提取待处理图像的多层的特征信息;编解码模块,被配置为对多层的特征信息分别编码,并根据多层编码信息联合解码,获取联合解码输出;像素信息获取模块,被配置为根据联合解码输出获取像素信息,其中,像素信息包括每个像素点与文案区域的边框的距离及文案区域的旋转角度信息;区域确定模块,被配置为根据像素信息确定文案区域的边框位置。
在一些实施例中,像素信息获取模块包括:前后信息融合单元,被配置为将每个像素的联合解码输出与前一个像素或后一个像素的联合解码输出中的至少一个融合, 获取融合解码信息;坐标回归单元,被配置为根据融合解码信息输出获取像素信息。
在一些实施例中,编解码模块包括GCN单元,被配置为根据特征信息获取每层的编码信息。
在一些实施例中,GCN单元被配置为:将每个特征通过一个1*k的卷积后再通过一个k*1的卷积,获取第一编码,其中,k为预定常数;将每个特征通过一个k*1的卷积后再通过一个1*k的卷积,获取第二编码;将第一编码与第二编码求和,并通过卷积输出,获取编码信息。
在一些实施例中,编解码模块包括解码单元,被配置为:对最高层特征解码,获取最高层解码输出;从高层到低层的顺序,利用上一层的解码输出和当前层的编码信息联合解码,并输出至下一层,直至当前层为最低层,输出联合解码信息。
在一些实施例中,坐标回归单元被配置为:将融合解码信息通过深度为5的卷积获取每个像素点在四个方向上分别与文案区域的边框的距离,以及文案区域的旋转角度信息;区域确定模块被配置为:根据各个像素点在四个方向上分别与文案边框的距离,以及文案区域的旋转角度信息,通过非极大值抑制算法确定文案区域的边框位置。
在一些实施例中,坐标回归单元还被配置为:将融合解码信息通过深度为1的卷积获取每个像素点的位置是文案区域的概率;区域确定模块还被配置为:根据每个像素点位于文案区域的概率筛选出大于等于预定阈值的像素点;通过非极大值抑制算法确定文案区域的边框位置为根据筛选出的像素点在四个方向上分别与文案边框的距离,以及文案区域的旋转角度信息,通过非极大值抑制算法确定文案区域的边框位置。
在一些实施例中,文案区域识别装置还包括审核单元,被配置为:根据预定文案审核规则审核文案区域;在文案区域不满足预定文案审核规则的情况下,驳回待处理图像对应的文案方案。
根据本公开的又一些实施例,提出一种文案区域识别装置,包括:存储器;以及耦接至存储器的处理器,处理器被配置为基于存储在存储器的指令执行如上文中任意一种文案区域识别方法。
根据本公开的再一些实施例,提出一种计算机可读存储介质,其上存储有计算机程序指令,该指令被处理器执行时实现上文中任意一种文案区域识别方法的步骤。
附图说明
此处所说明的附图用来提供对本公开的进一步理解,构成本公开的一部分,本公 开的示意性实施例及其说明用于解释本公开,并不构成对本公开的不当限定。在附图中:
图1为本公开的文案区域识别方法的一个实施例的流程图。
图2为本公开的文案区域识别方法中Resnet(Residual Network,残差网络)层级结构图。
图3为本公开的文案区域识别方法的另一个实施例的流程图。
图4为本公开的文案区域识别方法中获取像素信息并获取文案区域边框位置的一个实施例。
图5为本公开的文案区域识别装置的一个实施例的示意图。
图6为本公开的文案区域识别装置中像素信息获取模块的一个实施例的示意图。
图7为本公开的文案区域识别装置中编解码模块的一个实施例的示意图。
图8为本公开的文案区域识别装置中前后信息融合单元的一个实施例的示意图。
图9为本公开的文案区域识别装置中坐标回归单元获取的像素信息的一个实施例的示意图。
图10为本公开的文案区域识别装置的另一个实施例的示意图。
图11为本公开的文案区域识别装置的又一个实施例的示意图。
具体实施方式
下面通过附图和实施例,对本公开的技术方案做进一步的详细描述。
本公开的文案区域识别方法的一个实施例的流程图如图1所示。
在步骤101中,提取待处理图像的多层的特征信息。
在步骤102中,对多层的特征信息分别编码,并根据多层编码信息联合解码,获取联合解码输出。在一些实施例中,可以在对各层的特征信息分别编码后,在解码过程中将各层的编码或解码混合解码,获取联合解码输出。
在步骤103中,根据联合解码输出获取像素信息,其中,像素信息包括每个像素点与文案区域的边框的距离及文案区域的旋转角度信息。在一些实施例中,可以通过预定深度的卷积运算处理联合解码,获取像素信息,将得到的运算结果作为像素信息。
在步骤104中,根据像素信息确定文案区域的边框位置。在一些实施例中,可以基于每个像素点与文案区域的边框的相对位置,以及每个像素点的像素位置确定文案区域的边框位置。
相关技术中的文案检测算法,如EAST(Efficient and Accurate Scene Text Detector,准确高效的文本检测器),只是在一些公开的英文数据集上通过了测试,验证了算法的正确性。但是,广告图片中的中文文案与英文文案差异较大,该技术不能直接用于中文广告图片的文案区域检测和字体大小判别;广告图片中的文案形式复杂多样,该技术并没有考虑复杂多变的中文文案,尤其针对较长、较短的文案效果很差,无法准确的检测出文案的边界框,影响文案检测的精度以及字体大小判别的精度。
采用上述实施例中的方式,能够通过特征提取获取多深度的特征信息,并通过编码和联合解码,同时考虑各个深度的特征,进而通过像素信息的获取和分析实现文案区域边框位置的确定,提高识别图片中文案区域的速度和准确度。
在一些实施例中,可以通过CNN提取待处理图像的多层的特征信息。CNN是局部连接网络,相对于全连接网络,具有局部连接性和权值共享性。对一副图像中的某个像素来说,一般离该像素点越近的像素对其影响也就越大(局部连接性);另外,根据自然图像的统计特性,某个区域的权值也可以用于另一个区域(权值共享性)。这里的权值共享即卷积核共享,对于一个卷积核将其与给定的图像做卷积就可以提取一种图像的特征,不同的卷积核可以提取不同的图像特征。通过多层的CNN对待处理图像进行特征提取,网络越深,所能提取的特征表达越抽象。在一些实施例中,可以采用修正的Resnet模型作为原始输入图像的特征表达。Resnet模型的层级示意图可以如图2所示。通过采用CNN对待处理图像进行从浅到深的多层特征提取,能够提取出图像从具象到抽象的多重特征,提高图片中文案区域的准确度。
本公开的文案区域识别方法的另一个实施例的流程图如图3所示。
在步骤301中,通过CNN提取待处理图像中的多层的特征信息。在一些实施例中,为了兼顾精度和运算速度,可以采用Resnet-50模型进行特征提取,并去掉max-pool(最大池)层,选取第二至五层进行特征分析。
在步骤302中,将特征信息分别输入GCN,获取每层的编码信息,再根据GCN输出的多层编码信息联合解码。在一些实施例中,GCN将每个特征通过一个1*k的卷积后再通过一个k*1的卷积,获取第一编码,其中,k为预定常数,将每个特征通过一个k*1的卷积后再通过一个1*k的卷积,获取第二编码,进而将第一编码与第二编码求和,并卷积后输出,获取编码信息。由于GCN具有扩大感知野的能力,这样的方法能够提高对长短文案的检测能力。
在一些实施例中,联合解码的过程可以包括对最高层特征解码,获取最高层解码 输出;从高层到低层的顺序,利用上一层的解码输出和当前层的编码信息联合解码,并输出至下一层,直至当前层为最低层,输出联合解码信息。在一些实施例中,利用上一层的解码输出和当前层的编码信息联合解码包括:将当前层的编码信息采样2倍,并与上一层的解码输出拼接,通过3*3卷积后输出。通过这样的方法,每个像素点的联合解码能够同时拥有高维和低维特征,丰富了联合解码的特征内容,提高了文本区域确定的准确度。
在步骤303中,将每个像素的联合解码输出与前一个像素或后一个像素的联合解码输出中的至少一个融合,获取融合解码信息;根据融合解码信息输出获取像素信息。通过这样的方法,每一个像素点的融合解码信息能够兼具有它之前和之后像素点的特征,有助于进一步提高文案区域确定的准确度。
在步骤304中,将融合解码信息通过深度为5的3*3卷积获取每个像素点在四个方向上分别与文案区域的边框的距离,以及文案区域的旋转角度信息。
在步骤305中,根据像素点在四个方向上分别与文案边框的距离,以及文案区域的旋转角度信息,通过非极大值抑制算法确定文案区域的边框位置。
在步骤306中,根据预定文案审核规则审核文案区域。在文案区域符合预定文案审核规则的情况下,执行步骤307;在文案区域不符合预定文案审核规则的情况下,执行步骤308。
在一些实施例中,预定文案审核规则可以包括要求文案字号在预定字号范围内。文案字号与文案的高度或宽度相对应(如若文案为横向排列,则字号与文案高度相对应;若文案为纵向排列,则字号与文案宽度相对应),通过文案的高度或宽度可以确定文案的字号,进而与预定文案字号区间比较,若文案字号不在预定文案字号区间范围内,则不符合要求。
通过这样的方法,能够保证文案的字号在预定范围内,避免字号过小造成的阅读困难,或字号过大影响美观,优化了显示效果。
在另一些实施例中,预定文案审核规则可以包括要求文案区域不占用预定保护区域,如图片中展示物品的区域,或根据设计、美感的需要不能够被占用的区域,根据文案区域边框的坐标和预定保护区域的坐标进行匹配,保证文字区域不占用预定保护区域,避免文案遮挡造成图片损失重要信息。
在步骤307中,确定待处理图像对应的文案方案通过。
在步骤308中,驳回待处理图像对应的文案方案。
通过这样的方法,能够加入GCN以及循环神经网络进行了长文本信息的融合以及短文本信息的细化,从而提高了广告图片中长短文案区域的检测精度,降低审核的人力,提高效率。
本公开的文案区域识别方法中获取像素信息并获取文案区域边框位置的一个实施例如图4所示。
在步骤401中,将融合解码信息通过深度为1的3*3卷积,将结果(结果介于0~1之间)作为像素点的位置是文案区域的概率。
在步骤402中,将融合解码信息通过深度为5的3*3卷积获取每个像素点在四个方向上分别与文案区域的边框的距离,以及文案区域的旋转角度信息。
在步骤403中,根据像素点的位置是文案区域的概率将该概率值与预定阈值(如0.8)相比较。若像素点的位置是文案区域的概率大于等于预定阈值,则执行步骤405;若像素点的位置是文案区域的概率小于预定阈值,则执行步骤404。
在步骤404中,抛弃像素点的位置是文案区域的概率小于预定阈值的像素点。
在步骤405中,根据像素点在四个方向上分别与文案边框的距离,以及文案区域的旋转角度信息,通过非极大值抑制算法确定文案区域的边框位置,提高了运算效率。
通过这样的方法,能够先过滤掉确定不属于文案区域部分的像素点,再对筛选出的像素点进行进一步处理,得到文案区域的边界,降低运算量,提高处理效率。
本公开的文案区域识别装置的一个实施例的示意图如图5所示。
特征提取模块51能够提取待处理图像的多层的特征信息。在一些实施例中,特征提取模块51可以为CNN,提取出图像从具象到抽象的多重特征,提高图片中文案区域的准确度。
编解码模块52能够对多层的特征信息分别编码,并根据多层编码信息联合解码,获取联合解码输出。在一些实施例中,可以在对各层的特征信息分别编码后,在解码过程中将各层的编码或解码混合解码,获取联合解码输出。
像素信息获取模块53能够根据联合解码输出获取像素信息,其中,像素信息包括每个像素点与文案区域的边框的距离及文案区域的旋转角度信息。
区域确定模块54能够根据像素信息确定文案区域的边框位置。在一些实施例中,可以基于每个像素点与文案区域的边框的相对位置,以及每个像素点的像素位置确定文案区域的边框位置。
这样的装置能够通过特征提取获取多深度的特征信息,并通过编码和联合解码, 同时考虑各个深度的特征,进而通过像素点信息的获取和分析实现文案区域边框位置的确定,提高识别图片中文案区域的速度和准确度。
在一些实施例中,文案区域识别装置还可以包括审核单元55,能够根据预定文案审核规则审核文案区域。文案区域符合预定文案审核规则的情况下,确定待处理图像对应的文案方案通过;在不符合预定文案审核规则的情况下,驳回待处理图像对应的文案方案。
这样的装置能够根据预定文案审核规则对文案区域进行审核判断,并输出审核结果,避免了人工操作,提高了执行效率。
本公开的文案区域识别装置中像素信息获取模块的一个实施例的示意图如图6所示。像素信息获取模块可以包括前后信息融合单元601和坐标回归单元602。
前后信息融合单元601能够将每个像素的联合解码输出与前一个像素或后一个像素的联合解码输出中的至少一个融合,获取融合解码信息。坐标回归单元602能够根据融合解码信息输出获取像素信息。利用这样的装置处理图像,每一个像素点的融合解码信息能够兼具有它之前和之后像素点的特征,有助于进一步提高文案区域确定的准确度。
本公开的文案区域识别装置中编解码模块的一个实施例的示意图如图7所示。通过特征提取模块51提取出的多层特征,如第二~五层特征分别输入GCN2~5中,并通过多层解码单元解码。在一些实施例中,GCN将每个特征通过一个1*k的卷积后再通过一个k*1的卷积,获取第一编码,其中,k为预定常数,将每个特征通过一个k*1的卷积后再通过一个1*k的卷积,获取第二编码,进而将第一编码与第二编码求和,并卷积后输出,获取编码信息。
在一些实施例中,如图7所示,除了对最高层的编码解码的解码单元之外,其他各个解码单元利用上一层的解码输出和当前层的编码信息联合解码,并输出至下一层,最低层的解码单元输出联合解码信息。在一些实施例中,解码单元将当前层的编码信息采样2倍,并与上一层的解码输出拼接,通过3*3卷积后输出,实现利用上一层的解码输出和当前层的编码信息的联合解码。
这样的装置能够使每个像素点的联合解码同时拥有高维和低维特征,丰富了联合解码的特征内容,提高了文本区域确定的准确度。
针对广告图片中各类宣传文案的变化较大,有些文案的长度可能会超出神经网络感知野的范围,因此无法获取长文案的精确边界框的问题,利用前后信息融合单元将 编码解码模块的输出进行信息融合处理。本公开的文案区域识别装置中前后信息融合单元的一个实施例的示意图如图8所示,其中,左侧为BLSTM(Bidirectional Long Short-Term Memory,双向长短期记忆网络)结构图,右侧为单向LSTM(Long Short-Term Memory,长短期记忆网络)展开图。将编码解码模块输出的每一行所有窗口对应的维度为C的特征(C表示通道个数,C为正整数)输入到双向RNN(Recurrent Neural Network,循环神经网络)(BLSTM)中,得到256维(RNN隐藏层的个数)的输出,然后通过一个全连接层将特征大小变回C。经过递归神经网络的处理后,每一个像素点不仅拥有高维与低维特征,同时还兼具有它之前和之后像素点的特征。这样的装置在处理较长的文案信息时,能够优化获取的边界信息的准确度。
在一些实施例中,坐标回归单元602将前后信息融合模块的输出先通过3*3(深度为1)的卷积得到第一像素信息,然后通过另一个并行的3*3(深度为5)卷积得到第二像素信息。第一像素信息中每个点的幅度值代表该点是否为文字的概率(介于0和1之间)。第二像素信息包含5个通道,如图9所示,每个像素点的幅度值分别代表该像素点到包含该像素点的文案到边界框左侧的距离d_left,右侧的距离d_right,顶部的距离d_up,底部的距离d_down,以及文案边界框的旋转角度theta。
区域确定模块根据每个像素点位于文案区域的概率筛选出概率大于等于预定阈值的像素点;根据筛选出的像素点在四个方向上分别与文案边框的距离,以及文案区域的旋转角度信息,通过非极大值抑制算法确定文案区域的边框位置。
这样的装置能够先过滤掉确定不属于文案区域部分的像素点,再对筛选出的像素点进行进一步处理,得到文案区域的边界,降低运算量,提高处理效率。
本公开文案区域识别装置的一个实施例的结构示意图如图10所示。文案区域识别装置包括存储器1001和处理器1002。其中:存储器1001可以是磁盘、闪存或其它任何非易失性存储介质。存储器用于存储上文中文案区域识别方法的对应实施例中的指令。处理器1002耦接至存储器1001,可以作为一个或多个集成电路来实施,例如微处理器或微控制器。该处理器1002用于执行存储器中存储的指令,能够提高识别图片中文案区域的速度和准确度。
在一些实施例中,文案区域识别装置还可以如图11所示,文案区域识别装置1100包括存储器1101和处理器1102。处理器1102通过BUS总线1103耦合至存储器1101。该文案区域识别装置1100还可以通过存储接口1104连接至外部存储装置1105以便调用外部数据,还可以通过网络接口1106连接至网络或者另外一台计算机系统(未标 出)。此处不再进行详细介绍。
在该实施例中,通过存储器存储数据指令,再通过处理器处理上述指令,能够提高识别图片中文案区域的速度和准确度。
在另一个实施例中,一种计算机可读存储介质,其上存储有计算机程序指令,该指令被处理器执行时实现文案区域识别方法对应实施例中的方法的步骤。本领域内的技术人员应明白,本公开的实施例可提供为方法、装置、或计算机程序产品。因此,本公开可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本公开可采用在一个或多个其中包含有计算机可用程序代码的计算机可用非瞬时性存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
本公开是参照根据本公开实施例的方法、设备(系统)和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
至此,已经详细描述了本公开。为了避免遮蔽本公开的构思,没有描述本领域所公知的一些细节。本领域技术人员根据上面的描述,完全可以明白如何实施这里公开的技术方案。
可能以许多方式来实现本公开的方法以及装置。例如,可通过软件、硬件、固件 或者软件、硬件、固件的任何组合来实现本公开的方法以及装置。用于所述方法的步骤的上述顺序仅是为了进行说明,本公开的方法的步骤不限于以上具体描述的顺序,除非以其它方式特别说明。此外,在一些实施例中,还可将本公开实施为记录在记录介质中的程序,这些程序包括用于实现根据本公开的方法的机器可读指令。因而,本公开还覆盖存储用于执行根据本公开的方法的程序的记录介质。
最后应当说明的是:以上实施例仅用以说明本公开的技术方案而非对其限制;尽管参照较佳实施例对本公开进行了详细的说明,所属领域的普通技术人员应当理解:依然可以对本公开的具体实施方式进行修改或者对部分技术特征进行等同替换;而不脱离本公开技术方案的精神,其均应涵盖在本公开请求保护的技术方案范围当中。

Claims (21)

  1. 一种文案区域识别方法,包括:
    提取待处理图像的多层的特征信息;
    对多层的所述特征信息分别编码,并根据多层编码信息联合解码,获取联合解码输出;
    根据所述联合解码输出获取像素信息,其中,所述像素信息包括每个像素点与文案区域的边框的距离,及文案区域的旋转角度信息;
    根据所述像素信息确定文案区域的边框位置。
  2. 根据权利要求1所述的文案区域识别方法,其中,所述根据联合解码输出获取像素信息包括:
    将每个像素的联合解码输出与前一个像素或后一个像素的联合解码输出中的至少一个融合,获取融合解码信息;
    根据所述融合解码信息输出获取像素信息。
  3. 根据权利要求1所述的文案区域识别方法,其中,所述多层的特征信息为通过卷积神经网络CNN提取。
  4. 根据权利要求1所述的文案区域识别方法,其中,所述对提取的特征分别编码包括:
    将所述特征信息分别输入图卷积神经网络GCN,获取每层的编码信息。
  5. 根据权利要求4所述的文案区域识别方法,其中,
    所述GCN将每个特征通过一个1*k的卷积后再通过一个k*1的卷积,获取第一编码,其中,所述k为预定常数;
    将每个特征通过一个k*1的卷积后再通过一个1*k的卷积,获取第二编码;
    将所述第一编码与所述第二编码求和,并卷积后输出,获取所述编码信息。
  6. 根据权利要求1所述的文案区域识别方法,其中,所述根据多层编码信息联合解码包括:
    对最高层特征解码,获取最高层解码输出;
    从高层到低层的顺序,利用上一层的解码输出和当前层的编码信息联合解码,并输出至下一层,直至所述当前层为最低层,输出所述联合解码信息。
  7. 根据权利要求6所述的文案区域识别方法,其中,所述利用上一层的解码输出和当前层的编码信息联合解码包括:
    将当前层的编码信息采样2倍,并与上一层的解码输出拼接,卷积后输出。
  8. 根据权利要求2所述的文案区域识别方法,其中,
    所述根据所述融合解码信息获取像素信息包括:
    将所述融合解码信息通过深度为5的卷积获取每个像素点在四个方向上分别与文案区域的边框的距离,以及文案区域的旋转角度信息;
    所述根据所述像素信息确定文案区域包括:
    根据各个像素点在四个方向上分别与文案边框的距离,以及文案区域的旋转角度信息,通过非极大值抑制算法确定文案区域的边框位置。
  9. 根据权利要求8所述的文案区域识别方法,其中,所述根据所述融合解码信息获取像素信息还包括:
    将所述融合解码信息通过深度为1的卷积获取每个像素点的位置是文案区域的概率;
    所述根据所述像素信息确定文案区域还包括:
    根据每个像素点位于所述文案区域的概率筛选出大于等于预定阈值的像素点;
    所述通过非极大值抑制算法确定文案区域的边框位置为根据筛选出的像素点在四个方向上分别与文案边框的距离,以及文案区域的旋转角度信息,通过非极大值抑制算法确定文案区域的边框位置。
  10. 根据权利要求1所述的文案区域识别方法,还包括:
    根据预定文案审核规则审核所述文案区域;
    在所述文案区域不满足所述预定文案审核规则的情况下,驳回所述待处理图像对应的文案方案。
  11. 根据权利要求10所述的文案区域识别方法,其中,所述预定文案审核规则包括以下至少一种:
    文案字号在预定字号范围内;或,
    所述文案区域不占用预定保护区域。
  12. 一种文案区域识别装置,包括:
    特征提取模块,被配置为提取待处理图像的多层的特征信息;
    编解码模块,被配置为对多层的所述特征信息分别编码,并根据多层编码信息联合解码,获取联合解码输出;
    像素信息获取模块,被配置为根据所述联合解码输出获取像素信息,其中,所述像素信息包括每个像素点与文案区域的边框的距离及文案区域的旋转角度信息;
    区域确定模块,被配置为根据所述像素信息确定文案区域的边框位置。
  13. 根据权利要求12所述的文案区域识别装置,其中,所述像素信息获取模块包括:
    前后信息融合单元,被配置为将每个像素的联合解码输出与前一个像素或后一个像素的联合解码输出中的至少一个融合,获取融合解码信息;
    坐标回归单元,被配置为根据所述融合解码信息输出获取像素信息。
  14. 根据权利要求12所述的文案区域识别装置,其中,所述编解码模块包括图卷积神经网络GCN单元,被配置为根据所述特征信息获取每层的编码信息。
  15. 根据权利要求14所述的文案区域识别装置,其中,所述GCN单元被配置为:将每个特征通过一个1*k的卷积后再通过一个k*1的卷积,获取第一编码,其中,所述k为预定常数;
    将每个特征通过一个k*1的卷积后再通过一个1*k的卷积,获取第二编码;
    将所述第一编码与所述第二编码求和,并卷积后输出,获取所述编码信息。
  16. 根据权利要求12所述的文案区域识别装置,其中,所述编解码模块包括解码单元,被配置为:
    对最高层特征解码,获取最高层解码输出;
    从高层到低层的顺序,利用上一层的解码输出和当前层的编码信息联合解码,并输出至下一层,直至所述当前层为最低层,输出所述联合解码信息。
  17. 根据权利要求13所述的文案区域识别装置,其中,
    所述坐标回归单元被配置为:
    将所述融合解码信息通过深度为5的卷积获取每个像素点在四个方向上分别与文案区域的边框的距离,以及文案区域的旋转角度信息;
    所述区域确定模块还被配置为:
    根据各个像素点在四个方向上分别与文案边框的距离,以及文案区域的旋转角度信息,通过非极大值抑制算法确定文案区域的边框位置。
  18. 根据权利要求17所述的文案区域识别装置,其中,所述坐标回归单元还被配置为:
    将所述融合解码信息通过深度为1的卷积获取每个像素点的位置是文案区域的概率;
    所述区域确定模块被配置为:
    根据每个像素点位于所述文案区域的概率筛选出大于等于预定阈值的像素点;
    所述通过非极大值抑制算法确定文案区域的边框位置为:根据筛选出的像素点在四个方向上分别与文案边框的距离,以及文案区域的旋转角度信息,通过非极大值抑制算法确定文案区域的边框位置。
  19. 根据权利要求12所述的文案区域识别装置,还包括审核单元,被配置为:
    根据预定文案审核规则审核所述文案区域;
    在所述文案区域不满足所述预定文案审核规则的情况下,驳回所述待处理图像对应的文案方案。
  20. 一种文案区域识别装置,包括:
    存储器;以及
    耦接至所述存储器的处理器,所述处理器被配置为基于存储在所述存储器的指令执行如权利要求1至11任一项所述的方法。
  21. 一种计算机可读存储介质,其上存储有计算机程序指令,该指令被处理器执行时实现权利要求1至11任意一项所述的方法的步骤。
PCT/CN2019/098414 2018-08-01 2019-07-30 文案区域识别方法和装置 WO2020024939A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP19845554.5A EP3812965A4 (en) 2018-08-01 2019-07-30 TEXT ZONE IDENTIFICATION METHOD AND DEVICE
US17/155,168 US11763167B2 (en) 2018-08-01 2021-01-22 Copy area identification method and device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810861942.7 2018-08-01
CN201810861942.7A CN110796133B (zh) 2018-08-01 2018-08-01 文案区域识别方法和装置

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/155,168 Continuation US11763167B2 (en) 2018-08-01 2021-01-22 Copy area identification method and device

Publications (1)

Publication Number Publication Date
WO2020024939A1 true WO2020024939A1 (zh) 2020-02-06

Family

ID=69231441

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/098414 WO2020024939A1 (zh) 2018-08-01 2019-07-30 文案区域识别方法和装置

Country Status (4)

Country Link
US (1) US11763167B2 (zh)
EP (1) EP3812965A4 (zh)
CN (1) CN110796133B (zh)
WO (1) WO2020024939A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111950356A (zh) * 2020-06-30 2020-11-17 深圳市雄帝科技股份有限公司 印章文本定位方法、装置及电子设备

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10616443B1 (en) * 2019-02-11 2020-04-07 Open Text Sa Ulc On-device artificial intelligence systems and methods for document auto-rotation
CN112101165B (zh) * 2020-09-07 2022-07-15 腾讯科技(深圳)有限公司 兴趣点识别方法、装置、计算机设备和存储介质
CN113453012B (zh) * 2021-06-25 2023-02-28 杭州海康威视数字技术股份有限公司 一种编解码方法、装置及电子设备
CN113887535B (zh) * 2021-12-03 2022-04-12 北京世纪好未来教育科技有限公司 模型训练方法、文本识别方法、装置、设备和介质
CN115575857B (zh) * 2022-12-08 2023-04-28 江西广凯新能源股份有限公司 一种高压电线断落紧急保护方法及装置

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103686181A (zh) * 2013-12-13 2014-03-26 洪雪荣 一种屏幕显示信息的编码方法和编码系统
CN107247950A (zh) * 2017-06-06 2017-10-13 电子科技大学 一种基于机器学习的身份证图像文本识别方法
CN107820096A (zh) * 2017-11-09 2018-03-20 京东方科技集团股份有限公司 图像处理装置及方法、图像处理系统及训练方法
CN108229303A (zh) * 2017-11-14 2018-06-29 北京市商汤科技开发有限公司 检测识别和检测识别网络的训练方法及装置、设备、介质

Family Cites Families (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6532302B2 (en) * 1998-04-08 2003-03-11 Canon Kabushiki Kaisha Multiple size reductions for image segmentation
JP5008572B2 (ja) * 2004-12-21 2012-08-22 キヤノン株式会社 画像処理方法、画像処理装置およびコンピュータ可読媒体
US20080310721A1 (en) * 2007-06-14 2008-12-18 John Jinhwan Yang Method And Apparatus For Recognizing Characters In A Document Image
US8462394B2 (en) * 2008-08-05 2013-06-11 Xerox Corporation Document type classification for scanned bitmaps
US20150200998A1 (en) * 2012-01-30 2015-07-16 Google Inc. Displaying portions of a host display area of a host device at a client device
US9424767B2 (en) * 2012-06-18 2016-08-23 Microsoft Technology Licensing, Llc Local rendering of text in image
US9628865B2 (en) * 2012-09-10 2017-04-18 Apple Inc. Enhanced closed caption feature
CN103002288B (zh) * 2012-12-28 2015-10-21 北京视博云科技有限公司 一种视频图像的编解码方法及装置
CN107391505B (zh) * 2016-05-16 2020-10-23 腾讯科技(深圳)有限公司 一种图像处理方法及系统
CN106778867B (zh) * 2016-12-15 2020-07-07 北京旷视科技有限公司 目标检测方法和装置、神经网络训练方法和装置
CN106960206B (zh) * 2017-02-08 2021-01-01 北京捷通华声科技股份有限公司 字符识别方法和字符识别系统
CN107527059B (zh) * 2017-08-07 2021-12-21 北京小米移动软件有限公司 文字识别方法、装置及终端
CN108171010B (zh) * 2017-12-01 2021-09-14 华南师范大学 基于半监督网络嵌入模型的蛋白质复合体检测方法与装置
CN108230329B (zh) * 2017-12-18 2021-09-21 孙颖 基于多尺度卷积神经网络的语义分割方法
US10248664B1 (en) * 2018-07-02 2019-04-02 Inception Institute Of Artificial Intelligence Zero-shot sketch-based image retrieval techniques using neural networks for sketch-image recognition and retrieval

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103686181A (zh) * 2013-12-13 2014-03-26 洪雪荣 一种屏幕显示信息的编码方法和编码系统
CN107247950A (zh) * 2017-06-06 2017-10-13 电子科技大学 一种基于机器学习的身份证图像文本识别方法
CN107820096A (zh) * 2017-11-09 2018-03-20 京东方科技集团股份有限公司 图像处理装置及方法、图像处理系统及训练方法
CN108229303A (zh) * 2017-11-14 2018-06-29 北京市商汤科技开发有限公司 检测识别和检测识别网络的训练方法及装置、设备、介质

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111950356A (zh) * 2020-06-30 2020-11-17 深圳市雄帝科技股份有限公司 印章文本定位方法、装置及电子设备
CN111950356B (zh) * 2020-06-30 2024-04-19 深圳市雄帝科技股份有限公司 印章文本定位方法、装置及电子设备

Also Published As

Publication number Publication date
US20210142513A1 (en) 2021-05-13
EP3812965A1 (en) 2021-04-28
CN110796133A (zh) 2020-02-14
EP3812965A4 (en) 2022-03-30
CN110796133B (zh) 2024-05-24
US11763167B2 (en) 2023-09-19

Similar Documents

Publication Publication Date Title
WO2020024939A1 (zh) 文案区域识别方法和装置
Diem et al. cBAD: ICDAR2017 competition on baseline detection
US20220116347A1 (en) Location resolution of social media posts
US11405344B2 (en) Social media influence of geographic locations
US20180122114A1 (en) Method and apparatus for processing video image and electronic device
US8649602B2 (en) Systems and methods for tagging photos
CN103488764B (zh) 个性化视频内容推荐方法和系统
CN110427859A (zh) 一种人脸检测方法、装置、电子设备及存储介质
Pang et al. A robust panel extraction method for manga
JP2010211785A (ja) 顔によって画像をグループ化する方法
US20230169554A1 (en) System and method for automated electronic catalogue management and electronic image quality assessment
WO2020159437A1 (en) Method and system for face liveness detection
Zhang et al. A comprehensive survey on computational aesthetic evaluation of visual art images: Metrics and challenges
CN110647956A (zh) 一种联合二维码识别的发票信息提取方法
CN100371945C (zh) 一种计算机辅助书法作品真伪鉴别方法
EP4195136A1 (en) Automated video generation from images for e-commerce applications
CN110427819A (zh) 一种识别图像中ppt边框的方法及相关设备
Wang et al. Thermal images-aware guided early fusion network for cross-illumination RGB-T salient object detection
CN113807066A (zh) 一种图表生成方法、装置及电子设备
US10198791B2 (en) Automatic correction of facial sentiment of portrait images
Su et al. 2.5 D visual relationship detection
Hoh et al. Salient-centeredness and saliency size in computational aesthetics
CN115546906A (zh) 检测图像中人脸活度的系统和方法及电子设备
CN117597702A (zh) 缩放无关的水印提取
CN116391200A (zh) 缩放不可知水印提取

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19845554

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2019845554

Country of ref document: EP

Effective date: 20210122

NENP Non-entry into the national phase

Ref country code: DE