WO2022236875A1 - File scanning method, device, medium and product - Google Patents

File scanning method, device, medium and product Download PDF

Info

Publication number
WO2022236875A1
WO2022236875A1 PCT/CN2021/095960 CN2021095960W WO2022236875A1 WO 2022236875 A1 WO2022236875 A1 WO 2022236875A1 CN 2021095960 W CN2021095960 W CN 2021095960W WO 2022236875 A1 WO2022236875 A1 WO 2022236875A1
Authority
WO
WIPO (PCT)
Prior art keywords
value
feature
structured information
contour feature
target image
Prior art date
Application number
PCT/CN2021/095960
Other languages
French (fr)
Chinese (zh)
Inventor
刘光禄
杨旭
段无悔
张守龙
Original Assignee
广州广电运通金融电子股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 广州广电运通金融电子股份有限公司 filed Critical 广州广电运通金融电子股份有限公司
Publication of WO2022236875A1 publication Critical patent/WO2022236875A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N1/00Scanning, transmission or reproduction of documents or the like, e.g. facsimile transmission; Details thereof
    • H04N1/00681Detecting the presence, position or size of a sheet or correcting its position before scanning
    • H04N1/00684Object of the detection
    • H04N1/00721Orientation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N1/00Scanning, transmission or reproduction of documents or the like, e.g. facsimile transmission; Details thereof
    • H04N1/00681Detecting the presence, position or size of a sheet or correcting its position before scanning
    • H04N1/00742Detection methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N1/00Scanning, transmission or reproduction of documents or the like, e.g. facsimile transmission; Details thereof
    • H04N1/00681Detecting the presence, position or size of a sheet or correcting its position before scanning
    • H04N1/00763Action taken as a result of detection
    • H04N1/00774Adjusting or controlling
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N1/00Scanning, transmission or reproduction of documents or the like, e.g. facsimile transmission; Details thereof
    • H04N1/04Scanning arrangements, i.e. arrangements for the displacement of active reading or reproducing elements relative to the original or reproducing medium, or vice versa
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N1/00Scanning, transmission or reproduction of documents or the like, e.g. facsimile transmission; Details thereof
    • H04N1/21Intermediate information storage
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • the invention relates to the field of file scanning processing, in particular to a file scanning method, device, medium and product.
  • the preservation of documents usually requires the correct orientation of the scanned images, that is, to ensure that the documents are stored in an orderly manner with the front side at the front and the back side at the back.
  • the following methods are usually used for document scanning: 1. In accordance with the characteristics of the machine, put the document to be scanned into the machine in a specific and correct direction and order for scanning; 2. Put the document in a free The direction is put into the machine for scanning, and then unified on the electronic device side in the later stage, and the image information is manually corrected and adjusted. Both of the above two operation methods require manual intervention, which greatly affects production efficiency.
  • one of the purposes of the present invention is to provide a document scanning method, which can solve the current problem of requiring manual intervention and greatly affecting production efficiency when scanning documents with headers.
  • the second object of the present invention is to provide an electronic device, which can solve the current problem of requiring manual intervention when scanning documents with headers, which greatly affects production efficiency.
  • the third object of the present invention is to provide a computer-readable storage medium, which can solve the current problem of requiring manual intervention when scanning documents with headers, which greatly affects production efficiency.
  • the fourth object of the present invention is to provide a computer program product, which can solve the problem that manual intervention is required when scanning documents with headers, which greatly affects production efficiency.
  • a file scanning method is applied to the process of scanning a target file containing a header, comprising the following steps:
  • Receiving the target image receiving the target image obtained by scanning the target file;
  • contour features perform contour feature extraction processing on the characters in the target image, and obtain a contour feature structured information set containing several contour feature structured information, each contour feature structured information includes area value and coordinate information , the coordinate information includes an abscissa value and a ordinate value;
  • the first filtering process is to filter all the contour feature structured information in the contour feature structured information set according to the pre-stored first feature filtering threshold, and use the contour feature structured information whose area value is greater than the first feature filtering threshold as the first Contour feature structured information and save;
  • Overlap processing judging whether the first contour feature structured information overlaps in the projection area in the ordinate direction, and taking the first contour feature structured information that overlaps in the ordinate direction's projection area as the second contour feature structured information information and save;
  • the second filtering process is to filter all the first contour feature structured information according to the pre-stored second feature filtering threshold, and use the first contour feature structured information whose area value is greater than the second feature filtering threshold as the second contour feature structure information and save it;
  • Rotation correction rotate the target image by 180°, and output the target image rotated by 180° to the host computer for storage.
  • the first filtering processing step it also includes calculating the first feature filtering threshold, sorting all the contour feature structured information in the contour feature structured information set according to the corresponding area value from large to small, and sorting all the contour feature structured information.
  • the number of the contour feature structured information is used as the first quantity value
  • the first target number is calculated according to the first preset coefficient and the first quantity value
  • the area in the contour feature structured information of the first target number The value is used as the first feature filtering threshold and stored.
  • the second filtering processing step it also includes calculating the second feature filtering threshold, sorting the second contour feature structured information according to the corresponding area value from large to small, and sorting the second contour feature structured information
  • the number of is used as the second quantity value
  • the second target serial number is calculated according to the second quantitative value and the second preset coefficient
  • the area value in the contour feature structured information of the second target serial number is used as the second feature filtering threshold.
  • image preprocessing is also included, and binarization processing is performed on the target image.
  • the image preprocessing specifically includes: performing binarization processing on the target image by using an average gray threshold method.
  • a connected domain segmentation algorithm is used to perform contour feature extraction processing on the characters in the target image.
  • the calculation of the one-dimensional head-up feature value specifically includes: taking the product obtained by multiplying the area value corresponding to each second contour feature structured information and the one-dimensional distance feature value as the corresponding one-dimensional head-up feature value.
  • An electronic device comprising: a processor
  • a memory a memory
  • a program wherein the program is stored in the memory and configured to be executed by a processor, the program includes a method for performing a file scanning method described in this application.
  • a computer-readable storage medium on which a computer program is stored, and the computer program is executed by a processor to perform a file scanning method described in this application.
  • a computer program product including a computer program, is characterized in that, when the computer program is executed by a processor, a file scanning method described in this application is implemented.
  • the beneficial effect of the present invention lies in that: a document scanning method in the present application, when scanning a target document containing a header, extracts contour features, first filter processing, and overlaps the target image. Processing, second filtering processing, and calculating the distance feature value realize the identification of the header of the target file, and then realize whether the corresponding information of the target file is correct or not according to the obtained one-dimensional header feature value and the ordinate value of the center line of the target image.
  • FIG. 1 is a schematic flowchart of a file scanning method of the present invention.
  • the file scanning method in this embodiment is applied to scanning the target file containing header, as shown in Figure 1, specifically comprising the following steps:
  • the target image is received, and the target image obtained by scanning the target file is received.
  • the user puts the target document into the corresponding document storage place of the scanning device for scanning, and the scanning device in the scanning device scans the target document to obtain the target image, and at this time also obtains the target image according to the target image
  • the overall width and height of the target image are calculated according to the width and height (ie length), and half of its height is used as the ordinate value of the target image center line and stored.
  • the target image is binarized using the average gray threshold method, and the width and height of the binarized target image remain unchanged.
  • Extract contour features use the connected domain segmentation algorithm to perform contour feature extraction processing on the characters in the target image, and obtain a contour feature structured information set containing several contour feature structured information, each contour feature structured information includes Area value, coordinate information, width value and height value, the coordinate information includes abscissa value and ordinate value, and the coordinate information is based on any one of the four corners of the target image as the origin of the coordinate system, the target image
  • the width is the abscissa
  • the height of the target image is the coordinate point information in the coordinate system established by the ordinate.
  • the abscissa value is the distance from the corresponding character to the abscissa
  • the ordinate is the distance from the corresponding character to the ordinate.
  • Calculate the first feature filtering threshold sort all the contour feature structured information in the contour feature structured information set from large to small according to the corresponding area value, that is, rank the contour feature structured information with the largest area value first, By analogy, sorting all the contour feature structured information, using the number of the contour feature structured information as the first quantity value, calculating the first target serial number according to the first preset coefficient and the first quantity value, The area value in the contour feature structured information of the first object serial number is used as the first feature filtering threshold and stored.
  • the first target serial number be K
  • the first number be A_N
  • the first preset coefficient be ⁇
  • K ⁇ *A_N, where ⁇ (0,1), in this embodiment, ⁇ is preferably 0.15.
  • the following example illustrates that if the first quantity value is 100 and the first preset coefficient is 0.15, then the first target serial number is 15, and the 15th-ranked contour feature is selected from the contour feature structure information sorted from large to small For the structure information, the area value sorted in the 15th contour feature structure information is used as the first feature filtering threshold and stored.
  • the first filtering process is to filter all the contour feature structured information in the contour feature structured information set according to the pre-stored first feature filtering threshold, and use the contour feature structured information whose area value is greater than the first feature filtering threshold as the first Contour feature structured information and saved.
  • Overlap processing judging whether the first contour feature structured information overlaps in the projection area in the ordinate direction, and taking the first contour feature structured information that overlaps in the ordinate direction's projection area as the second contour feature structured information information and save it.
  • Calculate the second feature filtering threshold sort the second contour feature structured information according to the corresponding area value from large to small, and use the quantity of the second contour feature structured information as the second quantity value, according to the second quantity value and
  • the second preset coefficient is used to calculate the second target number, and the area value in the contour feature structured information of the second target number is used as the second feature filtering threshold.
  • the second preset coefficient be ⁇ t
  • the target serial number be Kt
  • the second quantity value be A_Nt
  • the following examples illustrate that if the second quantity value is 50, and the first preset coefficient is 0.5, then the first target sequence number is 25, and is screened in the second profile feature structured information after sorting from large to small The 25th-ranked second contour feature structured information is obtained, and the area value in the 15th-ranked second contour feature structured information is used as the second feature filtering threshold and stored.
  • the second filtering process is to filter all the first contour feature structured information according to the pre-stored second feature filtering threshold, and use the first contour feature structured information whose area value is greater than the second feature filtering threshold as the second contour feature structure information and save it.
  • the height of the target image can be obtained in the aforementioned step of receiving the target image, and its height can be used as the centerline ordinate value of the target image. If the height of the target image is H, then the centerline ordinate value of the target image is Let the one-dimensional distance feature value be dj, the ordinate value in the second profile feature structured information is
  • Reg2[j].y, where j is the position number of the second profile feature structured information, j 0,1, ..., A_N2, then
  • the product obtained by multiplying the area value corresponding to each second contour feature structured information and the one-dimensional distance feature value is used as the corresponding one-dimensional head feature value.
  • the target image Facing the judgment, judging whether the ordinate value in the second contour feature structured information corresponding to the one-dimensional head-up feature value with the largest numerical value is greater than the target image midline ordinate value, if so, the target image is facing upside down, then perform the rotation correction step, If not, the target image is facing upright, and the target image is output to the host computer for storage.
  • the ordinate value in the second contour feature structured information corresponding to the one-dimensional head-up feature value with the largest value be Y
  • the ordinate value of the center line in the target image be judge Whether it is true, and if it is true, the target image is facing upside down and needs to be corrected. If not established, the target image is facing upright, without correction, and the target image can be directly output to the host computer for storage as an image corresponding to the target file.
  • Rotation correction rotate the target image by 180°, and output the target image rotated by 180° to the host computer for storage.
  • an electronic device including: a processor;
  • a memory a memory
  • a program wherein the program is stored in the memory and configured to be executed by a processor, the program includes a method for performing a file scanning method described in this application.
  • a computer-readable storage medium on which a computer program is stored, and the computer program is used by a processor to execute a file scanning method described in this application.
  • a computer program product includes a computer program, and is characterized in that, when the computer program is executed by a processor, a file scanning method described in this application is implemented.
  • a document scanning method in the present application when scanning a target document containing headers, it is realized by extracting contour features, first filtering processing, overlapping processing, second filtering processing, and calculating distance feature values on the target image. Identify the header of the target file, and then judge whether the corresponding orientation information of the target file is correct according to the obtained one-dimensional header feature value and the vertical coordinate value of the target image center line, and determine whether the target image needs to be automatically rotated according to the orientation information
  • the correction process finally realizes the orderly storage of the target image, the whole process does not require manual intervention, and when the user scans the document, there is no need to check the position of the document on the scanning device, and the document to be scanned can be placed in the scanning device in any direction In the device, and there is no need for manual intervention in the later stage, it greatly improves the efficiency of digital processing of existing document physical information images.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Character Input (AREA)

Abstract

A file scanning method, comprising: receiving a target image, extracting contour features, first filtering processing, overlapping processing, second filtering processing, calculating distance feature values, calculating one-dimensional head-up feature values, facing determination, and rotation correction. The described method achieves the orderly storage of the target image, manual intervention is not needed for the entire process, the position of a file in a scanning apparatus does not need to be checked when a user scans the file, a file to be scanned can be placed into the scanning apparatus in any direction, and manual intervention is not needed for a later stage, thereby greatly increasing the digital processing efficiency of a physical information image of existing files.

Description

一种文件扫描方法、设备、介质及产品A file scanning method, device, medium and product 技术领域technical field
本发明涉及文件扫描处理领域,尤其涉及一种文件扫描方法、设备、介质及产品。The invention relates to the field of file scanning processing, in particular to a file scanning method, device, medium and product.
背景技术Background technique
随着信息数字化时代的发展,传统的以纸质或实物材料记录的信息被快速的以数字化技术进行转换存储。如日常中使用的发票、存单及支票等有价文件扫描,也有如证明材料、证书及身份证等文件扫描存档,都是在进行信息数字化转换存储。为便于后续数字化信息管理及提高生产效率,文件实物扫描的相关影像信息按有序规则存储是重要前提之一。With the development of the information digital age, information that is traditionally recorded in paper or physical materials is rapidly converted and stored with digital technology. For example, the scanning of valuable documents such as invoices, deposit receipts, and checks used in daily life, as well as the scanning and archiving of documents such as certification materials, certificates, and ID cards, are all digitally converting and storing information. In order to facilitate subsequent digital information management and improve production efficiency, it is one of the important prerequisites to store the relevant image information of physical document scanning in an orderly manner.
目前,在扫描文件实物存储影像数字信息时,文件的保存通常需要扫描得到的影像的面向正确,即保证文件按照正立及正面在前,反面在后等有序存储时。目前为了保证文件扫描后的影像面向正确,通常采用以下方式进行文件扫描:1、配合机器特性,将待扫描文件实物按照特定正确的方向和顺序放入机器中扫描;2、将文件实物以自由方向放入机器中扫描,后期再统一在电子设备端,人为对影像信息进行纠正调整。以上两种操作方式,都需要人工介入处理,极大影响生产效率。At present, when scanning documents to store image digital information, the preservation of documents usually requires the correct orientation of the scanned images, that is, to ensure that the documents are stored in an orderly manner with the front side at the front and the back side at the back. At present, in order to ensure that the image of the document scanned is correct, the following methods are usually used for document scanning: 1. In accordance with the characteristics of the machine, put the document to be scanned into the machine in a specific and correct direction and order for scanning; 2. Put the document in a free The direction is put into the machine for scanning, and then unified on the electronic device side in the later stage, and the image information is manually corrected and adjusted. Both of the above two operation methods require manual intervention, which greatly affects production efficiency.
发明内容Contents of the invention
为了克服现有技术的不足,本发明的目的之一在于提供一种文件扫描方法,其能解决目前对于带有抬头的文件进行扫描时存在的需要人工介入处理,极大影响生产效率问题。In order to overcome the deficiencies of the prior art, one of the purposes of the present invention is to provide a document scanning method, which can solve the current problem of requiring manual intervention and greatly affecting production efficiency when scanning documents with headers.
本发明的目的之二在于提供一种电子设备,其能解决目前对于带有抬头的文件进行扫描时存在的需要人工介入处理,极大影响生产效率问题。The second object of the present invention is to provide an electronic device, which can solve the current problem of requiring manual intervention when scanning documents with headers, which greatly affects production efficiency.
本发明的目的之三在于提供一种计算机可读存储介质,其能解决目前对于带有抬头的文件进行扫描时存在的需要人工介入处理,极大影响生产效率问题。The third object of the present invention is to provide a computer-readable storage medium, which can solve the current problem of requiring manual intervention when scanning documents with headers, which greatly affects production efficiency.
本发明的目的之四在于提供一种计算机程序产品,其能解决目前对于带有抬头的文件进行扫描时存在的需要人工介入处理,极大影响生产效率问题。The fourth object of the present invention is to provide a computer program product, which can solve the problem that manual intervention is required when scanning documents with headers, which greatly affects production efficiency.
本发明的目的之一采用以下技术方案实现:One of purpose of the present invention adopts following technical scheme to realize:
一种文件扫描方法,所述方法应用于对含有抬头的目标文件进行扫描过程中,包括以下步骤:A file scanning method, the method is applied to the process of scanning a target file containing a header, comprising the following steps:
接收目标图像,接收扫描目标文件得到的目标图像;Receiving the target image, receiving the target image obtained by scanning the target file;
提取轮廓特征,对所述目标图像中的字符进行轮廓特征提取处理,得到含有若干个轮廓特征结构化信息的轮廓特征结构化信息集合,每个轮廓特征结构化信息中均包括面积值以及坐标信息,所述坐标信息包括横坐标值和纵坐标值;Extract contour features, perform contour feature extraction processing on the characters in the target image, and obtain a contour feature structured information set containing several contour feature structured information, each contour feature structured information includes area value and coordinate information , the coordinate information includes an abscissa value and a ordinate value;
第一滤波处理,根据预先存储的第一特征滤波阈值对轮廓特征结构化信息集合中所有轮廓特征结构化信息进行滤波处理,将面积值大于第一特征滤波阈值的轮廓特征结构化信息作为第一轮廓特征结构化信息并保存;The first filtering process is to filter all the contour feature structured information in the contour feature structured information set according to the pre-stored first feature filtering threshold, and use the contour feature structured information whose area value is greater than the first feature filtering threshold as the first Contour feature structured information and save;
交叠处理,判断第一轮廓特征结构化信息在纵坐标方向的投影区域是否有交叠,将在纵坐标方向的投影区域有交叠的第一轮廓特征结构化信息作为第二轮廓特征结构化信息并保存;Overlap processing, judging whether the first contour feature structured information overlaps in the projection area in the ordinate direction, and taking the first contour feature structured information that overlaps in the ordinate direction's projection area as the second contour feature structured information information and save;
第二滤波处理,根据预先存储的第二特征滤波阈值对所有第一轮廓特征结构化信息进行滤波处理,将面积值大于第二特征滤波阈值的第一轮廓特征结构化信息作为第二轮廓特征结构化信息并保存;The second filtering process is to filter all the first contour feature structured information according to the pre-stored second feature filtering threshold, and use the first contour feature structured information whose area value is greater than the second feature filtering threshold as the second contour feature structure information and save it;
计算距离特征值,根据预先存储的目标图像中线纵坐标值以及每个所述第 二轮廓特征结构化信息中的纵坐标值计算出与第二轮廓特征结构化信息对应的一维距离特征值;Calculate the distance feature value, and calculate the one-dimensional distance feature value corresponding to the second contour feature structured information according to the pre-stored target image centerline ordinate value and the ordinate value in each of the second contour feature structured information;
计算一维抬头特征值,根据每个第二轮廓特征结构化信息对应的面积值以及一维距离特征值计算出对应的一维抬头特征值;Calculate the one-dimensional head-up feature value, and calculate the corresponding one-dimensional head-up feature value according to the area value and the one-dimensional distance feature value corresponding to each second contour feature structured information;
面向判断,判断数值最大的一维抬头特征值对应的第二轮廓特征结构化信息中的纵坐标值是否大于目标图像中线纵坐标值,若是,则目标图像为面向倒立,则执行旋转校正步骤,若否,则目标图像为面向正立,将目标图像输出至上位机进行存储;Facing the judgment, judging whether the ordinate value in the second contour feature structured information corresponding to the one-dimensional head-up feature value with the largest numerical value is greater than the target image midline ordinate value, if so, the target image is facing upside down, and the rotation correction step is performed, If not, the target image is facing upright, and the target image is output to the host computer for storage;
旋转校正,对目标图像进行旋转180°处理,将经过旋转180°处理的目标图像输出至上位机进行存储。Rotation correction, rotate the target image by 180°, and output the target image rotated by 180° to the host computer for storage.
进一步地,在所述第一滤波处理步骤之前还包括计算第一特征滤波阈值,将轮廓特征结构化信息集合中所有轮廓特征结构化信息按照对应的面积值进行从大到小排序处理,将所述轮廓特征结构化信息的个数作为第一数量值,根据第一预设系数以及所述第一数量值计算出第一目标序号,将位于第一目标序号的轮廓特征结构化信息中的面积值作为第一特征滤波阈值并存储。Further, before the first filtering processing step, it also includes calculating the first feature filtering threshold, sorting all the contour feature structured information in the contour feature structured information set according to the corresponding area value from large to small, and sorting all the contour feature structured information. The number of the contour feature structured information is used as the first quantity value, the first target number is calculated according to the first preset coefficient and the first quantity value, and the area in the contour feature structured information of the first target number The value is used as the first feature filtering threshold and stored.
进一步地,在所述第二滤波处理步骤之前还包括计算第二特征滤波阈值,将第二轮廓特征结构化信息按照对应的面积值进行从大到小排序处理,将第二轮廓特征结构化信息的数量作为第二数量值,根据第二数量值以及第二预设系数计算出第二目标序号,将位于第二目标序号的轮廓特征结构化信息中的面积值作为第二特征滤波阈值。Further, before the second filtering processing step, it also includes calculating the second feature filtering threshold, sorting the second contour feature structured information according to the corresponding area value from large to small, and sorting the second contour feature structured information The number of is used as the second quantity value, the second target serial number is calculated according to the second quantitative value and the second preset coefficient, and the area value in the contour feature structured information of the second target serial number is used as the second feature filtering threshold.
进一步地,在所述提取轮廓特征步骤之前还包括图像预处理,对所述目标图像进行二值化处理。Further, before the step of extracting contour features, image preprocessing is also included, and binarization processing is performed on the target image.
进一步地,所述图像预处理具体为:采用平均灰度阈值法对所述目标图像 进行二值化处理。Further, the image preprocessing specifically includes: performing binarization processing on the target image by using an average gray threshold method.
进一步地,采用连通域分割算法对所述目标图像中的字符进行轮廓特征提取处理。Further, a connected domain segmentation algorithm is used to perform contour feature extraction processing on the characters in the target image.
进一步地,所述计算一维抬头特征值具体为:将每个第二轮廓特征结构化信息对应的面积值以及一维距离特征值相乘得到的乘积作为对应的一维抬头特征值。Further, the calculation of the one-dimensional head-up feature value specifically includes: taking the product obtained by multiplying the area value corresponding to each second contour feature structured information and the one-dimensional distance feature value as the corresponding one-dimensional head-up feature value.
本发明的目的之二采用以下技术方案实现:Two of the purpose of the present invention adopts following technical scheme to realize:
一种电子设备,包括:处理器;An electronic device, comprising: a processor;
存储器;以及程序,其中所述程序被存储在所述存储器中,并且被配置成由处理器执行,所述程序包括用于执行本申请中所述的一种文件扫描方法。a memory; and a program, wherein the program is stored in the memory and configured to be executed by a processor, the program includes a method for performing a file scanning method described in this application.
本发明的目的之三采用以下技术方案实现:Three of the purpose of the present invention adopts following technical scheme to realize:
一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行本申请中所述的一种文件扫描方法。A computer-readable storage medium, on which a computer program is stored, and the computer program is executed by a processor to perform a file scanning method described in this application.
本发明的目的之四采用以下技术方案实现:Four of the purpose of the present invention adopts following technical scheme to realize:
一种计算机程序产品,包括计算机程序,其特征在于,该计算机程序被处理器执行时实现本申请中所述的一种文件扫描方法。A computer program product, including a computer program, is characterized in that, when the computer program is executed by a processor, a file scanning method described in this application is implemented.
相比现有技术,本发明的有益效果在于:本申请中的一种文件扫描方法,在对含有抬头的目标文件进行扫描时,通过对目标图像进行提取轮廓特征、第一滤波处理、交叠处理、第二滤波处理、计算距离特征值实现了对目标文件的抬头的识别,再根据得到的一维抬头特征值以及目标图像中线的纵坐标值实现对目标文件对应面向信息是否正确的判断,根据面向信息确定是否需要对目标图像进行自动的旋转校正处理,最终实现了对目标影像的有序存储,整个过程无需人工介入,而且当使用者进行文件扫描时,无需检查文件在扫描装置的位 置,可以以任意方向将待扫描文件放入至扫描装置中,而且后期无需人工介入处理,极大地提升了现有文件实物信息图像数字化处理效率。Compared with the prior art, the beneficial effect of the present invention lies in that: a document scanning method in the present application, when scanning a target document containing a header, extracts contour features, first filter processing, and overlaps the target image. Processing, second filtering processing, and calculating the distance feature value realize the identification of the header of the target file, and then realize whether the corresponding information of the target file is correct or not according to the obtained one-dimensional header feature value and the ordinate value of the center line of the target image. Determine whether to perform automatic rotation correction processing on the target image according to the information orientation, and finally realize the orderly storage of the target image, the whole process does not require manual intervention, and when the user scans the document, there is no need to check the position of the document on the scanning device , the document to be scanned can be put into the scanning device in any direction, and there is no need for manual intervention in the later stage, which greatly improves the efficiency of digital processing of existing document physical information images.
上述说明仅是本发明技术方案的概述,为了能够更清楚了解本发明的技术手段,并可依照说明书的内容予以实施,以下以本发明的较佳实施例并配合附图详细说明如后。本发明的具体实施方式由以下实施例及其附图详细给出。The above description is only an overview of the technical solutions of the present invention. In order to understand the technical means of the present invention more clearly and implement them according to the contents of the description, the preferred embodiments of the present invention and accompanying drawings are described in detail below. The specific embodiment of the present invention is given in detail by the following examples and accompanying drawings.
附图说明Description of drawings
此处所说明的附图用来提供对本发明的进一步理解,构成本申请的一部分,本发明的示意性实施例及其说明用于解释本发明,并不构成对本发明的不当限定。在附图中:The accompanying drawings described here are used to provide a further understanding of the present invention and constitute a part of the application. The schematic embodiments of the present invention and their descriptions are used to explain the present invention and do not constitute improper limitations to the present invention. In the attached picture:
图1为本发明的一种文件扫描方法的流程示意图。FIG. 1 is a schematic flowchart of a file scanning method of the present invention.
具体实施方式Detailed ways
下面,结合附图以及具体实施方式,对本发明做进一步描述,需要说明的是,在不相冲突的前提下,以下描述的各实施例之间或各技术特征之间可以任意组合形成新的实施例。Below, the present invention will be further described in conjunction with the accompanying drawings and specific implementation methods. It should be noted that, under the premise of not conflicting, the various embodiments described below or the technical features can be combined arbitrarily to form new embodiments. .
本实施例中的文件扫描方法应用于对含有抬头的目标文件进行扫描,如图1所示,具体包括以下步骤:The file scanning method in this embodiment is applied to scanning the target file containing header, as shown in Figure 1, specifically comprising the following steps:
接收目标图像,接收扫描目标文件得到的目标图像。在本实施例中,由使用者将目标文件放入扫描装置对应的文件放置处进行扫描,扫描装置中的扫描设备对目标文件进行扫描后得到目标图像,此时还会根据目标图像得到目标图像的整体的宽度和高度,根据宽度和高度(即长度)计算出目标图像的图像中线,将其高度的一半作为目标图像中线纵坐标值并存储。The target image is received, and the target image obtained by scanning the target file is received. In this embodiment, the user puts the target document into the corresponding document storage place of the scanning device for scanning, and the scanning device in the scanning device scans the target document to obtain the target image, and at this time also obtains the target image according to the target image The overall width and height of the target image are calculated according to the width and height (ie length), and half of its height is used as the ordinate value of the target image center line and stored.
图像预处理,采用平均灰度阈值法对所述目标图像进行二值化处理,经过 二值化处理的目标图像的宽度和高度并未改变。In image preprocessing, the target image is binarized using the average gray threshold method, and the width and height of the binarized target image remain unchanged.
提取轮廓特征,采用连通域分割算法对所述目标图像中的字符进行轮廓特征提取处理,得到含有若干个轮廓特征结构化信息的轮廓特征结构化信息集合,每个轮廓特征结构化信息中均包括面积值、坐标信息、宽度值以及高度值,所述坐标信息包括横坐标值和纵坐标值,所述坐标信息为根据目标图像的四个顶角中任意一个作为坐标系的原点、目标图像的宽度为横坐标,目标图像的高度为纵坐标建立的坐标系中的坐标点信息,上述横坐标值为对应字符到横坐标的距离值,纵坐标值为对应字符到纵坐标的距离值。Extract contour features, use the connected domain segmentation algorithm to perform contour feature extraction processing on the characters in the target image, and obtain a contour feature structured information set containing several contour feature structured information, each contour feature structured information includes Area value, coordinate information, width value and height value, the coordinate information includes abscissa value and ordinate value, and the coordinate information is based on any one of the four corners of the target image as the origin of the coordinate system, the target image The width is the abscissa, and the height of the target image is the coordinate point information in the coordinate system established by the ordinate. The abscissa value is the distance from the corresponding character to the abscissa, and the ordinate is the distance from the corresponding character to the ordinate.
计算第一特征滤波阈值,将轮廓特征结构化信息集合中所有轮廓特征结构化信息按照对应的面积值进行从大到小排序处理,即将面积值最大的轮廓特征结构化信息排在第一位,依次类推,将所有轮廓特征结构化信息进行排序,将所述轮廓特征结构化信息的个数作为第一数量值,根据第一预设系数以及所述第一数量值计算出第一目标序号,将位于第一目标序号的轮廓特征结构化信息中的面积值作为第一特征滤波阈值并存储。在本实施例中,令第一目标序号为K,第一数量自为Α_Ν,第一预设系数为α,K=α*Α_Ν,其中α∈(0,1),在本实施例中,α优选为0.15。以下举例说明,假如第一数量值为100,第一预设系数为0.15,则第一目标序号为15,在经过从大到小排序后轮廓特征结构信息中筛选出排序在第15位轮廓特征结构信息,将排序在第15位轮廓特征结构信息中的面积值作为第一特征滤波阈值并存储。Calculate the first feature filtering threshold, sort all the contour feature structured information in the contour feature structured information set from large to small according to the corresponding area value, that is, rank the contour feature structured information with the largest area value first, By analogy, sorting all the contour feature structured information, using the number of the contour feature structured information as the first quantity value, calculating the first target serial number according to the first preset coefficient and the first quantity value, The area value in the contour feature structured information of the first object serial number is used as the first feature filtering threshold and stored. In this embodiment, let the first target serial number be K, the first number be A_N, the first preset coefficient be α, K=α*A_N, where α∈(0,1), in this embodiment, α is preferably 0.15. The following example illustrates that if the first quantity value is 100 and the first preset coefficient is 0.15, then the first target serial number is 15, and the 15th-ranked contour feature is selected from the contour feature structure information sorted from large to small For the structure information, the area value sorted in the 15th contour feature structure information is used as the first feature filtering threshold and stored.
第一滤波处理,根据预先存储的第一特征滤波阈值对轮廓特征结构化信息集合中所有轮廓特征结构化信息进行滤波处理,将面积值大于第一特征滤波阈值的轮廓特征结构化信息作为第一轮廓特征结构化信息并保存。The first filtering process is to filter all the contour feature structured information in the contour feature structured information set according to the pre-stored first feature filtering threshold, and use the contour feature structured information whose area value is greater than the first feature filtering threshold as the first Contour feature structured information and saved.
交叠处理,判断第一轮廓特征结构化信息在纵坐标方向的投影区域是否有 交叠,将在纵坐标方向的投影区域有交叠的第一轮廓特征结构化信息作为第二轮廓特征结构化信息并保存。Overlap processing, judging whether the first contour feature structured information overlaps in the projection area in the ordinate direction, and taking the first contour feature structured information that overlaps in the ordinate direction's projection area as the second contour feature structured information information and save it.
计算第二特征滤波阈值,将第二轮廓特征结构化信息按照对应的面积值进行从大到小排序处理,将第二轮廓特征结构化信息的数量作为第二数量值,根据第二数量值以及第二预设系数计算出第二目标序号,将位于第二目标序号的轮廓特征结构化信息中的面积值作为第二特征滤波阈值。在本实施例中,令第二预设系数为αt,第而目标序号为Kt,第二数量值为Α_Νt,则Kt=αt*Α_Νt,其中αt∈(0,1],在本实施例中优选αt=0.5。以下举例说明,假如第二数量值为50,第一预设系数为0.5,则第一目标序号为25,在经过从大到小排序后第二轮廓特征结构化信息中筛选出排序在第25位第二轮廓特征结构化信息,将排序在第15位第二轮廓特征结构化信息中的面积值作为第二特征滤波阈值并存储。Calculate the second feature filtering threshold, sort the second contour feature structured information according to the corresponding area value from large to small, and use the quantity of the second contour feature structured information as the second quantity value, according to the second quantity value and The second preset coefficient is used to calculate the second target number, and the area value in the contour feature structured information of the second target number is used as the second feature filtering threshold. In this embodiment, let the second preset coefficient be αt, the target serial number be Kt, and the second quantity value be A_Nt, then Kt=αt*A_Nt, where αt∈(0,1], in this embodiment Preferred αt=0.5. The following examples illustrate that if the second quantity value is 50, and the first preset coefficient is 0.5, then the first target sequence number is 25, and is screened in the second profile feature structured information after sorting from large to small The 25th-ranked second contour feature structured information is obtained, and the area value in the 15th-ranked second contour feature structured information is used as the second feature filtering threshold and stored.
第二滤波处理,根据预先存储的第二特征滤波阈值对所有第一轮廓特征结构化信息进行滤波处理,将面积值大于第二特征滤波阈值的第一轮廓特征结构化信息作为第二轮廓特征结构化信息并保存。The second filtering process is to filter all the first contour feature structured information according to the pre-stored second feature filtering threshold, and use the first contour feature structured information whose area value is greater than the second feature filtering threshold as the second contour feature structure information and save it.
计算距离特征值,根据预先存储的目标图像中线纵坐标值以及每个所述第二轮廓特征结构化信息中的纵坐标值计算出与第二轮廓特征结构化信息对应的一维距离特征值。在本实施例中,在前述接收目标图像步骤中可以得到目标图像的高度,将其高度作为目标图像中线纵坐标值,令目标图像的高度为H,则目标图像中线纵坐标值为
Figure PCTCN2021095960-appb-000001
令一维距离特征值为dj,第二轮廓特征结构化信息中的纵坐标值为|Reg2[j].y,其中j为第二轮廓特征结构化信息的位置序号,j=0,1,...,A_N2,则
Figure PCTCN2021095960-appb-000002
Calculate the distance feature value, and calculate the one-dimensional distance feature value corresponding to the second contour feature structured information according to the pre-stored target image center line ordinate value and the ordinate value in each of the second contour feature structured information. In this embodiment, the height of the target image can be obtained in the aforementioned step of receiving the target image, and its height can be used as the centerline ordinate value of the target image. If the height of the target image is H, then the centerline ordinate value of the target image is
Figure PCTCN2021095960-appb-000001
Let the one-dimensional distance feature value be dj, the ordinate value in the second profile feature structured information is |Reg2[j].y, where j is the position number of the second profile feature structured information, j=0,1, ..., A_N2, then
Figure PCTCN2021095960-appb-000002
计算一维抬头特征值,根据每个第二轮廓特征结构化信息对应的面积值以及一维距离特征值计算出对应的一维抬头特征值。将每个第二轮廓特征结构化 信息对应的面积值以及一维距离特征值相乘得到的乘积作为对应的一维抬头特征值。Calculate the one-dimensional head-up feature value, and calculate the corresponding one-dimensional head-up feature value according to the area value and the one-dimensional distance feature value corresponding to each second contour feature structured information. The product obtained by multiplying the area value corresponding to each second contour feature structured information and the one-dimensional distance feature value is used as the corresponding one-dimensional head feature value.
面向判断,判断数值最大的一维抬头特征值对应的第二轮廓特征结构化信息中的纵坐标值是否大于目标图像中线纵坐标值,若是,则目标图像为面向倒立,则执行旋转校正步骤,若否,则目标图像为面向正立,将目标图像输出至上位机进行存储。在本实施例中,令数值最大的一维抬头特征值对应的第二轮廓特征结构化信息中的纵坐标值为Y,目标图像中线纵坐标值为
Figure PCTCN2021095960-appb-000003
判断
Figure PCTCN2021095960-appb-000004
是否成立,若成立,则目标图像为面向倒立,需要进行校正。若不成立,则目标图像为面向正立,无需校正,可以直接将目标图像输出至上位机进行存储,作为与目标文件对应的图像。
Facing the judgment, judging whether the ordinate value in the second contour feature structured information corresponding to the one-dimensional head-up feature value with the largest numerical value is greater than the target image midline ordinate value, if so, the target image is facing upside down, then perform the rotation correction step, If not, the target image is facing upright, and the target image is output to the host computer for storage. In this embodiment, let the ordinate value in the second contour feature structured information corresponding to the one-dimensional head-up feature value with the largest value be Y, and the ordinate value of the center line in the target image be
Figure PCTCN2021095960-appb-000003
judge
Figure PCTCN2021095960-appb-000004
Whether it is true, and if it is true, the target image is facing upside down and needs to be corrected. If not established, the target image is facing upright, without correction, and the target image can be directly output to the host computer for storage as an image corresponding to the target file.
旋转校正,对目标图像进行旋转180°处理,将经过旋转180°处理的目标图像输出至上位机进行存储。Rotation correction, rotate the target image by 180°, and output the target image rotated by 180° to the host computer for storage.
在本实施例中,还提供一种电子设备,包括:处理器;In this embodiment, an electronic device is also provided, including: a processor;
存储器;以及程序,其中所述程序被存储在所述存储器中,并且被配置成由处理器执行,所述程序包括用于执行本申请中所述的一种文件扫描方法。a memory; and a program, wherein the program is stored in the memory and configured to be executed by a processor, the program includes a method for performing a file scanning method described in this application.
在本实施例中,还提供一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行本申请中所述的一种文件扫描方法。In this embodiment, there is also provided a computer-readable storage medium, on which a computer program is stored, and the computer program is used by a processor to execute a file scanning method described in this application.
在本实施例中,一种计算机程序产品,包括计算机程序,其特征在于,该计算机程序被处理器执行时实现本申请中所述的一种文件扫描方法。In this embodiment, a computer program product includes a computer program, and is characterized in that, when the computer program is executed by a processor, a file scanning method described in this application is implemented.
本申请中的一种文件扫描方法,在对含有抬头的目标文件进行扫描时,通过对目标图像进行提取轮廓特征、第一滤波处理、交叠处理、第二滤波处理、计算距离特征值实现了对目标文件的抬头的识别,再根据得到的一维抬头特征值以及目标图像中线的纵坐标值实现对目标文件对应面向信息是否正确的判 断,根据面向信息确定是否需要对目标图像进行自动的旋转校正处理,最终实现了对目标影像的有序存储,整个过程无需人工介入,而且当使用者进行文件扫描时,无需检查文件在扫描装置的位置,可以以任意方向将待扫描文件放入至扫描装置中,而且后期无需人工介入处理,极大地提升了现有文件实物信息图像数字化处理效率。A document scanning method in the present application, when scanning a target document containing headers, it is realized by extracting contour features, first filtering processing, overlapping processing, second filtering processing, and calculating distance feature values on the target image. Identify the header of the target file, and then judge whether the corresponding orientation information of the target file is correct according to the obtained one-dimensional header feature value and the vertical coordinate value of the target image center line, and determine whether the target image needs to be automatically rotated according to the orientation information The correction process finally realizes the orderly storage of the target image, the whole process does not require manual intervention, and when the user scans the document, there is no need to check the position of the document on the scanning device, and the document to be scanned can be placed in the scanning device in any direction In the device, and there is no need for manual intervention in the later stage, it greatly improves the efficiency of digital processing of existing document physical information images.
以上,仅为本发明的较佳实施例而已,并非对本发明作任何形式上的限制;凡本行业的普通技术人员均可按说明书附图所示和以上而顺畅地实施本发明;但是,凡熟悉本专业的技术人员在不脱离本发明技术方案范围内,利用以上所揭示的技术内容而做出的些许更动、修饰与演变的等同变化,均为本发明的等效实施例;同时,凡依据本发明的实质技术对以上实施例所作的任何等同变化的更动、修饰与演变等,均仍属于本发明的技术方案的保护范围之内。The above are only preferred embodiments of the present invention, and are not intended to limit the present invention in any form; all those skilled in the art can smoothly implement the present invention as shown in the accompanying drawings and above; however, all Those skilled in the art who make use of the technical content disclosed above without departing from the scope of the technical solution of the present invention, make some changes, modifications and equivalent changes of evolution are all equivalent embodiments of the present invention; meanwhile, Any equivalent changes, modifications and evolutions made to the above embodiments based on the substantive technology of the present invention still fall within the protection scope of the technical solution of the present invention.

Claims (10)

  1. 一种文件扫描方法,所述方法应用于对含有抬头的目标文件进行扫描,其特征在于:包括以下步骤:A file scanning method, the method is applied to scan the target file containing header, characterized in that: comprising the following steps:
    接收目标图像,接收扫描目标文件得到的目标图像;Receiving the target image, receiving the target image obtained by scanning the target file;
    提取轮廓特征,对所述目标图像中的字符进行轮廓特征提取处理,得到含有若干个轮廓特征结构化信息的轮廓特征结构化信息集合,每个轮廓特征结构化信息中均包括面积值以及坐标信息,所述坐标信息包括横坐标值和纵坐标值;Extract contour features, perform contour feature extraction processing on the characters in the target image, and obtain a contour feature structured information set containing several contour feature structured information, each contour feature structured information includes area value and coordinate information , the coordinate information includes an abscissa value and a ordinate value;
    第一滤波处理,根据预先存储的第一特征滤波阈值对轮廓特征结构化信息集合中所有轮廓特征结构化信息进行滤波处理,将面积值大于第一特征滤波阈值的轮廓特征结构化信息作为第一轮廓特征结构化信息并保存;The first filtering process is to filter all the contour feature structured information in the contour feature structured information set according to the pre-stored first feature filtering threshold, and use the contour feature structured information whose area value is greater than the first feature filtering threshold as the first Contour feature structured information and save;
    交叠处理,判断第一轮廓特征结构化信息在纵坐标方向的投影区域是否有交叠,将在纵坐标方向的投影区域有交叠的第一轮廓特征结构化信息作为第二轮廓特征结构化信息并保存;Overlap processing, judging whether the first contour feature structured information overlaps in the projection area in the ordinate direction, and taking the first contour feature structured information that overlaps in the ordinate direction's projection area as the second contour feature structured information information and save;
    第二滤波处理,根据预先存储的第二特征滤波阈值对所有第一轮廓特征结构化信息进行滤波处理,将面积值大于第二特征滤波阈值的第一轮廓特征结构化信息作为第二轮廓特征结构化信息并保存;The second filtering process is to filter all the first contour feature structured information according to the pre-stored second feature filtering threshold, and use the first contour feature structured information whose area value is greater than the second feature filtering threshold as the second contour feature structure information and save it;
    计算距离特征值,根据预先存储的目标图像中线纵坐标值以及每个所述第二轮廓特征结构化信息中的纵坐标值计算出与第二轮廓特征结构化信息对应的一维距离特征值;Calculate the distance feature value, and calculate the one-dimensional distance feature value corresponding to the second contour feature structured information according to the pre-stored target image centerline ordinate value and the ordinate value in each of the second contour feature structured information;
    计算一维抬头特征值,根据每个第二轮廓特征结构化信息对应的面积值以及一维距离特征值计算出对应的一维抬头特征值;Calculate the one-dimensional head-up feature value, and calculate the corresponding one-dimensional head-up feature value according to the area value and the one-dimensional distance feature value corresponding to each second contour feature structured information;
    面向判断,判断数值最大的一维抬头特征值对应的第二轮廓特征结构化信息中的纵坐标值是否大于目标图像中线纵坐标值,若是,则目标图像为面向倒立,则执行旋转校正步骤,若否,则目标图像为面向正立,将目标图像输出至 上位机进行存储;Facing the judgment, judging whether the ordinate value in the second contour feature structured information corresponding to the one-dimensional head-up feature value with the largest numerical value is greater than the target image midline ordinate value, if so, the target image is facing upside down, and the rotation correction step is performed, If not, the target image is facing upright, and the target image is output to the host computer for storage;
    旋转校正,对目标图像进行旋转180°处理,将经过旋转180°处理的目标图像输出至上位机进行存储。Rotation correction, rotate the target image by 180°, and output the target image rotated by 180° to the host computer for storage.
  2. 如权利要求1所述的一种文件扫描方法,其特征在于:在所述第一滤波处理步骤之前还包括计算第一特征滤波阈值,将轮廓特征结构化信息集合中所有轮廓特征结构化信息按照对应的面积值进行从大到小排序处理,将所述轮廓特征结构化信息的个数作为第一数量值,根据第一预设系数以及所述第一数量值计算出第一目标序号,将位于第一目标序号的轮廓特征结构化信息中的面积值作为第一特征滤波阈值并存储。A document scanning method as claimed in claim 1, characterized in that: before the first filtering processing step, it also includes calculating the first feature filtering threshold, and all the profile feature structured information in the profile feature structured information set according to The corresponding area values are sorted from large to small, and the number of the contour feature structured information is used as the first quantity value, and the first target serial number is calculated according to the first preset coefficient and the first quantity value, and the The area value in the contour feature structured information of the first object serial number is used as the first feature filtering threshold and stored.
  3. 如权利要求1所述的一种文件扫描方法,其特征在于:在所述第二滤波处理步骤之前还包括计算第二特征滤波阈值,将第二轮廓特征结构化信息按照对应的面积值进行从大到小排序处理,将第二轮廓特征结构化信息的数量作为第二数量值,根据第二数量值以及第二预设系数计算出第二目标序号,将位于第二目标序号的轮廓特征结构化信息中的面积值作为第二特征滤波阈值。A document scanning method as claimed in claim 1, characterized in that: before the second filtering processing step, it also includes calculating a second feature filtering threshold, and converting the second contour feature structured information according to the corresponding area value Large to small sorting process, using the quantity of the second contour feature structured information as the second quantity value, calculating the second target serial number according to the second quantity value and the second preset coefficient, and calculating the contour feature structure located in the second target serial number The area value in the transformation information is used as the second feature filtering threshold.
  4. 如权利要求1所述的一种文件扫描方法,其特征在于:在所述提取轮廓特征步骤之前还包括图像预处理,对所述目标图像进行二值化处理。A document scanning method according to claim 1, characterized in that: before the step of extracting contour features, image preprocessing is further included, and binarization processing is performed on the target image.
  5. 如权利要求4所述的一种文件扫描方法,其特征在于:所述图像预处理具体为:采用平均灰度阈值法对所述目标图像进行二值化处理。The document scanning method according to claim 4, wherein the image preprocessing specifically comprises: performing binarization processing on the target image by using an average gray threshold method.
  6. 如权利要求1所述的一种文件扫描方法,其特征在于:采用连通域分割算法对所述目标图像中的字符进行轮廓特征提取处理。The document scanning method according to claim 1, characterized in that: using a connected domain segmentation algorithm to perform contour feature extraction processing on the characters in the target image.
  7. 如权利要求1所述的一种文件扫描方法,其特征在于:所述计算一维抬头特征值具体为:将每个第二轮廓特征结构化信息对应的面积值以及一维距离 特征值相乘得到的乘积作为对应的一维抬头特征值。The document scanning method according to claim 1, wherein the calculation of the one-dimensional header feature value is specifically: multiplying the area value corresponding to each second contour feature structured information and the one-dimensional distance feature value The obtained product is used as the corresponding one-dimensional head-up eigenvalue.
  8. 一种电子设备,其特征在于包括:处理器;An electronic device, characterized by comprising: a processor;
    存储器;以及程序,其中所述程序被存储在所述存储器中,并且被配置成由处理器执行,所述程序包括用于执行权利要求1-7中任意一项所述的一种文件扫描方法。memory; and a program, wherein the program is stored in the memory and is configured to be executed by a processor, the program includes a method for performing a file scanning according to any one of claims 1-7 .
  9. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于:所述计算机程序被处理器执行权利要求1-7中任意一项所述的一种文件扫描方法。A computer-readable storage medium, on which a computer program is stored, characterized in that: the computer program is executed by a processor according to the file scanning method described in any one of claims 1-7.
  10. 一种计算机程序产品,包括计算机程序,其特征在于,该计算机程序被处理器执行时实现权利要求1-7中任意一项所述的一种文件扫描方法。A computer program product, comprising a computer program, characterized in that, when the computer program is executed by a processor, the file scanning method described in any one of claims 1-7 is implemented.
PCT/CN2021/095960 2021-05-14 2021-05-26 File scanning method, device, medium and product WO2022236875A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110528815.7 2021-05-14
CN202110528815.7A CN113286053B (en) 2021-05-14 2021-05-14 File scanning method, equipment, medium and product

Publications (1)

Publication Number Publication Date
WO2022236875A1 true WO2022236875A1 (en) 2022-11-17

Family

ID=77279165

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/095960 WO2022236875A1 (en) 2021-05-14 2021-05-26 File scanning method, device, medium and product

Country Status (2)

Country Link
CN (1) CN113286053B (en)
WO (1) WO2022236875A1 (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107169489A (en) * 2017-05-08 2017-09-15 北京京东金融科技控股有限公司 The method and apparatus of tilted image correction
CN108182398A (en) * 2017-12-26 2018-06-19 广东金赋科技股份有限公司 The method and device in the direction based on scanning device adjustment scan image
CN109101963A (en) * 2018-08-10 2018-12-28 深圳市碧海扬帆科技有限公司 Certificate image automatic positive method, image processing apparatus and readable storage medium storing program for executing
US10223618B1 (en) * 2016-09-27 2019-03-05 Matrox Electronic Systems Ltd. Method and apparatus for transformation of dot text in an image into stroked characters based on dot pitches
CN111062317A (en) * 2019-12-16 2020-04-24 中国计量大学上虞高等研究院有限公司 Method and system for cutting edges of scanned document
CN111461100A (en) * 2020-03-31 2020-07-28 重庆农村商业银行股份有限公司 Bill identification method and device, electronic equipment and storage medium

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2003195797A (en) * 2001-12-21 2003-07-09 Canon Inc Device and method for displaying picture
CN104376319B (en) * 2014-10-22 2018-03-23 西安工程大学 A kind of method based on anisotropic Gaussian core extraction closed edge image outline
CN108122203B (en) * 2016-11-29 2020-04-07 上海东软医疗科技有限公司 Geometric parameter correction method, device, equipment and system
CN106780352B (en) * 2016-12-16 2020-06-09 珠海赛纳打印科技股份有限公司 Image rotation method and device and image forming equipment

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10223618B1 (en) * 2016-09-27 2019-03-05 Matrox Electronic Systems Ltd. Method and apparatus for transformation of dot text in an image into stroked characters based on dot pitches
CN107169489A (en) * 2017-05-08 2017-09-15 北京京东金融科技控股有限公司 The method and apparatus of tilted image correction
CN108182398A (en) * 2017-12-26 2018-06-19 广东金赋科技股份有限公司 The method and device in the direction based on scanning device adjustment scan image
CN109101963A (en) * 2018-08-10 2018-12-28 深圳市碧海扬帆科技有限公司 Certificate image automatic positive method, image processing apparatus and readable storage medium storing program for executing
CN111062317A (en) * 2019-12-16 2020-04-24 中国计量大学上虞高等研究院有限公司 Method and system for cutting edges of scanned document
CN111461100A (en) * 2020-03-31 2020-07-28 重庆农村商业银行股份有限公司 Bill identification method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN113286053A (en) 2021-08-20
CN113286053B (en) 2022-08-30

Similar Documents

Publication Publication Date Title
CN109086714B (en) Form recognition method, recognition system and computer device
CN109657665B (en) Invoice batch automatic identification system based on deep learning
CN104143094B (en) A kind of paper automatic marking processing method and system without answering card
WO2022063199A1 (en) Pulmonary nodule automatic detection method, apparatus and computer system
JP6139396B2 (en) Method and program for compressing binary image representing document
WO2020143325A1 (en) Electronic document generation method and device
CN112183038A (en) Form identification and typing method, computer equipment and computer readable storage medium
CN109902737A (en) A kind of bill classification method and terminal
US9202146B2 (en) Duplicate check image resolution
WO2021051527A1 (en) Image segmentation-based text positioning method, apparatus and device, and storage medium
CN110889311A (en) Financial electronic facsimile document identification system and method
CN112949471A (en) Domestic CPU-based electronic official document identification reproduction method and system
CN110490185A (en) One kind identifying improved method based on repeatedly comparison correction OCR card information
CN110378351A (en) Seal discrimination method and device
CN114648776B (en) Financial reimbursement data processing method and processing system
CN113642380A (en) Identification technology for wireless form
CN113139535A (en) OCR document recognition method
CN115082776A (en) Electric energy meter automatic detection system and method based on image recognition
CN115100657A (en) Line recognition method for characters and strip widths of electrical CAD drawing scanned graph
CN114581928A (en) Form identification method and system
WO2022236875A1 (en) File scanning method, device, medium and product
JP2008204184A (en) Image processor, image processing method, program and recording medium
WO2019071476A1 (en) Express information input method and system based on intelligent terminal
CN113947778A (en) Archive file based digital processing method
CN112308141B (en) Scanning bill classification method, system and readable storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21941444

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21941444

Country of ref document: EP

Kind code of ref document: A1