WO2021174786A1 - 训练样本制作方法、装置、计算机设备及可读存储介质 - Google Patents

训练样本制作方法、装置、计算机设备及可读存储介质 Download PDF

Info

Publication number
WO2021174786A1
WO2021174786A1 PCT/CN2020/112302 CN2020112302W WO2021174786A1 WO 2021174786 A1 WO2021174786 A1 WO 2021174786A1 CN 2020112302 W CN2020112302 W CN 2020112302W WO 2021174786 A1 WO2021174786 A1 WO 2021174786A1
Authority
WO
WIPO (PCT)
Prior art keywords
picture
image
cropped
threshold
cropping
Prior art date
Application number
PCT/CN2020/112302
Other languages
English (en)
French (fr)
Inventor
盛建达
叶明�
张国辉
宋晨
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021174786A1 publication Critical patent/WO2021174786A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Definitions

  • This application relates to the field of artificial intelligence image detection technology, and in particular to a training sample preparation method, device, computer equipment, and readable storage medium. It can be applied in the field of making training samples of artificial neural networks.
  • the length of the text varies in length, and in real life, the semantic relevance of the text is very low in many scenarios, such as the information on the ID card (person’s name, gender, nationality, etc.), formatted form , Mailing address, etc.
  • the purpose of this application is to provide a training sample preparation method, device, computer equipment, and readable storage medium, which are used to solve the problem that the training samples obtained in the prior art are full of useless zero-padding information, resulting in OCR recognition training
  • This application can be applied to smart government scenarios to promote the construction of smart cities.
  • this application provides a method for making training samples, including:
  • the picture and the training picture form a training sample.
  • this application also provides a training sample making device, including:
  • the input module is used to obtain at least one training picture
  • a picture cropping module configured to identify training pictures whose length exceeds a preset cropping threshold and set them as a picture to be cropped, and crop the picture to be cropped to obtain at least one cropped picture and image fragment whose length does not exceed the cropping threshold;
  • a picture splicing module for identifying training pictures whose length is lower than a preset splicing threshold and setting them as pictures to be spliced, and splicing the image fragments with the pictures to be spliced to obtain a spliced picture;
  • the sample summary module is used to identify training pictures whose length does not exceed the cropping threshold and exceeds the stitching threshold and set them as zero-to-fill pictures, perform zero-padding operations on the zero-to-fill pictures to make their length reach the cropping threshold, and summarize The cropped picture, the stitched picture, and the training picture form a training sample.
  • each computer device includes a memory, a processor, and a computer program stored in the memory and running on the processor, and the processor of the computer device executes the computer program
  • a method for making training samples is realized at a time, and the method for making training samples includes:
  • the picture and the training picture form a training sample.
  • the present application also provides a computer-readable storage medium storing a computer program, and the computer program stored in the storage medium implements a training sample preparation method when the computer program is executed by a processor, and the training sample preparation method include:
  • the picture and the training picture form a training sample.
  • the training sample preparation method, device, computer equipment, and readable storage medium provided in this application obtain at least one cropped picture and image fragment whose length does not exceed the cropping threshold by cropping the picture to be cropped, and stitch the image fragment with the picture to be spliced to obtain a splicing Picture, by converting training pictures into cropped pictures and image fragments whose length does not exceed the cropping threshold, the technical effect of cropping the training pictures with too long length to obtain the image fragments and stitching them on the shorter training pictures is extremely effective.
  • FIG. 1 is a flowchart of Embodiment 1 of the method for making training samples of this application;
  • FIG. 2 is a schematic diagram of the environmental application of the training sample preparation method in the second embodiment of the training sample preparation method of the application;
  • Fig. 3 is a specific method flowchart of the training sample preparation method in the second embodiment of the training sample preparation method of the present application;
  • FIG. 4 is a flowchart of cropping the picture to be cropped to obtain cropped pictures and image fragments in the second embodiment of the method for making training samples of the present application;
  • FIG. 5 is a flowchart of a method for obtaining the image label of the cropped image and the image label of the image fragment when the type of the picture to be cropped is a print in the second embodiment of the training sample making method of the present application;
  • FIG. 6 is a flowchart of a method for obtaining the image label of the cropped image and the image label of the image fragment when the type of the picture to be cropped is non-printed in the second embodiment of the training sample making method of the present application;
  • FIG. 7 is a flowchart of another method for obtaining the image label of the cropped image and the image label of the image fragment when the type of the picture to be cropped is non-printed in the second embodiment of the training sample making method of the present application;
  • FIG. 8 is a flowchart of stitching the image fragment and the picture to be stitched to obtain a stitched picture in the second embodiment of the training sample making method of the present application;
  • Embodiment 9 is a schematic diagram of program modules of Embodiment 3 of the training sample making device of this application.
  • FIG. 10 is a schematic diagram of the hardware structure of the computer device in the fourth embodiment of the computer device of this application.
  • a method for making training samples of this embodiment includes:
  • S101 Obtain at least one training picture.
  • S103 Identify a training picture whose length exceeds a preset cropping threshold and set it as a picture to be cropped, and crop the picture to be cropped to obtain at least one cropped picture and image fragment whose length does not exceed the cropping threshold.
  • S105 Identify a training picture with a length lower than a preset splicing threshold and set it as a picture to be spliced, and splice the image fragment with the picture to be spliced to obtain a spliced picture.
  • S107 Identify training pictures whose length does not exceed the cropping threshold and exceed the splicing threshold, perform zero padding on the training pictures so that their length reaches the cropping threshold, and summarize the cropped pictures, the spliced pictures, and the training pictures to form a training sample .
  • a training picture sent by a user terminal is acquired or received from a database, and the training picture has a character string for training the OCR model, and the character string is composed of at least one character; the training picture There is also an image tag, which is used to express the character string in the training picture.
  • cropping the picture to be cropped obtains at least one cropped picture and image fragment whose length does not exceed the cropping threshold, so that the training picture can be converted to growth
  • the image fragments and the pictures to be spliced are spliced to obtain spliced pictures, since the spliced pictures are not complements used in the prior art.
  • Zero operation instead of cropping the training pictures with too long length to obtain image fragments and splicing them on the training pictures with shorter length, which greatly reduces the zero padding information in the picture collection, which in turn helps to improve OCR recognition training s efficiency.
  • This application can be applied to smart government affairs scenarios to promote the construction of smart cities.
  • This embodiment is a specific application scenario of the foregoing Embodiment 1. Through this embodiment, the method provided by this application can be described more clearly and specifically.
  • Fig. 2 schematically shows an environmental application diagram of the method for making training samples according to the second embodiment of the present application.
  • the server 2 where the training sample preparation method is located is the database 3 and the user terminal 4 respectively through the network; the server 2 may provide services through one or more networks, and the network 3 may include various network devices, such as Routers, switches, multiplexers, hubs, modems, bridges, repeaters, firewalls, proxy devices and/or etc.
  • the network 3 may include physical links, such as coaxial cable links, twisted pair cable links, optical fiber links, combinations thereof, and/or the like.
  • the network may include wireless links, such as cellular links, satellite links, Wi-Fi links and/or the like; the database 3 may be a database server storing training pictures, and the user terminal 4 may be a smart phone or a tablet Computers, laptops, desktop computers and other computer equipment.
  • FIG. 3 is a specific method flowchart of a method for making training samples provided by an embodiment of the present application. The method specifically includes steps S201 to S207.
  • the training picture has a character string for training the OCR model, and the character string is composed of at least one character; the training picture also has an image tag, which is used for Express the character string in the training picture.
  • S202 Calculate the complexity of cropping and stitching the training picture according to at least one preset cropping length and according to the preset cropping and stitching rules through the preset loading model, and set the cropping length with the lowest complexity as the cropping threshold , Subtracting the cropping threshold from the preset redundancy value to obtain the splicing threshold.
  • this step calculates the complexity of cropping and stitching of the training pictures to obtain the cropping length with the lowest complexity and set it as the cropping threshold. , Subtracting the cropping threshold and the preset redundancy value to obtain the splicing threshold, making full use of the computing power of the server to avoid occupying too much computing power of the server, thereby helping to increase the amount of service concurrency and reduce the service delay.
  • Q is the overall complexity
  • a1 is the complexity of the cropping event
  • a2 is the complexity of the stitching event
  • m1 is the number of times the image is cropped
  • m2 is the number of times the image is stitched
  • the complexity of the cropping event reflects the complexity of the image Computing power consumed for cropping
  • the complexity of the splicing event reflects the computing power consumed for splicing pictures
  • the a1 and a2 can be set according to the actual situation of the server.
  • the cutting and splicing rules include:
  • the prediction segment is spliced to training pictures whose length is less than the cropping length to obtain a prediction spliced picture whose length does not exceed the cropping length, and the number of times the prediction segment is spliced on the training picture is recorded and recorded as m2.
  • the redundancy value can be set as required.
  • S203 Identify a training picture whose length exceeds a preset cropping threshold and set it as a picture to be cropped, and crop the picture to be cropped to obtain at least one cropped picture and image fragment whose length does not exceed the cropping threshold.
  • This step is to identify those whose length exceeds the preset cropping threshold.
  • the training picture is set as the picture to be cropped, and the picture to be cropped is cropped to obtain at least one cropped picture and image fragment whose length does not exceed the cropping threshold.
  • the step of cropping the picture to be cropped to obtain at least one cropped picture and image fragment whose length does not exceed the cropping threshold includes:
  • S31 Set the picture to be cropped as a first cropped picture and perform a positioning process, taking the end of the first cropped picture as a starting point, and set the length from the end of the first cropped picture to the The position of the cropping threshold is set as the threshold position. From the threshold position, the preset blank value is moved toward the end to obtain the blank position; wherein the training picture has characters, and the end is the cropped picture. Either one of the first and last ends in the arrangement direction of the characters;
  • the end of this step may be the head end of the first cropped picture, or the end of the first cropped picture.
  • the blank value can be set according to user needs.
  • the end is either one of the two ends of the cropped picture in the horizontal direction, such as the left end or the right end; if the training picture is The characters of are arranged in the vertical direction, so the end is either one of the two ends of the cropped picture in the vertical direction, such as the top end and the bottom end.
  • S32 Perform a cropping process to obtain the blank position and the threshold position on the first cropped picture, and the former blank segment located between the end and the blank position is located at the blank position and the threshold position The overlap area between the two is located at the threshold position and the rear threshold segment located between the other end of the first cropped picture;
  • S33 Perform a first splicing process to copy the overlap area, splice one of the overlap areas at the end of the front blank segment to obtain a cropped picture, and splice another overlap area at the beginning of the back threshold segment Get the segment to be evaluated at the end; or
  • S34 Perform a second splicing process to copy the overlapping area, splicing one of the overlapping areas at the beginning of the front blank segment to obtain a cropped picture, and splicing the other overlapping area in the back threshold segment Obtain the fragment to be evaluated at the end;
  • S35 Perform an evaluation process to determine whether the segment to be evaluated exceeds the clipping threshold
  • S204 Obtain the image tag of the picture to be cropped; obtain the characters corresponding to the number of characters of the cropped picture from the image tag, and summarize them as the image tag of the cropped image; and from the image tag The characters corresponding to the number of characters in the image segment are acquired, and collected as the image tags of the image segment.
  • the current OCR label is a picture whose label corresponds to a line of character strings instead of a label corresponding to a character in a picture. If the picture is hurriedly intercepted, the label will not correspond to any of the intercepted image fragments;
  • the image tags of the cropped pictures and image fragments obtained by cropping cannot accurately reflect the character strings in the cropped pictures or image fragments, resulting in the failure of user OCR recognition training;
  • the image tags of the picture to be cropped are obtained; the characters corresponding to the number of characters of the cropped picture are obtained from the image tags, and collected as the image tags of the cropped image; and from the image tags Obtain the characters corresponding to the number of characters in the image fragment in the, and summarize them as the image label of the image fragment, so as to realize the image label of the cropped image and the image fragment, which can accurately reflect the characters in the cropped image and the image fragment. String to ensure that the obtained cropped image and its image label and image fragment and its image label can be used for OCR recognition training.
  • the types of the pictures to be cropped include printed and non-printed images.
  • the spacing of characters in the printed image is equal, and the spacing of characters in the non-printed image is unequal.
  • the characters corresponding to the number of characters of the cropped picture are obtained from the image tag, and collected as the cropping
  • the image label of the image; and the step of obtaining the characters corresponding to the number of characters of the image fragment from the image label and collecting them as the image label of the image fragment includes:
  • S4-01 Extract the length and image tag of the picture to be cropped, and obtain the length of the cropped picture and image fragment;
  • S4-02 Divide the length of the cropped picture by the length of the picture to be cropped to obtain the cropping probability, multiply the cropping probability by the number of characters in the image label to obtain the cropping quantity, and obtain the cropping quantity from the image label.
  • the characters corresponding to the cropped quantity are summarized as the image label of the cropped image;
  • S4-03 Divide the length of the image fragment by the length of the picture to be cropped to obtain the fragment probability, multiply the fragment probability by the number of characters in the image tag to obtain the number of fragments, and obtain the number of fragments from the image tag.
  • the characters corresponding to the number of fragments are summarized as the image tags of the image fragments.
  • the characters corresponding to the number of characters of the cropped picture are obtained from the image tag, and are summarized as the The step of cropping the image label of the image; and obtaining the characters corresponding to the number of characters of the image fragment from the image label and collecting them as the image label of the image fragment includes:
  • S4-11 Extract the image tag of the picture to be cropped, and perform binarization processing on the cropped image and the image fragment respectively, to obtain a binary cropped picture and a binary image fragment correspondingly;
  • a binary cropped picture is obtained by binarizing the cropped image, and the image fragment is binarized to obtain a binary image fragment;
  • S4-13 Obtain the starting position of each character on the left and right sides of the binary image segment through the vertical projection module, and draw a rectangular box according to the starting position to mark the position of each character;
  • the characters corresponding to the number of rectangular boxes in the binary image fragment are summarized as the image label of the cropped picture.
  • the vertical projection refers to the calculation of a certain pixel of the binarized image in the vertical direction.
  • the binarized image it is either black or white. Therefore, in this embodiment, the binarized image
  • the black points in the statistic are calculated, and the upper and lower boundaries of each row and the left and right boundaries of each column can be judged based on the statistical results. For example, set the position between the column with a statistical result of 0 and the column with a statistical result of non-zero. Start position, so as to achieve the purpose of segmentation.
  • OpenCV is used as the vertical projection module
  • the OpenCV implementation is a cross-platform computer vision and machine learning software library issued under a BSD license (open source), which can run on Linux, Windows, Android, and Mac OS On the operating system. It is lightweight and efficient--consisting of a series of C functions and a small number of C++ classes, it also provides interfaces to languages such as Python, Ruby, and MATLAB, and implements many common algorithms in image processing and computer vision.
  • the characters corresponding to the number of characters of the cropped picture are obtained from the image tag, and are summarized as the The step of cropping the image label of the image; and obtaining the characters corresponding to the number of characters of the image fragment from the image label and collecting them as the image label of the image fragment includes:
  • S4-21 Extract the length and image tag of the picture to be cropped, and obtain the length of the cropped picture and image fragment.
  • S4-22 Calculate the length of the cropped picture using a preset character model to obtain the number of characters in the cropped picture, obtain characters corresponding to the number from the image tag, and summarize them as the cropped image Image tags.
  • S4-23 Calculate the length of the image segment by using a preset character model to obtain the number of characters in the image segment, obtain characters corresponding to the number from the image tag, and summarize them as the image segment Image tags.
  • the character model is obtained in the following manner:
  • the equal-spacing model divides the length of the training samples according to the preset spacing values therein, and predicts the number of characters in the training samples to obtain the predicted number; Divide the prediction number by the number of characters of the training label in the training sample to obtain the accuracy rate of the equal-spacing model for predicting the number of characters in the training sample; adjust the spacing value according to the accuracy rate until the The accuracy of predicting the number of characters of the training sample by the equal-spacing model reaches a preset accuracy threshold, and the equal-spacing model is set as a character model.
  • S205 Identify training pictures whose length is lower than a preset splicing threshold and set them as pictures to be spliced, and splice the image fragments with the pictures to be spliced to obtain a spliced picture.
  • this step is to identify training pictures whose length is lower than the preset splicing threshold and set them as the pictures to be spliced.
  • the image fragment and the picture to be spliced are spliced to obtain a spliced picture. Since the spliced picture is not a zero-padded operation used in the prior art, the training picture with an excessively long length is cut to obtain an image fragment and spliced in a longer length.
  • the short training pictures greatly reduce the zero-padded information in the picture set, thereby helping to improve the efficiency of OCR recognition training.
  • the splicing threshold can be obtained in step S202, or can be set by the user.
  • the step of stitching the image fragment and the picture to be spliced to obtain a spliced picture includes:
  • S51 Take any image fragment as a target fragment, and stitch it with the picture to be spliced to obtain a picture to be evaluated.
  • S52 Execute a judgment thread to judge whether the length of the picture to be evaluated exceeds a preset cropping threshold.
  • the image tag of any image segment except the target tag is extracted, and the image corresponding to the image tag is obtained Segment, and set the image segment as a newly-added target segment, and at the same time, set the image tag corresponding to the newly-added target segment as the target tag, so that any image segment other than the target segment can be extracted again and combined It is set as a newly-added target segment, and the target segment and the picture to be evaluated are spliced to obtain an updated picture to be evaluated continuously, so as to ensure that the picture to be evaluated whose length exceeds the cropping threshold can be finally obtained.
  • S206 Splicing the image tags of the pictures to be spliced and the image fragments in the spliced pictures to obtain a spliced label, and set the spliced label as the image label of the spliced picture.
  • the spliced labels are obtained by splicing the image tags of the pictures to be spliced and the image fragments in the spliced pictures, and the spliced labels are set as the images of the spliced pictures.
  • Tag to solve the problem that the image tag of the spliced picture cannot reflect the character string in the spliced picture, and the resulting spliced picture cannot be used for OCR recognition training.
  • the image tags of the picture to be spliced and the image tags of the image fragment are spliced
  • the image tag of the picture to be spliced is spliced on the head of the image tag of the image fragment to form the image tag of the spliced picture; if the picture to be spliced is located at the end of the image fragment, the The image tags of the pictures to be spliced are spliced at the end of the image tags of the image fragments to form the image tags of the spliced pictures.
  • S207 Identify training pictures whose length does not exceed the cropping threshold and exceed the splicing threshold, perform zero padding on the training pictures so that their length reaches the cropping threshold, and summarize the cropped pictures, spliced pictures, and training pictures to form a training sample .
  • this step is to identify training images whose length does not exceed the cropping threshold and exceeds the stitching threshold.
  • the training pictures are zero-filled to make their length reach the cropping threshold, and the cropped pictures, spliced pictures, and training pictures are aggregated to form training samples, so as to ensure that the training samples can achieve batch training while greatly reducing Percentage of zero-padded information.
  • the corresponding summary information is obtained based on the training samples.
  • the summary information is obtained by hashing the training samples, for example, obtained by the sha256s algorithm.
  • Uploading summary information to the blockchain can ensure its security and fairness and transparency to users.
  • the user equipment can download the summary information from the blockchain to verify whether the training sample has been tampered with.
  • the blockchain referred to in this example is a new application mode of computer technology such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm.
  • Blockchain essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information for verification. The validity of the information (anti-counterfeiting) and the generation of the next block.
  • the blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.
  • a training sample making device 1 of this embodiment includes:
  • the input module 11 is used to obtain at least one training picture
  • the picture cropping module 13 is configured to identify a training picture whose length exceeds a preset cropping threshold and set it as a picture to be cropped, and crop the picture to be cropped to obtain at least one cropped picture and image fragment whose length does not exceed the cropping threshold;
  • the picture splicing module 15 is configured to identify training pictures whose length is lower than a preset splicing threshold and set them as pictures to be spliced, and splice the image fragments and the pictures to be spliced to obtain a spliced picture;
  • the sample summary module 17 is used to identify the training pictures whose length does not exceed the cropping threshold and exceeds the stitching threshold and set them as the zero-to-be-filled pictures, and perform zero-filling operations on the zero-to-fill pictures to make their length reach the cropping threshold, Summarizing the cropped picture, the stitched picture, and the training picture to form a training sample.
  • the training sample making device 1 further includes:
  • the complexity evaluation module 12 is configured to calculate the complexity of cropping and stitching the training picture according to at least one preset cropping length and according to the preset cropping and splicing rules through a preset loading model, and the lowest complexity
  • the cropping length is set as the cropping threshold, and the cropping threshold is subtracted from the preset redundancy value to obtain the stitching threshold.
  • the training sample making device 1 further includes:
  • the label cropping module 14 is configured to obtain the image label of the picture to be cropped; obtain the characters corresponding to the number of characters of the cropped picture from the image label, and summarize them as the image label of the cropped image; and Characters corresponding to the number of characters of the image segment are obtained from the image tags, and collected as the image tags of the image segment.
  • the training sample making device 1 further includes:
  • the label splicing module 16 is configured to splice the image labels of the pictures to be spliced and the image fragments in the spliced pictures to obtain a spliced label, and set the spliced label as the image label of the spliced picture.
  • the cropped pictures and image fragments whose length does not exceed the cropping threshold are obtained by cropping the pictures to be cropped, and the image fragments are spliced with the pictures to be spliced to obtain the spliced pictures, and the training pictures are filled with zeros.
  • Make the length reach the cropping threshold collect cropped pictures, stitched pictures, and training pictures to form training samples, so that they can be applied to the OCR recognition training field of image processing.
  • the present application also provides a computer device 5.
  • the components of the training sample making device 1 in the third embodiment can be dispersed in different computer devices.
  • the computer device 5 can be a smart phone, a tablet computer, or a computer that executes the program.
  • the computer equipment of this embodiment at least includes but is not limited to: a memory 51 and a processor 52 that can be communicatively connected to each other through a system bus, as shown in FIG. 10. It should be pointed out that FIG. 10 only shows a computer device with components, but it should be understood that it is not required to implement all the illustrated components, and more or fewer components may be implemented instead.
  • the memory 51 (ie, readable storage medium) includes flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), random access memory (RAM), static random access memory (SRAM), Read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), programmable read-only memory (PROM), magnetic memory, magnetic disks, optical disks, etc.
  • the memory 51 may be an internal storage unit of a computer device, such as a hard disk or memory of the computer device.
  • the memory 51 may also be an external storage device of the computer device, such as a plug-in hard disk or a smart memory card (Smart Memory Card) equipped on the computer device.
  • the memory 51 may also include both the internal storage unit of the computer device and its external storage device.
  • the memory 51 is generally used to store an operating system and various application software installed in a computer device, such as the program code of the training sample making device in the third embodiment, and so on.
  • the memory 51 can also be used to temporarily store various types of data that have been output or will be output.
  • the processor 52 may be a central processing unit (Central Processing Unit) in some embodiments. Processing Unit, CPU), controller, microcontroller, microprocessor, or other data processing chip.
  • the processor 52 is generally used to control the overall operation of the computer equipment.
  • the processor 52 is used to run the program code or process data stored in the memory 51, for example, to run a training sample preparation device, so as to implement the training sample preparation methods of the first and second embodiments.
  • the readable storage medium may be non-volatile or volatile, such as flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), random access memory (RAM), static random access memory (SRAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), programmable read-only memory (PROM) , Magnetic storage, magnetic disks, optical disks, servers, App application malls, etc., on which computer programs are stored, and when the programs are executed by the processor 52, corresponding functions are realized.
  • the computer-readable storage medium of this embodiment is used to store the training sample preparation device, and when executed by the processor 52, the training sample preparation method of Embodiment 1 and Embodiment 2 is implemented.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

一种训练样本制作方法、装置、计算机设备及可读存储介质,涉及人工智能技术领域,还涉及区块链技术,信息可存储于区块链节点中。所述方法包括:获取至少一个训练图片(S101);识别长度超过预置的裁剪阈值的训练图片并将其设为待裁剪图片,裁剪待裁剪图片得到至少一个长度不超过裁剪阈值的裁剪图片及图像片段(S103);识别长度低于预置拼接阈值的训练图片并将其设为待拼接图片,将图像片段与待拼接图片进行拼接得到拼接图片(S105);识别长度不超过裁剪阈值且超过拼接阈值的训练图片,对训练图片进行补零操作使其长度达到裁剪阈值,汇总裁剪图片、拼接图片和训练图片形成训练样本(S107)。在保证训练样本的批量化训练同时,降低了补零信息的占比,提高了OCR识别训练的效率。

Description

训练样本制作方法、装置、计算机设备及可读存储介质
本申请要求2020年07月29日提交中国专利局、申请号为CN202010739646.7,发明名称为“训练样本制作方法、装置、计算机设备及可读存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及人工智能的图像检测技术领域,尤其涉及一种训练样本制作方法、装置、计算机设备及可读存储介质。其可应用在人工智能的神经网络的训练样本制作领域。
背景技术
在OCR识别领域,文本的长度是长短不一的,并且在实际生活中,很多场景下文字的前后语义相关性极低,例如证件卡上的信息(人名,性别,国籍等),格式化表格,邮寄地址等。
在工程化使用的场景下,为避免受制于GPU的IO瓶颈,导致并发量很低,GPU无法充分利用的情况,通常会对获得的至少一个图片进行汇总得到图片集合,以进行批量化ORC识别训练;由于所述图片集合中的图片通常是长度不一的,因此,通常是对所述集合中长度较短的图片进行补零,使其长度达到所述集合中最长图片的长度,以得到便于批量化ORC识别训练的训练样本,然而,发明人发现当前的做法会使获得的训练样本中将会充斥着过多无用的补零信息,导致OCR识别训练效率降低。
发明内容
本申请的目的是提供一种训练样本制作方法、装置、计算机设备及可读存储介质,用于解决现有技术存在的获得的训练样本因充斥着过多无用的补零信息,导致OCR识别训练效率降低的问题;本申请可应用于智慧政务场景中,从而推动智慧城市的建设。
为实现上述目的,本申请提供一种训练样本制作方法,包括:
获取至少一个训练图片;
识别长度超过预置的裁剪阈值的训练图片并将其设为待裁剪图片,裁剪所述待裁剪图片得到至少一个长度不超过所述裁剪阈值的裁剪图片及图像片段;
识别长度低于预置拼接阈值的训练图片并将其设为待拼接图片,将所述图像片段与所述待拼接图片进行拼接得到拼接图片;
识别长度不超过裁剪阈值且超过拼接阈值的训练图片并将其设为待补零图片,对所述待补零图片进行补零操作使其长度达到所述裁剪阈值,汇总所述裁剪图片、拼接图片和所述训练图片形成训练样本。
为实现上述目的,本申请还提供一种训练样本制作装置,包括:
输入模块,用于获取至少一个训练图片;
图片裁剪模块,用于识别长度超过预置的裁剪阈值的训练图片并将其设为待裁剪图片,裁剪所述待裁剪图片得到至少一个长度不超过所述裁剪阈值的裁剪图片及图像片段;
图片拼接模块,用于识别长度低于预置拼接阈值的训练图片并将其设为待拼接图片,将所述图像片段与所述待拼接图片进行拼接得到拼接图片;
样本汇总模块,用于识别长度不超过裁剪阈值且超过拼接阈值的训练图片并将其设为待补零图片,对所述待补零图片进行补零操作使其长度达到所述裁剪阈值,汇总所述裁剪图片、拼接图片和所述训练图片形成训练样本。
为实现上述目的,本申请还提供一种计算机设备,各计算机设备包括存储器、处理器以及存储在存储器上并可在处理器上运行的计算机程序,所述计算机设备的处理器执行所述计算机程序时实现训练样本制作方法,所述训练样本制作方法包括:
获取至少一个训练图片;
识别长度超过预置的裁剪阈值的训练图片并将其设为待裁剪图片,裁剪所述待裁剪图片得到至少一个长度不超过所述裁剪阈值的裁剪图片及图像片段;
识别长度低于预置拼接阈值的训练图片并将其设为待拼接图片,将所述图像片段与所述待拼接图片进行拼接得到拼接图片;
识别长度不超过裁剪阈值且超过拼接阈值的训练图片并将其设为待补零图片,对所述待补零图片进行补零操作使其长度达到所述裁剪阈值,汇总所述裁剪图片、拼接图片和所述训练图片形成训练样本。
为实现上述目的,本申请还提供一种计算机可读存储介质,其存储有计算机程序,所述存储介质存储的所述计算机程序被处理器执行时实现训练样本制作方法,所述训练样本制作方法包括:
获取至少一个训练图片;
识别长度超过预置的裁剪阈值的训练图片并将其设为待裁剪图片,裁剪所述待裁剪图片得到至少一个长度不超过所述裁剪阈值的裁剪图片及图像片段;
识别长度低于预置拼接阈值的训练图片并将其设为待拼接图片,将所述图像片段与所述待拼接图片进行拼接得到拼接图片;
识别长度不超过裁剪阈值且超过拼接阈值的训练图片并将其设为待补零图片,对所述待补零图片进行补零操作使其长度达到所述裁剪阈值,汇总所述裁剪图片、拼接图片和所述训练图片形成训练样本。
本申请提供的训练样本制作方法、装置、计算机设备及可读存储介质,通过裁剪待裁剪图片得到至少一个长度不超过裁剪阈值的裁剪图片及图像片段,将图像片段与待拼接图片进行拼接得到拼接图片,以通过将训练图片转换成长度不超过裁剪阈值的裁剪图片和图像片段,实现裁切长度过长的训练图片得到图像片段并将其拼接在长度较短的训练图片上的技术效果,极大的降低了图片集合中的补零信息,进而有助于提高OCR识别训练的效率;通过对训练图片进行补零操作使其长度达到裁剪阈值,并汇总裁剪图片、拼接图片和训练图片形成训练样本,实现了在保证训练样本能够实现批量化训练的同时,极大的降低了补零信息的占比。
附图说明
图1为本申请训练样本制作方法实施例一的流程图;
图2为本申请训练样本制作方法实施例二中训练样本制作方法的环境应用示意图;
图3是本申请训练样本制作方法实施例二中训练样本制作方法的具体方法流程图;
图4是本申请训练样本制作方法实施例二中裁剪所述待裁剪图片得到裁剪图片及图像片段的流程图;
图5是本申请训练样本制作方法实施例二中若待裁剪图片类型为印刷体时,得到所述裁剪图像的图像标签及所述图像片段的图像标签的方法的流程图;
图6是本申请训练样本制作方法实施例二中若待裁剪图片类型为非印刷体时,得到所述裁剪图像的图像标签及所述图像片段的图像标签的一种方法流程图;
图7是本申请训练样本制作方法实施例二中若待裁剪图片类型为非印刷体时,得到所述裁剪图像的图像标签及所述图像片段的图像标签的另一种方法流程图;
图8是本申请训练样本制作方法实施例二中将所述图像片段与所述待拼接图片进行拼接得到拼接图片的流程图;
图9为本申请训练样本制作装置实施例三的程序模块示意图;
图10为本申请计算机设备实施例四中计算机设备的硬件结构示意图。
具体实施方式
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处所描述的具体实施例仅用以解释本申请,并不用于限定本申请。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
现提供以下实施例:
实施例一:
请参阅图1,本实施例的一种训练样本制作方法,包括:
S101:获取至少一个训练图片。
S103:识别长度超过预置的裁剪阈值的训练图片并将其设为待裁剪图片,裁剪所述待裁剪图片得到至少一个长度不超过所述裁剪阈值的裁剪图片及图像片段。
S105:识别长度低于预置拼接阈值的训练图片并将其设为待拼接图片,将所述图像片段与所述待拼接图片进行拼接得到拼接图片。
S107:识别长度不超过裁剪阈值且超过拼接阈值的训练图片,对所述训练图片进行补零操作使其长度达到所述裁剪阈值,汇总所述裁剪图片、拼接图片和所述训练图片形成训练样本。
在示例性的实施例中,从数据库获取或接收用户端发送的训练图片,所述训练图片中具有用于对OCR模型进行训练的字符串,该字符串至少由一个字符构成;所述训练图片还具有图像标签,其用于表达所述训练图片中的字符串。
通过识别长度超过预置的裁剪阈值的训练图片并将其设为待裁剪图片,裁剪所述待裁剪图片得到至少一个长度不超过所述裁剪阈值的裁剪图片及图像片段,实现将训练图片转换成长度不超过裁剪阈值的裁剪图片和图像片段,以便于将超过裁剪阈值的部分能够补充拼接在长度较短的训练图片上。
通过识别长度低于预置拼接阈值的训练图片并将其设为待拼接图片,将所述图像片段与所述待拼接图片进行拼接得到拼接图片,由于拼接图片中不是现有技术中采用的补零操作,而是裁切长度过长的训练图片得到图像片段并将其拼接在长度较短的训练图片上,极大的降低了图片集合中的补零信息,进而有助于提高OCR识别训练的效率。
通过识别长度不超过裁剪阈值且超过拼接阈值的训练图片,对所述训练图片进行补零操作使其长度达到所述裁剪阈值,汇总所述裁剪图片、拼接图片和所述训练图片形成训练样本,以在保证训练样本能够实现批量化训练的同时,极大的降低了补零信息的占比。
本申请可应用于智慧政务场景中,从而推动智慧城市的建设。
实施例二:
本实施例为上述实施例一的一种具体应用场景,通过本实施例,能够更加清楚、具体地阐述本申请所提供的方法。
下面,以在运行有训练样本制作方法的服务器中,对其中的缓存器和数据库进行信息同步及返回认证口令识别为例,来对本实施例提供的方法进行具体说明。需要说明的是,本实施例只是示例性的,并不限制本申请实施例所保护的范围。
图2示意性示出了根据本申请实施例二的训练样本制作方法的环境应用示意图。
在示例性的实施例中,训练样本制作方法所在的服务器2通过网络分别数据库3和用户端4;所述服务器2可以通过一个或多个网络提供服务,网络3可以包括各种网络设备,例如路由器,交换机,多路复用器,集线器,调制解调器,网桥,中继器,防火墙,代理设备和/或等等。网络3可以包括物理链路,例如同轴电缆链路,双绞线电缆链路,光纤链路,它们的组合和/或类似物。网络可以包括无线链路,例如蜂窝链路,卫星链路,Wi-Fi链路和/或类似物;所述数据库3可为保存有训练图片的数据库服务器,用户端4可为智能手机、平板电脑、笔记本电脑、台式电脑等计算机设备。
图3是本申请一个实施例提供的一种训练样本制作方法的具体方法流程图,该方法具体包括步骤S201至S207。
S201:获取至少一个训练图片。
从数据库获取或接收用户端发送的训练图片,所述训练图片中具有用于对OCR模型进行训练的字符串,该字符串至少由一个字符构成;所述训练图片还具有图像标签,其用于表达所述训练图片中的字符串。
S202:通过预设的加载模型根据至少一个预设的裁剪长度并按照预设的裁剪拼接规则,计算对所述训练图片进行裁剪及拼接的复杂度,将复杂度最低的裁剪长度设为裁剪阈值,将所述裁剪阈值与预设的冗余值相减得到拼接阈值。
为在获得的训练图片进行裁剪和拼接过程中,避免占用服务器过多的算力,本步骤通过计算训练图片的裁剪及拼接的复杂度,得到复杂度最低的裁剪长度并将其设为裁剪阈值,将所述裁剪阈值与预设的冗余值相减得到拼接阈值,充分利用服务器的计算算力,避免了占用服务器过多的算力,进而有助于提高业务并发量,降低服务延迟。
本申请还通过加载模型并给定多个固定长度(例如:L1=l Byte,L2=1.5L1,L3=1.5L2……),通过以下公式计算上述通过裁剪拼接所获得的图片集合的计算复杂度:
Q=a1*m1+a2*m2
其中,Q为整体复杂度,a1为裁剪事件复杂度,a2为拼接事件复杂度,m1为对图片进行裁剪的次数,m2为对图片进行拼接的次数;所述裁剪事件复杂度反应了对图片进行裁剪所消耗的计算算力,所述拼接事件复杂度反应了对图片进行拼接所消耗的计算算力,所述a1和a2可根据服务器实际情况设置。
所述裁剪拼接规则包括:
裁剪超过裁剪长度的训练图片,以得到至少一个长度不超过所述裁剪长度的预测裁剪图片和预测图像片段,记录裁剪所述训练图片的次数,并将其记录为m1;
将预测片段拼接在长度低于裁剪长度的训练图片,以得到长度不超过所述裁剪长度的预测拼接图片,记录将预测片段拼接在所述训练图片上的次数,并将其记录为m2。
所述冗余值可根据需要设置。
S203:识别长度超过预置的裁剪阈值的训练图片并将其设为待裁剪图片,裁剪所述待裁剪图片得到至少一个长度不超过所述裁剪阈值的裁剪图片及图像片段。
为将训练图片转换成长度不超过裁剪阈值的裁剪图片和图像片段,以便于将超过裁剪阈值的部分能够补充拼接在长度较短的训练图片上,本步骤通过识别长度超过预置的裁剪阈值的训练图片并将其设为待裁剪图片,裁剪所述待裁剪图片得到至少一个长度不超过所述裁剪阈值的裁剪图片及图像片段。
在一个优选的实施例中,请参阅图4,裁剪所述待裁剪图片得到至少一个长度不超过所述裁剪阈值的裁剪图片及图像片段的步骤,包括:
S31:将所述待裁剪图片设为第一裁剪图片并执行定位进程,以将所述第一裁剪图片的端部为起点,将所述第一裁剪图片上距所述端部长度为所述裁剪阈值的位置设为阈值位置,从所述阈值位置起朝向所述端部移动预置的留空值得到留空位置;其中,所述训练图片中具有字符,所述端部是裁剪图片在其中字符的排列方向上的首尾两端中的任意一端;
为保证最终得到的训练样本的多样性,本步骤的端部可为所述第一裁剪图片的首端,也可为所述第一裁剪图片的尾端。
同时,本步骤中,所述留空值可根据用户需要而设置。
需要说明的是,如果训练图片中的字符是按照水平方向排列的,那么所述端部则是所述裁剪图片在水平方向上的两端中的任意一端,如左端或右端;如果训练图片中的字符是按照垂直方向排列的,那么所述端部则是所述裁剪图片在垂直方向上的两端中的任意一端,如:顶端和底端。
S32:执行裁剪进程,以裁剪所述第一裁剪图片上的留空位置和阈值位置得到,位于所述端部和留空位置之间的前留空片段,位于所述留空位置和阈值位置之间的重叠区域,位于所述阈值位置和位于所述第一裁剪图片的所述端部另一端之间的后阈值片段;
S33:执行第一拼接进程,以复制所述重叠区域,并将其中一个重叠区域拼接在所述前留空片段的末端得到裁剪图片,及将另一个重叠区域拼接在所述后阈值片段的首端得到待评价片段;或
S34:执行第二拼接进程,以复制所述重叠区域,并将其中一个重叠区域拼接在所述前留空片段的首端得到裁剪图片,及将另一个重叠区域拼接在所述后阈值片段的末端得到待评价片段;
S35:执行评价进程,以判断所述待评价片段是否超过所述裁剪阈值;
S36:若是,则将所述待评价片段设为第二裁剪图片,并依次执行所述定位进程、裁剪进程以及所述第一拼接进程或第二拼接进程,得到裁剪图片和待评价片段,并再次执行所述评价进程;
S37:若否,则将所述待评价片段设为图像片段。
进一步地,在对图片进行裁剪时,可能会裁剪到图片上的字符,如此一来会导致因裁减图片所获得的图像片段不完整,导致图像片段信息失效;本申请通过提供重叠区域字体识别的方法,在裁剪的位置附近提供一部分重叠区域,该重叠区域分别添加在一个图像片段的尾部和另一图像片段的头部,以避免两个图像片段因裁剪到某一字符而造成图像片段失效的问题发生;而对比文件1并未公开相关技术特征。
S204:获取所述待裁剪图片的图像标签;从所述图像标签中获取与所述裁剪图片的字符数量对应的字符,并汇总以作为所述裁剪图像的图像标签;以及从所述图像标签中获取与所述图像片段的字符数量对应的字符,并汇总以作为所述图像片段的图像标签。
当前OCR标签是标签对应到一行字符串的图片而不是一个图片中的字符对应一个标签,而如果贸然对图片进行截取,将会导致标签无法对应到任一被截取形成的图像片段;为避免经裁剪所获得的裁剪图片和图像片段的图像标签,无法准确反应所述裁剪图片或图像片段中的字符串,导致其无法用户OCR识别训练;
本步骤通过获取所述待裁剪图片的图像标签;从所述图像标签中获取与所述裁剪图片的字符数量对应的字符,并汇总以作为所述裁剪图像的图像标签;以及从所述图像标签中获取与所述图像片段的字符数量对应的字符,并汇总以作为所述图像片段的图像标签,以实现了裁剪图像和图像片段的图像标签,能够分别准确反应裁剪图像和图像片段中的字符串,保证了得到的裁剪图像及其图像标签和图像片段及其图像标签能够被用于OCR识别训练。
所述待裁剪图片的类型包括印刷体和非印刷体,所述印刷体的图片中各字符的间距相等,所述非印刷体的图片中各字符的间距不等。
在一个优选的实施例中,请参阅图5,若所述待裁剪图片类型为印刷体,从所述图像标签中获取与所述裁剪图片的字符数量对应的字符,并汇总以作为所述裁剪图像的图像标签;以及从所述图像标签中获取与所述图像片段的字符数量对应的字符,并汇总以作为所述图像片段的图像标签的步骤,包括:
S4-01:提取所述待裁剪图片的长度和图像标签,并获得所述裁剪图片和图像片段的长度;
S4-02:将所述裁剪图片的长度与待裁剪图片的长度相除得到裁剪概率,将所述裁剪概率与图像标签中字符的数量相乘得到裁剪数量,从所述图像标签中获取与所述裁剪数量对应的字符,并汇总以作为所述裁剪图像的图像标签;
S4-03:将所述图像片段的长度与待裁剪图片的长度相除得到片段概率,将所述片段概率与图像标签中字符的数量相乘得到片段数量,从所述图像标签中获取与所述片段数量对应的字符,并汇总以作为所述图像片段的图像标签。
在一个优选的实施例中,请参阅图6,若所述待裁剪图片类型为非印刷体,从所述图像标签中获取与所述裁剪图片的字符数量对应的字符,并汇总以作为所述裁剪图像的图像标签;以及从所述图像标签中获取与所述图像片段的字符数量对应的字符,并汇总以作为所述图像片段的图像标签的步骤,包括:
S4-11:提取所述待裁剪图片的图像标签,并分别对所述裁剪图像和图像片段进行二值化处理,对应得到二值裁剪图片和二值图像片段;
本步骤中,通过对裁剪图像进行二值化处理得到二值裁剪图片,对图像片段进行二值化处理得到二值图像片段;
S4-12:通过垂直投影模块获得二值裁剪图片中每一个字符左右两侧的起始位置,并根据所述起始位置绘制矩形框以标注各字符的位置;从图像标签中得到与所述二值裁剪图片中矩形框的数量对应的字符,并将汇总以作为所述裁剪图片的图像标签。
S4-13:通过垂直投影模块获得二值图像片段中每一个字符左右两侧的起始位置,并根据所述起始位置绘制矩形框以标注各字符的位置;从图像标签中得到与所述二值图像片段中矩形框的数量对应的字符,并汇总以作为所述裁剪图片的图像标签。
本步骤中,所述垂直投影是指在垂直方向上对二值化图像的某一种像素进行统计,对于二值化图像非黑即白,因此,于本实施例中,对二值化图像中的黑点进行统计,根据统计结果就可以判断出每一行的上下边界以及每一列的左右边界,例如,将统计结果为0的列与统计结果为非0的列之间的位置设为起始位置,从而实现分割的目的。于本实施例中,采用OpenCV作为所述垂直投影模块,所述OpenCV实现是一个基于BSD许可(开源)发行的跨平台计算机视觉和机器学习软件库,可以运行在Linux、Windows、Android和Mac OS操作系统上。它轻量级而且高效--由一系列 C 函数和少量 C++ 类构成,同时提供了Python、Ruby、MATLAB等语言的接口,实现了图像处理和计算机视觉方面的很多通用算法。
在一个优选的实施例中,请参阅图7,若所述待裁剪图片类型为非印刷体,从所述图像标签中获取与所述裁剪图片的字符数量对应的字符,并汇总以作为所述裁剪图像的图像标签;以及从所述图像标签中获取与所述图像片段的字符数量对应的字符,并汇总以作为所述图像片段的图像标签的步骤,包括:
S4-21:提取所述待裁剪图片的长度和图像标签,并获得所述裁剪图片和图像片段的长度。
S4-22:通过预设的字符模型计算所述裁剪图片的长度得到所述裁剪图片中字符的数量,从所述图像标签中获取与所述数量对应的字符,并汇总以作为所述裁剪图像的图像标签。
S4-23:通过预设的字符模型计算所述图像片段的长度得到所述图像片段中字符的数量,从所述图像标签中获取与所述数量对应的字符,并汇总以作为所述图像片段的图像标签。
于本实施例中,所述字符模型是通过以下方式获得:
以类型为非印刷体的图像作为字符训练样本;所述等间距模型按照其中预设的间距值对所述训练样本的长度进行分割,并预测所述训练样本中的字符数量以得到预测数量;将所述预测数量与所述训练样本中训练标签的字符数量相除,得到所述等间距模型对训练样本中字符数量预测的准确率;根据所述准确率调整所述间距值,直至所述等间距模型对所述训练样本的字符数量预测的准确率达到预设的准确阈值,并将所述等间距模型设为字符模型。
S205:识别长度低于预置拼接阈值的训练图片并将其设为待拼接图片,将所述图像片段与所述待拼接图片进行拼接得到拼接图片。
为避免获得的训练样本中充斥着过多无用的补零信息,进而导致OCR识别训练效率降低,本步骤通过识别长度低于预置拼接阈值的训练图片并将其设为待拼接图片,将所述图像片段与所述待拼接图片进行拼接得到拼接图片,由于拼接图片中不是现有技术中采用的补零操作,而是裁切长度过长的训练图片得到图像片段并将其拼接在长度较短的训练图片上,极大的降低了图片集合中的补零信息,进而有助于提高OCR识别训练的效率。
本步骤中,所述拼接阈值可通过步骤S202所得到,也可由用户自行设置。
在一个优选的实施例中,请参阅图8,将所述图像片段与所述待拼接图片进行拼接得到拼接图片的步骤,包括:
S51:将任一图像片段作为目标片段,并将其与所述待拼接图片进行拼接得到待评价图片。
S52:执行判断线程,判断待评价图片的长度是否超过预设的裁剪阈值。
S53:若否,则执行持续拼接线程,以提取除所述目标片段外的任一图像片段并将其设为新增的目标片段,并将该目标片段与所述待评价图片进行拼接得到更新的待评价图片,并执行所述判断线程。本步骤中,记录作为目标片段的图像标签并将其设为目标标签,执行所述持续拼接线程时,提取除目标标签外的任一图像片段的图像标签,并获得该图像标签所对应的图像片段,及将该图像片段设为新增的目标片段,同时,再将所述新增的目标片段所对应的图像标签设为目标标签,以便于再次提取除目标片段外的任一图像片段并将其设为新增的目标片段,并将该目标片段与所述待评价图片进行拼接得到更新的待评价图片的持续进行,以保证最终能够获得长度超过所述裁剪阈值的待评价图片。
S54:若是,则断开最后与所述待评价图片拼接的图像片段,并在断开所述图像片段的待评价图片的尾部进行补零操作,以得到长度为所述裁剪阈值的拼接图片。
S206:拼接所述拼接图片中待拼接图片和图像片段的图像标签得到拼接标签,将所述拼接标签设为所述拼接图片的图像标签。
为避免因拼接图片的图像标签无法准确对应其拼接图片,本步骤通过拼接所述拼接图片中待拼接图片和图像片段的图像标签得到拼接标签,将所述拼接标签设为所述拼接图片的图像标签,以解决拼接图片的图像标签无法反应其拼接图片中的字符串,导致得到的拼接图片无法用于OCR识别训练的问题。
于本实施例中,按照待拼接图片和图像片段在拼接图片中的位置关系,拼接所述待拼接图片的图像标签和图像片段的图像标签,
例如:如果待拼接图片位于图像片段的头部,则将待拼接图片的图像标签拼接在图像片段的图像标签的头部形成拼接图片的图像标签;如果待拼接图片位于图像片段的尾部,则将待拼接图片的图像标签拼接在图像片段的图像标签的尾部形成拼接图片的图像标签。
S207:识别长度不超过裁剪阈值且超过拼接阈值的训练图片,对所述训练图片进行补零操作使其长度达到所述裁剪阈值,汇总所述裁剪图片、拼接图片和所述训练图片形成训练样本。
为能够得到对OCR识别模型进行批量化训练的训练样本,并且降低该训练样本中补零信息的数量及占比,本步骤通过识别长度不超过裁剪阈值且超过拼接阈值的训练图片,对所述训练图片进行补零操作使其长度达到所述裁剪阈值,汇总所述裁剪图片、拼接图片和所述训练图片形成训练样本,以在保证训练样本能够实现批量化训练的同时,极大的降低了补零信息的占比。
汇总所述裁剪图片、拼接图片和所述训练图片形成训练样本之后,还可包括:
将所述训练样本保存至区块链中。
需要说明的是,基于训练样本得到对应的摘要信息,具体来说,摘要信息由训练样本进行散列处理得到,比如利用sha256s算法处理得到。将摘要信息上传至区块链可保证其安全性和对用户的公正透明性。用户设备可以从区块链中下载得该摘要信息,以便查证训练样本是否被篡改。本示例所指区块链是分布式数据存储、点对点传输、共识机制、加密算法等计算机技术的新型应用模式。区块链(Blockchain),本质上是一个去中心化的数据库,是一串使用密码学方法相关联产生的数据块,每一个数据块中包含了一批次网络交易的信息,用于验证其信息的有效性(防伪)和生成下一个区块。区块链可以包括区块链底层平台、平台产品服务层以及应用服务层等。
实施例三:
请参阅图9,本实施例的一种训练样本制作装置1,包括:
输入模块11,用于获取至少一个训练图片;
图片裁剪模块13,用于识别长度超过预置的裁剪阈值的训练图片并将其设为待裁剪图片,裁剪所述待裁剪图片得到至少一个长度不超过所述裁剪阈值的裁剪图片及图像片段;
图片拼接模块15,用于识别长度低于预置拼接阈值的训练图片并将其设为待拼接图片,将所述图像片段与所述待拼接图片进行拼接得到拼接图片;
样本汇总模块17,用于识别长度不超过裁剪阈值且超过拼接阈值的训练图片并将其设为待补零图片,对所述待补零图片进行补零操作使其长度达到所述裁剪阈值,汇总所述裁剪图片、拼接图片和所述训练图片形成训练样本。
可选的,所述训练样本制作装置1还包括:
复杂度评价模块12,用于通过预设的加载模型根据至少一个预设的裁剪长度并按照预设的裁剪拼接规则,计算对所述训练图片进行裁剪及拼接的复杂度,将复杂度最低的裁剪长度设为裁剪阈值,将所述裁剪阈值与预设的冗余值相减得到拼接阈值。
可选的,所述训练样本制作装置1还包括:
标签裁剪模块14,用于获取所述待裁剪图片的图像标签;从所述图像标签中获取与所述裁剪图片的字符数量对应的字符,并汇总以作为所述裁剪图像的图像标签;以及从所述图像标签中获取与所述图像片段的字符数量对应的字符,并汇总以作为所述图像片段的图像标签。
可选的,所述训练样本制作装置1还包括:
标签拼接模块16,用于拼接所述拼接图片中待拼接图片和图像片段的图像标签得到拼接标签,将所述拼接标签设为所述拼接图片的图像标签。
本技术方案应用于人工智能的图像检测领域,通过裁剪待裁剪图片得到长度不超过裁剪阈值的裁剪图片及图像片段,将图像片段与待拼接图片进行拼接得到拼接图片,对训练图片进行补零操作使其长度达到裁剪阈值,汇总裁剪图片、拼接图片和训练图片形成训练样本,以便于应用在图像处理的OCR识别训练领域。
实施例四:
为实现上述目的,本申请还提供一种计算机设备5,实施例三的训练样本制作装置1的组成部分可分散于不同的计算机设备中,计算机设备5可以是执行程序的智能手机、平板电脑、笔记本电脑、台式计算机、机架式服务器、刀片式服务器、塔式服务器或机柜式服务器(包括独立的服务器,或者多个应用服务器所组成的服务器集群)等。本实施例的计算机设备至少包括但不限于:可通过系统总线相互通信连接的存储器51、处理器52,如图10所示。需要指出的是,图10仅示出了具有组件-的计算机设备,但是应理解的是,并不要求实施所有示出的组件,可以替代的实施更多或者更少的组件。
本实施例中,存储器51(即可读存储介质)包括闪存、硬盘、多媒体卡、卡型存储器(例如,SD或DX存储器等)、随机访问存储器(RAM)、静态随机访问存储器(SRAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、可编程只读存储器(PROM)、磁性存储器、磁盘、光盘等。在一些实施例中,存储器51可以是计算机设备的内部存储单元,例如该计算机设备的硬盘或内存。在另一些实施例中,存储器51也可以是计算机设备的外部存储设备,例如该计算机设备上配备的插接式硬盘,智能存储卡(Smart Media Card, SMC),安全数字(Secure Digital, SD)卡,闪存卡(Flash Card)等。当然,存储器51还可以既包括计算机设备的内部存储单元也包括其外部存储设备。本实施例中,存储器51通常用于存储安装于计算机设备的操作系统和各类应用软件,例如实施例三的训练样本制作装置的程序代码等。此外,存储器51还可以用于暂时地存储已经输出或者将要输出的各类数据。
处理器52在一些实施例中可以是中央处理器(Central Processing Unit,CPU)、控制器、微控制器、微处理器、或其他数据处理芯片。该处理器52通常用于控制计算机设备的总体操作。本实施例中,处理器52用于运行存储器51中存储的程序代码或者处理数据,例如运行训练样本制作装置,以实现实施例一和实施例二的训练样本制作方法。
实施例五:
为实现上述目的,本申请还提供一种计算机可读存储介质,所述可读存储介质可以是非易失性,也可以是易失性,如闪存、硬盘、多媒体卡、卡型存储器(例如,SD或DX存储器等)、随机访问存储器(RAM)、静态随机访问存储器(SRAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、可编程只读存储器(PROM)、磁性存储器、磁盘、光盘、服务器、App应用商城等等,其上存储有计算机程序,程序被处理器52执行时实现相应功能。本实施例的计算机可读存储介质用于存储训练样本制作装置,被处理器52执行时实现实施例一和实施例二的训练样本制作方法。
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。
以上仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。

Claims (20)

  1. 一种训练样本制作方法,包括:
    获取至少一个训练图片;
    识别长度超过预置的裁剪阈值的训练图片并将其设为待裁剪图片,裁剪所述待裁剪图片得到至少一个长度不超过所述裁剪阈值的裁剪图片及图像片段;
    识别长度低于预置拼接阈值的训练图片并将其设为待拼接图片,将所述图像片段与所述待拼接图片进行拼接得到拼接图片;
    识别长度不超过裁剪阈值且超过拼接阈值的训练图片并将其设为待补零图片,对所述待补零图片进行补零操作使其长度达到所述裁剪阈值,汇总所述裁剪图片、拼接图片和所述训练图片形成训练样本。
  2. 根据权利要求1所述的训练样本制作方法,其中,获取至少一个训练图片之后,还可包括:
    通过预设的加载模型根据至少一个预设的裁剪长度并按照预设的裁剪拼接规则,计算对所述训练图片进行裁剪及拼接的复杂度,将复杂度最低的裁剪长度设为裁剪阈值,将所述裁剪阈值与预设的冗余值相减得到拼接阈值。
  3. 根据权利要求1所述的训练样本制作方法,其中,裁剪所述待裁剪图片得到至少一个长度不超过所述裁剪阈值的裁剪图片及图像片段的步骤,包括:
    将所述待裁剪图片设为第一裁剪图片并执行定位进程,以将所述第一裁剪图片的端部为起点,将所述第一裁剪图片上距所述端部长度为所述裁剪阈值的位置设为阈值位置,从所述阈值位置起朝向所述端部移动预置的留空值得到留空位置;其中,所述训练图片中具有字符,所述端部是裁剪图片在其中字符的排列方向上的首尾两端中的任意一端;执行裁剪进程,以裁剪所述第一裁剪图片上的留空位置和阈值位置得到,位于所述端部和留空位置之间的前留空片段,位于所述留空位置和阈值位置之间的重叠区域,位于所述阈值位置和位于所述第一裁剪图片的所述端部另一端之间的后阈值片段;
    执行第一拼接进程,以复制所述重叠区域,并将其中一个重叠区域拼接在所述前留空片段的末端得到裁剪图片,及将另一个重叠区域拼接在所述后阈值片段的首端得到待评价片段;或执行第二拼接进程,以复制所述重叠区域,并将其中一个重叠区域拼接在所述前留空片段的首端得到裁剪图片,及将另一个重叠区域拼接在所述后阈值片段的末端得到待评价片段;
    执行评价进程,以判断所述待评价片段是否超过所述裁剪阈值;
    若是,则将所述待评价片段设为第二裁剪图片,并依次执行所述定位进程、裁剪进程以及所述第一拼接进程或第二拼接进程,得到裁剪图片和待评价片段,并再次执行所述评价进程;
    若否,则将所述待评价片段设为图像片段。
  4. 根据权利要求1所述的训练样本制作方法,其中,裁剪所述待裁剪图片得到至少一个长度不超过所述裁剪阈值的裁剪图片及图像片段之后,还包括:
    获取所述待裁剪图片的图像标签;从所述图像标签中获取与所述裁剪图片的字符数量对应的字符,并汇总以作为所述裁剪图像的图像标签;以及从所述图像标签中获取与所述图像片段的字符数量对应的字符,并汇总以作为所述图像片段的图像标签。
  5. 根据权利要求4所述的训练样本制作方法,其中,从所述图像标签中获取与所述裁剪图片的字符数量对应的字符,并汇总以作为所述裁剪图像的图像标签;以及从所述图像标签中获取与所述图像片段的字符数量对应的字符,并汇总以作为所述图像片段的图像标签的步骤,包括:
    提取所述待裁剪图片的图像标签,并分别对所述裁剪图像和图像片段进行二值化处理,对应得到二值裁剪图片和二值图像片段;
    通过垂直投影模块获得二值裁剪图片中每一个字符左右两侧的起始位置,并根据所述起始位置绘制矩形框以标注各字符的位置;从图像标签中得到与所述二值裁剪图片中矩形框的数量对应的字符,并将汇总以作为所述裁剪图片的图像标签;
    通过垂直投影模块获得二值图像片段中每一个字符左右两侧的起始位置,并根据所述起始位置绘制矩形框以标注各字符的位置;从图像标签中得到与所述二值图像片段中矩形框的数量对应的字符,并汇总以作为所述裁剪图片的图像标签。
  6. 根据权利要求1所述的训练样本制作方法,其中,将所述图像片段与所述待拼接图片进行拼接得到拼接图片的步骤,包括:
    将任一图像片段作为目标片段,并将其与所述待拼接图片进行拼接得到待评价图片;
    执行判断线程,以判断所述待评价图片的长度是否超过预设的裁剪阈值;
    若否,则执行持续拼接线程,以继续提取除所述目标片段外的任一图像片段并将其设为新增的目标片段,并将该目标片段与所述待评价图片进行拼接得到更新的待评价图片,并执行所述判断线程;
    若是,则断开最后与所述待评价图片拼接的图像片段,并在断开所述图像片段的待评价图片的尾部进行补零操作,以得到长度为所述裁剪阈值的拼接图片。
  7. 根据权利要求1所述的训练样本制作方法,其中,将所述图像片段与所述待拼接图片进行拼接得到拼接图片之后,还可包括:
    拼接所述拼接图片中待拼接图片和图像片段的图像标签得到拼接标签,将所述拼接标签设为所述拼接图片的图像标签;
    汇总所述裁剪图片、拼接图片和所述训练图片形成训练样本之后,还可包括:
    将所述训练样本保存至区块链中。
  8. 一种训练样本制作装置,包括:
    输入模块,用于获取至少一个训练图片;
    图片裁剪模块,用于识别长度超过预置的裁剪阈值的训练图片并将其设为待裁剪图片,裁剪所述待裁剪图片得到至少一个长度不超过所述裁剪阈值的裁剪图片及图像片段;
    图片拼接模块,用于识别长度低于预置拼接阈值的训练图片并将其设为待拼接图片,将所述图像片段与所述待拼接图片进行拼接得到拼接图片;
    样本汇总模块,用于识别长度不超过裁剪阈值且超过拼接阈值的训练图片并将其设为待补零图片,对所述待补零图片进行补零操作使其长度达到所述裁剪阈值,汇总所述裁剪图片、拼接图片和所述训练图片形成训练样本。
  9. 一种计算机设备,各计算机设备包括存储器、处理器以及存储在存储器上并可在处理器上运行的计算机程序,所述计算机设备的处理器执行所述计算机程序时实现训练样本制作方法,所述训练样本制作方法包括:
    获取至少一个训练图片;
    识别长度超过预置的裁剪阈值的训练图片并将其设为待裁剪图片,裁剪所述待裁剪图片得到至少一个长度不超过所述裁剪阈值的裁剪图片及图像片段;
    识别长度低于预置拼接阈值的训练图片并将其设为待拼接图片,将所述图像片段与所述待拼接图片进行拼接得到拼接图片;
    识别长度不超过裁剪阈值且超过拼接阈值的训练图片并将其设为待补零图片,对所述待补零图片进行补零操作使其长度达到所述裁剪阈值,汇总所述裁剪图片、拼接图片和所述训练图片形成训练样本。
  10. 根据权利要求9所述的计算机设备,其中,获取至少一个训练图片之后,还可包括:
    通过预设的加载模型根据至少一个预设的裁剪长度并按照预设的裁剪拼接规则,计算对所述训练图片进行裁剪及拼接的复杂度,将复杂度最低的裁剪长度设为裁剪阈值,将所述裁剪阈值与预设的冗余值相减得到拼接阈值。
  11. 根据权利要求9所述的计算机设备,其中,裁剪所述待裁剪图片得到至少一个长度不超过所述裁剪阈值的裁剪图片及图像片段的步骤,包括:
    将所述待裁剪图片设为第一裁剪图片并执行定位进程,以将所述第一裁剪图片的端部为起点,将所述第一裁剪图片上距所述端部长度为所述裁剪阈值的位置设为阈值位置,从所述阈值位置起朝向所述端部移动预置的留空值得到留空位置;其中,所述训练图片中具有字符,所述端部是裁剪图片在其中字符的排列方向上的首尾两端中的任意一端;执行裁剪进程,以裁剪所述第一裁剪图片上的留空位置和阈值位置得到,位于所述端部和留空位置之间的前留空片段,位于所述留空位置和阈值位置之间的重叠区域,位于所述阈值位置和位于所述第一裁剪图片的所述端部另一端之间的后阈值片段;
    执行第一拼接进程,以复制所述重叠区域,并将其中一个重叠区域拼接在所述前留空片段的末端得到裁剪图片,及将另一个重叠区域拼接在所述后阈值片段的首端得到待评价片段;或执行第二拼接进程,以复制所述重叠区域,并将其中一个重叠区域拼接在所述前留空片段的首端得到裁剪图片,及将另一个重叠区域拼接在所述后阈值片段的末端得到待评价片段;
    执行评价进程,以判断所述待评价片段是否超过所述裁剪阈值;
    若是,则将所述待评价片段设为第二裁剪图片,并依次执行所述定位进程、裁剪进程以及所述第一拼接进程或第二拼接进程,得到裁剪图片和待评价片段,并再次执行所述评价进程;
    若否,则将所述待评价片段设为图像片段。
  12. 根据权利要求9所述的计算机设备,其中,裁剪所述待裁剪图片得到至少一个长度不超过所述裁剪阈值的裁剪图片及图像片段之后,还包括:
    获取所述待裁剪图片的图像标签;从所述图像标签中获取与所述裁剪图片的字符数量对应的字符,并汇总以作为所述裁剪图像的图像标签;以及从所述图像标签中获取与所述图像片段的字符数量对应的字符,并汇总以作为所述图像片段的图像标签;
    从所述图像标签中获取与所述裁剪图片的字符数量对应的字符,并汇总以作为所述裁剪图像的图像标签;以及从所述图像标签中获取与所述图像片段的字符数量对应的字符,并汇总以作为所述图像片段的图像标签的步骤,包括:
    提取所述待裁剪图片的图像标签,并分别对所述裁剪图像和图像片段进行二值化处理,对应得到二值裁剪图片和二值图像片段;
    通过垂直投影模块获得二值裁剪图片中每一个字符左右两侧的起始位置,并根据所述起始位置绘制矩形框以标注各字符的位置;从图像标签中得到与所述二值裁剪图片中矩形框的数量对应的字符,并将汇总以作为所述裁剪图片的图像标签;
    通过垂直投影模块获得二值图像片段中每一个字符左右两侧的起始位置,并根据所述起始位置绘制矩形框以标注各字符的位置;从图像标签中得到与所述二值图像片段中矩形框的数量对应的字符,并汇总以作为所述裁剪图片的图像标签。
  13. 根据权利要求9所述的计算机设备,其中,将所述图像片段与所述待拼接图片进行拼接得到拼接图片的步骤,包括:
    将任一图像片段作为目标片段,并将其与所述待拼接图片进行拼接得到待评价图片;
    执行判断线程,以判断所述待评价图片的长度是否超过预设的裁剪阈值;
    若否,则执行持续拼接线程,以继续提取除所述目标片段外的任一图像片段并将其设为新增的目标片段,并将该目标片段与所述待评价图片进行拼接得到更新的待评价图片,并执行所述判断线程;
    若是,则断开最后与所述待评价图片拼接的图像片段,并在断开所述图像片段的待评价图片的尾部进行补零操作,以得到长度为所述裁剪阈值的拼接图片。
  14. 根据权利要求9所述的计算机设备,其中,将所述图像片段与所述待拼接图片进行拼接得到拼接图片之后,还可包括:
    拼接所述拼接图片中待拼接图片和图像片段的图像标签得到拼接标签,将所述拼接标签设为所述拼接图片的图像标签;
    汇总所述裁剪图片、拼接图片和所述训练图片形成训练样本之后,还可包括:
    将所述训练样本保存至区块链中。
  15. 一种计算机可读存储介质,其存储有计算机程序,所述存储介质存储的所述计算机程序被处理器执行时实现训练样本制作方法,所述训练样本制作方法包括:
    获取至少一个训练图片;
    识别长度超过预置的裁剪阈值的训练图片并将其设为待裁剪图片,裁剪所述待裁剪图片得到至少一个长度不超过所述裁剪阈值的裁剪图片及图像片段;
    识别长度低于预置拼接阈值的训练图片并将其设为待拼接图片,将所述图像片段与所述待拼接图片进行拼接得到拼接图片;
    识别长度不超过裁剪阈值且超过拼接阈值的训练图片并将其设为待补零图片,对所述待补零图片进行补零操作使其长度达到所述裁剪阈值,汇总所述裁剪图片、拼接图片和所述训练图片形成训练样本。
  16. 根据权利要求15所述的计算机可读存储介质,其中,获取至少一个训练图片之后,还可包括:
    通过预设的加载模型根据至少一个预设的裁剪长度并按照预设的裁剪拼接规则,计算对所述训练图片进行裁剪及拼接的复杂度,将复杂度最低的裁剪长度设为裁剪阈值,将所述裁剪阈值与预设的冗余值相减得到拼接阈值。
  17. 根据权利要求15所述的计算机可读存储介质,其中,裁剪所述待裁剪图片得到至少一个长度不超过所述裁剪阈值的裁剪图片及图像片段的步骤,包括:
    将所述待裁剪图片设为第一裁剪图片并执行定位进程,以将所述第一裁剪图片的端部为起点,将所述第一裁剪图片上距所述端部长度为所述裁剪阈值的位置设为阈值位置,从所述阈值位置起朝向所述端部移动预置的留空值得到留空位置;其中,所述训练图片中具有字符,所述端部是裁剪图片在其中字符的排列方向上的首尾两端中的任意一端;执行裁剪进程,以裁剪所述第一裁剪图片上的留空位置和阈值位置得到,位于所述端部和留空位置之间的前留空片段,位于所述留空位置和阈值位置之间的重叠区域,位于所述阈值位置和位于所述第一裁剪图片的所述端部另一端之间的后阈值片段;
    执行第一拼接进程,以复制所述重叠区域,并将其中一个重叠区域拼接在所述前留空片段的末端得到裁剪图片,及将另一个重叠区域拼接在所述后阈值片段的首端得到待评价片段;或执行第二拼接进程,以复制所述重叠区域,并将其中一个重叠区域拼接在所述前留空片段的首端得到裁剪图片,及将另一个重叠区域拼接在所述后阈值片段的末端得到待评价片段;
    执行评价进程,以判断所述待评价片段是否超过所述裁剪阈值;
    若是,则将所述待评价片段设为第二裁剪图片,并依次执行所述定位进程、裁剪进程以及所述第一拼接进程或第二拼接进程,得到裁剪图片和待评价片段,并再次执行所述评价进程;
    若否,则将所述待评价片段设为图像片段。
  18. 根据权利要求15所述的计算机可读存储介质,其中,裁剪所述待裁剪图片得到至少一个长度不超过所述裁剪阈值的裁剪图片及图像片段之后,还包括:
    获取所述待裁剪图片的图像标签;从所述图像标签中获取与所述裁剪图片的字符数量对应的字符,并汇总以作为所述裁剪图像的图像标签;以及从所述图像标签中获取与所述图像片段的字符数量对应的字符,并汇总以作为所述图像片段的图像标签;
    从所述图像标签中获取与所述裁剪图片的字符数量对应的字符,并汇总以作为所述裁剪图像的图像标签;以及从所述图像标签中获取与所述图像片段的字符数量对应的字符,并汇总以作为所述图像片段的图像标签的步骤,包括:
    提取所述待裁剪图片的图像标签,并分别对所述裁剪图像和图像片段进行二值化处理,对应得到二值裁剪图片和二值图像片段;
    通过垂直投影模块获得二值裁剪图片中每一个字符左右两侧的起始位置,并根据所述起始位置绘制矩形框以标注各字符的位置;从图像标签中得到与所述二值裁剪图片中矩形框的数量对应的字符,并将汇总以作为所述裁剪图片的图像标签;
    通过垂直投影模块获得二值图像片段中每一个字符左右两侧的起始位置,并根据所述起始位置绘制矩形框以标注各字符的位置;从图像标签中得到与所述二值图像片段中矩形框的数量对应的字符,并汇总以作为所述裁剪图片的图像标签。
  19. 根据权利要求15所述的计算机可读存储介质,其中,将所述图像片段与所述待拼接图片进行拼接得到拼接图片的步骤,包括:
    将任一图像片段作为目标片段,并将其与所述待拼接图片进行拼接得到待评价图片;
    执行判断线程,以判断所述待评价图片的长度是否超过预设的裁剪阈值;
    若否,则执行持续拼接线程,以继续提取除所述目标片段外的任一图像片段并将其设为新增的目标片段,并将该目标片段与所述待评价图片进行拼接得到更新的待评价图片,并执行所述判断线程;
    若是,则断开最后与所述待评价图片拼接的图像片段,并在断开所述图像片段的待评价图片的尾部进行补零操作,以得到长度为所述裁剪阈值的拼接图片。
  20. 根据权利要求15所述的计算机可读存储介质,其中,将所述图像片段与所述待拼接图片进行拼接得到拼接图片之后,还可包括:
    拼接所述拼接图片中待拼接图片和图像片段的图像标签得到拼接标签,将所述拼接标签设为所述拼接图片的图像标签;
    汇总所述裁剪图片、拼接图片和所述训练图片形成训练样本之后,还可包括:
    将所述训练样本保存至区块链中。
PCT/CN2020/112302 2020-07-28 2020-08-29 训练样本制作方法、装置、计算机设备及可读存储介质 WO2021174786A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010739646.7 2020-07-28
CN202010739646.7A CN111881902B (zh) 2020-07-28 2020-07-28 训练样本制作方法、装置、计算机设备及可读存储介质

Publications (1)

Publication Number Publication Date
WO2021174786A1 true WO2021174786A1 (zh) 2021-09-10

Family

ID=73200957

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/112302 WO2021174786A1 (zh) 2020-07-28 2020-08-29 训练样本制作方法、装置、计算机设备及可读存储介质

Country Status (2)

Country Link
CN (1) CN111881902B (zh)
WO (1) WO2021174786A1 (zh)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112070074B (zh) * 2020-11-12 2021-02-05 中电科新型智慧城市研究院有限公司 物体检测方法、装置、终端设备和存储介质
CN112329777B (zh) * 2021-01-06 2021-05-04 平安科技(深圳)有限公司 基于方向检测的文字识别方法、装置、设备及介质
CN113362218B (zh) * 2021-05-21 2022-09-27 北京百度网讯科技有限公司 数据处理方法、装置、电子设备及存储介质
CN113256652A (zh) * 2021-05-24 2021-08-13 中国长江三峡集团有限公司 一种混合图像数据增强方法
CN113989804A (zh) * 2021-11-11 2022-01-28 北京百度网讯科技有限公司 字符识别方法、装置、设备和存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0790574A2 (en) * 1996-02-19 1997-08-20 Fujitsu Limited Character recognition apparatus and method
CN108734708A (zh) * 2018-05-23 2018-11-02 平安科技(深圳)有限公司 胃癌识别方法、装置及存储介质
CN111340037A (zh) * 2020-03-25 2020-06-26 上海智臻智能网络科技股份有限公司 文本版面分析方法、装置、计算机设备和存储介质
CN111444922A (zh) * 2020-03-27 2020-07-24 Oppo广东移动通信有限公司 图片处理方法、装置、存储介质及电子设备

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11030486B2 (en) * 2018-04-20 2021-06-08 XNOR.ai, Inc. Image classification through label progression
US10671878B1 (en) * 2019-01-11 2020-06-02 Capital One Services, Llc Systems and methods for text localization and recognition in an image of a document
CN110097564B (zh) * 2019-04-04 2023-06-16 平安科技(深圳)有限公司 基于多模型融合的图像标注方法、装置、计算机设备及存储介质

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0790574A2 (en) * 1996-02-19 1997-08-20 Fujitsu Limited Character recognition apparatus and method
CN108734708A (zh) * 2018-05-23 2018-11-02 平安科技(深圳)有限公司 胃癌识别方法、装置及存储介质
CN111340037A (zh) * 2020-03-25 2020-06-26 上海智臻智能网络科技股份有限公司 文本版面分析方法、装置、计算机设备和存储介质
CN111444922A (zh) * 2020-03-27 2020-07-24 Oppo广东移动通信有限公司 图片处理方法、装置、存储介质及电子设备

Also Published As

Publication number Publication date
CN111881902A (zh) 2020-11-03
CN111881902B (zh) 2023-06-27

Similar Documents

Publication Publication Date Title
WO2021174786A1 (zh) 训练样本制作方法、装置、计算机设备及可读存储介质
CN111401371B (zh) 一种文本检测识别方法、系统及计算机设备
WO2022156066A1 (zh) 文字识别方法、装置、电子设备及存储介质
US20210216880A1 (en) Method, equipment, computing device and computer-readable storage medium for knowledge extraction based on textcnn
EP3869385B1 (en) Method for extracting structural data from image, apparatus and device
CN112257613B (zh) 体检报告信息结构化提取方法、装置及计算机设备
WO2022048363A1 (zh) 网站分类方法、装置、计算机设备及存储介质
WO2022156178A1 (zh) 图像目标对比方法、装置、计算机设备及可读存储介质
CN113642584B (zh) 文字识别方法、装置、设备、存储介质和智能词典笔
US11495014B2 (en) Systems and methods for automated document image orientation correction
CN113627395B (zh) 文本识别方法、装置、介质及电子设备
CN112541443B (zh) 发票信息抽取方法、装置、计算机设备及存储介质
CN113221983B (zh) 迁移学习模型的训练方法及装置、图像处理方法及装置
CN113033543A (zh) 曲形文本识别方法、装置、设备及介质
WO2019024231A1 (zh) 数据自动匹配方法、电子设备及计算机可读存储介质
CN111338688B (zh) 数据长效缓存方法、装置、计算机系统及可读存储介质
CN112966687B (zh) 图像分割模型训练方法、装置及通信设备
CN112579958B (zh) 网页转换方法、装置、计算机设备及可读存储介质
WO2021174869A1 (zh) 用户图片数据的处理方法、装置、计算机设备及存储介质
CN112017763A (zh) 医疗影像数据传输方法、装置、设备及介质
CN112035774A (zh) 网络页面生成方法、装置、计算机设备及可读存储介质
WO2023134080A1 (zh) 相机作弊识别方法、装置、设备及存储介质
CN116416632A (zh) 基于人工智能的文件自动归档方法及相关设备
CN114743030A (zh) 图像识别方法、装置、存储介质和计算机设备
WO2021258991A1 (zh) 目标轮廓圈定方法、装置、计算机系统及可读存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20923441

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20923441

Country of ref document: EP

Kind code of ref document: A1