CN113469161A - Method, device and storage medium for processing logistics list - Google Patents

Method, device and storage medium for processing logistics list Download PDF

Info

Publication number
CN113469161A
CN113469161A CN202010241316.5A CN202010241316A CN113469161A CN 113469161 A CN113469161 A CN 113469161A CN 202010241316 A CN202010241316 A CN 202010241316A CN 113469161 A CN113469161 A CN 113469161A
Authority
CN
China
Prior art keywords
target
information
text
logistics
area
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010241316.5A
Other languages
Chinese (zh)
Inventor
武晨
赵培
杨刘洋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SF Technology Co Ltd
SF Tech Co Ltd
Original Assignee
SF Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SF Technology Co Ltd filed Critical SF Technology Co Ltd
Priority to CN202010241316.5A priority Critical patent/CN113469161A/en
Publication of CN113469161A publication Critical patent/CN113469161A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/08Logistics, e.g. warehousing, loading or distribution; Inventory or stock management

Abstract

The embodiment of the application provides a method, a device and a storage medium for processing a logistics list, wherein the method comprises the following steps: acquiring a target logistics list picture to be identified; adopting a target marking frame to mark key field information in the target logistics single picture; acquiring coordinate information of the target labeling frame in the target logistics single picture; according to the coordinate information of the target labeling frame, intercepting a target area corresponding to key field information from the target logistics single picture; identifying target text in the target area; and outputting the target text as target key information, wherein the target key information is used for representing logistics information corresponding to the target logistics single picture. According to the scheme, the efficiency and accuracy of extracting the key field information from the massive business related pictures can be improved, and the extraction cost is saved.

Description

Method, device and storage medium for processing logistics list
Technical Field
The embodiment of the application relates to the technical field of image recognition, in particular to a method, a device and a storage medium for processing a logistics list.
Background
In the existing mechanism, in the air transportation business of logistics, relevant workers need to manually read information of the air delivery bill of each airline company and compare the information with settlement bill information to confirm the authenticity of the business. The method is characterized in that the aviation bill of lading is browsed manually one by one, 8 key field information entry systems on pictures are checked for comparison, the complex repeated operation of manually browsing the aviation bill of lading and extracting a plurality of key field information every day becomes the daily work of related workers, and the browsing time of one main aviation bill of lading is about one minute.
Even if the existing document recognition software is applied to the air transportation business to read the character part in the image of the air bill of lading, all characters on the air bill of lading can only be converted into documents without distinction, but the characters corresponding to each field required by a user cannot be fed back, namely the key field information in the image of the air bill of lading cannot be directly read, and finally the key field information still needs to be looked up manually from the extracted documents, so that the efficiency of obtaining the key field information is general; on the other hand, the accuracy of the general recognition software is not high, and the efficiency of extracting the document from the picture is general.
Disclosure of Invention
The embodiment of the application provides a method, a device and a storage medium for processing a logistics list, which can improve the efficiency and accuracy of extracting key field information from a mass of service related pictures.
In a first aspect, an embodiment of the present application provides a method for processing a logistics list, where the method includes:
acquiring a target logistics list picture to be identified;
adopting a target marking frame to mark key field information in the target logistics single picture;
acquiring coordinate information of the target labeling frame in the target logistics single picture;
according to the coordinate information of the target labeling frame, intercepting a target area corresponding to key field information from the target logistics single picture;
identifying target text in the target area;
and outputting the target text as target key information, wherein the target key information is used for representing logistics information corresponding to the target logistics single picture.
In one possible design, after the identifying the target text in the target region, the method further includes:
calculating the confidence of the recognized target text;
and taking the target text with the confidence coefficient higher than the preset confidence coefficient as the target key information.
In one possible design, the target area is at least one, and the target text is at least one; the calculating a confidence level that the target text is identified comprises:
obtaining the confidence coefficient of each target character in each target area;
and multiplying the confidence degrees of the target texts to obtain the confidence degree of the target text.
In one possible design, the identifying the target text in the target area includes:
carrying out horizontal projection on the target area to obtain a projection area;
determining a text area and a blank area in the projection area;
acquiring pixel values of the text area and the blank area;
and determining the boundary of the target text in the text area according to the difference value of the pixel value of the character area and the pixel value of the blank area.
In one possible design, the obtaining of the pixel values of the text region and the pixel values of the blank region; determining boundary information of the target text in the text area according to the difference value between the pixel value of the character area and the pixel value of the blank area, wherein the boundary information comprises:
calculating the pixel sum of each row of pixel points in the text area and the pixel sum of each column of pixel points in the blank area;
determining a starting row, an ending row, a starting column and an ending column of the target text according to a preset pixel threshold, the pixel sum of each row of pixel points in the text area and the pixel sum of each column of pixel points in the blank area;
and determining the boundary information of the target text according to the starting line, the ending line, the starting column and the ending column of the target text.
In one possible design, the method further includes:
obtaining a training sample, wherein the training sample comprises a plurality of logistics single pictures;
marking position information of a plurality of items of key information of the air transportation business in the logistics single pictures by adopting marking frames, and recording coordinate information of marking frames for marking the position information;
inputting the logistics single picture marked with the position information into a positioning model, identifying the category of each marking frame on the logistics single picture through the positioning model, and taking the size of each type of marking frame as the prior size of a candidate frame in the positioning model;
compressing the size of the picture, and updating the weight of each layer of the positioning model according to the coordinate information of each marking frame in the logistics single picture so as to obtain the optimal model parameter of the positioning model.
In a possible design, after the target logistics single picture to be identified is obtained, and before the target markup frame is used to identify the key field information in the target logistics single picture, the method further includes:
acquiring the outline information of the target logistics single picture;
acquiring a straight line in the target logistics single picture;
acquiring the deflection angle of each straight line according to the start coordinate and the end coordinate of each straight line;
counting the deflection angle with the most times as a target deflection angle of the target logistics single picture;
and correcting the target deflection angle to obtain the corrected target logistics list picture.
In one possible design, after the identifying the target text in the target region, the method further includes:
determining the type of the target text according to the type of the labeling box;
and setting a label on the target text according to the type of the target text, wherein the label is used for identifying the key information type to which the target text belongs.
In a second aspect, an embodiment of the present application provides an apparatus for processing a logistics list, which has a function of implementing a method for processing a logistics list corresponding to the first aspect. The functions can be realized by hardware, and the functions can also be realized by executing corresponding software by hardware. The hardware or software includes one or more modules corresponding to the above functions, which may be software and/or hardware.
In one possible design, the apparatus includes:
the input and output module is used for acquiring a target logistics single picture to be identified;
the processing module is used for adopting a target marking frame to mark key field information in the target logistics single picture acquired by the input and output module; acquiring coordinate information of the target labeling frame in the target logistics single picture through the input and output module; according to the coordinate information of the target labeling frame, intercepting a target area corresponding to key field information from the target logistics single picture; identifying target text in the target area;
the input and output module is further configured to output the target text as target key information, where the target key information is used to represent logistics information corresponding to the target logistics single picture.
In one possible design, the target area is at least one, and the target text is at least one; the processing module is specifically configured to:
obtaining the confidence coefficient of each target character in each target area through the input and output module;
and multiplying the confidence degrees of the target texts to obtain the confidence degree of the target text.
In one possible design, the processing module is specifically configured to:
carrying out horizontal projection on the target area to obtain a projection area;
determining a text area and a blank area in the projection area;
acquiring pixel values of the text area and the blank area;
and determining the boundary of the target text in the text area according to the difference value of the pixel value of the character area and the pixel value of the blank area.
In one possible design, the processing module is specifically configured to:
calculating the pixel sum of each row of pixel points in the text area and the pixel sum of each column of pixel points in the blank area;
determining a starting row, an ending row, a starting column and an ending column of the target text according to a preset pixel threshold, the pixel sum of each row of pixel points in the text area and the pixel sum of each column of pixel points in the blank area;
and determining the boundary information of the target text according to the starting line, the ending line, the starting column and the ending column of the target text.
In one possible design, the processing module is further configured to:
acquiring a training sample through the input and output module, wherein the training sample comprises a plurality of logistics single pictures;
marking position information of a plurality of items of key information of the air transportation business in the logistics single pictures by adopting marking frames, and recording coordinate information of marking frames for marking the position information;
inputting the logistics single picture marked with the position information into a positioning model, identifying the category of each marking frame on the logistics single picture through the positioning model, and taking the size of each type of marking frame as the prior size of a candidate frame in the positioning model;
compressing the size of the picture, and updating the weight of each layer of the positioning model according to the coordinate information of each marking frame in the logistics single picture so as to obtain the optimal model parameter of the positioning model.
In one possible design, after the input/output module obtains a target logistics single picture to be identified, and before a target mark box is used to identify key field information in the target logistics single picture, the processing module is further configured to:
acquiring the outline information of the target logistics single picture through the input and output module;
acquiring a straight line in the target logistics single picture;
acquiring the deflection angle of each straight line according to the start coordinate and the end coordinate of each straight line;
counting the deflection angle with the most times as a target deflection angle of the target logistics single picture;
and correcting the target deflection angle to obtain the corrected target logistics list picture.
In one possible design, after identifying the target text in the target region, the processing module is further configured to:
determining the type of the target text according to the type of the labeling box;
and setting a label on the target text according to the type of the target text, wherein the label is used for identifying the key information type to which the target text belongs.
In another aspect, an apparatus for processing a logistics list is provided, which includes at least one connected processor, a memory and an input/output unit, where the memory is used for storing a computer program, and the processor is used for calling the computer program in the memory to execute the method of the first aspect.
Yet another aspect of the embodiments of the present application provides a computer-readable storage medium, which includes instructions that, when executed on a computer, cause the computer to perform the method of the first aspect.
Compared with the prior art, in the scheme provided by the embodiment of the application, the target logistics single picture to be identified is obtained; adopting a target marking frame to mark key field information in the target logistics single picture; acquiring coordinate information of the target labeling frame in the target logistics single picture; according to the coordinate information of the target labeling frame, intercepting a target area corresponding to key field information from the target logistics single picture; identifying target text in the target area; and outputting the target text as target key information, wherein the target key information is used for representing logistics information corresponding to the target logistics single picture. According to the scheme, the efficiency and accuracy of extracting the key field information from the massive business related pictures can be improved, and the extraction cost is saved.
Drawings
FIG. 1 is a schematic flow chart of a method for processing a logistics list in an embodiment of the present application;
FIG. 2 is a schematic illustration of an aerial bill of lading picture in an embodiment of the present application;
FIG. 3 is a diagram of a target text in the embodiment of the present application
FIG. 4 is a schematic diagram of boundary information of a target text determined based on horizontal projection in the embodiment of the present application;
FIG. 5 is a schematic flow chart of an apparatus for processing a logistics sheet in an embodiment of the present application;
fig. 6 is a schematic structural diagram of a physical device for executing the method for processing a logistics list in the embodiment of the present application.
Detailed Description
The terms "first," "second," and the like in the description and in the claims of the embodiments of the application and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that the embodiments described herein may be practiced otherwise than as specifically illustrated or described herein. Furthermore, the terms "comprise" and "have," and any variations thereof, are intended to cover non-exclusive inclusions, such that a process, method, system, article, or apparatus that comprises a list of steps or modules is not necessarily limited to those steps or modules expressly listed, but may include other steps or modules not expressly listed or inherent to such process, method, article, or apparatus, such that the division of modules presented in the present application is merely a logical division and may be implemented in a practical application in a different manner, such that multiple modules may be combined or integrated into another system or some features may be omitted or not implemented, and such that couplings or direct couplings or communicative connections shown or discussed may be through interfaces, indirect couplings or communicative connections between modules may be electrical or the like, the embodiments of the present application are not limited. Moreover, the modules or sub-modules described as separate components may or may not be physically separated, may or may not be physical modules, or may be distributed in a plurality of circuit modules, and some or all of the modules may be selected according to actual needs to achieve the purpose of the embodiments of the present application.
The embodiment of the application provides a method, a device and a storage medium for processing a logistics list, which can be used on a terminal side or a server side, for example, the terminal side can be used for extracting key field information related to business from a business-related picture, for example, reading information of an airline bill of lading bill of each airline company and settlement bill information.
Referring to fig. 1, a method for processing a logistics list provided in an embodiment of the present application is described below, where the embodiment of the present application includes:
101. and acquiring a target logistics single picture to be identified, and identifying key field information in the target logistics single picture by adopting a target marking frame.
The target logistics list picture is logistics information indicating the article and a logistics list picture indicating the article characteristics. For example, the target logistics list picture may include an air pick-up picture, a land-based logistics list picture, a port pick-up picture, a warehouse pick-up picture, and the like.
The key field information refers to information content capable of maximally summarizing the logistics business to which the target logistics list picture belongs, for example, taking an airline bill of lading picture as an example, the airline bill of lading picture may include 8 key fields: originating station, destination station, flight date, piece size, gross weight, billing weight, rate and shipping charge. As shown in fig. 2, the key field information includes the key field title and the key field content. Wherein, the key field questions include the commodity code number, the charging weight, the rate, the air freight rate and the goods name. The key field contents include the commodity code number being the commodity code number, the billing weight being 528kg, the rate being 0.80, the air freight being 422, and the goods name being clothing and shoes.
The target labeling box is a box used for positioning key field information in the target logistics picture. The labeling boxes may be set according to the category of the key field information, and the labeling boxes for labeling the key field information may be the same or different, and may also be referred to as a positioning box, a detection box, and the like. The number, shape (e.g., rectangular or circular), type and name of the labeled boxes are not limited in the embodiments of the present application. For example, for an airline bill of lading picture, separate and distinct label boxes may be used for the origin, destination, flight date, piece size, gross weight, billing weight, rate, and shipping charge, respectively. For example, as shown in fig. 2, the ellipses are all target labeling boxes.
In some embodiments, in order to improve the accuracy of identifying the key field information, the central area of the target labeling box may also be corresponding to the key field information in the target logistics single picture.
In some embodiments, the target logistics picture may be further preprocessed to improve the quality of the target logistics picture, so as to facilitate identification of the key field information. Specifically, after obtaining a target logistics single picture to be identified, before identifying key field information in the target logistics single picture by using a target label box, the method further includes:
acquiring the outline information of the target logistics single picture;
acquiring a straight line in the target logistics single picture;
acquiring the deflection angle of each straight line according to the start coordinate and the end coordinate of each straight line;
counting the deflection angle with the most times as a target deflection angle of the target logistics single picture;
and correcting the target deflection angle to obtain the corrected target logistics list picture.
Therefore, the quality of the target logistics picture can be improved by correcting the target deflection angle, the key field information can be conveniently positioned based on the target marking frame in the follow-up process, the problems that the key field information cannot be selected from all frames of the target marking frame and the non-key field information of a frame selection department and the like caused by the fact that information in the target logistics picture is displayed irregularly are solved, and therefore the effectiveness and the integrity of the follow-up target area intercepting based on the target marking frame can be further guaranteed through preprocessing.
Take the example of pre-processing the airline bill of lading pictures (including coarse and fine adjustments). The process for preprocessing the aerial bill of lading picture comprises the following steps (1) to (3):
(1) and reading the aerial bill of lading picture, and acquiring W and H of the picture.
(2) According to W and H, the problems of 90-degree rotation and 240-degree rotation are solved through coarse adjustment;
(3) fine tuning of small angular deflection:
a. and contour information of the picture is acquired by using the Canny edge detection algorithm of opencv, so that noise is effectively inhibited. The specific operation is as follows:
the median pixel values of the picture pixels are counted, and two thresholds minVal and maxVal are set which determine which boundaries are true boundaries.
Pixels below the threshold minVal will be considered not to be edges; pixels above the threshold maxVal would be considered edges; and if the pixel point between the threshold value minVal and the threshold value maxVal is adjacent to the obtained edge pixel point, the pixel point is considered to be an edge, otherwise, the pixel point is not considered to be an edge.
b. Straight lines in the image are obtained by using a HoughLinesP Hough straight line detection algorithm of opencv, the deflection angle of each straight line is obtained according to the start coordinate and the end coordinate of each straight line, and the deflection angle with the largest counting number is used as the deflection angle of the whole aviation bill of lading picture.
c. After the image deflection angle is obtained, the image angle can be corrected based on the opencv warpAffine algorithm.
102. And acquiring the coordinate information of the target labeling frame in the target logistics single picture.
And the coordinate information of the target labeling frame is used for representing the position of the target labeling frame in the target logistics single picture. For example, coordinate values of the upper left corner and the lower right corner of the target labeling frame are adopted to represent coordinates of the labeling frame; for another example, the coordinate information of the target labeling frame may include coordinate information of each point on the target labeling frame, for example, when the target labeling frame is a rectangular frame, the coordinate information of the target labeling frame may include coordinate information of 4 vertices of the rectangular frame, and the position of the target labeling frame in the target logistics single picture may be determined according to the coordinate information of the 4 vertices.
103. And intercepting a target area corresponding to the key field information from the target logistics single picture according to the coordinate information of the target labeling frame.
The shape of the target area can be a regular shape, and the shape of the target area is matched with the target labeling frame. The target area can cover the pixel points corresponding to the key field information.
For example, when the key field information in the target logistics single picture is in the central area of the target labeling box, the target area may be intercepted by the central area of the target labeling box. By extracting the target area, the efficiency and the accuracy of extracting the text of the key field information in the later period can be improved.
For another example, the size of the target labeling box may be dynamically adjusted based on the pixel region occupied by the key field information, so as to reduce the interception of non-key field information (i.e., noise pixel points) into the target region, thereby further improving the accuracy of extracting the text of the key field information in the later stage.
In some embodiments, the intercepting, according to the coordinate information of the target labeling box, a target area corresponding to the key field information from the target logistics single picture includes:
determining the edge characteristics of the target labeling box in the target logistics single picture by adopting an edge detection algorithm according to the coordinate information of the target labeling box; the edge characteristics comprise position information of pixel points in the target logistics picture corresponding to the edge of the target labeling frame;
extracting the characteristics of each pixel point of the key field information from the target logistics single picture by taking the edge characteristics as an intercepting range;
and taking the characteristics of each pixel point of the key field information as the target area.
The coordinate information of the target labeling frame can comprise the coordinate information of each point on the target labeling frame, so that when a target area is intercepted from a target logistics picture, the position information of the key field information in the target logistics picture can be accurately positioned, the accuracy of intercepting the target area is improved, and the probability of intercepting unnecessary pixel points is reduced.
As shown in fig. 3, the rectangular solid-line box refers to a target region where the corresponding key field information is truncated.
104. Identifying a target text in the target area.
The target text is a character string composed of characters, the character string can express key field information, each target text is key field information, the number of the detected target texts is not limited in the embodiment of the application, and the type of the key field information to be extracted can be selected based on the extraction requirement. Correspondingly, when only part of key field information in the target logistics single picture is selected and extracted, only the prior label boxes of the key field information can be set when the model is trained; or setting prior marking boxes of all key field information, and directly discarding the target area to which the unnecessary key field information belongs after extracting all the key field information based on the prior marking boxes. For example, the delivery location: beijing, receiving area: shenzhen, goods name: a digital code.
In some embodiments, the identifying the target text in the target region comprises:
a. and horizontally projecting the target area to obtain a projection area.
b. Determining a text region and a blank region in the projection region.
c. And acquiring the pixel value of the text area and the pixel value of the blank area, and determining the boundary information of the target text in the text area according to the difference value between the pixel value of the text area and the pixel value of the blank area.
In some embodiments, the boundary information of the target text in the text region is determined according to the following operations:
calculating the pixel sum of each row of pixel points in the text area and the pixel sum of each column of pixel points in the blank area;
determining a starting row, an ending row, a starting column and an ending column of the target text according to a preset pixel threshold, the pixel sum of each row of pixel points in the text area and the pixel sum of each column of pixel points in the blank area;
and determining the boundary information of the target text according to the starting line, the ending line, the starting column and the ending column of the target text.
As shown in fig. 6, a horizontal projection method is adopted, the pixel value of each pixel in the picture is one number from 0 to 255, so that the pixel sum of each row of the left picture is calculated, for a full white pixel row without characters, the pixel sum is 255 × wide, and the area with characters is certainly smaller than the value, so that the starting row and the ending row of the character area can be found by setting a threshold, and the starting column and the ending column can also be found similarly.
Therefore, by adopting horizontal projection and then accurately segmenting text boundaries according to the difference of the pixel value summation of the word area and the blank area, the text recognition effect can be improved.
In some embodiments, in order to facilitate the subsequent reading efficiency and processing efficiency of the extracted key field information, the extracted target text may be further classified, and the classification may be performed by means of marking. Specifically, after identifying the target text in the target region, the method further comprises:
determining the type of the target text according to the type of the labeling box;
and setting a label on the target text according to the type of the target text, wherein the label is used for identifying the key information type to which the target text belongs.
In some embodiments, to improve the recognition accuracy of the target text, the following operations may be further performed after the target text in the target area is recognized:
(1) calculating a confidence level of the recognized target text.
In some embodiments, the target area is at least one and the target text is at least one. The calculating a confidence level that the target text is identified comprises:
obtaining the confidence coefficient of each target character in each target area;
and multiplying the confidence degrees of the target texts to obtain the confidence degree of the target text.
The confidence level of the target text may also be referred to as reliability, or confidence level, or confidence coefficient, that is, when the sampling estimates the overall parameter, the conclusion is always uncertain due to the randomness of the sample. Therefore, a probabilistic presentation method, that is, an interval estimation method in mathematical statistics, is adopted, that is, the confidence degree refers to the probability that the estimated value and the overall parameter are within a certain allowable error range.
Alternatively, the confidence level of each target text may be calculated by a softmax function, and the confidence level may be represented by a score value. Specifically, the score value of each target character is respectively input into a softmax function, a corresponding result from 0 to 1 is calculated, and then the softmax results of all the target characters belonging to the same target text are multiplied to calculate the confidence coefficient of the target text.
(2) And taking the target text with the confidence coefficient higher than the preset confidence coefficient as the target key information.
The preset confidence coefficient can be determined according to the required recognition accuracy, the higher the required recognition accuracy is, the larger the preset confidence coefficient is, and otherwise, the smaller the preset confidence coefficient is. For example, if the preset confidence level is set to 0.9, the recognition results with confidence levels greater than 0.9 can be directly used for subsequent tasks, and the recognition results with confidence levels less than 0.9 need to be manually reviewed once. Through the screening mechanism, the mass target logistics single pictures can be screened out quickly and accurately. The method can adjust the threshold value according to the actual condition of the service in the actual service operation, and adopts the confidence mechanism of different degrees for the identification result.
105. And outputting the target text as target key information.
The target key information is used for representing logistics information corresponding to the target logistics single picture.
As shown in fig. 3, fig. 3 is a target text extracted based on the air bill of lading picture.
In the embodiment of the application, a target logistics list picture to be identified is obtained; adopting a target marking frame to mark key field information in the target logistics single picture; acquiring coordinate information of the target labeling frame in the target logistics single picture; according to the coordinate information of the target labeling frame, intercepting a target area corresponding to key field information from the target logistics single picture; identifying target text in the target area; and outputting the target text as target key information, wherein the target key information is used for representing logistics information corresponding to the target logistics single picture. Because the positions of 8 key fields of the same logistics service key in the target logistics single picture are approximately fixed, the scheme can improve the efficiency and accuracy of extracting the key field information from the massive service related pictures and save the extraction cost.
In addition, the embodiment of the application also considers whether the position of the key field information in the target logistics list picture is fixed or not to select the corresponding strategy for positioning the key field information. The policies may include policies for locating key field content and policies for locating key field topics.
(1) Policy for locating key field content
Specifically, as shown in fig. 2, under the condition that the position of the target logistics single picture is approximately fixed, 8 key fields of the same logistics business key directly position the position of the key field information, and this result can be directly used for subsequent text recognition, so that under the condition that the position of the target logistics single picture is approximately fixed, a strategy for positioning the content of the key fields is generally adopted.
(2) Strategy for locating key field topics
Specifically, as shown in fig. 2, under the condition that the positions of 8 key fields of the same logistics service key are not fixed, the positions of the key field information are positioned by using the strategy for positioning the key field topics, so that the positioning accuracy can be improved.
Optionally, in some of the embodiments of the present application, the method is implemented by a neural network model. The method further comprises the following model training process:
(1) obtaining a training sample, wherein the training sample comprises a plurality of logistics single pictures.
(2) And marking position information of a plurality of items of key information of the air transportation business in the logistics single pictures by using a marking frame, and recording coordinate information of a marking frame for marking the position information.
(3) Inputting the logistics single picture marked with the position information into a positioning model, identifying the category of each marking frame on the logistics single picture through the positioning model, and taking the size of each type of marking frame as the prior size of a candidate frame in the positioning model.
Optionally, the types of the labeled boxes can be clustered by adopting a Kmeans algorithm according to the coordinate information of the labeled boxes, and the size of each type of labeled boxes is output as the prior size of the candidate box of the positioning model.
(4) Compressing the size of the picture, and updating the weight of each layer of the positioning model according to the coordinate information of each marking frame in the logistics single picture so as to obtain the optimal model parameter of the positioning model.
For example, the size of the original logistics single picture is 1500 × 2100, and on the premise of not affecting the precision, in order to accelerate the model training and inference, the size of the original logistics single picture can be compressed from 1500 × 2100 to 640 × 800, and the connection weights of each layer of the model are continuously updated by using a random gradient descent algorithm in combination with the coordinate information of the labeled frame until the model can well predict the positions of eight key regions in the picture. And saving the model parameters so as to obtain the trained positioning model.
Any technical feature mentioned in the embodiment corresponding to any one of fig. 1 to 3 is also applicable to the embodiments corresponding to fig. 4 and 5 in the embodiment of the present application, and the details of the subsequent similarities are not repeated.
The above describes a method for processing a logistics list in the embodiment of the present application, and the following describes an apparatus for processing a logistics list in the embodiment of the present application.
Referring to fig. 4, a schematic diagram of an apparatus 40 for processing a logistic list is shown in fig. 4. The apparatus for processing a logistics list in the embodiment of the present application can implement the steps corresponding to the method for processing a logistics list performed in the embodiment corresponding to fig. 1. The functions realized by the device for processing the logistics list can be realized by hardware, and can also be realized by hardware executing corresponding software. The hardware or software includes one or more modules corresponding to the above functions, which may be software and/or hardware. The apparatus 40 for processing a logistics list may include a processing module 401 and an input/output module 402, and the implementation of the functions of the processing module, the input/output module 401, and the input/output module 402 may refer to the operations executed in the embodiment corresponding to fig. 1, which are not described herein again. For example, the processing module may be used to control the input, output, acquisition, etc. operations of the input-output module.
In some embodiments, the input/output module 402 may be configured to obtain a target logistics list picture to be identified;
the processing module 401 may be configured to identify, by using a target label box, key field information in the target logistics list picture obtained by the input and output module 402; acquiring coordinate information of the target labeling box in the target logistics single picture through the input/output module 402; according to the coordinate information of the target labeling frame, intercepting a target area corresponding to key field information from the target logistics single picture; identifying target text in the target area;
the input/output module 402 is further configured to output the target text as target key information, where the target key information is used to represent logistics information corresponding to the target logistics single picture.
In this embodiment of the application, after the input/output module 402 acquires a target logistics single picture to be identified, the processing module 401 identifies key field information in the target logistics single picture by using a target marking frame; acquiring coordinate information of the target labeling frame in the target logistics single picture; according to the coordinate information of the target labeling frame, intercepting a target area corresponding to key field information from the target logistics single picture; identifying target text in the target area; and outputting the target text as target key information, wherein the target key information is used for representing logistics information corresponding to the target logistics single picture. Because the positions of 8 key fields of the same logistics service key in the target logistics single picture are approximately fixed, the scheme can improve the efficiency and accuracy of extracting the key field information from the massive service related pictures and save the extraction cost.
In some embodiments, the target area is at least one, and the target text is at least one; the processing module 401 is specifically configured to:
obtaining the confidence of each target character in each target area through the input/output module 402;
and multiplying the confidence degrees of the target texts to obtain the confidence degree of the target text.
In some embodiments, the processing module 401 is specifically configured to:
carrying out horizontal projection on the target area to obtain a projection area;
determining a text area and a blank area in the projection area;
acquiring pixel values of the text area and the blank area;
and determining the boundary of the target text in the text area according to the difference value of the pixel value of the character area and the pixel value of the blank area.
In some embodiments, the processing module 401 is specifically configured to:
calculating the pixel sum of each row of pixel points in the text area and the pixel sum of each column of pixel points in the blank area;
determining a starting row, an ending row, a starting column and an ending column of the target text according to a preset pixel threshold, the pixel sum of each row of pixel points in the text area and the pixel sum of each column of pixel points in the blank area;
and determining the boundary information of the target text according to the starting line, the ending line, the starting column and the ending column of the target text.
In some embodiments, the processing module 401 is further configured to:
acquiring a training sample through the input/output module 402, wherein the training sample comprises a plurality of logistics list pictures;
marking position information of a plurality of items of key information of the air transportation business in the logistics single pictures by adopting marking frames, and recording coordinate information of marking frames for marking the position information;
inputting the logistics single picture marked with the position information into a positioning model, identifying the category of each marking frame on the logistics single picture through the positioning model, and taking the size of each type of marking frame as the prior size of a candidate frame in the positioning model;
compressing the size of the picture, and updating the weight of each layer of the positioning model according to the coordinate information of each marking frame in the logistics single picture so as to obtain the optimal model parameter of the positioning model.
In some embodiments, after the input/output module acquires the target logistics single picture to be identified, and before the target markup frame is used to identify the key field information in the target logistics single picture, the processing module 401 is further configured to:
acquiring the outline information of the target logistics single picture through the input and output module 402;
acquiring a straight line in the target logistics single picture through the input and output module 402;
acquiring the deflection angle of each straight line according to the start coordinate and the end coordinate of each straight line;
counting the deflection angle with the most times as a target deflection angle of the target logistics single picture;
and correcting the target deflection angle to obtain the corrected target logistics list picture.
In some embodiments, the processing module 401, after identifying the target text in the target region, is further configured to:
determining the type of the target text according to the type of the labeling box;
and setting a label on the target text according to the type of the target text, wherein the label is used for identifying the key information type to which the target text belongs.
The apparatus 40 for processing the logistics list in the embodiment of the present application is described above from the perspective of the modular functional entity, and the network authentication server and the terminal device in the embodiment of the present application are described below from the perspective of hardware processing.
The device 40 shown in fig. 4 may have a structure as shown in fig. 5, when the device 40 shown in fig. 4 has a structure as shown in fig. 5, the processor and the input/output unit in fig. 5 can implement the same or similar functions of the processing module and the input/output module provided in the device embodiment corresponding to the device, and the central storage in fig. 5 stores the computer program that the processor needs to call when executing the method for processing the logistics list. In this embodiment of the application, the entity device corresponding to the input/output module in the embodiment shown in fig. 4 may be an input/output interface, and the entity device corresponding to the processing module may be a processor.
In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the system, the apparatus and the module described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the embodiments of the present application, it should be understood that the disclosed system, apparatus, and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is merely a logical division, and in actual implementation, there may be other divisions, for example, multiple modules or components may be combined or integrated into another system, or some features may be omitted, or not implemented. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or modules, and may be in an electrical, mechanical or other form.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical modules, may be located in one place, or may be distributed on a plurality of network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.
In addition, functional modules in the embodiments of the present application may be integrated into one processing module, or each module may exist alone physically, or two or more modules are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may be stored in a computer readable storage medium.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product.
The computer program product includes one or more computer instructions. The procedures or functions described in accordance with the embodiments of the present application are generated in whole or in part when the computer program is loaded and executed on a computer. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that a computer can store or a data storage device, such as a server, a data center, etc., that is integrated with one or more available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.
The technical solutions provided by the embodiments of the present application are introduced in detail, and the principles and implementations of the embodiments of the present application are explained by applying specific examples in the embodiments of the present application, and the descriptions of the embodiments are only used to help understanding the method and core ideas of the embodiments of the present application; meanwhile, for a person skilled in the art, according to the idea of the embodiment of the present application, there may be a change in the specific implementation and application scope, and in summary, the content of the present specification should not be construed as a limitation to the embodiment of the present application.

Claims (10)

1. A method of processing a logistics list, the method comprising:
acquiring a target logistics list picture to be identified;
adopting a target marking frame to mark key field information in the target logistics single picture;
acquiring coordinate information of the target labeling frame in the target logistics single picture;
according to the coordinate information of the target labeling frame, intercepting a target area corresponding to key field information from the target logistics single picture;
identifying target text in the target area;
and outputting the target text as target key information, wherein the target key information is used for representing logistics information corresponding to the target logistics single picture.
2. The method of claim 1, wherein after identifying the target text in the target region, the method further comprises:
calculating the confidence of the recognized target text;
and taking the target text with the confidence coefficient higher than the preset confidence coefficient as the target key information.
3. The method of claim 2, wherein the target area is at least one, and the target text is at least one; the calculating a confidence level that the target text is identified comprises:
obtaining the confidence coefficient of each target character in each target area;
and multiplying the confidence degrees of the target texts to obtain the confidence degree of the target text.
4. The method of any of claims 1-3, wherein the identifying the target text in the target region comprises:
carrying out horizontal projection on the target area to obtain a projection area;
determining a text area and a blank area in the projection area;
acquiring pixel values of the text area and the blank area;
and determining the boundary information of the target text in the text area according to the difference value between the pixel value of the character area and the pixel value of the blank area.
5. The method according to claim 4, wherein the obtaining of the pixel values of the text region and the blank region; determining boundary information of the target text in the text area according to the difference value between the pixel value of the character area and the pixel value of the blank area, wherein the boundary information comprises:
calculating the pixel sum of each row of pixel points in the text area and the pixel sum of each column of pixel points in the blank area;
determining a starting row, an ending row, a starting column and an ending column of the target text according to a preset pixel threshold, the pixel sum of each row of pixel points in the text area and the pixel sum of each column of pixel points in the blank area;
and determining the boundary information of the target text according to the starting line, the ending line, the starting column and the ending column of the target text.
6. The method of claim 5, further comprising:
obtaining a training sample, wherein the training sample comprises a plurality of logistics single pictures;
marking position information of a plurality of items of key information of the air transportation business in the logistics single pictures by adopting marking frames, and recording coordinate information of marking frames for marking the position information;
inputting the logistics single picture marked with the position information into a positioning model, identifying the category of each marking frame on the logistics single picture through the positioning model, and taking the size of each type of marking frame as the prior size of a candidate frame in the positioning model;
compressing the size of the picture, and updating the weight of each layer of the positioning model according to the coordinate information of each marking frame in the logistics single picture so as to obtain the optimal model parameter of the positioning model.
7. The method according to any one of claims 1 to 3, wherein after the obtaining of the target logistics single picture to be identified and before the identifying of the key field information in the target logistics single picture by using the target mark box, the method further comprises:
acquiring the outline information of the target logistics single picture;
acquiring a straight line in the target logistics single picture;
acquiring the deflection angle of each straight line according to the start coordinate and the end coordinate of each straight line;
counting the deflection angle with the most times as a target deflection angle of the target logistics single picture;
and correcting the target deflection angle to obtain the corrected target logistics list picture.
8. The method of any of claims 1-3, wherein after the identifying the target text in the target region, the method further comprises:
determining the type of the target text according to the type of the labeling box;
and setting a label on the target text according to the type of the target text, wherein the label is used for identifying the key information type to which the target text belongs.
9. An apparatus for processing a logistics sheet, the apparatus comprising:
the input and output module is used for acquiring a target logistics single picture to be identified;
the processing module is used for adopting a target marking frame to mark key field information in the target logistics single picture acquired by the input and output module; acquiring coordinate information of the target labeling frame in the target logistics single picture through the input and output module; according to the coordinate information of the target labeling frame, intercepting a target area corresponding to key field information from the target logistics single picture;
the detection module is used for identifying the target text in the target area intercepted by the real-time processing module;
the input and output module is further configured to output the target text as target key information, where the target key information is used to represent logistics information corresponding to the target logistics single picture.
10. A computer-readable storage medium comprising instructions which, when executed on a computer, cause the computer to perform the method of any one of claims 1-8.
CN202010241316.5A 2020-03-31 2020-03-31 Method, device and storage medium for processing logistics list Pending CN113469161A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010241316.5A CN113469161A (en) 2020-03-31 2020-03-31 Method, device and storage medium for processing logistics list

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010241316.5A CN113469161A (en) 2020-03-31 2020-03-31 Method, device and storage medium for processing logistics list

Publications (1)

Publication Number Publication Date
CN113469161A true CN113469161A (en) 2021-10-01

Family

ID=77865999

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010241316.5A Pending CN113469161A (en) 2020-03-31 2020-03-31 Method, device and storage medium for processing logistics list

Country Status (1)

Country Link
CN (1) CN113469161A (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160132739A1 (en) * 2014-11-06 2016-05-12 Alibaba Group Holding Limited Method and apparatus for information recognition
CN108446621A (en) * 2018-03-14 2018-08-24 平安科技(深圳)有限公司 Bank slip recognition method, server and computer readable storage medium
CN109426814A (en) * 2017-08-22 2019-03-05 顺丰科技有限公司 A kind of positioning of the specific plate of invoice picture, recognition methods, system, equipment
CN109614923A (en) * 2018-12-07 2019-04-12 上海智臻智能网络科技股份有限公司 The recognition methods of OCR document and its device
CN109840520A (en) * 2017-11-24 2019-06-04 中国移动通信集团广东有限公司 A kind of invoice key message recognition methods and system
CN109977949A (en) * 2019-03-20 2019-07-05 深圳市华付信息技术有限公司 Text positioning method, device, computer equipment and the storage medium of frame fine tuning
CN110490198A (en) * 2019-08-12 2019-11-22 上海眼控科技股份有限公司 Text orientation bearing calibration, device, computer equipment and storage medium
CN110598686A (en) * 2019-09-17 2019-12-20 携程计算机技术(上海)有限公司 Invoice identification method, system, electronic equipment and medium
CN110738602A (en) * 2019-09-12 2020-01-31 北京三快在线科技有限公司 Image processing method and device, electronic equipment and readable storage medium

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160132739A1 (en) * 2014-11-06 2016-05-12 Alibaba Group Holding Limited Method and apparatus for information recognition
CN109426814A (en) * 2017-08-22 2019-03-05 顺丰科技有限公司 A kind of positioning of the specific plate of invoice picture, recognition methods, system, equipment
CN109840520A (en) * 2017-11-24 2019-06-04 中国移动通信集团广东有限公司 A kind of invoice key message recognition methods and system
CN108446621A (en) * 2018-03-14 2018-08-24 平安科技(深圳)有限公司 Bank slip recognition method, server and computer readable storage medium
CN109614923A (en) * 2018-12-07 2019-04-12 上海智臻智能网络科技股份有限公司 The recognition methods of OCR document and its device
CN109977949A (en) * 2019-03-20 2019-07-05 深圳市华付信息技术有限公司 Text positioning method, device, computer equipment and the storage medium of frame fine tuning
CN110490198A (en) * 2019-08-12 2019-11-22 上海眼控科技股份有限公司 Text orientation bearing calibration, device, computer equipment and storage medium
CN110738602A (en) * 2019-09-12 2020-01-31 北京三快在线科技有限公司 Image processing method and device, electronic equipment and readable storage medium
CN110598686A (en) * 2019-09-17 2019-12-20 携程计算机技术(上海)有限公司 Invoice identification method, system, electronic equipment and medium

Similar Documents

Publication Publication Date Title
CN109887153B (en) Finance and tax processing method and system
US11151369B2 (en) Systems and methods for classifying payment documents during mobile image processing
EP3437019B1 (en) Optical character recognition in structured documents
CN105528604A (en) Bill automatic identification and processing system based on OCR
CN113313111B (en) Text recognition method, device, equipment and medium
CN111814785B (en) Invoice recognition method, training method of relevant model, relevant equipment and device
CA3052248C (en) Detecting orientation of textual documents on a live camera feed
CN113569863B (en) Document checking method, system, electronic equipment and storage medium
US9212007B2 (en) Correction of customer mailing information
CN116092231A (en) Ticket identification method, ticket identification device, terminal equipment and storage medium
CN115018513A (en) Data inspection method, device, equipment and storage medium
CN110647824A (en) Value-added tax invoice layout extraction method based on computer vision technology
JP2019191665A (en) Financial statements reading device, financial statements reading method and program
US9597714B2 (en) Routing of an unknown mail item
CN105243365A (en) Data processing method and data processing system
CN111462388A (en) Bill inspection method and device, terminal equipment and storage medium
CN111428725A (en) Data structuring processing method and device and electronic equipment
CN113469161A (en) Method, device and storage medium for processing logistics list
CN111639905B (en) Enterprise business process management and control system, method, electronic equipment and storage medium
US9466044B2 (en) Use of organization chart to direct mail items from central receiving area to organizational entities using clusters based on a union of libraries
CN113066223A (en) Automatic invoice verification method and device
CN111860263A (en) Information input method and device and computer readable storage medium
US9213970B1 (en) Processing of co-mingled paper correspondence
US20230237558A1 (en) Object recognition systems and methods
CN116664066B (en) Method and system for managing enterprise planning income and actual income

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination