CN110990647A

CN110990647A - Data processing method and device

Info

Publication number: CN110990647A
Application number: CN201911198687.3A
Authority: CN
Inventors: 左凯; 程钰茗; 应晓伟
Original assignee: Hanhai Information Technology Shanghai Co Ltd
Current assignee: Hanhai Information Technology Shanghai Co Ltd
Priority date: 2019-11-29
Filing date: 2019-11-29
Publication date: 2020-04-10

Abstract

The specification discloses a data processing method and a data processing device, after an image is obtained, character strings corresponding to text lines of an image in which a text line is located in the same designated area are determined to be character strings with an association relation, data identifications corresponding to geographic positions when the image is collected are determined according to the corresponding relation between the pre-stored data identifications and the geographic positions, then data identifications matched with at least one character string are determined from the determined data identifications to serve as updating identifications, and finally data corresponding to the updating identifications are updated according to other character strings with the association relation between the character strings matched with the updating identifications. Data updating is carried out by determining the incidence relation of character strings contained in the acquired image, a novel data updating method is realized, and the difficulty of data supplement is reduced.

Description

Data processing method and device

Technical Field

The present application relates to the field of information processing technologies, and in particular, to a data processing method and apparatus.

Background

At present, with the development of distribution services, the services of a distribution platform are more and more extensive, and in order to better execute the services, the distribution platform has a need to supplement supplier data. For example, the delivery platform needs at least information such as the coordinate position of the supplier, the name of the supplier, and the delivery items available from the supplier in order to perform the delivery service. And the rest information such as meal ordering telephone, address, zip code, business hours and the like may not be acquired by the distribution platform and is the information needing to be supplemented.

However, the distribution platform has a problem that it is difficult to collect the supplier data and the supplier data collection is not complete.

Disclosure of Invention

The embodiment of the specification provides a data processing method and a data processing device, which are used for partially solving the problems in the prior art.

The embodiment of the specification adopts the following technical scheme:

the data processing method provided by the present specification includes:

acquiring an image;

determining the area of the text line in the image and at least one designated area in the image;

determining character strings corresponding to the text lines of which the areas are located in the same designated area as the character strings with association relations;

determining a data identifier corresponding to the geographic position when the image is acquired according to the geographic position when the image is acquired and the corresponding relation between each pre-stored data identifier and each geographic position;

determining a data identifier matched with at least one character string in the determined data identifiers as an update identifier;

and updating the data corresponding to the update identifier according to other character strings with incidence relation with the character string matched with the update identifier.

Optionally, acquiring an image specifically includes:

acquiring an image collected at a supplier when the distribution capacity performs a distribution task, wherein the image at least comprises a door surface of the supplier.

Optionally, determining the geographic location when the image is acquired specifically includes:

determining the geographic position when the image is acquired according to the coordinates contained in the POI of the supplier; alternatively, the first and second electrodes may be,

and determining the position information of the acquisition equipment when the distribution transport capacity acquires the image through the acquisition equipment.

Optionally, determining, as each character string having an association relationship, a character string corresponding to each text line in which the region where the text line is located falls into the same designated region, specifically including:

determining character strings corresponding to the texts respectively according to the region of the text line in the image;

determining the area of each text line falling into the same designated area according to the position of each designated area in the image and the position of the area of each text line in the image;

and determining the association relationship between the character strings respectively corresponding to the areas of the text lines falling into the same designated area.

Optionally, determining a data identifier corresponding to the geographic location when the image is acquired specifically includes:

determining each geographic position with the geographic position smaller than a preset distance when the image is acquired from each pre-stored geographic position;

and according to the corresponding relation between each pre-stored data identifier and each geographic position, taking the data identifier corresponding to each determined geographic position with the distance smaller than the preset distance as the data identifier corresponding to the geographic position when the image is collected.

Optionally, updating the data corresponding to the update identifier according to another character string having an association relationship with the character string matched with the update identifier, specifically including:

determining data corresponding to the update identifier according to the corresponding relation between each piece of data and each geographic position stored in advance and the geographic position corresponding to the update identifier;

Optionally, updating the data corresponding to the update identifier specifically includes:

determining a character string with inconsistent data corresponding to the updating identifier from other character strings with incidence relation with the character string matched with the updating identifier;

and adding the determined character string to the data corresponding to the updating identification.

The data processing apparatus provided in the present specification includes:

an acquisition module for acquiring an image;

the first determining module is used for determining the area of the text line in the image and at least one designated area in the image;

the relation determining module is used for determining character strings corresponding to the text lines of which the areas are located in the same designated area as the character strings with the association relation;

the identification determining module is used for determining the data identification corresponding to the geographic position when the image is acquired according to the geographic position when the image is acquired and the corresponding relation between each pre-stored data identification and each geographic position;

the matching module is used for determining a data identifier matched with at least one character string in the determined data identifiers to serve as an updating identifier;

and the data processing module is used for updating the data corresponding to the updating identification according to other character strings which have incidence relations with the character string matched with the updating identification.

The electronic device provided by the present specification includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and the processor implements the data processing method when executing the program.

The present specification provides a computer-readable storage medium storing a computer program which, when executed by a processor, implements the above-described data processing method.

The embodiment of the specification adopts at least one technical scheme which can achieve the following beneficial effects:

after an image is obtained, firstly, character strings corresponding to text lines of an image in which a text line region falls into the same designated region in the image are determined as character strings with an association relation, then, data identifications corresponding to geographic positions when the image is collected are determined according to the corresponding relation between each pre-stored data identification and each geographic position, then, data identifications matched with at least one character string are determined from the determined data identifications to serve as updating identifications, and finally, data corresponding to the updating identifications are updated according to other character strings with the association relation between the character strings matched with the updating identifications. The incidence relation of the character strings contained in the acquired image is determined, the character strings used for data updating are determined, and the data corresponding to the data identification matched with at least one character string is updated, so that a novel data updating method is realized, and the difficulty of data supplement is reduced.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:

FIG. 1 is a schematic diagram of a data processing process provided in an embodiment of the present disclosure;

FIG. 2 is a schematic diagram of a designated area provided in an embodiment of the present disclosure;

FIG. 3 is a schematic diagram of an area where text lines falling into the same designated area are located according to an embodiment of the present disclosure;

FIG. 4 is a schematic diagram of determining data identifiers provided by an embodiment of the present description;

FIG. 5 is a schematic diagram of a minimum bounding rectangle determined by the prior art;

FIG. 6 is a schematic diagram of determining a vertical direction of a text provided in an embodiment of the present description;

FIG. 7 is a schematic diagram of a connection provided in an embodiment of the present disclosure;

FIG. 8 is a schematic view of a circumscribed quadrilateral provided herein;

fig. 9 and fig. 10 are schematic diagrams of a process for determining a region where a text line is located according to an embodiment of the present specification;

fig. 11 is a schematic structural diagram of a data processing apparatus provided in an embodiment of the present specification;

fig. 12 is a schematic structural diagram of an electronic device implementing a data processing method according to an embodiment of the present disclosure.

Detailed Description

In order to make the objects, technical solutions and advantages of the present disclosure more apparent, the technical solutions of the present disclosure will be clearly and completely described below with reference to the specific embodiments of the present disclosure and the accompanying drawings. It should be apparent that the described embodiments are only some of the embodiments of the present application, and not all of the embodiments. All other embodiments obtained by a person skilled in the art based on the embodiments in the present specification without any inventive step are within the scope of the present application.

The technical solutions provided by the embodiments of the present application are described in detail below with reference to the accompanying drawings.

Fig. 1 is a data processing process provided in this specification, which may specifically include the following steps:

s100: an image is acquired.

In this specification, the data processing procedure may be performed by a terminal, or may also be performed by a server receiving an image uploaded by the terminal, and performing the data processing procedure, where the terminal may include: the server may be a single server, or a system composed of a plurality of servers, for example, a distributed server, and the specific device executing the image recognition process is not limited in this specification and may be set as needed.

For convenience of description, the present specification is based on a scene that an existing distribution platform needs to supplement supplier data, and the distribution platform is described as an example of supplementing the store data in an image according to an image acquired by a terminal, and for the distribution platform, a supplier is a store providing a distribution object. For example, the supplier of the take-out platform may be a dining establishment offering take-out food. Thus, in the embodiments provided herein, the server of the distribution platform may obtain an image that includes the facade of the store.

In one or more embodiments provided herein, the distribution platform may request the distribution capacity to perform image capture on the facade of the supplier when performing the distribution task, and then the distribution capacity may capture an image containing the facade of the supplier through a terminal of the distribution capacity and upload the image to a server of the distribution platform, which obtains the image containing the facade of the store when performing the distribution task.

Of course, the present specification does not limit when the image is obtained, and for example, the server may obtain the image from historically collected images including the store door, or obtain the image uploaded by the terminal of the distribution capacity in real time.

In the present specification, the image including the shop front may be acquired and uploaded by a terminal of another user. For example, if the shop environment image uploaded by the user includes the shop front, or if the shop operator uploads the shop front image of the own shop to the server, the server may acquire the image used for the supplementary material from the image uploaded by the user.

S102: and determining the area of the text line in the image and at least one designated area in the image.

In this specification, after acquiring an image, a server may determine, according to the image, a region corresponding to at least one text line in the image and at least one designated region, where the region corresponding to the text line is determined by an image segmentation model, and the designated region is a region where a signboard of a shop in the image is located. Of course, the designated area may also be an area where a store billboard, a bulletin board, or the like is located according to different application scenes, or the designated area is an area where a shop front of the store is located, and the server may determine the designated area by training a corresponding model.

Fig. 2 is a schematic diagram of a designated area provided in the present specification, and an area where a signboard of a store is located may be used as the designated area, an area where a signboard of the store is located may be used as the designated area, or an area where a shop front is located may be used as the designated area.

Specifically, since the region belonging to the text line and the region belonging to the non-text line in the image need to be distinguished, the server may first input the image into a pre-trained image segmentation model to obtain the segmentation result of the image output by the image segmentation model, where the segmentation result of the image may be, as shown in fig. 3, a polygon corresponding to the text line and a region of the non-text line obtained by segmenting the image, and then determine the minimum connected region of the polygon corresponding to each text line to determine the minimum circumscribed rectangle of each text line in the image as the region where the text line in the image is located. The image segmentation model may be a full Convolutional network model (FCN), a Conditional Random Field (CRF), and the like, which is not limited in this specification. For convenience of description, the FCN model will be described later. And the region of the text line output by the FCN model is a regular polygon, and the minimum circumscribed rectangle of each text line can be determined by continuously processing the polygon. It should be noted that, because it is a mature technology to segment an image by using an FCN model and determine the minimum bounding rectangle of a text line, the description of the present specification does not repeat the use of the FCN model, the FCN training process, and the process of determining the minimum bounding rectangle.

Similarly, the FCN model obtained by training or other image detection models may also be used to determine the designated area from the image, and certainly, the FCN model for determining the designated area and the FCN for segmenting the text line may be different models obtained by training different training samples, that is, two models are trained respectively to determine the area where the text line in the image is located and the designated area. In addition, a model capable of segmenting the text line and the designated area by colleagues can be trained, and the server can input the image into the model to determine the area where the text line is located and the designated area in the image. For example, the FCN image can identify the region of the airplane, the ground, and the sky in one image at the same time, and similarly, the region of the text row in the image and the model of the designated region can be identified at the same time through a suitable sample training process.

S104: and determining character strings corresponding to the text lines of which the areas are located in the same designated area as the character strings with the association relationship.

In this specification, in a scene in which the server acquires an image including a shop front to update the shop profile, since different specified areas correspond to different shops, a character string corresponding to each text line falling into the same specified area in the image should also be a character string corresponding to the same shop. Therefore, after the server determines the area where each text line in the image is located and each designated area, the server may determine the character strings corresponding to the text lines whose areas are located in the same designated area as the character strings having an association relationship.

Specifically, since the server needs to determine the association relationship existing between the character strings corresponding to the text lines included in the image, the server may determine the character strings corresponding to the respective texts according to the region where the text lines are located in the image. The server can input the trained text line recognition model by the image corresponding to the area where the text line is located to obtain the character string output by the text line recognition model, and the character string is used as the character string corresponding to the text line contained in the area where the input text line is located.

Wherein, the common text line recognition model comprises: an attention (attention) model, a connection time model (CTC) model, and the like. The server can input the image corresponding to the area where the text line is located into the trained attention model or CTC model to obtain the character string output by the model. At present, identifying text lines contained in an image through an attention model or a CTC model, and determining character strings corresponding to the text lines is a relatively mature technology, which is not described in detail herein.

Secondly, after determining the character strings corresponding to the texts, the server may determine, for each designated area, that there is an association relationship between the character strings corresponding to the text lines falling into the designated area if it is determined that at least two text lines fall into the designated area according to the position of the area where the text lines are located in the image and the position of the designated area in the image.

Fig. 3 is a schematic diagram of the text line region provided in the present specification, where the dashed line indicates the text line region, where 3 text line regions A, B, D, F are visible, and the solid line indicates the designated region, i.e., the regions marked C and E in fig. 3. It can be seen that the text line region a and the text line region B fall into the designated region C, the text line region D does not fall into any designated region, and the text line region F falls into the designated region E. Because only the region F where the text line is located falls into the designated region E, there is no association between the character string corresponding to the text line included in the region F where the text line is located and other character strings.

Of course, the server may also determine that there is an association relationship between the regions where the text lines that fall into the same designated region are located, determine the character strings corresponding to the regions where the text lines that have an association relationship are located, and determine the association relationship between the character strings according to the association relationship between the regions where the text lines are located.

For example, still taking fig. 3 as an example, assuming that both the area a where the text line is located and the area B where the text line is located fall into the designated area C, the server may determine that an association relationship exists between the area a where the text line is located and the area B where the text line is located, then determine that a character string a corresponding to the area a where the text line is located and a character string B corresponding to the area B where the text line is located, and determine that an association relationship exists between the character string a and the character string B, which may reduce resources consumed by identifying the character strings and improve data processing efficiency.

S106: and determining the data identifier corresponding to the geographic position when the image is acquired according to the geographic position when the image is acquired and the corresponding relation between each pre-stored data identifier and each geographic position.

In this specification, since an image can only present information within a certain geographic range, the server can update data within the geographic range according to a character string determined by the image. That is, in the present specification, data updated at the time of data processing has a correspondence relationship with a geographical position (for example, data corresponding to a POI). Then, after determining that the character strings in the association relationship exist, the server may also determine that those data can be updated from the image acquired in step S100.

Specifically, the server may first determine the distance between each geographic position and the geographic position at the time of acquiring the image according to each geographic position stored in advance and the geographic position at the time of acquiring the image, and then determine each geographic position from among the geographic positions at which the distance from the geographic position at the time of acquiring the image is smaller than the preset distance. The geographic location may specifically be coordinates of a POI saved in advance, or a geographic location of each store stored in advance.

Secondly, the server can use the determined data identifications corresponding to the geographic positions which are smaller than the preset distance as the data identifications corresponding to the geographic positions when the image is collected according to the corresponding relation between the pre-stored data identifications and the geographic positions. Namely, from the pre-stored geographic positions, determining the geographic positions whose distance from the geographic position when the image is collected is less than the preset distance, then determining the corresponding data identifications according to the corresponding relationship between the pre-stored data identifications and the geographic positions, and finally taking the determined data identifications as the data identifications corresponding to the geographic position when the image is collected.

It should be noted that the data identifier is data corresponding to a geographic location, but not an identifier of the geographic location. Taking a POI as an example, the data identifies data corresponding to a ceremonial POI, rather than identification data uniquely identifying the POI when storing the POI. For example, the data corresponding to POIs of a dining store may include: store name, the cuisine in which the store operates, store contact, store hours, etc.

In addition, in a scene in which the shop profile is updated, the image acquired by the server should include the name of the shop, and therefore the server may use only the shop name corresponding to the POI stored in advance as the data identifier.

Fig. 4 is a schematic diagram of determining data identifiers corresponding to geographic locations when the image is acquired, where black dots in fig. 4 are POIs in an electronic map, each POI corresponds to a plurality of data identifiers, and a white dot is a geographic location when the image is acquired, and the geographic location corresponds to a plurality of character strings having an association relationship. The POIs in the dotted circle are POIs smaller than the preset distance.

S108: and determining a data identifier matched with at least one character string as an updating identifier in the determined data identifiers.

In this specification, since the character string having the association relationship determined in step S104 is a character string that can be used for data update, and which data can be specifically updated, it can be determined from the data identifier corresponding to the geographic location at the time of capturing the image determined in step S106.

Specifically, the server may determine, for each data identifier and each character string determined in step S104, a similarity between each data identifier and each character string, determine that data with the similarity greater than a preset value matches the character string, and determine a data identifier matching at least one character string as an update identifier. Taking fig. 4 as an example, that is, the server may determine, from the data identifiers, the data identifier matching the character string included in the image, and since the character strings determined in step S104 are all character strings having an association relationship, when the data identifier matches a certain character string, other character strings indicating that the matched character string is associated are also associated with the data identifier (i.e., the update identifier), and the subsequent steps may update the data accordingly.

For example, if there is a character string a matching the data identifier c and there is an association relationship between the character string a and the character string b, the character string b and the data identifier c are also related, and may be used to update the data corresponding to the geographic location corresponding to the data identifier c. More specifically, assuming that the data identifier c is a store name, the data identifier c corresponds to the geographic location of the X store, the character string a is also a store name, and the character string b is a contact phone, when the character string a matches the data identifier c (i.e., the store name matches), the contact phone should be a contact phone of the X store, and then the subsequent step may update the data corresponding to the geographic location of the X store according to the character string b.

S110: and updating the data corresponding to the update identifier according to other character strings with incidence relation with the character string matched with the update identifier.

In this specification, after determining the update identifier, the server may determine other character strings associated with the character string according to the character string matched with the update identifier and the association relationship determined in step S104, and update the data corresponding to the update representation according to the determined other character strings.

In addition, in the update data, the server may determine, from other character strings having an association relationship with the character string matching the update identifier, a character string that is different from the data corresponding to the update identifier and has no conflict, and then add the determined character string to the data corresponding to the update identifier. And the character string with no conflict with the data corresponding to the updating identification is the character string with the similarity lower than a preset threshold value. The data corresponding to the update identifier conflicts with the character string, which means that the types of the data are the same but the contents are different, for example, the data corresponding to the update identifier includes: if the content of the phone is 123, the character string is also the phone, and the content is 124, then there is a conflict between the two data. When the identified data in the image and the pre-stored data both correspond to the same geographic location but are different, it is difficult to determine which data is valid. At this time, the server may send the conflict data to the staff for manual review.

And if the data corresponding to the update identifier does not conflict with the character string, the server can update the data corresponding to the update identifier according to the character string. When the data corresponding to the update identifier is different from the character string, the server may store the determined character string as the data corresponding to the update identifier. When the data corresponding to the update identifier is the same as the character string, the server may determine that the updated data corresponding to the update identifier has not changed, that is, no new data needs to be saved.

Based on the data processing method shown in fig. 1, after an image is acquired, first, character strings corresponding to text lines in the same designated area in the image in which the text line is located are determined as character strings having an association relationship, then, data identifiers corresponding to geographic positions when the image is acquired are determined according to the correspondence between the pre-stored data identifiers and the geographic positions, then, data identifiers matched with at least one character string are determined from the determined data identifiers to serve as update identifiers, and finally, data corresponding to the update identifiers are updated according to other character strings having an association relationship with the character strings matched with the update identifiers. The incidence relation of the character strings contained in the acquired image is determined, the character strings used for data updating are determined, and the data corresponding to the data identification matched with at least one character string is updated, so that a novel data updating method is realized, and the difficulty of data supplement is reduced.

In addition, in one or more embodiments provided in this specification, when the area where the text line is located in the image is determined in step S102, since the image of the shop front is captured, the text of the shop front is not captured directly, and there may be a certain angle, which may cause the text line in the image of the shop front to exhibit a perspective effect of near-far, which may cause the minimum bounding rectangle of the text line in the image determined by the embodiment of the specification to include much background noise (i.e., non-text area) at the far end of the text line (i.e., the end with smaller text), which may affect the recognition result of the text line in the image, as shown in fig. 5.

Fig. 5 is a schematic diagram of a minimum circumscribed rectangle determined by a non-front-side captured image, where the left side is an original image, the middle is a character line region and a non-character line region obtained by segmentation after FCN model processing, the light color is the character line region, the dark color is the non-character line region, and the right side is the minimum circumscribed rectangle determined according to the segmented region and is represented by a dotted line, which is visible at the far end of the character line, where the minimum circumscribed rectangle contains more backgrounds. And then, the recognition is performed based on the image area corresponding to the minimum circumscribed rectangle, which may cause the accuracy of character string recognition corresponding to the text line to be reduced.

Therefore, in this specification, when determining the region where the text line is located, the server may first determine a polygon region corresponding to at least one text line in the image, then determine, according to a pre-trained angle correction model, a text vertical direction of a text line included in the polygon region in the image for each polygon region, then determine a minimum circumscribed rectangle corresponding to the polygon region according to the polygon region, finally adjust a position and a length of a vertical edge in the minimum circumscribed rectangle corresponding to the polygon region according to the determined text vertical direction and the polygon region, and determine, according to the adjusted vertical edge, a circumscribed quadrilateral region corresponding to the text line as the region where the text line is located.

Specifically, the server may input the image into a pre-trained FCN model to obtain a segmentation result of the image output by the FCN model, where the segmentation result of the image may be an area including a polygon of text lines and non-text lines obtained by segmenting the image as shown in the intermediate image in fig. 5.

Due to the image acquisition angle, besides the perspective relation which may exist in the image, the text line in the image may also have characters which are not on the same horizontal line, so that in order to correct the problem that the characters in the text line are not on the same horizontal line, the server may determine, for each determined polygonal area, the vertical direction of the characters in the image of the text line contained in the polygonal area through a pre-trained angle correction model.

Specifically, first, the server may determine, for each polygonal area, an image containing the polygonal area. The server may obtain an image including only the polygonal area by changing another polygonal area in the image into a non-character area, or may obtain an image including only the polygonal area by cropping in the vertical and horizontal directions of the image, and keep the vertical and horizontal directions of the cropped image consistent with those of the image before cropping. Of course, the server may also determine the image including the polygon area in other manners, and this specification is not limited thereto. By determining the image containing the polygonal area, the interference of other polygonal areas in the image on the output result of the model can be avoided.

Secondly, the server can input the image containing the polygonal area as input, input a pre-trained angle correction model, and determine the included angle between the polygonal area and the horizontal direction of the image. The result output by the angle correction model can be a tangent value of the included angle, the value range of the tangent value is (-1, 1), the included angle between the polygonal area and the image in the horizontal direction can be determined according to the tangent value, when the value of the tangent value is negative, the character style representing the polygonal area inclines leftwards in the image, otherwise, the character style representing the polygonal area inclines rightwards.

Finally, the server can determine the vertical direction of the characters in the image in the text lines contained in the polygonal area according to the determined included angle. Fig. 6 is a schematic diagram for determining a vertical direction of a text provided in an embodiment of the present specification. The server determines the vertical direction of the characters according to the tangent numerical values, and the angle correction model outputs the tangent numerical values of included angles between the straight lines and the dotted lines.

In addition, the angle correction model in this specification may be a regression model, such as a logistic regression or a linear regression. In this specification, the training of the angle correction model may be based on a training sample prepared in advance, and specifically, the server may acquire an image of the trainable sample, and may generally acquire an image containing a text line from a database. The "label" of each training sample is then determined, and the worker may label the top left corner and the bottom left corner of the text line in the image. Then, the server determines a connecting line of the upper left corner and the lower left corner of each image after being labeled and a tangent value of an included angle between the connecting line and the horizontal direction of the image. For example, assuming that the upper left corner is P0 and the lower left corner is P3, the server may determine the tangent values as P0(x) -P3(x)/P0(y) -P3(y), where P0(x) and P0(y) are the x-axis coordinate and the y-axis coordinate of the P0 point pixel in the image, and similarly P3(x) and P3(y) are the x-axis coordinate and the y-axis coordinate of the P3 point pixel in the image, respectively. Then, for each labeled image, the image is segmented by the FCN model to obtain an image including a polygonal region in the image, and a training sample including the training image and the tangent value is determined as a training image.

After the training sample is determined, the server can minimize the difference value between the tangent value output by the angle correction model and the tangent value contained in the training sample as an optimization target, and adjust the model parameters in the angle correction model until the training end condition is met.

Further, in this specification, the result output by the angle correction model may also be set as needed, and when the angle correction model is trained, the model parameters of the angle correction model are adjusted by using the corresponding training samples and the optimization target, which is not limited in this specification. For example, if an angle between the polygonal area and the vertical direction of the image is output, and the like, a corresponding angle value of the included angle needs to be labeled when the training sample is determined, and the angle correction model can also be obtained through a similar training process.

In this specification, in order to adjust the minimum bounding rectangle corresponding to a polygon region, reduce background noise of a non-character portion in a region subjected to image analysis, and improve recognition accuracy, the server may determine the minimum bounding rectangle corresponding to the polygon region for each polygon region. Of course, since determining the minimum bounding rectangle corresponding to the polygon region is a mature technology, the description thereof is not omitted herein.

Finally, in this specification, for each polygon, after determining the text vertical direction of the text line included in the polygon and the minimum circumscribed rectangle corresponding to the polygon, the server may adjust the position and length of the vertical edge of the minimum circumscribed rectangle according to the text vertical direction to determine the circumscribed quadrilateral region corresponding to the text line included in the polygon, and determine, according to the obtained circumscribed quadrilateral region, a region in the image where the text line corresponding to the character string matched with the sample tag is located, as the region where the text line is located.

When the server determines the external quadrangle area, the server can determine the straight line where the vertical edge of the external quadrangle corresponding to the text line is located according to the determined minimum external rectangle and the vertical direction of the characters. Specifically, the server may determine, for each vertical side of the minimum circumscribed rectangle, that is, the left vertical side and the right vertical side, a straight line passing through a midpoint of the vertical side in the vertical direction of the text, where the straight line is located by the vertical side of the circumscribed quadrangle corresponding to the text line. Namely, the angle of the vertical side of the minimum circumscribed rectangle is adjusted to the vertical direction of the character.

The server can determine the external quadrilateral area corresponding to the text line according to the determined straight line where the vertical side is located, each corner point of the convex hull corresponding to the polygonal area and four corners of the minimum external rectangle. Specifically, the server may determine each corner point corresponding to the polygon area as a first type point, determine a point where each corner of the minimum circumscribed rectangle is located as a second type point, then determine, for each second type point, a connection line between each first type point and each second type point, and determine an intersection point between a straight line where each connection line is located and a straight line where each vertical side is located, as shown in fig. 7.

Fig. 7 is a schematic connection diagram provided in the present specification. The black points are first type points, the white points are second type points, a connecting line of each first type point and the second type point at the upper left corner of the minimum circumscribed rectangle, namely a dotted line in the figure, is shown in fig. 7, the adjusted vertical edge on the left side of the minimum circumscribed rectangle (namely a straight line where the vertical edge of the circumscribed quadrangle is located, and a light straight line in the figure) is shown in the figure, and the server can determine the intersection point of each dotted line and the light straight line.

The server can then determine the point where the corner of the circumscribed quadrangle corresponding to the text line is located from each intersection point according to the position relationship between the second type point and other second type points. That is, the length of the vertical side of the minimum bounding rectangle is adjusted. For example, the intersection point closest to the upper boundary of the image in fig. 7 may be used as the vertex of the vertical side of the adjusted minimum bounding rectangle, i.e., the upper left corner of the bounding rectangle, and the intersection point closest to the lower boundary of the image may be used as the other vertex of the vertical side of the adjusted minimum bounding rectangle, i.e., the lower left corner of the bounding rectangle. The distance from the upper left corner to the lower left corner, namely the length of the vertical side of the minimum circumscribed rectangle after adjustment, is also the left vertical side of the circumscribed quadrangle. Similarly, the server also determines the right vertical side of the circumscribed quadrangle by the same method.

The server can finally determine the circumscribed quadrangle area according to the point where the determined corner of the circumscribed quadrangle is located, namely the area surrounded by the four corners of the determined circumscribed quadrangle, and the circumscribed quadrangle area is used as the area where the text line is located. Fig. 8 is a schematic diagram of a circumscribed quadrangle finally determined based on the text lines shown in fig. 5, wherein the area enclosed by the gray line segments is the circumscribed quadrangle.

In addition, in one or more embodiments of the present specification, after the server determines the circumscribed quadrangle in step S104, since the input of the text line recognition model is usually a rectangular image, the server may continue to perform image processing on the circumscribed quadrangle to adjust the circumscribed quadrangle into a rectangle in order to facilitate text line recognition.

Specifically, the server may adopt an image processing method such as stretching, rotating, twisting, or the like to perform image processing on the circumscribed quadrangle, adjust the circumscribed quadrangle into a rectangle, and perform coordinate change processing on the text lines in the circumscribed quadrangle, so that the sizes of characters in the text lines in the converted rectangle are uniform, and the obtained rectangular image is used as the region where the text lines are located. Or, the circumscribed quadrangle is directly subjected to image processing methods such as stretching, rotating, twisting and the like, and the circumscribed quadrangle is adjusted to be a rectangle.

Finally, the region where the text line is located (i.e., the converted rectangle) obtained by the above-described processing is clipped, thereby obtaining an image of the input text line recognition model in step S104.

The first processing case may be as shown in fig. 9. Fig. 9 is a schematic diagram of a process of determining a region where a text line is located according to an embodiment of the present disclosure, where the server may first determine a polygon corresponding to the text line, then determine a minimum circumscribed rectangle, determine a circumscribed quadrangle corresponding to the text line through the foregoing process, and finally adjust the circumscribed quadrangle into a rectangle to serve as the region where the text line is located, as shown in fig. 8. As can be seen from the sample image m in fig. 9, the server may directly perform stretching processing according to the circumscribed quadrangle to determine the corresponding rectangle, and the sample image obtained by the server is the image m in fig. 9. The problem that far-end background noise is more due to perspective in fig. 8 is solved by determining the circumscribed quadrangle, and new background noise is not brought in through processing such as stretching, but image processing is performed on the original background noise.

A second processing scenario may be as shown in fig. 10. Fig. 10 is a schematic diagram of a process of determining a region where a text line is located according to an embodiment of the present disclosure, where the server may first determine a rectangle corresponding to an external quadrangle through stretching processing, and then further perform coordinate change processing on the text line in the external quadrangle, adjust the size and position of a smaller character in the text line to unify the size of each character, and determine the region where the text line is located as shown in an image n in fig. 10, so that the smaller character in the image is enlarged due to perspective, and the problem caused by background noise is further reduced.

Further, the server may also directly use the circumscribed quadrangle corresponding to the text line as the image of the input text line recognition model in step S104. Alternatively, the server may determine a rectangle circumscribed by the circumscribed quadrangle and fill a blank area between the circumscribed quadrangle and the rectangle circumscribed by the circumscribed quadrangle, for example, fill with a preset pure color, or determine an average gray value of the circumscribed quadrangle and fill gray based on the average gray value, in addition to performing image processing such as stretching, rotating, and twisting on the circumscribed quadrangle to adjust the circumscribed quadrangle into the image of the rectangle. The rectangle circumscribed by the circumscribed quadrangle can be a minimum rectangle circumscribed by the circumscribed quadrangle or a preset circumscribed rectangle with the same size. Moreover, how to fill the blank area can also be set according to needs, for example, a mode with less brought-in background noise can be selected for filling.

By the above-described operation of converting the circumscribed quadrangle into the rectangular image, the generated images of the input text line identification models can be unified into the rectangular image, and the server can also unify the sizes of the images of the input text line identification models by the processing such as upsampling or image compression.

Based on the data processing method shown in fig. 1, the embodiment of this specification further provides a schematic structural diagram of a data processing apparatus, as shown in fig. 11.

Fig. 11 is a schematic structural diagram of an image recognition apparatus provided in an embodiment of the present specification, where the apparatus includes:

an acquisition module 200 for acquiring an image;

a first determining module 202, configured to determine a region where a text line in the image is located and at least one designated region in the image;

the relationship determining module 204 is configured to determine, as each character string having an association relationship, a character string corresponding to each text line in which the text line is located and falling into the same designated area;

an identifier determining module 206, configured to determine a data identifier corresponding to the geographic location when the image is acquired according to the geographic location when the image is acquired and a correspondence between each pre-stored data identifier and each geographic location;

the matching module 208 determines a data identifier matched with at least one character string from the determined data identifiers as an update identifier;

and the data processing module 210 updates the data corresponding to the update identifier according to other character strings having an association relation with the character string matched with the update identifier.

Optionally, the obtaining module 200 obtains an image of the delivery capacity collected at the supplier while performing the delivery task, wherein the image includes at least a facade of the supplier.

Optionally, the identification determining module 206 determines the geographic location when the image is acquired or determines the location information of the collecting device when the distribution capacity is acquired by the collecting device according to the coordinates included in the POI of the supplier.

Optionally, the relationship determining module 204 determines character strings corresponding to the texts according to the region where the text line is located in the image, determines the region where each text line falls in the same designated region according to the position of each designated region in the image and the position of the region where each text line is located in the image, and determines the association relationship between the character strings corresponding to the regions where each text line falls in the same designated region.

Optionally, the identifier determining module 206 determines, from the pre-stored geographic positions, the geographic positions where the geographic position is less than the preset distance from the geographic position where the image is collected, and uses the data identifiers corresponding to the determined geographic positions less than the preset distance as the data identifiers corresponding to the geographic position where the image is collected according to the correspondence between the pre-stored data identifiers and the geographic positions.

Optionally, the data processing module 210 determines data corresponding to the update identifier according to a pre-stored correspondence between each piece of data and each geographic location and a geographic location corresponding to the update identifier, and updates the data corresponding to the update identifier according to another character string having an association relationship with the character string matched with the update identifier.

Optionally, the data processing module 210 determines, from other character strings having an association relationship with the character string matching the update identifier, a character string having no conflict with the data corresponding to the update identifier, and stores the determined character string as the data corresponding to the update identifier.

The present specification embodiment also provides a computer-readable storage medium storing a computer program, which is operable to execute any one of the above-described data processing methods.

Based on the data processing method shown in fig. 1, the embodiment of the present specification further provides a schematic structural diagram of the electronic device shown in fig. 12. As shown in fig. 12, the electronic device includes a processor, an internal bus, a network interface, a memory, and a non-volatile memory, and may include hardware required for other services. The processor reads the corresponding computer program from the nonvolatile memory into the memory and then runs the computer program to realize any one of the above data processing methods.

Of course, besides the software implementation, the present specification does not exclude other implementations, such as logic devices or a combination of software and hardware, and the like, that is, the execution subject of the following processing flow is not limited to each logic unit, and may be hardware or logic devices.

In the 90 s of the 20 th century, improvements in a technology could clearly distinguish between improvements in hardware (e.g., improvements in circuit structures such as diodes, transistors, switches, etc.) and improvements in software (improvements in process flow). However, as technology advances, many of today's process flow improvements have been seen as direct improvements in hardware circuit architecture. Designers almost always obtain the corresponding hardware circuit structure by programming an improved method flow into the hardware circuit. Thus, it cannot be said that an improvement in the process flow cannot be realized by hardware physical modules. For example, a Programmable Logic Device (PLD), such as a Field Programmable Gate Array (FPGA), is an integrated circuit whose Logic functions are determined by programming the Device by a user. A digital system is "integrated" on a PLD by the designer's own programming without requiring the chip manufacturer to design and fabricate application-specific integrated circuit chips. Furthermore, nowadays, instead of manually making an integrated Circuit chip, such Programming is often implemented by "logic compiler" software, which is similar to a software compiler used in program development and writing, but the original code before compiling is also written by a specific Programming Language, which is called Hardware Description Language (HDL), and HDL is not only one but many, such as abel (advanced Boolean Expression Language), ahdl (alternate Language Description Language), traffic, pl (core unified Programming Language), HDCal, JHDL (Java Hardware Description Language), langue, Lola, HDL, laspam, hardsradware (Hardware Description Language), vhjhd (Hardware Description Language), and vhigh-Language, which are currently used in most common. It will also be apparent to those skilled in the art that hardware circuitry that implements the logical method flows can be readily obtained by merely slightly programming the method flows into an integrated circuit using the hardware description languages described above.

The controller may be implemented in any suitable manner, for example, the controller may take the form of, for example, a microprocessor or processor and a computer-readable medium storing computer-readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, an Application Specific Integrated Circuit (ASIC), a programmable logic controller, and an embedded microcontroller, examples of which include, but are not limited to, the following microcontrollers: ARC 625D, Atmel AT91SAM, Microchip PIC18F26K20, and Silicone Labs C8051F320, the memory controller may also be implemented as part of the control logic for the memory. Those skilled in the art will also appreciate that, in addition to implementing the controller as pure computer readable program code, the same functionality can be implemented by logically programming method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Such a controller may thus be considered a hardware component, and the means included therein for performing the various functions may also be considered as a structure within the hardware component. Or even means for performing the functions may be regarded as being both a software module for performing the method and a structure within a hardware component.

The systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. One typical implementation device is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smartphone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.

For convenience of description, the above devices are described as being divided into various units by function, and are described separately. Of course, the functions of the various elements may be implemented in the same one or more software and/or hardware implementations of the present description.

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.

Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

As will be appreciated by one skilled in the art, embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, the description may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the description may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

This description may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The specification may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.

The above description is only an example of the present specification, and is not intended to limit the present specification. Various modifications and alterations to this description will become apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present specification should be included in the scope of the claims of the present specification.

Claims

1. A data processing method, comprising:

acquiring an image;

2. The method of claim 1, wherein acquiring the image specifically comprises:

3. The method of claim 2, wherein determining the geographic location at which the image was acquired comprises:

4. The method according to claim 1, wherein determining, as each character string having an association relationship, a character string corresponding to each text line whose area in which the text line is located falls into the same designated area, specifically comprises:

5. The method of claim 1, wherein determining the data identifier corresponding to the geographic location at which the image was acquired comprises:

6. The method according to claim 1, wherein updating the data corresponding to the update identifier according to another character string having an association relationship with the character string matched with the update identifier specifically includes:

7. The method of claim 6, wherein updating the data corresponding to the update identifier specifically includes:

determining a character string without conflict with data corresponding to the updating identification from other character strings with incidence relation with the character string matched with the updating identification;

and storing the determined character string as data corresponding to the updating identification.

8. A data processing apparatus, comprising:

an acquisition module for acquiring an image;

9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of any of claims 1 to 7 when executing the program.

10. A computer-readable storage medium, characterized in that the storage medium stores a computer program which, when executed by a processor, implements the method of any of the preceding claims 1 to 7.