CN110276352A

CN110276352A - Index identification method, device, electronic equipment and computer readable storage medium

Info

Publication number: CN110276352A
Application number: CN201910578100.5A
Authority: CN
Inventors: 龙力; 王佳军
Original assignee: Lazhasi Network Technology Shanghai Co Ltd
Current assignee: Rajax Network Technology Co Ltd; Lazhasi Network Technology Shanghai Co Ltd
Priority date: 2019-06-28
Filing date: 2019-06-28
Publication date: 2019-09-24

Abstract

The embodiment of the present disclosure discloses a kind of index identification method, device, electronic equipment and computer readable storage medium.This method comprises: multiple text boxes in detection images to be recognized；The dimension information for obtaining the text box determines the target text box according at least to the dimension information of the text box from multiple text boxes；The mark of target object is identified from the target text box.Pass through the embodiment of the present disclosure, it can be for the shop image of the images to be recognized such as solid shop of target object, the target text box of the most possible mark comprising target object is selected from the multiple text boxes that detected in images to be recognized, and further identify the text in target text box, the mark such as entity store name of target object can be quickly and accurately automatically identified from images to be recognized, recognition efficiency is improved, and saves a large amount of cost of human and material resources.

Description

Index identification method, device, electronic equipment and computer readable storage medium

Technical field

This disclosure relates to field of computer technology, and in particular to a kind of index identification method, device, electronic equipment and calculating Machine readable storage medium storing program for executing.

Background technique

With the development of internet technology, more and more Xian Xia trade companies are added to line upper mounting plate.Line upper mounting plate is in order to keep away The low-quality trade company for exempting from incorporeity StoreFront is serviced by line upper mounting plate, it will usually it is shone it is required that trade company uploads the corresponding shop front, To prove that it is entity trade company.Furthermore it is required that a series of data are filled in by trade company, audited, such as is examined for line upper mounting plate Whether core its retail shop's trade name is consistent with the data of offer.With increasing for trade company's quantity, manual examination and verification speed is slower, per capita Efficiency is lower, and consumes a large amount of manpower and material resources.

Summary of the invention

The embodiment of the present disclosure provides a kind of index identification method, device, electronic equipment and computer readable storage medium.

In a first aspect, providing a kind of index identification method in the embodiment of the present disclosure, comprising:

Detect multiple text boxes in images to be recognized；

The dimension information for obtaining the text box, according at least to the text box dimension information from multiple text boxes The middle determination target text box；

The mark of target object is identified from the target text box.

With reference to first aspect, the disclosure is in the first implementation of first aspect, the method also includes:

The text box of the first preset condition is not met according to the filtering of the dimension information of the text box；

Merge the text box for meeting two intersections of the second preset condition.

With reference to first aspect and/or the first implementation of first aspect, the disclosure is real at second of first aspect In existing mode, the text box of the first preset condition is not met according to the filtering of the dimension information of the text box, comprising:

The text box of the filter area less than the first preset threshold.

With reference to first aspect, the first implementation of first aspect and/or second of implementation of first aspect, this Be disclosed in the third implementation of first aspect, the dimension information of the text box include the text box height and/ Or width.

With reference to first aspect, the first implementation of first aspect, second of implementation of first aspect and/or The third implementation of one side, the disclosure is in the 4th kind of implementation of first aspect, according at least to the text box Dimension information the target text box is determined from multiple text boxes, comprising:

The text box is ranked up respectively according to width and height, obtains two kinds of ranking results；

The target text box is determined according to described two ranking results.

With reference to first aspect, the first implementation, second of implementation of first aspect, first party of first aspect The third implementation in face and/or the 4th kind of implementation of first aspect, five kind realization of the disclosure in first aspect In mode, the target text box is determined according to described two ranking results, comprising:

It, will in first text box in described two ranking results when the first text box identical there are ranking Ranking is near preceding first text box, widest first text box of width and highest first text box of height One of be determined as the target text box；

In described two ranking results when first text box identical there is no ranking, by width widest second One of text box and the highest third text box of height are determined as the target text box.

With reference to first aspect, the first implementation, second of implementation of first aspect, first party of first aspect The 5th kind of implementation of the third implementation in face, the 4th kind of implementation of first aspect and/or first aspect, this public affairs It is opened in the 6th kind of implementation of first aspect, there are identical first text boxes of ranking in described two ranking results When, in first text box, by ranking near preceding first text box, widest first text box of width and One of highly highest described first text box is determined as the target text box, comprising:

If ranking is greater than or equal to the mean height of multiple text boxes near the height of preceding first text box Degree, then will be determined as candidate text box near preceding first text box；

If ranking is less than multiple average heights for text box near the height of preceding first text box, and The width of the highly highest third text box is greater than or equal to multiple mean breadths for text box, then will highly most The high third text box is determined as candidate text box；

If ranking is less than multiple average heights for text box near the height of preceding first text box, and The width of the highly highest third text box is less than multiple mean breadths for text box, then by the widest institute of width It states the second text box and is determined as candidate text box；

The target text is determined at a distance from the top of the images to be recognized according to the upper sideline of the candidate text box This frame.

With reference to first aspect, the first implementation, second of implementation of first aspect, first party of first aspect The third implementation in face, the 4th kind of implementation of first aspect, first aspect the 5th kind of implementation and/or first 6th kind of implementation of aspect, the disclosure is in the 7th kind of implementation of first aspect, according to the candidate text box Upper sideline determines the target text box at a distance from the top of the images to be recognized, comprising:

It is pre- to be less than or equal to second at a distance from the top of the images to be recognized in the upper sideline of the candidate text box If threshold value, and when the candidate text box is the text box uppermost positioned at the images to be recognized, it will the candidate text This frame is determined as the target text box；

It is pre- to be less than or equal to second at a distance from the top of the images to be recognized in the upper sideline of the candidate text box If threshold value, and when the candidate text box is not the text box uppermost positioned at the images to be recognized, from the candidate Text box and highest 4th text box of selection height is true in the 4th text box on the candidate text box It is set to target text box；

It is greater than the second preset threshold at a distance from the top of the images to be recognized in the upper sideline of the candidate text box, And widest second text box of width be the text box uppermost positioned at the images to be recognized when, width is most wide Second text box be determined as target text box；

It is greater than the second preset threshold at a distance from the top of the images to be recognized in the upper sideline of the candidate text box, And width widest second text box is not when being the text box uppermost positioned at the images to be recognized, most from width Wide second text box and the selection height in the 5th text box on widest second text box of width Highest 5th text box is determined as target text box.

With reference to first aspect, the first implementation, second of implementation of first aspect, first party of first aspect The third implementation in face, the 4th kind of implementation of first aspect, first aspect the 5th kind of implementation, first aspect The 6th kind of implementation and/or first aspect the 7th kind of implementation, the disclosure is in the 8th kind of realization side of first aspect In formula, in described two ranking results when first text box identical there is no ranking, by widest second text of width One of this frame and the highest third text box of height are determined as the target text box, comprising:

If the ratio between the width of widest second text box of width and height are less than or equal to third predetermined threshold value, Widest second text box of the width is determined as the target text box；

If the ratio between the width of widest second text box of width and height are greater than the third predetermined threshold value, and high When spending the upper sideline of the highest third text box and being greater than four preset thresholds at a distance from the top of the images to be recognized, Widest second text box of width is then determined as target text box；

If the upper sideline of the highest third text box of height is less than or waits at a distance from the top of images to be recognized When four preset threshold, the highest third text box of height is determined as target text box.

With reference to first aspect, the first implementation, second of implementation of first aspect, first party of first aspect The third implementation in face, the 4th kind of implementation of first aspect, first aspect the 5th kind of implementation, first aspect The 6th kind of implementation, the 7th kind of implementation of first aspect and/or the 8th kind of implementation of first aspect, the disclosure In the 9th kind of implementation of first aspect, multiple text boxes in images to be recognized are detected, comprising:

Using multiple text boxes in first smart network's model inspection images to be recognized；Wherein, described the first Preparatory training of the work intelligent network model Jing Guo sample data.

With reference to first aspect, the first implementation, second of implementation of first aspect, first party of first aspect The third implementation in face, the 4th kind of implementation of first aspect, first aspect the 5th kind of implementation, first aspect The 6th kind of implementation, the 7th kind of implementation of first aspect, first aspect the 8th kind of implementation and/or first party The 9th kind of implementation in face, the disclosure are identified from the target text box in the tenth kind of implementation of first aspect The mark of target object out, comprising:

The mark of target object is identified from the target text box using second smart network's model；Wherein, Preparatory training of the second smart network model Jing Guo sample data.

With reference to first aspect, the first implementation, second of implementation of first aspect, first party of first aspect The third implementation in face, the 4th kind of implementation of first aspect, first aspect the 5th kind of implementation, first aspect The 6th kind of implementation, the 7th kind of implementation of first aspect, the 8th kind of implementation of first aspect, first aspect Tenth kind of implementation of the 9th kind of implementation and/or first aspect, the disclosure is in a kind of the tenth realization side of first aspect In formula, further includes:

Obtain Background image set and character set；Wherein, generated including the use of different colours one of the Background image set Or multiple background images, and/or the one or more background images intercepted from existing image；The character set includes using not The one or more texts generated with color and/or different fonts

The sample data is generated according to the Background image set and the character set；Wherein, the sample data includes At least one text at least one background image and the character set in the background image collection.Second aspect, this public affairs It opens and provides a kind of identification recognition device in embodiment, comprising:

Detection module, the multiple text boxes being configured as in detection images to be recognized；

Determining module is configured as obtaining the dimension information of the text box, believes according at least to the size of the text box Breath determines the target text box from multiple text boxes；

Identification module is configured as identifying the mark of target object from the target text box.

The function can also execute corresponding software realization by hardware realization by hardware.The hardware or Software includes one or more modules corresponding with above-mentioned function.

It include memory and processor, the memory in the structure of identification recognition device in a possible design The computer instruction of index identification method in above-mentioned first aspect is executed for storing one or more support identification recognition device, The processor is configured to for executing the computer instruction stored in the memory.The identification recognition device can be with Including communication interface, for identification recognition device and other equipment or communication.

The third aspect, the embodiment of the present disclosure provide a kind of electronic equipment, including memory and processor；Wherein, described Memory is for storing one or more computer instruction, wherein one or more computer instruction is by the processor It executes to realize following methods step:

Detect multiple text boxes in images to be recognized；

The mark of target object is identified from the target text box.

In conjunction with the third aspect, the disclosure is in the first implementation of the third aspect, one or more computer Instruction is also executed by the processor to realize following methods step:

In conjunction with the first of the third aspect and/or the third aspect implementation, second reality of the disclosure in the third aspect In existing mode, the text box of the first preset condition is not met according to the filtering of the dimension information of the text box, comprising:

The text box of the filter area less than the first preset threshold.

In conjunction with the first implementation of the third aspect, the third aspect and/or second of implementation of the third aspect, originally Be disclosed in the third implementation of the third aspect, the dimension information of the text box include the text box height and/ Or width.

In conjunction with the first implementation of the third aspect, the third aspect, second of implementation of the third aspect and/or The third implementation of three aspects, the disclosure is in the 4th kind of implementation of the third aspect, according at least to the text box Dimension information the target text box is determined from multiple text boxes, comprising:

The target text box is determined according to described two ranking results.

The first implementation, second of implementation of the third aspect, third party in conjunction with the third aspect, the third aspect The third implementation in face and/or the 4th kind of implementation of the third aspect, five kind realization of the disclosure in the third aspect In mode, the target text box is determined according to described two ranking results, comprising:

The first implementation, second of implementation of the third aspect, third party in conjunction with the third aspect, the third aspect The 5th kind of implementation of the third implementation in face, the 4th kind of implementation of the third aspect and/or the third aspect, this public affairs It is opened in the 6th kind of implementation of the third aspect, there are identical first text boxes of ranking in described two ranking results When, in first text box, by ranking near preceding first text box, widest first text box of width and One of highly highest described first text box is determined as the target text box, comprising:

The first implementation, second of implementation of the third aspect, third party in conjunction with the third aspect, the third aspect The third implementation in face, the 4th kind of implementation of the third aspect, the third aspect the 5th kind of implementation and/or third 6th kind of implementation of aspect, the disclosure is in the 7th kind of implementation of the third aspect, according to the candidate text box Upper sideline determines the target text box at a distance from the top of the images to be recognized, comprising:

The first implementation, second of implementation of the third aspect, third party in conjunction with the third aspect, the third aspect The third implementation in face, the 4th kind of implementation of the third aspect, the third aspect the 5th kind of implementation, the third aspect The 6th kind of implementation and/or the third aspect the 7th kind of implementation, the disclosure is in the 8th kind of realization side of the third aspect In formula, in described two ranking results when first text box identical there is no ranking, by widest second text of width One of this frame and the highest third text box of height are determined as the target text box, comprising:

The first implementation, second of implementation of the third aspect, third party in conjunction with the third aspect, the third aspect The third implementation in face, the 4th kind of implementation of the third aspect, the third aspect the 5th kind of implementation, the third aspect The 6th kind of implementation, the 7th kind of implementation of the third aspect and/or the 8th kind of implementation of the third aspect, the disclosure In the 9th kind of implementation of the third aspect, multiple text boxes in images to be recognized are detected, comprising:

Using multiple text boxes in first smart network's model inspection images to be recognized；Wherein, described the first Preparatory training of the work intelligent network model Jing Guo sample data.In conjunction with the third aspect, the third aspect the first implementation, Second of implementation of three aspects, the third implementation of the third aspect, the third aspect the 4th kind of implementation, third 5th kind of implementation of aspect, the 6th kind of implementation of the third aspect, the third aspect the 7th kind of implementation, third party The 8th kind of implementation in face and/or the 9th kind of implementation of the third aspect, ten kind realization of the disclosure in the third aspect In mode, the mark of target object is identified from the target text box, comprising:

The first implementation, second of implementation of the third aspect, third party in conjunction with the third aspect, the third aspect The third implementation in face, the 4th kind of implementation of the third aspect, the third aspect the 5th kind of implementation, the third aspect The 6th kind of implementation, the 7th kind of implementation of the third aspect, the 8th kind of implementation of the third aspect, the third aspect Tenth kind of implementation of the 9th kind of implementation and/or the third aspect, the disclosure is in a kind of the tenth realization side of the third aspect In formula, one or more computer instruction is also executed by the processor to realize following methods step:

The sample data is generated according to the Background image set and the character set；Wherein, the sample data includes At least one text at least one background image and the character set in the background image collection.

Fourth aspect, the embodiment of the present disclosure provide a kind of computer readable storage medium, for storing mark identification dress Computer instruction used is set, it includes refer to for executing computer involved in index identification method in above-mentioned first aspect It enables.

The technical solution that the embodiment of the present disclosure provides can include the following benefits:

The embodiment of the present disclosure carries out text box detection to images to be recognized, and from detection after obtaining images to be recognized To multiple text boxes according to the selection of the dimension information of text box closest to target object mark text box, and to this article This frame carries out Text region, to obtain the mark of target object.It, can be for target object wait know by the embodiment of the present disclosure The shop image of other image such as solid shop is selected most possible from the multiple text boxes that detected in images to be recognized The target text box of mark comprising target object, and further identify the text in target text box, it can be quick and accurate Ground automatically identifies the mark such as entity store name of target object from images to be recognized, improves recognition efficiency, and save A large amount of cost of human and material resources.

It should be understood that above general description and following detailed description be only it is exemplary and explanatory, not The disclosure can be limited.

Detailed description of the invention

In conjunction with attached drawing, by the detailed description of following non-limiting embodiment, the other feature of the disclosure, purpose and excellent Point will be apparent.In the accompanying drawings:

Fig. 1 shows the flow chart of the index identification method according to one embodiment of the disclosure；

Fig. 2 shows the flow charts of the step S102 of embodiment according to Fig. 1；

Fig. 3 shows the flow chart of the step S202 of embodiment according to Fig.2,；

Fig. 4 shows the flow chart of the step S301 of embodiment according to Fig.3,；

Fig. 5 shows the flow chart of the step S404 of embodiment according to Fig.4,；

Fig. 6 shows the flow chart of the step S302 of embodiment according to Fig.3,；

Fig. 7 shows the effect diagram that Catering Pubs title is identified according to one embodiment of the disclosure；

Fig. 8 shows the structural block diagram of the identification recognition device according to one embodiment of the disclosure；

Fig. 9 shows the structural block diagram of the determining module 802 of embodiment according to Fig.4,；

Figure 10 shows the structural block diagram of the first determining submodule 902 of embodiment according to Fig. 9；

Figure 11 shows the structural block diagram of the second determining submodule 1001 of embodiment according to Fig.10,；

Figure 12 shows the structural block diagram for determining submodule 1104 according to the 7th of Figure 11 illustrated embodiment the；

The third that Figure 13 shows embodiment according to Fig.10, determines the structural block diagram of submodule 1002；

Figure 14 is adapted for the structure for realizing the electronic equipment of the index identification method according to one embodiment of the disclosure Schematic diagram.

Specific embodiment

Hereinafter, the illustrative embodiments of the disclosure will be described in detail with reference to the attached drawings, so that those skilled in the art can Easily realize them.In addition, for the sake of clarity, the portion unrelated with description illustrative embodiments is omitted in the accompanying drawings Point.

In the disclosure, it should be appreciated that the term of " comprising " or " having " etc. is intended to refer to disclosed in this specification Feature, number, step, behavior, the presence of component, part or combinations thereof, and be not intended to exclude other one or more features, A possibility that number, step, behavior, component, part or combinations thereof exist or are added.

It also should be noted that in the absence of conflict, the feature in embodiment and embodiment in the disclosure It can be combined with each other.The disclosure is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.

Fig. 1 shows the flow chart of the index identification method according to one embodiment of the disclosure.As shown in Figure 1, the mark Recognition methods includes the following steps S101-S103:

In step s101, multiple text boxes in images to be recognized are detected；

In step s 102, the dimension information for obtaining the text box, according at least to the text box dimension information from The target text box is determined in multiple text boxes；

In step s 103, the mark of target object is identified from the target text box.

In the present embodiment, target object can be in line upper mounting plate and provide the system object of service for user, such as trade company, Product etc..Line upper mounting plate includes but is not limited to electronic commercial platform etc..System object can be a trade company, a product etc.. Images to be recognized can be the corresponding shop image of solid shop of trade company, image of product entity outer packing etc..Figure to be identified It may include the mark of target object on picture, the mark of the target object can be words identification, such as the store name of eating and drinking establishment Title, name of product etc..It is understood that can also be wrapped other than including the mark of target object in images to be recognized Include other word contents, such as the texts such as details introduction (such as eating and drinking establishment address, phone, production producer) of target object Content.

In some embodiments, text box position can be detected by image segmentation network model, such as can be used The method of deep learning carries out image segmentation to images to be recognized, and then determines the pixel of text box, and by minimum external The method of rectangle obtains text box position.There may be multiple text boxes in images to be recognized, and one of them may include mesh Mark the mark of object.

After detecting multiple text boxes, the dimension information of available detected text box, and according to text box Dimension information the target text box of the most possible mark comprising target object is selected from multiple text boxes, and to target text This frame carries out Text region and obtains the mark of target object.The dimension information of target text box includes the height for being singly not limited to text box Degree and/or width.Target text box can be one of the maximum probability of the mark in multiple text box comprising target object Text box.

In practical application scene, images to be recognized can be the appearance image of target object correspondent entity, usual situation The outer surface of lower target object correspondent entity more may significantly mark the mark such as title, code name of target object, target The mark of object may include text, character, number etc..It therefore, can be from multiple in the mark identification process of target object A text box the most significant is selected in text box as target text box, it can select most by the significance of text box Significant text box is as target text box.The significance of text box can pass through the area, transverse width, Zong Xianggao of text box Degree, at the top of images to be recognized at a distance from etc. in one or more features be compared to determine.For example, most significant text box It can be most wide area maximum, transverse width, longitudinal height highest and/or the text box at the top of images to be recognized etc.. It should be noted that text box is the rectangle for including four edges in the embodiment of the present disclosure in the case where no specified otherwise, it is wide Degree refers to the distance between longitudinal both sides, and highly refers to the distance between both sides above and below laterally.

After target text box has been determined, target can be identified by Text region model such as neural network model etc. Word content in text box, and the word content that will identify that is determined as the mark of target object.

The embodiment of the present disclosure carries out text box detection to images to be recognized, and from detection after obtaining images to be recognized To multiple text boxes in target text box selected according to the dimension information of text box, and text knowledge is carried out to the target text box Not, to obtain the mark of target object.It, can be from the corresponding images to be recognized of target object such as entity by the embodiment of the present disclosure The target text of the most possible mark comprising target object is selected in multiple text boxes that the shop image detection in shop comes out This frame, and further identify the text in target text box, target can be quickly and accurately identified from images to be recognized The mark of object such as entity store name, improves recognition efficiency, and save a large amount of cost of human and material resources.

In an optional implementation of the present embodiment, the step S101 is that is, multiple in detection images to be recognized The step of text box, further includes steps of

Processing is optimized to multiple text boxes according to default optimal way.

In the optional implementation, due to the text box detected from images to be recognized by text box detection model There may be the case where mistake determines, some regions in images to be recognized may be mistakenly classified as text box；In addition, one There may be the incomplete situations of detection for a little biggish text boxes of transverse width.Therefore, it is being detected from images to be recognized After text box, processing can be optimized to multiple text boxes based on default optimal way, retaining has in complete text The text box of appearance；Such as some text boxes of erroneous judgement can be rejected, incomplete text box can also be will test and merged.

In an optional implementation of the present embodiment, the method further comprises following steps:

In the optional implementation, due to the embodiment of the present disclosure be from the mark of images to be recognized detected target object, And images to be recognized is the appearance image of target object correspondent entity, and under normal conditions, in the corresponding entity of target object On outer surface, the mark of target object may be designed larger and/or more significant, therefore can pass through experience and/or system The first preset condition is arranged in the modes such as meter analysis sample data, and does not meet first according to the filtering of the dimension information of text box and preset The text box of condition, the mark for further identification target object are laid a solid foundation；Wherein, the first preset condition can be text box Size range, lateral extent and/or longitudinal altitude range etc..

In addition, if when intersecting there are two text boxes, then the first is it may be the case that text in two text boxes Content is closer, and second situation may be to be mistaken for two parts content due to belonging to the word content of an entirety.For The case where avoiding latter from judging by accident, can be arranged second in a manner of experience and/or statistical analysis sample data etc. pre- in advance to first pass through If condition, and two text box intersections are distinguished according to the second preset condition and belong to the first above-mentioned situation or second situation, In the case where the text box that two intersect meets the second preset condition, it is believed that the text of the text box kind of this two intersections Content belongs to an entirety, and is mistaken for two parts content, therefore the text box that two intersect can be merged；Without In the case where meeting the second preset condition, it may be considered that the word content of two text box kinds is closer, do not need to carry out Merge.Second preset condition can be the intersection ratio of two text boxes, and intersection ratio can be the phase of two intersection text boxes The ratio between the merging part for handing over part intersect after text box merging with two, for example, the area of intersection, transverse width and/or The ratio between longitudinal height and area, transverse width and/or longitudinal height for merging part etc..

It is described not met according to the filtering of the dimension information of the text box in an optional implementation of the present embodiment It the step of text box of first preset condition, further includes steps of

The text box of the filter area less than the first preset threshold.

In the optional implementation, the first preset threshold can be set to minimum area, if the area of text box is small In the minimum area, it may be considered that it is target pair that text frame, which is word content in the text box or text frame of erroneous judgement, A possibility that as identifying, is smaller.Therefore, when the area of the one or more text boxes detected is less than first preset threshold, The one or more text box is deleted, to reduce the complexity of Text region, improves the efficiency of Text region.

In an optional implementation of the present embodiment, the step S102, the i.e. ruler according at least to the text box Very little information determines the step of target text box from multiple text boxes, further includes steps of

Height and/or width according at least to the text box determine the target text from multiple text boxes Frame.

In the optional implementation, rule of thumb it is found that the mark marked on the outer surface of target object correspondent entity, Under normal conditions can be more eye-catching, such as font can be big compared with other word contents etc..Therefore, the embodiment of the present disclosure at least can root After being compared according to longitudinal height of text box and/or transverse width etc., target text is determined from multiple text boxes Frame, and then therefrom identify the mark of target object.

In an optional implementation of the present embodiment, as shown in Fig. 2, the step S102, i.e., according at least to described The dimension information of text box determines the step of target text box from multiple text boxes, further comprises the steps S201-S202:

In step s 201, the text box is ranked up respectively according to width and height, obtains two kinds of ranking results；

In step S202, the target text box is determined according to described two ranking results.

In the optional implementation, multiple text boxes are ranked up respectively by transverse width and longitudinal height, and The target text box of the most possible mark comprising target object is selected according to two kinds of ranking results.For example, can be according to target The type etc. of object selects that width is most wide and/or the highest text box of height is as target text box etc..

In an optional implementation of the present embodiment, as shown in figure 3, the step S202, i.e., according to described two Ranking results determine the step of target text box, further comprise the steps S301-S302:

In step S301, in described two ranking results when the first text box identical there are ranking, described It is in one text box, ranking is highest near preceding first text box, widest first text box of width and height One of described first text box is determined as the target text box；

It in step s 302, will in described two ranking results when first text box identical there is no ranking One of widest second text box of width and the highest third text box of height are determined as the target text box.

In the optional implementation, according in transverse width and longitudinal two kinds of obtained ranking results that highly sort, vacation Such as there are identical one or more first text boxes of ranking, then it can be from ranking near the first preceding text box, transverse width The most possible mark comprising target object of selection in widest second text box and longitudinal highest third text box of height One text box, as target text box；This is because a large number of experiments show that, transverse width sequence and longitudinal height Include target pair in the identical and more forward text box of ranking, the widest text box of width and the highest text box of height in sequence The probability of the mark of elephant is larger.For example, detected 5 text boxes, identified respectively with 1-5 number；After being sorted using width Obtained ranking results are [Isosorbide-5-Nitrae, 5,3,2], and the ranking results obtained after being sorted using height are [2,4,5,3,1], it is seen that row The identical text box of name is the 4th, 5,3 text box, and near preceding for the 4th text box；It therefore can be from the 4th text Select one of them as target text box in the widest text box of frame, width and the highest text box of height.

If, then can be from width widest second in two kinds of ranking results when the first text box identical there is no ranking Selected in text box and the highest third text box of height one as target text box.

In an optional implementation of the present embodiment, as shown in figure 4, the step S301, i.e., in described two rows In sequence result when the first text box identical there are ranking, in first text box, by ranking near preceding described first One of widest first text box of text box, width and highest first text box of height are determined as the target text The step of this frame, further comprises the steps S401-S404:

In step S401, if ranking is greater than or equal to multiple texts near the height of preceding first text box Ranking is then determined as candidate text box near preceding first text box by the average height of this frame；

In step S402, if ranking is described for text box less than multiple near the height of preceding first text box Average height, and the width of the highest third text box of height be greater than or equal to it is multiple it is described be text box average width Degree, then be determined as candidate text box for the highest third text box of height；

In step S403, if ranking is described for text box less than multiple near the height of preceding first text box Average height, and the width of the highest third text box of height be less than it is multiple it is described be text box mean breadths, then Widest second text box of width is determined as candidate text box；

In step s 404, the upper sideline according to the candidate text box is true at a distance from the top of the images to be recognized The fixed target text box.

In the optional implementation, if in two kinds of ranking results when the first text box identical there are ranking, preferentially Select ranking near the first preceding text box as candidate text box, but on condition that the ranking near the first preceding text box height Degree is greater than or equal to the average height of multiple text boxes, if ranking is less than multiple texts near the height of the first preceding text box The average height of frame, then illustrate the ranking near the first preceding text box due to more narrow on insufficient height namely longitudinal direction, packet A possibility that mark containing target object, is lower than widest second text box of width and the highest third text box of height, therefore can Using selected from most wide second text box of width and the highest third text box of height one as candidate text box.At this point, can First to determine whether the width of the highest third text box of height is greater than or equal to the mean breadth of multiple text boxes, if it is greater than It then selects the highest third text box of height as candidate text box, otherwise selects widest second text box of width as candidate Text box, this is because height of the mark of target object in images to be recognized is higher, so if the higher and width of height If not wide enough, the word content in text frame is that the probability of the mark of target object is smaller, namely the highest third of height In the case that the width of text box is less than mean breadth, wherein word content is that the probability of the mark of target object is less than width Word content is the probability of the mark of target object in widest second text box.

By above-mentioned Rule of judgment determine candidate text box and then according to the upper sideline of candidate text box with it is to be identified The distance at the top of image determines whether candidate's text box is target text box, namely by judging candidate text box wait know Position in other image is top or leans on determination.In some embodiments, if candidate's text box is not target text This frame, then can the selection target text box from other text boxes again.

In an optional implementation of the present embodiment, as shown in figure 5, the step S404, i.e., according to the candidate The step of upper sideline of text box determines the target text box at a distance from the top of the images to be recognized further comprise Following steps S501-S504:

In step S501, it is less than at a distance from the top of the images to be recognized in the upper sideline of the candidate text box Or it is equal to the second preset threshold, and when the candidate text box is the text box uppermost positioned at the images to be recognized, The candidate text box is determined as the target text box；

In step S502, it is less than at a distance from the top of the images to be recognized in the upper sideline of the candidate text box Or it is equal to the second preset threshold, and the candidate text box is not positioned at the uppermost text box of the images to be recognized When, selection height is highest described from the candidate text box and in the 4th text box on the candidate text box 4th text box is determined as target text box；

In step S503, it is greater than at a distance from the top of the images to be recognized in the upper sideline of the candidate text box Second preset threshold, and widest second text box of width is positioned at the uppermost text box of the images to be recognized When, widest second text box of width is determined as target text box；

In step S504, it is greater than at a distance from the top of the images to be recognized in the upper sideline of the candidate text box Second preset threshold, and widest second text box of width is not positioned at the uppermost text of the images to be recognized When frame, from widest second text box of width and the 5th text on widest second text box of width Highest 5th text box of selection height is determined as target text box in frame.

In the optional implementation, it is determined that after candidate text box, position of the candidate text box in images to be recognized It rests against, namely the upper sideline of candidate text box is less than or equal to the second preset threshold at a distance from the top of images to be recognized, And in the case where on candidate text box there is no other text boxes, which can be determined as target text box.And Position of the candidate text box in images to be recognized is although top, namely the candidate upper sideline of text box and the top of images to be recognized The distance in portion is less than or equal to the second preset threshold, but there are also in the case where other text boxes on candidate text box, can be with The 4th text box on candidate's text box and candidate's text box is resequenced according to height, and will be after rearrangement Highly highest 4th text box is determined as target text box.

If position of the candidate text box in images to be recognized is on the lower, namely candidate text box upper sideline with it is to be identified The distance at the top of image is greater than the second preset threshold, then judges whether widest second text box of width is positioned at figure to be identified As uppermost text box, widest second text box of the width is then if so determined as target text box, if not If then from the 5th text box on widest second text box of width and the widest text box of the width select height It is highest to be used as target text box.

In some embodiments, the second preset threshold can rule of thumb or statistical analysis etc. modes preset, such as Second preset threshold can be set to the numerical value for being less than or equal to the half of the height of candidate text box.

In an optional implementation of the present embodiment, as shown in fig. 6, the step S302, that is, in described two rows There is no when identical first text box of ranking in sequence result, by widest second text box of width and height highest the The step of one of three text boxes are determined as the target text box, further comprises the steps S601-S603:

In step s 601, if the ratio between the width of widest second text box of width and height are less than or equal to the Widest second text box of the width is then determined as the target text box by three preset thresholds；

In step S602, if the ratio between the width of widest second text box of width and height are greater than the third Preset threshold, and the upper sideline of the highest third text box of height with the top of the images to be recognized at a distance from greater than the When four preset thresholds, then widest second text box of width is determined as target text box；

In step S603, if the top of the upper sideline and images to be recognized of the highest third text box of height When distance is less than or equal to four preset threshold, the highest third text box of height is determined as target text box.

In the optional implementation, if be not present according in the two kinds of ranking results obtained after height and width sequence When identical first text box of ranking, it can be selected from widest second text box of width and the highest third text box of height One is used as target text box；It is less big in the ratio between the width of widest second text box of width and height, namely be less than or wait When third predetermined threshold value, using widest second text box of width as target text box.Why using width widest the The ratio between the width of two text boxes and height screening target text box, be in order to avoid widest second text box is passage, Rather than this case that the mark of target object.

If the width of widest second text box of width and height it is bigger, namely be greater than third predetermined threshold value when, Due to target object mark under normal conditions number of words will not excessive namely corresponding text box horizontally will not be wide, because This can consider that the probability of mark of widest second text box of the width comprising target object is little, can judge height again at this time Whether the probability for spending the mark that highest third text box includes target object is larger, if the highest third text box of height Upper sideline is greater than the 4th preset threshold, namely the highest third text box of height to be identified at a distance from images to be recognized top margin Position in image more on the lower when, at this time it is considered that the highest third text box of height includes the general of the mark of target object Rate the second text box more widest than width is also low, at this time or using widest second text box of width as target text box.

And the ratio between the width of widest second text box of width and height are greater than third predetermined threshold value, and height highest the The upper sideline of three text boxes is less than or equal to the 4th preset threshold at a distance from the top of images to be recognized, namely highly highest When position of the third text box in images to be recognized is more top, then the highest third text box of height can be determined as target Text box.

In some embodiments, third predetermined threshold value and the 4th preset threshold can rule of thumb, the modes such as statistical analysis It presets.For example, third predetermined threshold value can be a constant, and the 4th preset threshold can be set to be less than or equal to height Spend the numerical value of the half of highest TextField._height.

In the optional implementation, sample data can be advanced with and train the first artificial intelligence model, and by One artificial intelligence model carries out the detection of text box to images to be recognized.First artificial intelligence model can use PixelLink The example partitioning scheme of middle proposition realizes text detection, carries out two kinds of pixel predictions based on DNN: text/non-textual prediction and Link prediction；The text detection mode that PixelLink is proposed is prior art, and details are not described herein.The facilities network of PixelLink Network can choose Resnet (Residual Neural Network) network.

In an optional implementation of the present embodiment, the step S103 is identified from the target text box The step of mark of target object out, further includes steps of

In the optional implementation, after target text box has been determined, preparatory trained second people can use Work intelligent network model identifies the mark of target object from target text box.Second smart network's model can use CRNN model combines convolutional neural networks model (CNN) and Recognition with Recurrent Neural Network model (RNN), and identification function is stronger.

In an optional implementation of the present embodiment, the method still further comprises following steps:

Obtain Background image set and character set；Wherein, generated including the use of different colours one of the Background image set Or multiple background images, and/or the one or more background images intercepted from existing image；The character set includes using not The one or more texts generated with color and/or different fonts；

Sample data is generated according to the Background image set and the character set；Wherein, the sample data includes described At least one text at least one background image and the character set in background image collection.

In the optional implementation, in order to train above-mentioned first artificial intelligence model and/or the second artificial intelligence model, A large amount of sample data can be collected.It can be with manually generated a part of sample data in the present embodiment.Manually generated sample data During, multiple fonts can be chosen and/or multiple color constitutes the character set including a variety of different literals, and construct and include The Background image set of a variety of different background images, such as choose the background intercepted on arbitrary solid background and/or arbitrary image The background image of formation.When generating a sample data, from background image concentrate it is any choose a secondary background image, and from text Word concentrates any position for selecting one or more texts to be write on selected background image, forms piece image, the image Label can be noted as the text box where the one or more text, and the label of the image and the image can be used as first The sample data of artificial intelligence model；Further, it is also possible to the text box on the image where the one or more text is intercepted, and It is the one or more text by the label for labelling of text frame, the label of text frame and text frame can be used as the second people The sample data of work model of mind；The first artificial intelligence model of training and the second artificial intelligence mould can be generated in this way Multiple sample datas of type.

Certainly, true sample data on the other hand can also be obtained, such as obtains image from natural scene, and to image It carries out artificial mark and obtains sample data.

Fig. 7 shows the effect diagram that Catering Pubs title is identified according to one embodiment of the disclosure.Take-away is ordered platform In the Catering Pubs entered it is large number of, in order to ensure that user's right, Catering Pubs require have regular entity StoreFront.In order to protect The compliance for demonstrate,proving the Catering Pubs entered usually requires that Catering Pubs upload StoreFront photo, and the store name on StoreFront photo Claim consistent with the store information submitted.In above- mentioned information authenticity, if can be expended big by the way of manual examination and verification Measure human cost.And the index identification method proposed using the embodiment of the present disclosure, then it can be by machine automatically to StoreFront photo It is handled, identifies the title in shop in StoreFront photo.As shown in fig. 7, by carrying out text box detection to the image, and After being filtered according to dimension information to text box, two text boxes that image top half identifies are remained, further according to text The dimension information of this frame determines that target text box is the text box where " nine garden steamed stuffed buns ", carries out text knowledge to text frame Not, the corresponding shop of the available image is entitled " nine garden steamed stuffed buns ".

Following is embodiment of the present disclosure, can be used for executing embodiments of the present disclosure.

Fig. 8 shows the structural block diagram of the identification recognition device according to one embodiment of the disclosure, which can be by soft Part, hardware or both are implemented in combination with as some or all of of electronic equipment.As shown in figure 8, the mark identification dress It sets and includes:

Detection module 801, the multiple text boxes being configured as in detection images to be recognized；

Determining module 802 is configured as obtaining the dimension information of the text box, according at least to the size of the text box Information determines the target text box from multiple text boxes；

Identification module 803 is configured as identifying the mark of target object from the target text box.

In an optional implementation of the present embodiment, described device further include:

Optimization module is configured as optimizing processing to multiple text boxes according to default optimal way.

Filtering module is configured as not met described in the first preset condition according to the filtering of the dimension information of the text box Text box；

Merging module is configured as merging the text box for two intersections for meeting the second preset condition.

In an optional implementation of the present embodiment, the filtering module, comprising:

Filter submodule is configured as the text box of the filter area less than the first preset threshold.

In an optional implementation of the present embodiment, the determining module 802, comprising:

Sorting sub-module is configured as respectively being ranked up the text box according to width and height, obtains two kinds of rows Sequence result；

First determines submodule, is configured as determining the target text box according to described two ranking results.

In an optional implementation of the present embodiment, as shown in figure 9, the determining module 802, comprising:

Sorting sub-module 901 is configured as respectively being ranked up the text box according to width and height, obtains two kinds Ranking results；

First determines submodule 902, is configured as determining the target text box according to described two ranking results.

In an optional implementation of the present embodiment, as shown in Figure 10, described first determines submodule 902, comprising:

Second determines submodule 1001, is configured as in described two ranking results that there are identical first texts of ranking When frame, in first text box, by ranking near preceding first text box, widest first text box of width It is determined as the target text box with one of highest first text box of height；

Third determines submodule 1002, is configured as in described two ranking results that there is no rankings identical described the When one text box, one of widest second text box of width and the highest third text box of height are determined as the target text Frame.

In an optional implementation of the present embodiment, as shown in figure 11, described second determines submodule 1001, packet It includes:

4th determine submodule 1101, if be configured as ranking be greater than near the height of preceding first text box or Equal to the average height of multiple text boxes, then it will be determined as candidate text box near preceding first text box；

5th determines submodule 1102, if being configured as ranking near the height of preceding first text box less than more It is a it is described be text box average height, and the width of the highest third text box of height is greater than or equal to and multiple described is The highest third text box of height is then determined as candidate text box by the mean breadth of text box；

6th determines submodule 1103, if being configured as ranking near the height of preceding first text box less than more It is the average height of text box described in a, and the width of the highest third text box of height is described for text box less than multiple Mean breadth, then widest second text box of width is determined as candidate text box；

7th determines submodule 1104, is configured as upper sideline and the images to be recognized according to the candidate text box The distance at top determine the target text box.

In an optional implementation of the present embodiment, as shown in figure 12, the described 7th determines submodule 1104, packet It includes:

8th determines submodule 1201, is configured as upper sideline and the images to be recognized in the candidate text box The distance at top is less than or equal to the second preset threshold, and the candidate text box is uppermost positioned at the images to be recognized When the text box, the candidate text box is determined as the target text box；

9th determines submodule 1202, is configured as upper sideline and the images to be recognized in the candidate text box The distance at top is less than or equal to the second preset threshold, and the candidate text box is not positioned at described images to be recognized the top The text box when, selected from the candidate text box and in the 4th text box on the candidate text box high It spends highest 4th text box and is determined as target text box；

Tenth determines submodule 1203, is configured as upper sideline and the images to be recognized in the candidate text box The distance at top is greater than the second preset threshold, and widest second text box of width is most upper positioned at the images to be recognized When the text box in face, widest second text box of width is determined as target text box；

11st determines submodule 1204, is configured as upper sideline and the images to be recognized in the candidate text box The distance at top be greater than the second preset threshold, and widest second text box of width is not positioned at the images to be recognized When the uppermost text box, from widest second text box of width and it is located at widest second text of width Highest 5th text box of selection height is determined as target text box in the 5th text box on frame.

In an optional implementation of the present embodiment, as shown in figure 13, the third determines submodule 1002, packet It includes:

12nd determines submodule 1301, if being configured as the width and height of widest second text box of width The ratio between be less than or equal to third predetermined threshold value, then widest second text box of the width is determined as the target text Frame；

13rd determines submodule 1302, if being configured as the width and height of widest second text box of width The ratio between be greater than the third predetermined threshold value, and the upper sideline of the highest third text box of height and the images to be recognized When the distance at top is greater than four preset thresholds, then widest second text box of width is determined as target text box；

14th determines submodule 1303, if be configured as the upper sideline of the highest third text box of height with to It is when identifying that the distance at the top of image is less than or equal to four preset threshold, the highest third text box of height is true It is set to target text box.

In an optional implementation of the present embodiment, the detection module 801, comprising:

Detection sub-module is configured as using multiple texts in first smart network's model inspection images to be recognized Frame；Wherein, preparatory training of the first smart network model Jing Guo sample data.

In an optional implementation of the present embodiment, the identification module 803, comprising:

It identifies submodule, is configured as identifying mesh from the target text box using second smart network's model Mark the mark of object；Wherein, preparatory training of the second smart network model Jing Guo sample data.

Module is obtained, is configured as obtaining Background image set and character set；Wherein, the Background image set is not including the use of With one or more background images that color generates, and/or the one or more background images intercepted from existing image；It is described Character set includes the one or more texts generated using different colours and/or different fonts

Generation module is configured as generating the sample data according to the Background image set and the character set；Wherein, The sample data includes that at least one of at least one background image and described character set in the background image collection are literary Word.

Embodiment further provides a kind of electronic equipment for the disclosure, as shown in figure 14, including at least one processor 1401； And the memory 1402 with the communication connection of at least one processor 1401；Wherein, be stored with can be by least one for memory 1402 The instruction that a processor 1401 executes, instruction are executed by least one processor 1401 to realize:

Detect multiple text boxes in images to be recognized；

The mark of target object is identified from the target text box.

Wherein, one or more computer instruction is also executed by the processor to realize following methods step:

Wherein, the text box of the first preset condition is not met according to the filtering of the dimension information of the text box, comprising:

The text box of the filter area less than the first preset threshold.

Wherein, the dimension information of the text box includes the height and/or width of the text box

Wherein, the target text is determined from multiple text boxes according at least to the dimension information of the text box Frame, comprising:

The target text box is determined according to described two ranking results.

Wherein, the target text box is determined according to described two ranking results, comprising:

Wherein, in described two ranking results when the first text box identical there are ranking, in first text box In, by ranking near preceding first text box, widest first text box of width and height highest described first One of text box is determined as the target text box, comprising:

Wherein, the mesh is determined at a distance from the top of the images to be recognized according to the upper sideline of the candidate text box Mark text box, comprising:

Wherein, in described two ranking results when first text box identical there is no ranking, width is most wide The second text box and one of the highest third text box of height be determined as the target text box, comprising:

Wherein, multiple text boxes in images to be recognized are detected, comprising:

Wherein, the mark of target object is identified from the target text box, comprising:

Specifically, processor 1401, memory 1402 can be connected by bus or other modes, to pass through in Figure 14 For bus connection.Memory 1402 is used as a kind of non-volatile computer readable storage medium storing program for executing, can be used for storing non-volatile Software program, non-volatile computer executable program and module.Processor 1401 is stored in memory 1402 by operation In non-volatile software program, instruction and module, thereby executing the various function application and data processing of equipment, i.e., in fact The above method in the existing embodiment of the present disclosure.

Memory 1402 may include storing program area and storage data area, wherein storing program area can store operation system Application program required for system, at least one function；Storage data area can store the historical data etc. of shipping network transport.This Outside, memory 1402 may include high-speed random access memory, can also include nonvolatile memory, for example, at least one Disk memory, flush memory device or other non-volatile solid state memory parts.In some embodiments, electronic equipment can Selection of land includes communication component 1403, and memory 1402 optionally includes the memory remotely located relative to processor 1401, this A little remote memories can be connected to external equipment by communication component 1403.The example of above-mentioned network includes but is not limited to interconnect Net, intranet, local area network, mobile radio communication and combinations thereof.

One or more module is stored in memory 1402, when being executed by one or more processor 1401, Execute the above method in the embodiment of the present disclosure.

The said goods can be performed disclosure embodiment provided by method, have the corresponding functional module of execution method and Beneficial effect, the not technical detail of detailed description in the present embodiment, reference can be made to method provided by disclosure embodiment.

Flow chart and block diagram in attached drawing illustrate system, method and computer according to the various embodiments of the disclosure The architecture, function and operation in the cards of program product.In this regard, each box in course diagram or block diagram can be with A part of a module, section or code is represented, a part of the module, section or code includes one or more Executable instruction for implementing the specified logical function.It should also be noted that in some implementations as replacements, institute in box The function of mark can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are practical On can be basically executed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.Also it wants It is noted that the combination of each box in block diagram and or flow chart and the box in block diagram and or flow chart, Ke Yiyong The dedicated hardware based system of defined functions or operations is executed to realize, or can be referred to specialized hardware and computer The combination of order is realized.

Being described in unit or module involved in disclosure embodiment can be realized by way of software, can also It is realized in a manner of through hardware.Described unit or module also can be set in the processor, these units or module Title do not constitute the restriction to the unit or module itself under certain conditions.

As on the other hand, the disclosure additionally provides a kind of computer readable storage medium, the computer-readable storage medium Matter can be computer readable storage medium included in device described in above embodiment；It is also possible to individualism, Without the computer readable storage medium in supplying equipment.Computer-readable recording medium storage has one or more than one journey Sequence, described program is used to execute by one or more than one processor is described in disclosed method.

Above description is only the preferred embodiment of the disclosure and the explanation to institute's application technology principle.Those skilled in the art Member is it should be appreciated that invention scope involved in the disclosure, however it is not limited to technology made of the specific combination of above-mentioned technical characteristic Scheme, while should also cover in the case where not departing from the inventive concept, it is carried out by above-mentioned technical characteristic or its equivalent feature Any combination and the other technical solutions formed.Such as features described above has similar function with (but being not limited to) disclosed in the disclosure Can technical characteristic replaced mutually and the technical solution that is formed.

Claims

1. a kind of index identification method characterized by comprising

Detect multiple text boxes in images to be recognized；

The dimension information for obtaining the text box, it is true from multiple text boxes according at least to the dimension information of the text box The fixed target text box；

The mark of target object is identified from the target text box.

2. the method according to claim 1, wherein the method also includes:

3. according to the method described in claim 2, it is characterized in that, not meeting the according to the filtering of the dimension information of the text box The text box of one preset condition, comprising:

The text box of the filter area less than the first preset threshold.

4. method according to claim 1-3, which is characterized in that the dimension information of the text box includes described The height and/or width of text box.

5. method according to claim 1-3, which is characterized in that according at least to the dimension information of the text box The target text box is determined from multiple text boxes, comprising:

The target text box is determined according to described two ranking results.

6. according to the method described in claim 5, it is characterized in that, determining the target text according to described two ranking results Frame, comprising:

In described two ranking results when the first text box identical there are ranking, in first text box, by ranking Near one of preceding first text box, widest first text box of width and highest described first text box of height It is determined as the target text box；

In described two ranking results when first text box identical there is no ranking, by widest second text of width One of frame and the highest third text box of height are determined as the target text box.

7. according to the method described in claim 6, it is characterized in that, there are rankings identical in described two ranking results When one text box, in first text box, by ranking near preceding first text box, width widest described first One of text box and highest first text box of height are determined as the target text box, comprising:

If ranking is greater than or equal to the average height of multiple text boxes near the height of preceding first text box, It will be determined as candidate text box near preceding first text box；

If ranking is less than multiple average heights for text box, and height near the height of preceding first text box The width of the highest third text box is greater than or equal to multiple mean breadths for text box, then height is highest The third text box is determined as candidate text box；

If ranking is less than multiple average heights for text box, and height near the height of preceding first text box The width of the highest third text box be less than it is multiple it is described be text box mean breadths, then by width widest described the Two text boxes are determined as candidate text box；

The target text box is determined at a distance from the top of the images to be recognized according to the upper sideline of the candidate text box.

8. a kind of identification recognition device characterized by comprising

Determining module is configured as obtaining the dimension information of the text box, according at least to the text box dimension information from The target text box is determined in multiple text boxes；

9. a kind of electronic equipment, which is characterized in that including memory and processor；Wherein,

The memory is for storing one or more computer instruction, wherein one or more computer instruction is by institute Processor is stated to execute to realize following methods step:

Detect multiple text boxes in images to be recognized；

The mark of target object is identified from the target text box.

10. a kind of computer readable storage medium, is stored thereon with computer instruction, which is characterized in that the computer instruction quilt Claim 1-7 described in any item methods are realized when processor executes.