CN108427950A

CN108427950A - A kind of literal line detection method and device

Info

Publication number: CN108427950A
Application number: CN201810102229.4A
Authority: CN
Inventors: 高大帅; 李健; 张连毅; 武卫东
Original assignee: BEIJING INFOQUICK SINOVOICE SPEECH TECHNOLOGY CORP
Current assignee: BEIJING INFOQUICK SINOVOICE SPEECH TECHNOLOGY CORP; Beijing Sinovoice Technology Co Ltd
Priority date: 2018-02-01
Filing date: 2018-02-01
Publication date: 2018-08-21
Anticipated expiration: 2038-02-01
Also published as: CN108427950B

Abstract

An embodiment of the present invention provides a kind of literal line detection method and device.In embodiments of the present invention, detected simultaneously using default YOLO models position in image to be detected of literal line in image to be detected, the word that the both forward and reverse directions for the word that the angle of inclination of literal line, literal line include and literal line include languages.The embodiment of the present invention extracts the word in image without using self-adaption binaryzation method, so as to avoid reducing the accuracy in detection of literal line due to illumination or shade, without using pocket type feature classifiers to determine the words direction in line of text and word languages, avoids lower by its generalization ability and reduce the accuracy in detection of literal line.It is better than pocket type feature classifiers in the generalization ability of the YOLO models of the embodiment of the present invention, therefore, compared with the prior art, the embodiment of the present invention can improve the accuracy in detection of literal line.

Description

A kind of literal line detection method and device

Technical field

The present invention relates to field of computer technology, more particularly to a kind of literal line detection method and device.

Background technology

Currently, there is the demand being detected to the word in image in many occasions, such as acquisition includes identity card, row Sail card, the image then text informations such as name, number or position in detection image of driver's license either business card.Its In, include two Chinese characters, body comprising the multiple words for being arranged as a line in each text information, such as in name " Zhang San " Comprising 18 numbers and position by including more than two Chinese characters etc. in part card number.

Wherein, the literal line that each text information is made of multiple words, when need identify image in word When information, it usually needs first determine literal line in the picture, then use OCR (Optical Character Recognition, optical character identification) technology identification literal line in text information.

The prior art provides a kind of literal line detection method, including：It is extracted in image using self-adaption binaryzation method Word, further according to word size and location using cluster printed page analysis method generate line of text, then use pocket type feature Grader determines words direction and word languages in line of text.

However, inventor has found during realizing the embodiment of the present invention, following defect exists in the prior art：

First, in the word in extracting image using self-adaption binaryzation method, often by illumination or shade Influence lead to hiatus, alternatively, it includes non-legible noise in the word extracted to cause, and then the text detected may be led to Word row and actual literal line in image are not quite identical, to reduce the accuracy in detection of literal line.

Secondly, the generalization ability of pocket type feature classifiers is limited to the size of dictionary and its corresponding feature vector, is one The classification direction of kind out-of-order, can not characterize the structural information of image, so that the generalization ability of pocket type feature classifiers It is relatively low, thereby reduce the accuracy in detection of literal line.

Invention content

In order to solve the above technical problems, the embodiment of the present invention shows a kind of literal line detection method and device.

In a first aspect, the embodiment of the present invention shows a kind of literal line detection method, the method includes：

It obtains and presets YOLO models, obtain and preset YOLO models, the YOLO models include 24 layers of convolution storehouse, and one layer complete Whole convolution storehouse includes convolutional layer, pond layer, batch normalization and active coating, has 4 complete convolution in the YOLO models Storehouse and 20 contain only convolutional layer and the convolution storehouse of active coating, and convolution storehouse activation primitive selects line rectification unit, and It further includes 8 output convolutional layers, the 8 output convolutional layer that residual error jumper wire construction, the YOLO models are used between convolution storehouse Site layer including the layering of 1 confidence, 4 literal lines, the angle of inclination layer of 1 literal line, 1 literal line both forward and reverse directions layer And the languages layer of 1 literal line；

Described image to be detected is input in the YOLO models, the YOLO models is obtained and exports volume at described 8 The matrix that lamination exports respectively；

The matrix exported respectively according to described 8 output convolutional layers determines the literal line in described image to be detected described In the both forward and reverse directions and literal line of the word that the angle of inclination of position, literal line in image to be detected, literal line include Including word languages.

In an optional realization method, the matrix exported respectively according to described 8 output convolutional layers determines institute State position of the literal line in image to be detected in described image to be detected, the angle of inclination of literal line, literal line include Word both forward and reverse directions and the literal line word that includes languages, including：

The matrix for parsing the confidence layering output, obtains confidence point；

Judge whether the confidence point is more than the first predetermined threshold value；

If the confidence point is more than first predetermined threshold value, the matrix that 4 site layers of parsing export respectively And the matrix of the parsing angle of inclination layer output, obtain the prediction rectangle frame for including literal line；

The inclination of position and literal line of the literal line in described image to be detected is determined according to the prediction rectangle frame Angle；

The matrix for parsing the both forward and reverse directions layer output, obtains the word that the literal line in described image to be detected includes Both forward and reverse directions；

The matrix for parsing the languages layer output, obtains the language for the word that the literal line in described image to be detected includes Kind.

In an optional realization method, parsing described in the matrix and parsing that 4 site layers export respectively After the matrix of angle of inclination layer output, 1 prediction rectangle frame comprising literal line is obtained；

It is described that position and literal line of the literal line in described image to be detected are determined according to the prediction rectangle frame Angle of inclination, including：

It is literal line in described image to be detected by the location determination of the prediction rectangle frame in described image to be detected In position；

The angle of inclination of the prediction rectangle frame is determined as to the angle of inclination of literal line.

Two prediction rectangle frames are selected from the multiple prediction rectangle frame；

Calculate the area of the intersection between described two prediction rectangle frames；

Calculate the sum of the area of described two prediction rectangle frames；

Calculate the ratio between the sum of the area of the intersection and the area of described two prediction rectangle frames；

Judge whether the ratio is more than the second predetermined threshold value；

If the ratio is more than second predetermined threshold value, in described two prediction rectangle frames, most by confidence point The location determination of big prediction rectangle frame is the position of literal line, and, by the inclination angle of the maximum prediction rectangle frame of confidence point Degree is determined as the angle of inclination of literal line.

Second aspect, the embodiment of the present invention show that a kind of literal line detection device, described device include：

Acquisition module, for obtaining default YOLO models, the YOLO models include 24 layers of convolution storehouse, and one layer complete Convolution storehouse includes convolutional layer, pond layer, batch normalization and active coating, has 4 complete convolution storehouses in the YOLO models Convolutional layer and the convolution storehouse of active coating are contained only with 20, convolution storehouse activation primitive selects line rectification unit, and convolution It further includes 8 output convolutional layers that residual error jumper wire construction, the YOLO models are used between storehouse, and the 8 output convolutional layer includes 1 The layering of a confidence, 4 literal lines site layer, the angle of inclination layer of 1 literal line, 1 literal line both forward and reverse directions layer and 1 The languages layer of a literal line；

Input module obtains the YOLO models and exists for described image to be detected to be input in the YOLO models The matrix that the 8 output convolutional layer exports respectively；

Determining module, the matrix for being exported respectively according to described 8 output convolutional layers determine in described image to be detected Position of the literal line in described image to be detected, the angle of inclination of literal line, the literal line word that includes positive negative side To and the literal line word that includes languages.

In an optional realization method, the determining module includes：

First resolution unit, the matrix for parsing the confidence layering output, obtains confidence point；

Judging unit, for judging whether the confidence point is more than the first predetermined threshold value；

Second resolution unit parses 4 positions if being more than first predetermined threshold value for the confidence point The matrix for matrix and parsing the angle of inclination layer output that layer exports respectively, obtains the prediction rectangle frame for including literal line；

Determination unit, for according to the prediction rectangle frame determine position of the literal line in described image to be detected and The angle of inclination of literal line；

Third resolution unit, the matrix for parsing the both forward and reverse directions layer output, obtains in described image to be detected The both forward and reverse directions for the word that literal line includes；

4th resolution unit, the matrix for parsing the languages layer output, obtains the word in described image to be detected The languages for the word that row includes.

The determination unit includes：

First determination subelement, for being word by location determination of the prediction rectangle frame in described image to be detected Position of the row in described image to be detected；

Second determination subelement, the angle of inclination for the angle of inclination of the prediction rectangle frame to be determined as to literal line.

The determination unit includes：

Subelement is selected, for selecting two prediction rectangle frames from the multiple prediction rectangle frame；

First computation subunit, the area for calculating the intersection between described two prediction rectangle frames；

Second computation subunit, the sum of the area for calculating described two prediction rectangle frames；

Third computation subunit, for calculate the area of the intersection and described two prediction rectangle frames areas it Ratio between and；

Judgment sub-unit, for judging whether the ratio is more than the second predetermined threshold value；

Third determination subelement, if being more than second predetermined threshold value for the ratio, in described two predictions In rectangle frame, the location determination by the maximum prediction rectangle frame of confidence point is the position of literal line, and, confidence point is maximum The angle of inclination of prediction rectangle frame is determined as the angle of inclination of literal line.

The third aspect, the embodiment of the present invention show a kind of electronic equipment, including memory, processor and are stored in storage On device and the computer program that can run on a processor, the processor are realized as described in relation to the first aspect when executing described program A kind of literal line detection method the step of.

Fourth aspect, the embodiment of the present invention show a kind of computer readable storage medium, the computer-readable storage It is stored with computer program on medium, a kind of text as described in relation to the first aspect is realized when the computer program is executed by processor The step of word row detection method.

Compared with prior art, the embodiment of the present invention includes following advantages：

In embodiments of the present invention, the literal line in image to be detected is detected simultaneously using default YOLO models to be detected The both forward and reverse directions and literal line for the word that the angle of inclination of position, literal line in image, literal line include include The languages of word.The embodiment of the present invention extracts the word in image without using self-adaption binaryzation method, so as to avoid The accuracy in detection that literal line is reduced due to illumination or shade is determined without pocket type feature classifiers are used in line of text Words direction and word languages, avoid lower by its generalization ability and reduce the accuracy in detection of literal line.In the present invention The generalization ability of the YOLO models of embodiment is better than pocket type feature classifiers, therefore, compared with the prior art, the embodiment of the present invention The accuracy in detection of literal line can be improved.

Description of the drawings

Fig. 1 is a kind of step flow chart of literal line detection method embodiment of the present invention；

Fig. 2 is a kind of schematic diagram of literal line of the present invention；

Fig. 3 is a kind of schematic diagram of literal line of the present invention；

Fig. 4 is a kind of structure diagram of literal line detection device embodiment of the present invention.

Specific implementation mode

In order to make the foregoing objectives, features and advantages of the present invention clearer and more comprehensible, below in conjunction with the accompanying drawings and specific real Applying mode, the present invention is described in further detail.

Referring to Fig.1, the step flow chart for showing a kind of literal line detection method embodiment of the present invention, can specifically wrap Include following steps：

It in step S101, obtains and presets YOLO models, which includes 24 layers of convolution storehouse, one layer of complete volume Product storehouse includes convolutional layer, pond layer, batch normalization and active coating, has 4 complete convolution storehouses and 20 in the YOLO models A convolution storehouse for containing only convolutional layer and active coating, convolution storehouse activation primitive select line rectification unit, and convolution storehouse Between use the residual error jumper wire construction, the YOLO models to further include 8 output convolutional layers, this 8 output convolutional layers include 1 confidence point The site layer of layer, 4 literal lines, the angle of inclination layer of 1 literal line, 1 literal line both forward and reverse directions layer and 1 literal line Languages layer；

In YOLO models, image to be detected can be divided into 16*16 grid, image to be detected can also be divided This is not limited for 32*32 grid or 8*8 grid, the embodiment of the present invention.

In embodiments of the present invention, the image for needing a large amount of unified size of synthesis in advance, is then trained using image YOLO models are trained for example, by using self-adapting random gradient descent method, and initialization learning rate is 0.00002, exercise wheel Number is 800, in order to enhance scale robustness, picture size is amplified 2 times when a wheel training, the model after convergence is 2000 Open tuning on mark image.Deep learning frame can select theano etc..Then the loss function designed based on LOSS is used Optimize the YOLO models trained, is then stored in local using finally obtained YOLO models as default YOLO models.

Total loss is equal to the angle of inclination for classifying and returning the loss of the position of literal line in the picture, literal line The loss's of the languages for the word that the loss and literal line of the both forward and reverse directions for the word that loss, literal line include include adds Power combination, such as：

Loss=l_obj+0.1*l_nonObj+5*l_bnd+l_ori+l_script

Wherein l_objAnd l_nonObj1 confidence for including for the Classification Loss with the presence or absence of word, corresponding 8 output convolutional layer Layering, l_bndFor the site layer for 4 literal lines that the recurrence loss of minimum rotation boundary rectangle, corresponding 8 output convolutional layer include With the angle of inclination layer of 1 literal line, l_oriIt is lost for the both forward and reverse directions of word, 1 text that corresponding 8 output convolutional layer includes The both forward and reverse directions layer of word row, l_scriptIt is languages loss, the languages layer for 1 literal line that corresponding 8 output convolutional layer includes.

In step s 102, image to be detected is input in the YOLO models, obtains the YOLO models and is exported at this 8 The matrix that convolutional layer exports respectively；

In step s 103, the matrix exported respectively according to this 8 output convolutional layers determines the word in image to be detected The both forward and reverse directions and word for the word that position of the row in image to be detected, the angle of inclination of literal line, literal line include The languages for the word that row includes.

In embodiments of the present invention, image to be detected is rectangle.Literal line includes multiple words of a line arrangement, each Angle between the horizontal edge of line segment and image to be detected that the central point of word is linked to be is the angle of inclination of literal line.In literal line Including the languages of word include Chinese, English, Japanese, Korean, Latin Russian and Russian etc..What literal line included The both forward and reverse directions of word are that word is positive or anti-, for example, the square direction for the word that literal line shown in Fig. 2 includes The square direction for the word for including for forward direction, literal line shown in Fig. 3 is reversed.

Specifically, the matrix that can parse confidence layering output, obtains confidence point, then judges whether the confidence point is big In the first predetermined threshold value, if the confidence point is more than first predetermined threshold value, the matrix that 4 site layers of parsing export respectively And the matrix of angle of inclination layer output is parsed, the prediction rectangle frame for including literal line is obtained, then according to the prediction square Shape frame determines the angle of inclination of position and literal line of the literal line in described image to be detected, then parses the both forward and reverse directions layer The matrix of output obtains the both forward and reverse directions for the word that the literal line in image to be detected includes, and to parse the languages layer defeated The matrix gone out obtains the languages for the word that the literal line in image to be detected includes.

In an embodiment of the invention, the matrix and parse the inclination angle that 4 site layers export respectively are being parsed It spends after the matrix of layer output, if obtaining 1 prediction rectangle frame comprising literal line, text is determined according to the prediction rectangle frame The angle of inclination of position and literal line of the word row in image to be detected, Ke Yiwei：By the prediction rectangle frame in mapping to be checked Location determination as in is position of the literal line in image to be detected, and, the angle of inclination of the prediction rectangle frame is determined For the angle of inclination of literal line.

In an alternative embodiment of the invention, the matrix and parse the inclination angle that 4 site layers export respectively are being parsed It spends after the matrix of layer output, if obtaining including multiple prediction rectangle frames of literal line, is determined according to the prediction rectangle frame The angle of inclination of position and literal line of the literal line in image to be detected, Ke Yiwei：It is selected from multiple prediction rectangle frames Two prediction rectangle frames, calculate the area of the intersection between described two prediction rectangle frames, then calculate described two predictions Then the sum of area of rectangle frame calculates between the sum of the area of the intersection and the area of described two prediction rectangle frames Ratio, judge later the ratio whether be more than the second predetermined threshold value；If the ratio is more than second predetermined threshold value, at this In two prediction rectangle frames, the location determination by the maximum prediction rectangle frame of confidence point is the position of literal line, and, by confidence The angle of inclination for dividing the angle of inclination of maximum prediction rectangle frame to be determined as literal line.

If the ratio is less than or equal to second predetermined threshold value, two prediction rectangle frames are merged into new prediction square Shape frame predicts rectangle frame and area minimum new prediction rectangle frame for example, creating one and can include simultaneously this two, Then the prediction rectangle frame of reselection one from the remaining prediction rectangle frames in multiple prediction rectangle frames again, then by new prediction Rectangle frame and a prediction rectangle frame of selection continue aforesaid operations, and it is not described here in detail for detailed process.

It should be noted that for embodiment of the method, for simple description, therefore it is all expressed as a series of action group It closes, but those skilled in the art should understand that, the embodiment of the present invention is not limited by the described action sequence, because according to According to the embodiment of the present invention, certain steps can be performed in other orders or simultaneously.Secondly, those skilled in the art also should Know, embodiment described in this description belongs to preferred embodiment, and the involved action not necessarily present invention is implemented Necessary to example.

With reference to Fig. 4, a kind of structure diagram of literal line detection device embodiment of the present invention is shown, which specifically can be with Including following module：

Acquisition module 11, for obtaining default YOLO models, the YOLO models include 24 layers of convolution storehouse, and one layer complete Convolution storehouse include convolutional layer, pond layer, batch normalize and active coating, have 4 complete convolution heaps in the YOLO models Stack and 20 contain only convolutional layer and the convolution storehouse of active coating, and convolution storehouse activation primitive selects line rectification unit, and rolls up It further includes 8 output convolutional layers, the 8 output convolutional layer packet that residual error jumper wire construction, the YOLO models are used between product storehouse Include the layering of 1 confidence, 4 literal lines site layer, the angle of inclination layer of 1 literal line, 1 literal line both forward and reverse directions layer with And the languages layer of 1 literal line；

Input module 12 obtains the YOLO models for described image to be detected to be input in the YOLO models The matrix that convolutional layer exports respectively is exported at described 8；

Determining module 13, the matrix for being exported respectively according to described 8 output convolutional layers determine described image to be detected In position of the literal line in described image to be detected, the angle of inclination of literal line, the literal line word that includes it is positive and negative The languages for the word that direction and literal line include.

In an optional realization method, the determining module 13 includes：

The determination unit includes：

For device embodiments, since it is basically similar to the method embodiment, so fairly simple, the correlation of description Place illustrates referring to the part of embodiment of the method.

Each embodiment in this specification is described in a progressive manner, the highlights of each of the examples are with The difference of other embodiment, the same or similar parts between the embodiments can be referred to each other.

It should be understood by those skilled in the art that, the embodiment of the embodiment of the present invention can be provided as method, apparatus or calculate Machine program product.Therefore, the embodiment of the present invention can be used complete hardware embodiment, complete software embodiment or combine software and The form of the embodiment of hardware aspect.Moreover, the embodiment of the present invention can be used one or more wherein include computer can With in the computer-usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) of program code The form of the computer program product of implementation.

The embodiment of the present invention be with reference to according to the method for the embodiment of the present invention, terminal device (system) and computer program The flowchart and/or the block diagram of product describes.It should be understood that flowchart and/or the block diagram can be realized by computer program instructions In each flow and/or block and flowchart and/or the block diagram in flow and/or box combination.These can be provided Computer program instructions are set to all-purpose computer, special purpose computer, Embedded Processor or other programmable data processing terminals Standby processor is to generate a machine so that is held by the processor of computer or other programmable data processing terminal equipments Capable instruction generates for realizing in one flow of flow chart or multiple flows and/or one box of block diagram or multiple boxes The device of specified function.

These computer program instructions, which may also be stored in, can guide computer or other programmable data processing terminal equipments In computer-readable memory operate in a specific manner so that instruction stored in the computer readable memory generates packet The manufacture of command device is included, which realizes in one flow of flow chart or multiple flows and/or one side of block diagram The function of being specified in frame or multiple boxes.

These computer program instructions can be also loaded into computer or other programmable data processing terminal equipments so that Series of operation steps are executed on computer or other programmable terminal equipments to generate computer implemented processing, thus The instruction executed on computer or other programmable terminal equipments is provided for realizing in one flow of flow chart or multiple flows And/or in one box of block diagram or multiple boxes specify function the step of.

Although the preferred embodiment of the embodiment of the present invention has been described, once a person skilled in the art knows bases This creative concept, then additional changes and modifications can be made to these embodiments.So the following claims are intended to be interpreted as Including preferred embodiment and fall into all change and modification of range of embodiment of the invention.

Finally, it is to be noted that, herein, relational terms such as first and second and the like be used merely to by One entity or operation are distinguished with another entity or operation, without necessarily requiring or implying these entities or operation Between there are any actual relationship or orders.Moreover, the terms "include", "comprise" or its any other variant meaning Covering non-exclusive inclusion, so that process, method, article or terminal device including a series of elements not only wrap Those elements are included, but also include other elements that are not explicitly listed, or further include for this process, method, article Or the element that terminal device is intrinsic.In the absence of more restrictions, being wanted by what sentence "including a ..." limited Element, it is not excluded that there is also other identical elements in process, method, article or the terminal device including the element.

Above to a kind of literal line detection method and device provided by the present invention, it is described in detail, answers herein With specific case, principle and implementation of the present invention are described, and the explanation of above example is only intended to help to manage Solve the method and its core concept of the present invention；Meanwhile for those of ordinary skill in the art, according to the thought of the present invention, There will be changes in specific implementation mode and application range, in conclusion the content of the present specification should not be construed as to this hair Bright limitation.

Claims

1. a kind of literal line detection method, which is characterized in that the method includes：

It obtains and presets YOLO models, the YOLO models include 24 layers of convolution storehouse, and one layer of complete convolution storehouse includes convolution Layer, pond layer, batch normalization and active coating, there is 4 complete convolution storehouses in the YOLO models and 20 contain only convolution The convolution storehouse of layer and active coating, convolution storehouse activation primitive selects line rectification unit, and is jumped using residual error between convolution storehouse Cable architecture, the YOLO models further include 8 output convolutional layers, and the 8 output convolutional layer includes 1 confidence layering, 4 texts The site layer of word row, the angle of inclination layer of 1 literal line, the both forward and reverse directions layer of 1 literal line and 1 literal line languages Layer；

Described image to be detected is input in the YOLO models, the YOLO models is obtained and exports convolutional layer at described 8 The matrix exported respectively；

The matrix exported respectively according to described 8 output convolutional layers determines the literal line in described image to be detected described to be checked The both forward and reverse directions and literal line for the word that the angle of inclination of position, literal line in altimetric image, literal line include include Word languages.

2. according to the method described in claim 1, it is characterized in that, described export what convolutional layers exported respectively according to described 8 Matrix determines position of the literal line in described image to be detected in described image to be detected, the angle of inclination of literal line, text The languages for the word that the both forward and reverse directions and literal line for the word that word row includes include, including：

If the confidence point is more than first predetermined threshold value, matrix that 4 site layers of parsing export respectively and The matrix for parsing the angle of inclination layer output, obtains the prediction rectangle frame for including literal line；

The angle of inclination of position and literal line of the literal line in described image to be detected is determined according to the prediction rectangle frame；

The matrix for parsing both forward and reverse directions layer output is obtaining word that the literal line in described image to be detected includes just Negative direction；

The matrix for parsing the languages layer output, obtains the languages for the word that the literal line in described image to be detected includes.

3. according to the method described in claim 2, it is characterized in that, parse the matrix that exports respectively of 4 site layers with And after the matrix of the parsing angle of inclination layer output, 1 prediction rectangle frame comprising literal line is obtained；

The inclination that position and literal line of the literal line in described image to be detected are determined according to the prediction rectangle frame Angle, including：

It is literal line in described image to be detected by the location determination of the prediction rectangle frame in described image to be detected Position；

4. according to the method described in claim 2, it is characterized in that, parse the matrix that exports respectively of 4 site layers with And after the matrix of the parsing angle of inclination layer output, 1 prediction rectangle frame comprising literal line is obtained；

Calculate the sum of the area of described two prediction rectangle frames；

It is in described two prediction rectangle frames, confidence point is maximum if the ratio is more than second predetermined threshold value Predict that the location determination of rectangle frame is the position of literal line, and, the angle of inclination of the maximum prediction rectangle frame of confidence point is true It is set to the angle of inclination of literal line.

5. a kind of literal line detection device, which is characterized in that described device includes：

Acquisition module, for obtaining default YOLO models, the YOLO models include 24 layers of convolution storehouse, one layer of complete convolution Storehouse includes convolutional layer, pond layer, batch normalization and active coating, has 4 complete convolution storehouses and 20 in the YOLO models A convolution storehouse for containing only convolutional layer and active coating, convolution storehouse activation primitive select line rectification unit, and convolution storehouse Between use residual error jumper wire construction, the YOLO models further include 8 output convolutional layers, it is described 8 output convolutional layer set including 1 Letter layering, 4 literal lines site layer, the angle of inclination layer of 1 literal line, 1 literal line both forward and reverse directions layer and 1 text The languages layer of word row；

Input module obtains the YOLO models described 8 for described image to be detected to be input in the YOLO models The matrix that a output convolutional layer exports respectively；

Determining module, the matrix for being exported respectively according to described 8 output convolutional layers determine the text in described image to be detected The both forward and reverse directions for the word that position of the word row in described image to be detected, the angle of inclination of literal line, literal line include with And the languages of the literal line word that includes.

6. device according to claim 5, which is characterized in that the determining module includes：

Second resolution unit parses 4 site layers point if being more than first predetermined threshold value for the confidence point The matrix for matrix and parsing the angle of inclination layer output not exported, obtains the prediction rectangle frame for including literal line；

Determination unit, for determining position and word of the literal line in described image to be detected according to the prediction rectangle frame Capable angle of inclination；

Third resolution unit, the matrix for parsing the both forward and reverse directions layer output, obtains the word in described image to be detected The both forward and reverse directions for the word that row includes；

4th resolution unit, the matrix for parsing the languages layer output, obtains in the literal line in described image to be detected Including word languages.

7. device according to claim 6, which is characterized in that parse the matrix that exports respectively of 4 site layers with And after the matrix of the parsing angle of inclination layer output, 1 prediction rectangle frame comprising literal line is obtained；

The determination unit includes：

First determination subelement, for being that literal line exists by location determination of the prediction rectangle frame in described image to be detected Position in described image to be detected；

8. device according to claim 6, which is characterized in that parse the matrix that exports respectively of 4 site layers with And after the matrix of the parsing angle of inclination layer output, 1 prediction rectangle frame comprising literal line is obtained；

The determination unit includes：

Third computation subunit, for calculate the sum of area of the intersection and areas of described two prediction rectangle frames it Between ratio；

Third determination subelement, if being more than second predetermined threshold value for the ratio, in described two prediction rectangles In frame, the location determination by the maximum prediction rectangle frame of confidence point is the position of literal line, and, by the maximum prediction of confidence point The angle of inclination of rectangle frame is determined as the angle of inclination of literal line.

9. a kind of electronic equipment, including memory, processor and storage are on a memory and the calculating that can run on a processor Machine program, which is characterized in that the processor realizes according to any one of claims 1 to 4 one when executing described program The step of kind literal line detection method.

10. a kind of computer readable storage medium, which is characterized in that be stored with computer on the computer readable storage medium Program realizes a kind of literal line inspection according to any one of claims 1 to 4 when the computer program is executed by processor The step of survey method.