WO2022002262A1 - Procédé et appareil de reconnaissance de séquences de caractères basés sur la vision artificielle, dispositif et support - Google Patents

Procédé et appareil de reconnaissance de séquences de caractères basés sur la vision artificielle, dispositif et support Download PDF

Info

Publication number
WO2022002262A1
WO2022002262A1 PCT/CN2021/104308 CN2021104308W WO2022002262A1 WO 2022002262 A1 WO2022002262 A1 WO 2022002262A1 CN 2021104308 W CN2021104308 W CN 2021104308W WO 2022002262 A1 WO2022002262 A1 WO 2022002262A1
Authority
WO
WIPO (PCT)
Prior art keywords
character sequence
image
target area
character
horizontal
Prior art date
Application number
PCT/CN2021/104308
Other languages
English (en)
Chinese (zh)
Inventor
杨志成
李睿宇
Original Assignee
深圳思谋信息科技有限公司
上海思谋科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳思谋信息科技有限公司, 上海思谋科技有限公司 filed Critical 深圳思谋信息科技有限公司
Priority to JP2022564797A priority Critical patent/JP7429307B2/ja
Publication of WO2022002262A1 publication Critical patent/WO2022002262A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/24Aligning, centring, orientation detection or correction of the image
    • G06V10/242Aligning, centring, orientation detection or correction of the image by image rotation, e.g. by 90 degrees
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Definitions

  • the present application relates to a computer vision-based character sequence recognition method, apparatus, computer equipment and storage medium.
  • the process of recognizing a character sequence is to first detect the position of the character sequence, cut the character sequence at the detected position, and finally judge and recognize the angle of the cut character sequence image to obtain the corresponding Text content; or by detecting character sequences as a special target, detected by a classifier and aggregated into a word based on an image structure model; or by neural network algorithms to establish image features and character sequence positions and correspondences Content mapping to identify character sequences.
  • a first aspect of the present application provides a computer vision-based character sequence recognition method, the method comprising:
  • the horizontal target area image is input into a pre-built content recognition model to obtain the character sequence content corresponding to the character sequence to be recognized.
  • a second aspect of the present application provides a computer vision-based character sequence recognition device, the device comprising:
  • an image acquisition module for acquiring an image carrying a character sequence to be recognized
  • a position detection module configured to obtain an image of the target area where the character sequence to be recognized is located in the image based on a pre-built position detection model
  • a horizontal correction module for performing horizontal correction on the target area image to obtain a horizontal target area image
  • an angle judgment module for obtaining the character sequence of the horizontal target area image based on a pre-built angle judgment model
  • a content recognition module configured to input the horizontal target area image into a pre-built content recognition model if the character sequence is in an upright state, and obtain the character sequence content corresponding to the character sequence to be recognized.
  • a third aspect of the present application provides a computer device including a memory and a processor, wherein the memory stores a computer program, and the processor implements the steps of the above method when executing the computer program.
  • a fourth aspect of the present application provides a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, implements the steps of the above-mentioned method.
  • FIG. 1 is a schematic flowchart of a method for character sequence recognition based on computer vision in one embodiment.
  • FIG. 2 is a schematic flowchart of acquiring the state of the character sequence of the horizontal target area image based on a pre-built angle judgment model in one embodiment.
  • FIG. 3 is a schematic flowchart of acquiring an image of a target area where a character sequence to be recognized in an image is located based on a pre-built position detection model in one embodiment.
  • FIG. 4 is a schematic flowchart of inputting a horizontal target area image into a pre-built content recognition model to obtain the content of the character sequence corresponding to the character sequence to be recognized in one embodiment.
  • FIG. 5 is a schematic flowchart of a computer vision-based character sequence recognition method in another embodiment.
  • FIG. 6 is a schematic flowchart of algorithm training and prediction processing in an application example.
  • FIG. 7 is a schematic structural diagram of an image feature pyramid in an application example.
  • FIG. 8 is a schematic flowchart of a character sequence angle judgment algorithm in an application example.
  • FIG. 9 is a schematic flowchart of a character sequence content recognition algorithm in an application example.
  • FIG. 10 is a structural block diagram of an apparatus for character sequence recognition based on computer vision in one embodiment.
  • Figure 11 is a diagram of the internal structure of a computer device in one embodiment.
  • the current character sequence recognition methods are all based on low-dimensional handcrafted features, and lack the ability to adapt to changes in image shooting angles in industrial scenes. Therefore, the current character sequence recognition methods have low recognition accuracy for character sequences.
  • a computer vision-based character sequence recognition method is provided.
  • the method is applied to a terminal for illustration. It can be understood that the method can also be applied to a server. , can also be applied to a system including a terminal and a server, and is realized through the interaction between the terminal and the server.
  • the method includes the following steps:
  • Step S101 the terminal acquires an image carrying the character sequence to be recognized.
  • the character sequence to be recognized refers to a character sequence that the user needs to obtain from an image, and the image may be an image captured in an industrial scene. Specifically, the user can record the image carrying the character sequence to be recognized from different scenarios through a mobile phone camera or video capture device, etc., and store the image in the terminal, so that the terminal can obtain the image carrying the character sequence to be recognized. .
  • Step S102 the terminal acquires an image of a target area where the character sequence to be recognized in the image is located based on a pre-built position detection model.
  • the position detection model is mainly used to detect the position area of the character to be recognized in the image, and the target area image refers to the image of the position area of the character sequence to be recognized in the image.
  • the terminal can use a pre-built position detection model to perform character sequence position detection on an image carrying the character sequence to be recognized, so as to determine the target area image where the character sequence to be recognized is located.
  • Step S103 the terminal performs horizontal correction on the target area image to obtain a horizontal target area image.
  • the terminal Since the images of the character sequences to be recognized are often photographed by the user at different shooting angles, in the images carrying the character sequences to be recognized obtained by the terminal, the character sequences to be recognized are often not arranged horizontally, but at a certain angle to the horizontal. presented in the case. Therefore, in order to improve the accuracy of character sequence recognition, after obtaining the target area image in step S102, the terminal first needs to perform horizontal correction on the target area image to obtain a horizontal target area image. In the horizontal target area image, the character sequences to be recognized are arranged in a horizontal direction. Specifically, the terminal may perform affine transformation on the target area image to complete horizontal correction, thereby obtaining a horizontal target area image.
  • Step S104 based on the pre-built angle judgment model, the terminal acquires the horizontal position of the character sequence of the image of the target area.
  • the vertical state of the character sequence of the obtained horizontal target area image may be an upright state or an upside-down state, and if Inverted state, different character sequence placement states will affect the final character sequence recognition result. Therefore, after the terminal obtains the horizontal target area image, it is necessary to first determine the position of the character sequence of the obtained horizontal target area image. Specifically, the terminal may input the horizontal target area image into the pre-built angle judgment model, so as to determine the state of the character sequence of the horizontal target area image.
  • Step S105 if the placed state of the character sequence is the upright state, the terminal inputs the horizontal target area image into the pre-built content recognition model, and obtains the content of the character sequence corresponding to the character sequence to be recognized.
  • the terminal can directly input the horizontal target area image into the pre-built content recognition model, which is mainly used to recognize the character sequence in the target area image. Therefore, the terminal can use the content recognition model to obtain the character sequence content corresponding to the character sequence to be recognized.
  • the terminal acquires an image carrying the character sequence to be recognized; based on a pre-built position detection model, acquires an image of the target area where the character sequence to be recognized is located in the image; and performs horizontal correction on the target area image, Obtain the horizontal target area image; based on the pre-built angle judgment model, obtain the character sequence placement state of the horizontal target area image; if the character sequence placement state is the upright state, input the horizontal target area image into the pre-built image.
  • the content recognition model obtains the content of the character sequence corresponding to the character sequence to be recognized.
  • the image of the target area is horizontally corrected by the terminal, so as to realize the adaptive processing of the change of the image shooting angle in the industrial scene, thereby improving the accuracy of character sequence recognition.
  • step S104 includes:
  • Step S201 based on the angle judgment model, the terminal acquires the swing angle of the horizontal target area image.
  • the angle judgment model is mainly used to determine the angle of the horizontal target area image. Since the character sequence is placed mainly due to the angle of the user's original photographed image, the terminal can judge the model through the angle to determine the level. The swing angle of the target area image, and use the swing angle to determine the swing state of the character sequence.
  • Step S202 the terminal determines the swing state of the character sequence according to the swing angle interval in which the swing angle is located.
  • step S201 after the terminal determines the swing angle through the angle judgment model, it can also select from the preset swing angle interval table, The swing angle interval suitable for the swing angle is selected as the swing angle interval where the swing angle is located, and the swing state of the character sequence is determined by the swing angle interval.
  • the swing angle interval may include a first angle interval and a second angle interval
  • the character sequence swing state may include an upright state and an inverted state
  • step S202 may further include: if the swing angle interval is the first angle interval , the terminal determines that the placed state of the character sequence is the upright state; if the placed angle interval is the second angle interval, the terminal determines that the placed state of the character sequence is the inverted state.
  • the first angle interval and the second angle interval are two different angle intervals, which are respectively used to represent two pendulum states of the character sequence. Specifically, if the swing angle interval in which the swing angle of the horizontal target area image obtained by the terminal is located is the first angle interval, the terminal may determine that the horizontal target area image at this time is in an upright state, and if the terminal obtains The swing angle interval in which the swing angle of the horizontal target area image is located is the second angle interval, and the terminal may determine that the horizontal target area image at this time is in an inverted state.
  • the horizontal target area image is rotated to an upright state and then input to the content recognition model to obtain the content of the character sequence.
  • the terminal directly inputs the inverted horizontal target area image into the content recognition model, it may cause the content of the character sequence obtained by the content recognition model to deviate from the actual character content. Therefore, when the horizontal target area image is input into the content recognition model Before, the horizontal target area image needs to be rotated first to make it upright. For example, the horizontal target area image can be rotated to an upright state by rotating 180° around the center of the horizontal target area image. The rotated horizontal target area image is input into the content recognition model, so as to obtain the character sequence content of the character sequence to be recognized.
  • the terminal can obtain the horizontal orientation angle of the target area image through the angle judgment model and then determine the orientation state of the character sequence, and if the orientation state of the character sequence is an inverted state, the terminal can rotate the horizontal orientation.
  • the target area image is converted into an upright state, and the horizontal target area image in the upright state is input into the content recognition model to obtain the character sequence content, which is beneficial to further improve the accuracy of the obtained character sequence content.
  • step S102 includes:
  • Step S301 using the position detection model, the terminal extracts the image features of the character area from the image.
  • the character area image feature refers to the image feature used to determine the position of the character sequence.
  • the terminal may extract the above-mentioned character region image features from the obtained image of the character sequence to be recognized by using a position detection model.
  • Step S302 the terminal acquires the prediction mask of the target area image according to the character area image feature.
  • the mask refers to using a selected image, figure or object to occlude the processed image (all or part) to control the area or process of image processing.
  • the terminal may obtain a prediction mask corresponding to the image feature of the character region by using the image feature of the character region.
  • Step S303 the terminal performs a process of obtaining a connected domain and a minimum circumscribed rectangle on the prediction mask to obtain an image of the target area.
  • step S302 after the terminal obtains the prediction mask of the image of the target area, the mask can be processed to obtain a connected region and a minimum circumscribed rectangle to obtain the target image.
  • step S301 may further include: : The terminal preprocesses the image, and extracts high-dimensional image features from the preprocessed image; uses the image feature pyramid to perform a first feature enhancement process on the high-dimensional image features as the character area image features.
  • the preprocessing process may be that the terminal filters images of small or difficult-to-recognize character sequence regions in the image carrying the character sequence to be recognized, so that the terminal can extract high-dimensional image features in the image carrying the character sequence to be recognized.
  • the terminal can also use the image feature pyramid to perform the first feature enhancement process on the extracted high-dimensional image features, which is beneficial to improve the feature expression ability of the image features in the character area, so that accurate features can be generated even in an environment with unclear features.
  • the predicted mask for the target area image may be that the terminal filters images of small or difficult-to-recognize character sequence regions in the image carrying the character sequence to be recognized, so that the terminal can extract high-dimensional image features in the image carrying the character sequence to be recognized.
  • the terminal can also use the image feature pyramid to perform the first feature enhancement process on the extracted high-dimensional image features, which is beneficial to improve the feature expression ability of the image features in the character area, so that accurate features can be generated even in an environment with unclear features.
  • the predicted mask for the target area image
  • the terminal can extract the image features of the character area from the image and generate the corresponding prediction mask, and can also obtain an accurate target area image by processing the prediction mask to obtain the connected domain and the minimum circumscribed rectangle.
  • the terminal can perform the first feature enhancement process on the extracted image features through the image feature pyramid, so that the feature expression ability of the image features in the character area is increased. The accuracy of character sequence recognition can be further improved.
  • step S105 includes:
  • Step S401 the terminal uses the content recognition model to perform global image feature extraction on the horizontal target area image to obtain character sequence image features corresponding to the horizontal target area image.
  • the content recognition model is mainly used to recognize the character content of the character sequence to be recognized included in the horizontal target area image.
  • the terminal may use the content recognition model to perform global image feature extraction on the obtained horizontal target area image, so as to obtain character sequence image features corresponding to the horizontal target area image.
  • Step S402 the terminal adopts the row vector convolution kernel to carry out the second feature enhancement processing on the character sequence image features along the horizontal direction.
  • the second feature enhancement processing refers to the feature enhancement processing performed on the character sequence image features. Specifically, after the character sequence image features are obtained in step S401, a row vector convolution kernel can be used to perform second feature enhancement processing on the character sequence image features along the horizontal direction, that is, along the direction of the character sequence.
  • Step S403 Based on the character sequence image features obtained by performing the second feature enhancement process, the terminal performs parallel prediction on the character sequence to be recognized to obtain the character sequence content.
  • the terminal can perform character sequence content recognition on the character sequence image features obtained by the second feature enhancement process.
  • the recognition process is parallel prediction, which can predict multiple character sequences at the same time. , so that efficient prediction of the content of character sequences can be achieved.
  • the terminal can accurately recognize the content of the character sequence through the content recognition model, and by performing the second feature enhancement processing on the image features of the character sequence, the expressive ability of the feature can be improved, so the accuracy of the content recognition of the character sequence can be improved,
  • all character sequences are predicted by the method of parallel prediction, which further improves the efficiency of character sequence content recognition.
  • a computer vision-based character sequence recognition method is provided.
  • the method is applied to a terminal for illustration.
  • the method includes the following steps:
  • Step S501 the terminal acquires an image carrying the character sequence to be recognized
  • Step S502 the terminal preprocesses the image, and extracts high-dimensional image features from the preprocessed image; uses the image feature pyramid to perform a first feature enhancement process on the high-dimensional image features, as character area image features;
  • Step S503 the terminal obtains the prediction mask of the target area image according to the character area image feature; the prediction mask is processed to obtain the connected domain and the minimum circumscribed rectangle to obtain the target area image;
  • Step S504 the terminal performs horizontal correction on the target area image to obtain a horizontal target area image
  • Step S505 the terminal obtains the horizontal orientation angle of the target area image based on the angle judgment model
  • Step S506 if the swing angle interval is the first angle interval, the terminal determines that the swing state of the character sequence is the upright state; if the swing angle interval is the second angle interval, the terminal determines that the character sequence swing state is inverted state;
  • Step S507 if the character sequence is placed in an upright state, the terminal inputs the horizontal target area image into the pre-built content recognition model; if the character sequence is placed in an upside-down state, the terminal rotates the horizontal target area image as After the erect state is input to the content recognition model;
  • Step S508 the terminal uses the content recognition model to perform global image feature extraction on the horizontal target area image to obtain character sequence image features corresponding to the horizontal target area image;
  • Step S509 the terminal adopts the row vector convolution kernel to carry out the second feature enhancement processing to the character sequence image feature along the horizontal direction;
  • Step S510 Based on the character sequence image features obtained by performing the second feature enhancement process, the terminal performs parallel prediction on the character sequence to be recognized to obtain the content of the character sequence.
  • the terminal performs horizontal correction on the image of the target area, so as to realize the adaptive processing of the change of the shooting angle of the image in the industrial scene, thereby improving the accuracy of character sequence recognition.
  • the terminal can obtain the swing angle of the horizontal target area image through the angle judgment model to determine the character sequence swing state, and if the character sequence swing state is an inverted state, the terminal can rotate the horizontal target area image. Convert to the upright state, which is beneficial to further improve the accuracy of the obtained character sequence content, and the terminal can also use the image feature pyramid to perform the first feature enhancement processing on the extracted high-dimensional image features, and perform the character sequence image features.
  • the second feature enhancement processing can improve the expressive ability of the feature, and can further improve the accuracy of the content recognition of the character sequence.
  • all character sequences are predicted by a parallel prediction method, which further improves the efficiency of character sequence content recognition.
  • a character sequence recognition algorithm at any angle in an industrial scene is also provided, which aims to efficiently solve the problems of missing recognition and misrecognition in the current industrial scene character recognition algorithm in the case of blur, illumination and angle changes, etc., make the recognition accuracy higher.
  • This application can be deployed in an industrial environment with poor camera imaging environment, while ensuring that the recognition algorithm is efficient and accurate, and supports the recognition of multi-angle and even inverted characters.
  • the algorithm training and prediction processing flow is shown in Figure 6, and the main flow is divided into There are two processes of training and prediction. In the training process, three different models need to be trained, which are detecting the position of the character sequence, judging the angle of the character sequence and identifying the content of the character sequence. In the prediction process, the trained model processes the input test images in the order of position detection, angle judgment and content recognition, and finally obtains the character sequence, position and corresponding content.
  • the training sample is the entire sample image containing the character sequence, and the corresponding annotation is the position box of the character sequence in the image, which contains the position coordinate information of the character sequence, such as the upper left corner of the starting point and the lower right corner of the ending point of the character sequence. Due to the differences in scale and color distribution between different training samples, it is necessary to normalize the samples and filter out small or illegible character sequence position boxes.
  • the data after image preprocessing is used as the input of the character sequence position detection algorithm part, which is enhanced by the deep neural network combined with the image feature pyramid structure. As shown in Figure 7, among them, conv represents different convolution layers, and stride represents different step sizes.
  • the features extracted to different scales are upsampled and added to the features obtained by the previous network to obtain the final image features.
  • this feature retains both spatial information and semantic information at the same time.
  • the image features obtained by the position detection algorithm are used to predict the mask corresponding to the region of the final image character sequence. By obtaining the connected domain and the minimum circumscribed rectangle of the mask, the character sequence position box can be obtained.
  • a character sequence image corrected to be horizontal can be obtained through affine transformation. After the calibration is horizontal, the corrected character sequence cannot be guaranteed to be upright or inverted due to the original angle of the shooting. Therefore, an angle judgment algorithm is added to determine whether the corrected character sequence is inverted. If it is inverted, it will rotate 180 degrees around the center. , if it is upright, the direct output will not be processed, so as to ensure that the final character sequence image remains upright and output as the content of the character sequence in the next stage.
  • the deep neural network is used to learn the character sequence features.
  • the extracted image features are finally used as the convolution kernel, and the row vector is used as the convolution kernel.
  • Feature enhancement is carried out in the direction to achieve parallel and efficient prediction of the content of character sequences.
  • Input the test image firstly use the character sequence position detection algorithm to detect the character sequence position of the test image, then perform cropping and affine transformation on the detected image area, and then send the transformed cropped area to the character sequence angle judgment algorithm , if it is judged that the cropped area image is upside down, the center will be rotated 180 degrees, if it is upright, it will not be processed.
  • the image area processed by the character sequence position detection algorithm and the character sequence angle judgment algorithm is used as the input of the character sequence content recognition network. Through the content recognition network, the position of the character sequence in the image and the corresponding text content are finally obtained.
  • the above application example uses a three-stage algorithm, including a cascaded character sequence position detection algorithm, a character sequence angle judgment algorithm, and a character sequence content recognition algorithm, and finally obtains a stable performance for common industrial scenarios such as imaging sharpness changes, angle changes, and illumination changes.
  • the algorithm for efficient character sequence recognition has laid a good foundation for character sequence recognition applications in industrial scenarios.
  • a computer vision-based character sequence recognition device including: an image acquisition module 1001 , a position detection module 1002 , a horizontal correction module 1003 , an angle determination module 1004 and a content recognition module 1005 ,in:
  • An image acquisition module 1001 configured to acquire an image carrying a character sequence to be recognized
  • the position detection module 1002 is used for obtaining the target area image where the character sequence to be recognized in the image is located based on the pre-built position detection model;
  • the horizontal correction module 1003 is used to perform horizontal correction on the target area image to obtain a horizontal target area image
  • the angle judgment module 1004 is used for obtaining the character sequence of the horizontal target area image based on the pre-built angle judgment model
  • the content recognition module 1005 is configured to input the horizontal target area image into the pre-built content recognition model if the character sequence is in the upright state, and obtain the character sequence content corresponding to the character sequence to be recognized.
  • the angle determination module 1004 is further configured to obtain the swing angle of the horizontal target area image based on the angle determination model; and determine the swing state of the character sequence according to the swing angle range in which the swing angle is located.
  • the swing angle interval includes a first angle interval and a second angle interval;
  • the character sequence swing state includes an upright state and an inverted state;
  • the angle determination module 1004 is further configured to, if the swing angle interval is the above The first angle interval is used to determine that the placed state of the character sequence is the upright state; and if the placed angle interval is the second angle interval, the placed state of the character sequence is determined to be the inverted state.
  • the content recognition module 1005 is further configured to rotate the horizontal target area image to an upright state and input it to the content recognition model if the character sequence is in an inverted state to obtain the character sequence content.
  • the position detection module 1002 is further configured to use the position detection model to extract character region image features from the image; obtain a prediction mask of the target region image according to the character region image features; obtain the prediction mask Connected domain and minimum circumscribed rectangle processing, get the target area image.
  • the position detection module 1002 is further configured to preprocess the image, and extract high-dimensional image features from the preprocessed image; use the image feature pyramid to perform a first feature enhancement process on the high-dimensional image features, as character region image features.
  • the content recognition module 1005 is further configured to use the content recognition model to perform global image feature extraction on the horizontal target area image to obtain character sequence image features corresponding to the horizontal target area image; using a row vector convolution kernel A second feature enhancement process is performed on the character sequence image features along the horizontal direction; based on the character sequence image features obtained by the second feature enhancement process, the character sequence to be recognized is predicted in parallel to obtain the character sequence content.
  • Each module in the above-mentioned computer vision-based character sequence recognition device can be implemented in whole or in part by software, hardware and combinations thereof.
  • the above modules can be embedded in or independent of the processor in the computer device in the form of hardware, or stored in the memory in the computer device in the form of software, so that the processor can call and execute the operations corresponding to the above modules.
  • a computer device is provided, and the computer device may be a terminal, and its internal structure diagram may be as shown in FIG. 11 .
  • the computer equipment includes a processor, memory, a communication interface, a display screen, and an input device connected by a system bus.
  • the processor of the computer device is used to provide computing and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium, an internal memory.
  • the nonvolatile storage medium stores an operating system and a computer program.
  • the internal memory provides an environment for the execution of the operating system and computer programs in the non-volatile storage medium.
  • the communication interface of the computer device is used for wired or wireless communication with an external terminal, and the wireless communication can be realized by WIFI, operator network, NFC (Near Field Communication) or other technologies.
  • the computer program implements a computer vision-based character sequence recognition method when executed by the processor.
  • the display screen of the computer equipment may be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment may be a touch layer covered on the display screen, or a button, a trackball or a touchpad set on the shell of the computer equipment , or an external keyboard, trackpad, or mouse.
  • FIG. 11 is only a block diagram of a partial structure related to the solution of the present application, and does not constitute a limitation on the computer equipment to which the solution of the present application is applied. Include more or fewer components than shown in the figures, or combine certain components, or have a different arrangement of components.
  • a computer device including a memory and a processor, where a computer program is stored in the memory, and the processor implements the steps in the foregoing method embodiments when the processor executes the computer program.
  • a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, implements the steps in the foregoing method embodiments.
  • Non-volatile memory may include read-only memory (Read-Only Memory, ROM), magnetic tape, floppy disk, flash memory, or optical memory, and the like.
  • Volatile memory may include random access memory (RAM) or external cache memory.
  • RAM may be in various forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM).

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Character Input (AREA)
  • Image Analysis (AREA)

Abstract

L'invention concerne un procédé et un appareil de reconnaissance de séquences de caractères basés sur la vision artificielle, ainsi qu'un dispositif informatique et un support de stockage. Le procédé comporte les étapes consistant à: acquérir une image portant une séquence de caractères à reconnaître; sur la base d'un modèle préalablement construit de détection de position, acquérir une image de zone cible où est située la séquence de caractères à reconnaître figurant dans l'image; effectuer une correction horizontale sur l'image de zone cible pour obtenir une image de zone cible horizontale; acquérir un état de posture de séquence de caractères de l'image de zone cible horizontale sur la base d'un modèle préalablement construit de détermination d'angle; et si l'état de posture de séquence de caractères est un état dressé, introduire l'image de zone cible horizontale dans un modèle préalablement construit de reconnaissance de contenu, de façon à acquérir un contenu de séquence de caractères correspondant à la séquence de caractères à reconnaître.
PCT/CN2021/104308 2020-07-03 2021-07-02 Procédé et appareil de reconnaissance de séquences de caractères basés sur la vision artificielle, dispositif et support WO2022002262A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2022564797A JP7429307B2 (ja) 2020-07-03 2021-07-02 コンピュータビジョンに基づく文字列認識方法、装置、機器及び媒体

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010630553.0 2020-07-03
CN202010630553.0A CN111832561B (zh) 2020-07-03 2020-07-03 基于计算机视觉的字符序列识别方法、装置、设备和介质

Publications (1)

Publication Number Publication Date
WO2022002262A1 true WO2022002262A1 (fr) 2022-01-06

Family

ID=72900995

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/104308 WO2022002262A1 (fr) 2020-07-03 2021-07-02 Procédé et appareil de reconnaissance de séquences de caractères basés sur la vision artificielle, dispositif et support

Country Status (3)

Country Link
JP (1) JP7429307B2 (fr)
CN (1) CN111832561B (fr)
WO (1) WO2022002262A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114495083A (zh) * 2022-01-13 2022-05-13 深圳市瑞意博科技股份有限公司 钢印字符识别方法、装置、设备和介质

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111832561B (zh) * 2020-07-03 2021-06-08 深圳思谋信息科技有限公司 基于计算机视觉的字符序列识别方法、装置、设备和介质
CN113468905B (zh) * 2021-07-12 2024-03-26 深圳思谋信息科技有限公司 图形码识别方法、装置、计算机设备和存储介质

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140161365A1 (en) * 2012-12-12 2014-06-12 Qualcomm Incorporated Method of Perspective Correction For Devanagari Text
CN105279512A (zh) * 2015-10-22 2016-01-27 东方网力科技股份有限公司 一种倾斜车牌识别方法和装置
CN110163205A (zh) * 2019-05-06 2019-08-23 网易有道信息技术(北京)有限公司 图像处理方法、装置、介质和计算设备
CN110516672A (zh) * 2019-08-29 2019-11-29 腾讯科技(深圳)有限公司 卡证信息识别方法、装置及终端
CN111242126A (zh) * 2020-01-15 2020-06-05 上海眼控科技股份有限公司 不规则文本校正方法、装置、计算机设备和存储介质
CN111260569A (zh) * 2020-01-10 2020-06-09 百度在线网络技术(北京)有限公司 图像倾斜校正的方法、装置、电子设备和存储介质
CN111832561A (zh) * 2020-07-03 2020-10-27 深圳思谋信息科技有限公司 基于计算机视觉的字符序列识别方法、装置、设备和介质

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101154270A (zh) * 2006-09-30 2008-04-02 电子科技大学中山学院 基于补偿原理和中心区域扫描的车牌二值化方法
CN101814142B (zh) * 2009-02-24 2013-06-05 阿尔派株式会社 手写字符输入装置及字符处理方法
CN102890783B (zh) * 2011-07-20 2015-07-29 富士通株式会社 识别图像块中文字的方向的方法和装置
CN103927534B (zh) * 2014-04-26 2017-12-26 无锡信捷电气股份有限公司 一种基于卷积神经网络的喷码字符在线视觉检测方法
JP6744126B2 (ja) * 2016-05-18 2020-08-19 東芝インフラシステムズ株式会社 文字認識装置、文字認識プログラム、文字認識方法
CN106407979B (zh) * 2016-10-25 2019-12-10 深圳怡化电脑股份有限公司 一种票据字符校正的方法及装置
CN106650721B (zh) * 2016-12-28 2019-08-13 吴晓军 一种基于卷积神经网络的工业字符识别方法
CN108681729B (zh) * 2018-05-08 2023-06-23 腾讯科技(深圳)有限公司 文本图像矫正方法、装置、存储介质及设备

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140161365A1 (en) * 2012-12-12 2014-06-12 Qualcomm Incorporated Method of Perspective Correction For Devanagari Text
CN105279512A (zh) * 2015-10-22 2016-01-27 东方网力科技股份有限公司 一种倾斜车牌识别方法和装置
CN110163205A (zh) * 2019-05-06 2019-08-23 网易有道信息技术(北京)有限公司 图像处理方法、装置、介质和计算设备
CN110516672A (zh) * 2019-08-29 2019-11-29 腾讯科技(深圳)有限公司 卡证信息识别方法、装置及终端
CN111260569A (zh) * 2020-01-10 2020-06-09 百度在线网络技术(北京)有限公司 图像倾斜校正的方法、装置、电子设备和存储介质
CN111242126A (zh) * 2020-01-15 2020-06-05 上海眼控科技股份有限公司 不规则文本校正方法、装置、计算机设备和存储介质
CN111832561A (zh) * 2020-07-03 2020-10-27 深圳思谋信息科技有限公司 基于计算机视觉的字符序列识别方法、装置、设备和介质

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114495083A (zh) * 2022-01-13 2022-05-13 深圳市瑞意博科技股份有限公司 钢印字符识别方法、装置、设备和介质

Also Published As

Publication number Publication date
CN111832561B (zh) 2021-06-08
JP7429307B2 (ja) 2024-02-07
JP2023523745A (ja) 2023-06-07
CN111832561A (zh) 2020-10-27

Similar Documents

Publication Publication Date Title
WO2022002262A1 (fr) Procédé et appareil de reconnaissance de séquences de caractères basés sur la vision artificielle, dispositif et support
WO2022213879A1 (fr) Procédé et appareil de détection d'objet cible, et dispositif électronique et support de stockage
CN108009543B (zh) 一种车牌识别方法及装置
CN108520229B (zh) 图像检测方法、装置、电子设备和计算机可读介质
US10936911B2 (en) Logo detection
CN109815770B (zh) 二维码检测方法、装置及系统
CN108875723B (zh) 对象检测方法、装置和系统及存储介质
WO2018188453A1 (fr) Procédé de détermination d'une zone de visage humain, support de stockage et dispositif informatique
WO2017096753A1 (fr) Procédé de suivi de point clé facial, terminal et support de stockage lisible par ordinateur non volatil
WO2021196389A1 (fr) Procédé et appareil de reconnaissance d'unité d'action faciale, dispositif électronique et support de stockage
WO2020134528A1 (fr) Procédé de détection cible et produit associé
JP6688277B2 (ja) プログラム、学習処理方法、学習モデル、データ構造、学習装置、および物体認識装置
WO2021258579A1 (fr) Procédé et appareil d'épissage d'image, dispositif informatique et support de stockage
CN110619656B (zh) 基于双目摄像头的人脸检测跟踪方法、装置及电子设备
WO2019080743A1 (fr) Procédé et appareil de détection de cible, et dispositif informatique
CN112101386B (zh) 文本检测方法、装置、计算机设备和存储介质
CN113673519B (zh) 基于文字检测模型的文字识别方法及其相关设备
CN112232311B (zh) 人脸跟踪方法、装置及电子设备
WO2022206680A1 (fr) Procédé et appareil de traitement d'image, dispositif informatique et support d'enregistrement
CN111008576A (zh) 行人检测及其模型训练、更新方法、设备及可读存储介质
CN113160231A (zh) 一种样本生成方法、样本生成装置及电子设备
US11270152B2 (en) Method and apparatus for image detection, patterning control method
WO2022063321A1 (fr) Procédé et appareil de traitement d'image, dispositif et support de stockage
CN110222576B (zh) 拳击动作识别方法、装置和电子设备
CN113723375B (zh) 一种基于特征抽取的双帧人脸跟踪方法和系统

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref document number: 2022564797

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21833523

Country of ref document: EP

Kind code of ref document: A1