WO2022179138A1 - 图像处理方法、装置、计算机设备和存储介质 - Google Patents
图像处理方法、装置、计算机设备和存储介质 Download PDFInfo
- Publication number
- WO2022179138A1 WO2022179138A1 PCT/CN2021/125266 CN2021125266W WO2022179138A1 WO 2022179138 A1 WO2022179138 A1 WO 2022179138A1 CN 2021125266 W CN2021125266 W CN 2021125266W WO 2022179138 A1 WO2022179138 A1 WO 2022179138A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- text
- detected
- sensitive information
- images
- Prior art date
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 24
- 238000001514 detection method Methods 0.000 claims abstract description 106
- 238000012545 processing Methods 0.000 claims abstract description 44
- 238000000034 method Methods 0.000 claims abstract description 21
- 238000012015 optical character recognition Methods 0.000 claims description 65
- 238000000605 extraction Methods 0.000 claims description 22
- 238000004590 computer program Methods 0.000 claims description 16
- 230000000873 masking effect Effects 0.000 claims description 10
- 230000015654 memory Effects 0.000 claims description 10
- 238000007689 inspection Methods 0.000 claims description 7
- 238000003709 image segmentation Methods 0.000 claims description 4
- 230000006870 function Effects 0.000 description 15
- 238000012549 training Methods 0.000 description 12
- 230000011218 segmentation Effects 0.000 description 9
- 238000010586 diagram Methods 0.000 description 8
- 238000013528 artificial neural network Methods 0.000 description 7
- 238000013527 convolutional neural network Methods 0.000 description 7
- 238000004422 calculation algorithm Methods 0.000 description 6
- 230000002457 bidirectional effect Effects 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 5
- 238000003062 neural network model Methods 0.000 description 5
- 230000000306 recurrent effect Effects 0.000 description 5
- 238000003745 diagnosis Methods 0.000 description 2
- 238000003384 imaging method Methods 0.000 description 2
- 238000009533 lab test Methods 0.000 description 2
- 230000006403 short-term memory Effects 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000002939 conjugate gradient method Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 230000007787 long-term memory Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/40—Document-oriented image-based pattern recognition
- G06V30/41—Analysis of document content
- G06V30/413—Classification of content, e.g. text, photographs or tables
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
- G06F21/6218—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
- G06F21/6245—Protecting personal data, e.g. for financial or medical purposes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q40/00—Finance; Insurance; Tax strategies; Processing of corporate or income taxes
- G06Q40/08—Insurance
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/62—Text, e.g. of license plates, overlay texts or captions on TV images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/14—Image acquisition
- G06V30/148—Segmentation of character regions
- G06V30/153—Segmentation of character regions using recognition of characters or words
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
Definitions
- the present application relates to the field of artificial intelligence and digital medicine, and in particular, to an image processing method, apparatus, computer equipment and storage medium.
- OCR Optical Character Recognition, Optical Character Recognition
- the present application provides an image processing method, device, computer equipment and storage medium.
- text detection, segmentation and sensitive information shielding on an image to be detected with an optical character recognition model text recognition error
- the obtained multiple target images are allocated to Manual identification is carried out to the terminal, which improves the accuracy and efficiency of identifying text information.
- the present application provides an image processing method, the method comprising:
- the image to be detected is an image of a bill whose text is incorrectly recognized by an optical character recognition model
- Invoke a text detection model input the image to be detected into the text detection model for text detection, and obtain text position information corresponding to the image to be detected;
- Segmenting the to-be-detected image according to the text position information to obtain at least one segmented image corresponding to the to-be-detected image
- Each of the target images is allocated to a corresponding target terminal, so that the target terminal can recognize the allocated target image and obtain a text recognition result.
- the present application also provides an image processing device, the device comprising:
- an image acquisition module configured to acquire an image to be detected, where the image to be detected is an image of a bill whose text is incorrectly recognized by an optical character recognition model;
- a text detection module configured to invoke a text detection model, input the image to be detected into the text detection model for text detection, and obtain text position information corresponding to the image to be detected;
- an image segmentation module configured to segment the to-be-detected image according to the text position information to obtain at least one segmented image corresponding to the to-be-detected image
- an information shielding module configured to perform sensitive information shielding processing on each of the fragmented images to obtain at least one target image
- the image distribution module is configured to distribute each target image to a corresponding target terminal, so that the target terminal can recognize the allocated target image and obtain a text recognition result.
- the present application also provides a computer device, the computer device comprising a memory and a processor;
- the memory for storing computer programs
- the processor is configured to execute the computer program and implement the above-mentioned image processing method when executing the computer program.
- the present application further provides a computer-readable storage medium, where the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the processor implements the above-mentioned image processing method .
- the present application discloses an image processing method, device, computer equipment and storage medium.
- an optical character recognition model can obtain a bill image with incorrect text recognition; by calling a text detection model, the to-be-detected image is input into the text detection model.
- the text detection model can accurately obtain the text position information corresponding to the image to be detected; by segmenting the image to be detected according to the text position information, the fragmented image corresponding to the image to be detected can be obtained;
- Information shielding processing can avoid the leakage of private information; by assigning each target image to the corresponding target terminal, so that the target terminal can recognize the allocated target image, realize the manual text recognition of the target image, and improve the recognition of text information. accuracy and efficiency.
- FIG. 1 is a schematic flowchart of an image processing method provided by an embodiment of the present application
- FIG. 2 is a schematic flowchart of a sub-step of acquiring an image to be detected provided by an embodiment of the present application
- FIG. 3 is a schematic flowchart of a sub-step of inputting an image to be detected into a text detection model for text detection provided by an embodiment of the present application;
- FIG. 4 is a schematic diagram of inputting an image to be detected into a text detection model for text detection provided by an embodiment of the present application
- FIG. 5 is a schematic flowchart of a sub-step of performing sensitive information shielding processing on each fragmented image provided by an embodiment of the present application
- FIG. 6 is a schematic diagram of allocating each target image to a corresponding target terminal according to an embodiment of the present application
- FIG. 7 is a schematic block diagram of an image processing apparatus provided by an embodiment of the present application.
- FIG. 8 is a schematic structural block diagram of a computer device provided by an embodiment of the present application.
- Embodiments of the present application provide an image processing method, apparatus, computer device, and storage medium.
- the image processing method can be applied to a server or a terminal, by performing text detection, segmentation and shielding of sensitive information on the image to be detected whose text recognition is wrong in the optical character recognition model, and assigning the obtained multiple target images to the terminal for manual recognition , which improves the accuracy and efficiency of identifying text information.
- the server may be an independent server or a server cluster.
- Terminals can be electronic devices such as smart phones, tablet computers, notebook computers, and desktop computers.
- the image processing method includes steps S10 to S50.
- Step S10 acquiring an image to be detected, where the image to be detected is an image of a bill whose characters are incorrectly recognized by an optical character recognition model.
- the to-be-detected image refers to the bill image for which an error occurs in the character recognition performed by the optical character recognition model.
- the user in order to obtain useful information in the bill image, the user can input the bill image into the optical character recognition model for character recognition, so as to extract the text information in the bill image.
- the bill image can be a medical bill image, and certainly can also be bill images of other industries.
- the medical bill images may include, but are not limited to, images corresponding to documents such as expense lists, laboratory test sheets, invoices, outpatient and emergency medical records, diagnosis certificates, imaging examination reports, prescription sheets, and admission and discharge records. It should be noted that, by performing character recognition on images such as medical bill images to obtain character recognition results, medical informatization can be realized.
- FIG. 2 is a schematic flowchart of the sub-steps of acquiring an image to be detected in step S10 .
- the specific step S10 may include the following steps S101 to S103 .
- Step S101 when a selection operation on a ticket image is detected, at least one ticket image corresponding to the selection operation is determined.
- the ticket image may be pre-stored in a database or a local disk, or uploaded by a user.
- At least one ticket image corresponding to the selection operation is determined.
- Step S102 Invoke the optical character recognition model, and input each bill image into the optical character recognition model for character recognition.
- the optical character recognition model may be implemented by using an OCR (Optical Character Recognition, optical character recognition) technology.
- OCR Optical Character Recognition, optical character recognition
- the OCR technology is used to analyze and recognize files such as pictures and tables to obtain text and layout information.
- the optical character recognition model may include a text detection model and a text recognition model.
- the text detection model is used to detect the location, range and layout of the text
- the text recognition model is used to recognize the text content on the basis of text detection, and convert the text information in the image into text information.
- the text detection model may include, but is not limited to, CTPN (Detecting Text in Natural Image with Connectionist Text Proposal Network, text detection based on connection pre-selected box network) network model, Faster R-CNN network model and RRPN (Rotation Region Proposal Networks) , rotating region extraction network) network model and so on.
- Text recognition models may include, but are not limited to, Convolutional Neural Networks, Recurrent Neural Networks, Multilayer Perceptrons, and Restricted Boltzmann Machines, among others.
- the initial optical character recognition model may be trained in advance, and the trained optical character recognition model may be invoked to perform text recognition on the bill image to obtain the text recognition result.
- the specific training process is not limited here.
- Step S103 determining the image of the receipt with a character recognition error as the image to be detected.
- the method may further include: verifying the obtained character recognition result to determine a bill image with a character recognition error.
- a regular expression check may be performed on the character recognition result to determine a bill image with a character recognition error. For example, check string length, date and name, etc.
- a bill image with a character recognition error is determined as an image to be detected.
- the unrecognized bill image can also be determined as the image to be detected.
- Step S20 Invoke a text detection model, input the image to be detected into the text detection model to perform text detection, and obtain text position information corresponding to the image to be detected.
- the text detection model may be a CTPN network model.
- CTPN network model mainly locates the text lines in the image accurately.
- the seamless combination of CNN (Convolutional Neural Network, Convolutional Neural Network) and RNN (Recurrent Neural Network, Recurrent Neural Network) can be used to improve the detection accuracy.
- CNN Convolutional Neural Network, Convolutional Neural Network
- RNN Recurrent Neural Network, Recurrent Neural Network
- CNN is used to extract deep features
- RNN is used for sequence feature recognition, which provides the accuracy of recognition.
- the text detection model includes two sub-models, a text feature extraction model and a text position detection model.
- the text feature extraction model is used to extract text features in the image;
- the text position detection model is used to predict the text position.
- the feature extraction model may be a convolutional neural network
- the text position detection model may be a recurrent neural network.
- the above text detection model can be stored in a node of a blockchain.
- the text detection model needs to be used, it can be called from the nodes of the blockchain.
- FIG. 3 is a schematic flowchart of the sub-steps of inputting the image to be detected into the text detection model for text detection in step S20 .
- the specific step S20 may include the following steps S201 to S203 .
- Step S201 Input the image to be detected into the text feature extraction model to extract text features, and obtain a depth feature image corresponding to the image to be detected.
- the VGG16 network structure may be used as the basic model, and depth feature images of different scales are obtained by convolving the image to be detected through multiple convolution layers.
- FIG. 4 is a schematic diagram of inputting an image to be detected into a text detection model to perform text detection according to an embodiment of the present application.
- the image to be detected is input into a text feature extraction model to perform text feature extraction, and a depth feature image corresponding to the image to be detected is obtained.
- Step S202 adding a text candidate frame to the depth feature image, and determining a corresponding depth feature sequence according to the depth feature information corresponding to the text candidate frame in the same row.
- multiple text candidate boxes of preset size may be added to the depth feature image.
- the height and width of the text candidate frame may be set according to actual conditions, and the specific values are not limited herein.
- the depth feature information corresponding to all text candidate boxes in each row may be sequentially determined as the depth feature sequence corresponding to the row.
- Step S203 Input the depth feature sequence into the text position detection model to predict the text position, and obtain the predicted text position information corresponding to the image to be detected.
- the depth feature sequence is input into the text position detection model for text position prediction, and the predicted text position information corresponding to the image to be detected is output.
- the text position detection model may be a BI_LSTM-CRF neural network in a recurrent neural network.
- the BI_LSTM-CRF neural network combines BI_LSTM (Bidirectional Long Short Term Memory Network, bidirectional long and short term memory network) and CRF (Conditional Random Field) layers.
- the BI_LSTM-CRF neural network model can not only use past input features and sentence label information, but also use future input features. Considering the impact of long-distance context information on Chinese word segmentation, it can ensure higher accuracy of Chinese word segmentation.
- the text features are extracted from the depth feature sequence, and the text position information corresponding to the prediction of the image to be detected is obtained.
- the bidirectional LSTM layer includes the forward LSTM layer and the backward LSTM layer.
- the depth feature sequence is used as the input of each time step of the bidirectional LSTM layer, and the hidden state sequence output by the forward LSTM layer and the hidden state output by the backward LSTM layer at each position are spliced by position to obtain a complete hidden state sequence. ; Connect the hidden state sequence to the linear layer to obtain the predicted text position information corresponding to the image to be detected.
- an image to be detected may be input into a text detection model for text detection based on a GPU cluster to obtain text position information corresponding to the image to be detected.
- a GPU (Graphics Processing Unit, graphics processing unit) cluster is a computer cluster in which each node is equipped with a graphics processing unit. Because the GPU for general computing has a high data parallel architecture, it can process a large number of data points in parallel, so that the GPU cluster can perform very fast calculations and improve the computing throughput.
- Step S30 segment the to-be-detected image according to the text position information to obtain at least one segmented image corresponding to the to-be-detected image.
- the image area to be segmented may be determined according to the text position information, and then the image corresponding to the image area is intercepted as the segmented image.
- segmentation methods may also be used, and the specific segmentation methods are not limited herein.
- a fragmented image is obtained by corresponding segmentation.
- segmenting the image to be detected according to the text position information By segmenting the image to be detected according to the text position information, multiple discontinuous and incomplete segmented images are obtained, and the segmented images can be subsequently assigned to multiple target terminals for identification processing, which not only improves the image processing efficiency, but also improves the image processing efficiency. It can prevent the leakage of private information.
- Step S40 Perform sensitive information shielding processing on each of the segmented images to obtain at least one target image.
- FIG. 5 is a schematic flowchart of sub-steps of performing sensitive information shielding processing on each segmented image in step S40 .
- the specific step S40 may include the following steps S401 to S403 .
- Step S401 performing a sensitive information check on a plurality of the segmented images to determine whether each segmented image has sensitive information.
- sensitive information may include, but is not limited to, name, ID number, contact information, and the like.
- the method before performing sensitive information inspection on a plurality of sliced images to determine whether each sliced image has sensitive information, the method may further include: determining a ticket type corresponding to the image to be detected; According to the preset correspondence relationship between the to-be-detected images, the sensitive information areas in the to-be-detected images are determined according to the bill types corresponding to the to-be-detected images.
- the types of bills may include, but are not limited to, bill types such as bills of expenses, laboratory tests, invoices, outpatient and emergency medical records, diagnosis certificates, and imaging examination reports.
- the images to be detected of different bill types have different location areas where the corresponding sensitive information is located.
- the ticket type may be associated with the sensitive location area in advance.
- the sensitive location area may be the upper left corner, the upper right corner, the lower left corner and the lower right corner, and may also be other locations, which are not limited herein.
- performing a sensitive information check on a plurality of fragmented images to determine whether each fragmented image has sensitive information may include: determining whether each fragmented image has a sensitive information area; if the fragmented image has sensitive information areas; information area, it is determined that there is sensitive information in the fragmented image.
- the relative position of the sliced image in the image to be detected is in the sensitive information area, it can be determined that the sliced image has a sensitive information area, and then it can be determined that the sliced image has sensitive information.
- performing a sensitive information check on a plurality of fragmented images to determine whether each fragmented image has sensitive information may further include: determining whether a keyword group exists in the sensitive information area of each fragmented image; if If the segmented image has a keyword group, it is determined that the segmented image contains sensitive information.
- the keyword group may include, but is not limited to, "name”, “number”, “certificate”, “contact information” and so on.
- keyword groups corresponding to sensitive information may be collected and stored in a preset phrase database in advance.
- the phrase in the sensitive information area may be matched with a phrase in the phrase database to determine whether there is a keyword group in the sensitive information area.
- Step S402 performing masking processing on the segmented images with sensitive information to obtain the segmented images after the masking processing.
- masking the segmented images with sensitive information may include: determining a keyword group corresponding to the sensitive information area in the segmented image, and replacing the keyword group with a preset character identifier.
- the preset character identifier may be "*" or other characters, which are not limited herein.
- Step S403 Determine a plurality of the target images according to the segmented images after the masking process and the segmented images without sensitive information.
- the segmented image after the sensitive information masking process and the segmented image without sensitive information may be determined as the target image.
- Step S50 Allocate each target image to a corresponding target terminal, so that the target terminal can recognize the allocated target image and obtain a text recognition result.
- FIG. 6 is a schematic diagram of allocating each target image to a corresponding target terminal according to an embodiment of the present application. As shown in Figure 6, the user's assignment operation to multiple target images can be received, and the target terminal corresponding to each target image can be determined according to the assignment operation; target image for identification.
- each target image is assigned to a corresponding target terminal, and the operator corresponding to the target terminal manually recognizes the target image, thereby obtaining a text recognition result.
- operators can place text recognition labels on target images.
- the target terminal can generate a text recognition result according to the text recognition label marked by the operator.
- each target image to the corresponding target terminal may further include: receiving a text recognition result sent by the target terminal, wherein the text recognition result includes a text recognition label corresponding to the target image; Identifying the labels iteratively trains the OCR model until the OCR model converges.
- the optical character recognition model includes a text detection model and a text recognition model
- the text detection model and the text recognition model can be trained according to the target image and the text recognition label.
- the optical character recognition model can learn to recognize the wrong image, thereby improving the text recognition accuracy of the optical character recognition model.
- training the text detection model and the text recognition model according to the target image and the text recognition label may include: determining the training sample data for each round of training according to the target image and the text recognition label; inputting the training sample data into In the text detection model, the text detection result is obtained; the text detection result is input into the text recognition model to obtain the text recognition result; based on the preset loss function value, according to the text recognition label and the text recognition result, the loss function value corresponding to the current round is determined ; If the loss function value is greater than the preset loss value threshold, adjust the parameters of the text detection model and the text recognition model, and carry out the next round of training, until the obtained loss function value is less than or equal to the loss value threshold, end the training, and get the training After the text detection model and text recognition model.
- the preset loss value threshold may be set according to the actual situation, and the specific value is not limited herein.
- a loss function such as a 0-1 loss function, an absolute value loss function, a logarithmic loss function, a cross-entropy loss function, a squared loss function, or an exponential loss function can be used to calculate the loss function value.
- Convergence algorithms such as gradient descent algorithm, Newton algorithm, conjugate gradient method or Cauchy-Newton method can be used to adjust the parameters of the text detection model and the text recognition model.
- the above-mentioned trained optical character recognition model can also be stored in a node of a blockchain.
- the trained optical character recognition model needs to be used, it can be obtained from the nodes of the blockchain.
- the text detection model and the text recognition model can be quickly converged, thereby improving the training efficiency and accuracy of the optical character recognition model.
- the image processing method provided by the above embodiment improves the efficiency of recognizing text information by calling the optical character recognition model and inputting each bill image into the optical character recognition model for character recognition; It is determined that the bill image with text recognition error exists, and the bill image with text recognition error is determined as the image to be detected, and the to-be-detected image can be manually recognized later; The obtained deep feature image is added with a text candidate frame, and the deep feature sequence is input into the BI_LSTM-CRF neural network model for text position prediction. Since the word segmentation of the BI_LSTM-CRF neural network model has higher accuracy, it effectively improves the prediction of text position information.
- the accuracy and efficiency of detecting text position information can be improved; by segmenting the image to be detected according to the text position information, multiple discontinuities can be obtained. , Incomplete fragmented images, the fragmented images can be subsequently assigned to multiple target terminals for identification processing, which not only improves the image processing efficiency, but also prevents the leakage of private information; Shielding the fragmented images with sensitive information can avoid the leakage of sensitive information and improve the security of information; by iteratively training the optical character recognition model according to the target image and text recognition labels, the optical character recognition model can be Learning to recognize wrong images, thereby improving the text recognition accuracy of the optical character recognition model; by updating the parameters of the text detection model and text recognition model according to the preset loss function and convergence algorithm, the text detection model and text recognition model can be made. The model converges quickly, thereby improving the training efficiency and accuracy of the optical character recognition model.
- FIG. 7 is a schematic block diagram of an image processing apparatus 1000 further provided by an embodiment of the present application, and the image processing apparatus is configured to execute the aforementioned image processing method.
- the image processing apparatus may be configured in a server or a terminal.
- the image processing apparatus 1000 includes: an image acquisition module 1001 , a text detection module 1002 , an image segmentation module 1003 , an information shielding module 1004 and an image distribution module 1005 .
- the image acquisition module 1001 is configured to acquire an image to be detected, and the image to be detected is an image of a bill with an error in word recognition by an optical character recognition model.
- the text detection module 1002 is configured to call a text detection model, input the image to be detected into the text detection model to perform text detection, and obtain text position information corresponding to the image to be detected.
- the image segmentation module 1003 is configured to segment the to-be-detected image according to the text position information to obtain at least one segmented image corresponding to the to-be-detected image.
- the information shielding module 1004 is configured to perform sensitive information shielding processing on each of the segmented images to obtain at least one target image.
- the image allocation module 1005 is configured to allocate each target image to a corresponding target terminal, so that the target terminal can recognize the allocated target image and obtain a text recognition result.
- the above-mentioned apparatus can be implemented in the form of a computer program that can be executed on a computer device as shown in FIG. 8 .
- FIG. 8 is a schematic structural block diagram of a computer device provided by an embodiment of the present application.
- the computer device may be a server or a terminal.
- the computer device includes a processor and a memory connected through a system bus, wherein the memory may include a storage medium and an internal memory.
- the storage medium includes both non-volatile storage medium and volatile storage medium.
- the processor is used to provide computing and control capabilities to support the operation of the entire computer equipment.
- the internal memory provides an environment for running a computer program in a non-volatile storage medium, and when the computer program is executed by the processor, the processor can cause the processor to execute any image processing method.
- the processor may be a central processing unit (Central Processing Unit, CPU), and the processor may also be other general-purpose processors, digital signal processors (Digital Signal Processors, DSP), application specific integrated circuits (Application Specific Integrated circuits) Circuit, ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
- the general-purpose processor can be a microprocessor or the processor can also be any conventional processor or the like.
- the processor is configured to run a computer program stored in the memory to implement the following steps:
- the image to be detected is an image of a ticket with an error in the word recognition of the optical character recognition model; calling a text detection model, inputting the image to be detected into the text detection model for text detection, and obtaining the corresponding image of the image to be detected
- the to-be-detected image is segmented to obtain at least one fragmented image corresponding to the to-be-detected image
- the sensitive information masking process is performed on each of the fragmented images to obtain at least one target image
- assigning each target image to a corresponding target terminal so that the target terminal recognizes the allocated target image and obtains a text recognition result.
- the processor when the processor acquires the to-be-detected image, the processor is configured to:
- At least one ticket image corresponding to the selection operation is determined; the optical character recognition model is invoked, and each of the ticket images is input into the optical character recognition model for character recognition; A bill image with a character recognition error is determined as the to-be-detected image.
- the text detection model includes a text feature extraction model and a text position detection model; the processor invokes the text detection model, inputs the to-be-detected image into the text detection model for text detection, and obtains When the text position information corresponding to the image to be detected is used to realize:
- the corresponding depth feature information is determined, and the corresponding depth feature sequence is determined; the depth feature sequence is input into the text position detection model to perform text position prediction, and the predicted text position information corresponding to the to-be-detected image is obtained.
- the processor when the processor performs sensitive information shielding processing on each of the segmented images to obtain at least one target image, the processor is configured to:
- the processor is further configured to implement:
- the processor when the processor performs sensitive information inspection on a plurality of the segmented images to determine whether each segmented image has sensitive information, the processor is configured to:
- the processor when the processor performs masking processing on the fragmented image with sensitive information, the processor is configured to:
- a keyword group corresponding to the sensitive information area in the fragmented image is determined, and the keyword group is replaced with a preset character identifier.
- the processor is further configured to:
- the target terminal receives the text recognition result sent by the target terminal, wherein the text recognition result includes a text recognition label corresponding to the target image; iterate the optical character recognition model according to the target image and the text recognition label Train until the optical character recognition model converges.
- the embodiments of the present application further provide a computer-readable storage medium, where the computer-readable storage medium stores a computer program, the computer program includes program instructions, and the processor executes the program instructions to implement the present application Any one of the image processing methods provided in the embodiments.
- the computer-readable storage medium may be an internal storage unit of the computer device described in the foregoing embodiments, such as a hard disk or a memory of the computer device.
- the computer-readable storage medium may also be an external storage device of the computer device, such as a plug-in hard disk equipped on the computer device, a smart memory card (Smart Media Card, SMC), a Secure Digital Card (Secure Digital Card) , SD Card), flash memory card (Flash Card), etc.
- the computer-readable storage medium may be non-volatile or volatile.
- the computer-readable storage medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required by at least one function, and the like; The data created by the use of the node, etc.
- the blockchain referred to in this application is a new application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm.
- Blockchain essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information to verify its Validity of information (anti-counterfeiting) and generation of the next block.
- the blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Software Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Multimedia (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Business, Economics & Management (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Bioethics (AREA)
- Accounting & Taxation (AREA)
- Finance (AREA)
- Technology Law (AREA)
- General Business, Economics & Management (AREA)
- Development Economics (AREA)
- Economics (AREA)
- Marketing (AREA)
- Strategic Management (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Computer Hardware Design (AREA)
- Computer Security & Cryptography (AREA)
- Character Input (AREA)
- Character Discrimination (AREA)
Abstract
一种图像处理方法、装置、设备和存储介质,该方法包括:获取待检测图像;将待检测图像输入文本检测模型进行文本检测,根据获得的文本位置信息对待检测图像进行切分,得到分片图像;对分片图像进行敏感信息屏蔽处理,得到目标图像;将目标图像分配至目标终端,以使目标终端对分配的目标图像进行识别,获得文本识别结果。
Description
本申请要求于2021年2月26日提交中国专利局、申请号为2021102177307、发明名称为“图像处理方法、装置、计算机设备和存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
本申请涉及人工智能与数字医疗领域,尤其涉及一种图像处理方法、装置、计算机设备和存储介质。
在保险业务的理赔过程中,为了实现医疗信息化,需要对客户提供的大量的各种医疗类票据进行信息录入与保存。在传统的保险理赔系统中,由于需要根据用户上传的理赔医疗单证图像,逐一手工识别、提取以及录入有用的数据,因此面对大量、复杂、冗长、繁琐数据,存在录入体验较差、录入效率低下以及录入错误率高等问题。
在现有技术中,为了提高效率,通常采用OCR(Optical Character Recognition,光学字符识别)技术识别票据图像中的文本信息。发明人发现,OCR技术无法识别一些特殊字符、繁体字等数据,通常会出现识别错误的情况,从而降低了文本信息的识别准确度。
因此如何提高识别文本信息的准确度与效率成为亟需解决的问题。
发明内容
本申请提供了一种图像处理方法、装置、计算机设备和存储介质,通过对光学字符识别模型文字识别错误的待检测图像进行文本检测、切分以及敏感信息屏蔽,将得到的多个目标图像分配至终端进行人工识别,提高了识别文本信息的准确度与效率。
第一方面,本申请提供了一种图像处理方法,所述方法包括:
获取待检测图像,所述待检测图像为光学字符识别模型文字识别错误的票据图像;
调用文本检测模型,将所述待检测图像输入所述文本检测模型进行文本检测,获得所述待检测图像对应的文本位置信息;
根据所述文本位置信息对所述待检测图像进行切分,得到所述待检测图像对应的至少一个分片图像;
对每个所述分片图像进行敏感信息屏蔽处理,得到至少一个目标图像;
将每个所述目标图像分配至对应的目标终端,以使所述目标终端对分配的所述目标图像进行识别,获得文本识别结果。
第二方面,本申请还提供了一种图像处理装置,所述装置包括:
图像获取模块,用于获取待检测图像,所述待检测图像为光学字符识别模型文字识别错误的票据图像;
文本检测模块,用于调用文本检测模型,将所述待检测图像输入所述文本检测模型进行文本检测,获得所述待检测图像对应的文本位置信息;
图像切分模块,用于根据所述文本位置信息对所述待检测图像进行切分,得到所述待检测图像对应的至少一个分片图像;
信息屏蔽模块,用于对每个所述分片图像进行敏感信息屏蔽处理,得到至少一个目标图像;
图像分配模块,用于将每个所述目标图像分配至对应的目标终端,以使所述目标终端对分配的所述目标图像进行识别,获得文本识别结果。
第三方面,本申请还提供了一种计算机设备,所述计算机设备包括存储器和处理器;
所述存储器,用于存储计算机程序;
所述处理器,用于执行所述计算机程序并在执行所述计算机程序时实现如上述的图像处理方法。
第四方面,本申请还提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时使所述处理器实现如上述的图像处理方法。
本申请公开了一种图像处理方法、装置、计算机设备和存储介质,通过获取待检测图像,可以得到光学字符识别模型文字识别错误的票据图像;通过调用文本检测模型,将待检测图像输入文本检测模型进行文本检测,可以准确地获得待检测图像对应的文本位置信息;通过根据文本位置信息对待检测图像进行切分,可以得到待检测图像对应的分片图像;通过对每个分片图像进行敏感信息屏蔽处理,可以避免隐私信息的泄露;通过将每个目标图像分配至对应的目标终端,以使目标终端对分配的目标图像进行识别,实现人工对目标图像进行文本识别,提高了识别文本信息的准确度与效率。
为了更清楚地说明本申请实施例技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是本申请实施例提供的一种图像处理方法的示意性流程图;
图2是本申请实施例提供的一种获取待检测图像的子步骤的示意性流程图;
图3是本申请实施例提供的一种将待检测图像输入文本检测模型进行文本检测的子步骤的示意性流程图;
图4是本申请实施例提供的一种将待检测图像输入文本检测模型进行文本检测的示意图;
图5是本申请实施例提供的一种对每个分片图像进行敏感信息屏蔽处理的子步骤的示意性流程图;
图6是本申请实施例提供的一种将每个目标图像分配至对应的目标终端的示意图;
图7是本申请实施例提供的一种图像处理装置的示意性框图;
图8是本申请实施例提供的一种计算机设备的结构示意性框图。
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
附图中所示的流程图仅是示例说明,不是必须包括所有的内容和操作/步骤,也不是必须按所描述的顺序执行。例如,有的操作/步骤还可以分解、组合或部分合并,因此实际执行的顺序有可能根据实际情况改变。
应当理解,在此本申请说明书中所使用的术语仅仅是出于描述特定实施例的目的而并不意在限制本申请。如在本申请说明书和所附权利要求书中所使用的那样,除非上下文清楚地指明其它情况,否则单数形式的“一”、“一个”及“该”意在包括复数形式。
还应当理解,在本申请说明书和所附权利要求书中使用的术语“和/或”是指相关联列出的项中的一个或多个的任何组合以及所有可能组合,并且包括这些组合。
本申请的实施例提供了一种图像处理方法、装置、计算机设备和存储介质。其中,该图像处理方法可以应用于服务器或终端,通过对光学字符识别模型文字识别错误的待检测图像进行文本检测、切分以及敏感信息屏蔽,将得到的多个目标图像分配至终端进行人工识别,提高了识别文本信息的准确度与效率。
其中,服务器可以为独立的服务器,也可以为服务器集群。终端可以是智能手机、平板电脑、笔记本电脑和台式电脑等电子设备。
下面结合附图,对本申请的一些实施方式作详细说明。在不冲突的情况下,下述的实施例及实施例中的特征可以相互组合。如图1所示,图像处理方法包括步骤S10至步骤S50。
步骤S10、获取待检测图像,所述待检测图像为光学字符识别模型文字识别错误的票据图像。
需要说明的是,待检测图像是指经光学字符识别模型进行文字识别出现错误的票据图像。在本申请实施例中,为获取票据图像中的有用信息,用户可以将票据图像输入光学字符识别模型中进行文字识别,以提取票据图像中的文字信息。
示例性的,票据图像可以是医疗票据图像,当然也可以是其它行业的票据图像。其中,医疗票据图像可以包括但不限于费用清单、化验单、发票、门急诊病历、诊断证明、影像检查报告、处方单以及出入院记录等文件对应的图像。需要说明的是,通过对医疗票据图像等图像进行文字识别,获得文字识别结果,可以实现医疗信息化。
请参阅图2,图2是步骤S10中获取待检测图像的子步骤的示意性流程图,具体步骤S10可以包括以下步骤S101至步骤S103。
步骤S101、当检测到对票据图像的选中操作时,确定所述选中操作对应的至少一个票据图像。
需要说明的是,票据图像可以是预先存储在数据库或本地磁盘的,也可以是用户上传的。
示例性的,当检测到用户对预先存储在数据库或本地磁盘中的票据图像的选中操作时,确定选中操作对应的至少一个票据图像。
示例性的,当检测到用户对上传后的票据图像的选中操作时,确定选中操作对应的至少一个票据图像。
步骤S102、调用所述光学字符识别模型,将每个所述票据图像输入所述光学字符识别模型进行文字识别。
示例性的,光学字符识别模型可以采用OCR(Optical Character Recognition,光学字符识别)技术来实现。需要说明的是,OCR技术用于对图片、表格等文件进行分析与识别处理,以获取文字及版面信息。
其中,光学字符识别模型可以包括文本检测模型与文本识别模型。其中,文本检测模型用于检测文字的所在位置、范围以及布局;文本识别模型用于在文字检测的基础上,对文字内容进行识别,将图像中的文字信息转化为文本信息。
示例性的,文本检测模型可以包括但不限于CTPN(Detecting Text in Natural Image with Connectionist Text Proposal Network,基于连接预选框网络的文本检测)网络模型、Faster R-CNN网络模型以及RRPN(Rotation Region Proposal Networks,旋转区域提取网络)网络模型等等。文本识别模型可以包括但不限于卷积神经网络、循环神经网络、多层感知机以及受限玻尔兹曼机等等。
在本申请实施例中的,可以预先对初始的光学字符识别模型进行训练,调用训练好的光学字符识别模型对票据图像进行文字识别,得到文字识别结果。具体的训练过程,在此不作限定。
通过调用光学字符识别模型,将每个票据图像输入光学字符识别模型进行文字识别,提高了识别文本信息的效率。
步骤S103、将存在文字识别错误的票据图像,确定为所述待检测图像。
在一些实施例中,将每个票据图像输入光学字符识别模型进行文字识别之后,还可以包括:对得到的文字识别结果进行校验,以确定存在文字识别错误的票据图像。
示例性的,可以对文字识别结果进行正则表达式校验,以确定存在文字识别错误的票据图像。例如,校验字符串长度、日期以及姓名等等。
示例性的,将存在文字识别错误的票据图像,确定为待检测图像。还可以将无法识别的票据图像,确定为待检测图像。
通过对得到的文字识别结果进行校验,可以确定存在文字识别错误的票据图像,实现将存在文字识别错误的票据图像确定为待检测图像,后续可以对待检测图像进行人工识别。
步骤S20、调用文本检测模型,将所述待检测图像输入所述文本检测模型进行文本检测,获得所述待检测图像对应的文本位置信息。
在本申请实施例中,文本检测模型可以是CTPN网络模型。需要说明的是,CTPN网络模型主要对图像中的文本行进行准确定位,其基本做法是直接在卷积获得的特征图像上生成的一系列适当尺寸的文本预选框进行文本行的检测。在CTPN模型中,可以利用了CNN(Convolutional Neural Network,卷积神经网络)和RNN(Recurrent Neural Network,循环神经网络)的无缝结合来提高检测精度。其中,CNN用来提取深度特征,RNN用于序列的特征识别,提供了识别的准确度。
示例性的,文本检测模型包括文本特征提取模型与文本位置检测模型两个子模型。其中,文本特征提取模型用于提取图像中的文本特征;文本位置检测模型用于预测文本位置。示例性的,特征提取模型可以是卷积神经网络,文本位置检测模型可以是循环神经网络。
需要强调的是,为进一步保证上述文本检测模型的私密和安全性,上述文本检测模型可以存储于一区块链的节点中。当需要使用文本检测模型时,可以从区块链的节点中调用。
请参阅图3,图3是步骤S20中将待检测图像输入文本检测模型进行文本检测的子步骤的示意性流程图,具体步骤S20可以包括以下步骤S201至步骤S203。
步骤S201、将所述待检测图像输入所述文本特征提取模型进行文本特征提取,获得所述待检测图像对应的深度特征图像。
在本申请实施例中,对于文本特征提取模型,可以采用VGG16网络结构做基础模型,通过多个卷积层将待检测图像卷积得到不同尺度的深度特征图像。
请参阅图4,图4是本申请实施例提供的一种将待检测图像输入文本检测模型进行文本检测的示意图。如图4所示,先将待检测图像输入文本特征提取模型进行文本特征提取,获得待检测图像对应的深度特征图像。
步骤S202、对所述深度特征图像添加文本候选框,根据同一行的所述文本候选框对应的深度特征信息,确定对应的深度特征序列。
示例性的,如图4所示,可以对深度特征图像添加多个预设大小的文本候选框。其中,文本候选框的高与宽可以根据实际情况设定,具体数值在此不作限定。
示例性的,可以依次将每一行的所有文本候选框对应的深度特征信息,确定为该行对应的深度特征序列。
步骤S203、将所述深度特征序列输入所述文本位置检测模型进行文本位置预测,获得所述待检测图像对应预测的文本位置信息。
如图4所示,将深度特征序列输入文本位置检测模型进行文本位置预测,输出待检测图像对应预测的文本位置信息。
在本申请实施例中,文本位置检测模型可以是循环神经网络中的BI_LSTM-CRF神经网络。需要说明的是,BI_LSTM-CRF神经网络结合BI_LSTM(Bidirectional Long Short Term Memory Network,双向长短期记忆网络)和CRF(Conditional Random Field)层。BI_LSTM-CRF神经网络模型不仅可以使用过去输入的特征和语句标签信息,还可以使用将来的输入特征,考虑到长距离上下文信息对中文分词的影响,可以确保中文分词更高的准确性。
在一些实施例中,通过双向LSTM层,对深度特征序列进行提取文本特征,得到待检测 图像对应预测的文本位置信息。其中,双向LSTM层包括forward LSTM层和backward LSTM层。
示例性的,将深度特征序列作为双向LSTM层的各个时间步的输入,将forward LSTM层输出的隐状态序列与backward LSTM层在各个位置输出的隐状态进行按位置拼接,得到完整的隐状态序列;将隐状态序列接入线性层,得到待检测图像对应预测的文本位置信息。
通过将待检测图像输入文本特征提取模型进行文本特征提取,并将获得的深度特征图像添加文本候选框,将深度特征序列输入BI_LSTM-CRF神经网络模型进行文本位置预测,由于BI_LSTM-CRF神经网络模型的分词具有更高准确性,因此有效提高了预测文本位置信息的准确性。
在一些实施例中,可以基于GPU集群,将待检测图像输入文本检测模型进行文本检测,得到待检测图像对应的文本位置信息。
需要说明的是,GPU(Graphics Processing Unit,图形处理单元)集群是一个计算机集群,其中每个节点配备有图形处理单元。由于通用计算的GPU具有很高的数据并行架构,可以并行处理大量的数据点,从而可以使GPU集群执行非常快速的计算,提高计算吞吐量。
通过基于GPU集群,将待检测图像输入文本检测模型中进行文本检测,可以提高检测文本位置信息的准确度与效率。
步骤S30、根据所述文本位置信息对所述待检测图像进行切分,得到所述待检测图像对应的至少一个分片图像。
示例性的,可以根据文本位置信息确定待切分的图像区域,然后截取图像区域对应的图像作为分片图像。当然还可以是其它切分方式,具体的切分方式在此不作限定。
其中,根据每一文本位置信息,对应切分得到一个分片图像。
通过根据文本位置信息对待检测图像进行切分,得到多个不连续、不完整的分片图像,后续可以将分片图像分配至多个目标终端进行识别处理,不仅提高了图像的处理效率,而且还可以防止隐私信息的泄露。
步骤S40、对每个所述分片图像进行敏感信息屏蔽处理,得到至少一个目标图像。
请参阅图5,图5是步骤S40中对每个分片图像进行敏感信息屏蔽处理的子步骤的示意性流程图,具体步骤S40可以包括以下步骤S401至步骤S403。
步骤S401、对多个所述分片图像进行敏感信息检验,以确定每个所述分片图像是否存在敏感信息。
示例性的,敏感信息可以包括但不限于姓名、身份证号码以及联系方式等等。
在一些实施例中,对多个分片图像进行敏感信息检验,以确定每个分片图像是否存在敏感信息之前,还可以包括:确定待检测图像对应的票据类型;根据票据类型与敏感位置区域之间预设的对应关系,根据待检测图像对应的票据类型,确定待检测图像中的敏感信息区域。
示例性的,票据类型可以包括但不限于费用清单、化验单、发票、门急诊病历、诊断证明以及影像检查报告等票据类型。
可以理解的是,不同票据类型的待检测图像,对应的敏感信息所在的位置区域不同。在 本申请实施例中,可以预先将票据类型与敏感位置区域进行关联。
其中,敏感位置区域可以是左上角、右上角、左下角以及右下角,还可以是其它位置,在此不作限定。
在一些实施例中,对多个分片图像进行敏感信息检验,以确定每个分片图像是否存在敏感信息,可以包括:确定每个分片图像是否存在敏感信息区域;若分片图像存在敏感信息区域,则确定分片图像存在敏感信息。
需要说明的是,若分片图像在待检测图像中的相对位置处于敏感信息区域中,则可以确定分片图像存在敏感信息区域,进而可以确定分片图像存在敏感信息。
在另一些实施例中,对多个分片图像进行敏感信息检验,以确定每个分片图像是否存在敏感信息,还可以包括:确定每个分片图像的敏感信息区域是否存在关键词组;若分片图像存在关键词组,则确定分片图像存在敏感信息。
示例性的,关键词组可以包括但不限于“姓名”、“号码”、“证件”以及“联系方式”等等。
示例性的,可以基于预设的词组数据库,确定敏感信息区域是否存在关键词组。在本申请实施例中,可以预先将敏感信息对应的关键词组收集并保存至预设的词组数据库中。
示例性的,在确定每个分片图像的敏感信息区域是否存在关键词组时,可以将敏感信息区域中的词组与词组数据库中的词组进行匹配,以确定敏感信息区域是否存在关键词组。
步骤S402、对存在敏感信息的所述分片图像进行屏蔽处理,得到屏蔽处理后的所述分片图像。
在一些实施例中,对存在敏感信息的分片图像进行屏蔽处理,可以包括:确定分片图像中的敏感信息区域对应的关键词组,将关键词组替换成预设的字符标识。
示例性的,预设的字符标识可以是“*”,也可以是其它字符,在此不作限定。
步骤S403、根据屏蔽处理后的所述分片图像以及不存在敏感信息的分片图像,确定多个所述目标图像。
示例性的,可以将敏感信息屏蔽处理后的分片图像以及不存在敏感信息的分片图像,确定为目标图像。
通过对分片图像进行敏感信息检验,并对存在敏感信息的分片图像进行屏蔽处理,可以避免敏感信息的泄露,提高了信息的安全性。
步骤S50、将每个所述目标图像分配至对应的目标终端,以使所述目标终端对分配的所述目标图像进行识别,获得文本识别结果。
请参阅图6,图6是本申请实施例提供的一种将每个目标图像分配至对应的目标终端的示意图。如图6所示,可以接收用户对多个目标图像的分配操作,根据分配操作确定每个目标图像对应的目标终端;然后将每个目标图像分配至对应的目标终端,以使目标终端对分配的目标图像进行识别。
需要说明的是,将每个目标图像分配至对应的目标终端,由目标终端对应的作业人员对目标图像进行人工识别,从而得到文本识别结果。例如,作业人员可以对目标图像标注文字 识别标签。目标终端可以根据作业人员标注的文字识别标签生成文字识别结果。
在一些实施例中,将每个目标图像分配至对应的目标终端之后,还可以包括:接收目标终端发送的文本识别结果,其中文本识别结果包括目标图像对应的文字识别标签;根据目标图像与文字识别标签对光学字符识别模型进行迭代训练,直至光学字符识别模型收敛。
在本申请实施例中,由于光学字符识别模型包括文本检测模型与文本识别模型,因此,可以根据目标图像与文字识别标签对文本检测模型与文本识别模型进行训练。
可以理解的是,通过根据目标图像与文字识别标签对光学字符识别模型进行迭代训练,可以使得光学字符识别模型学习识别错误的图像,从而提高了光学字符识别模型的文字识别的准确性。
在一些实施例中,根据目标图像与文字识别标签对文本检测模型与文本识别模型进行训练,可以包括:根据目标图像与文字识别标签,确定每一轮训练的训练样本数据;将训练样本数据输入文字检测模型中,获得文字检测结果;将文字检测结果输入文字识别模型中,获得文字识别结果;基于预设的损失函数值,根据文字识别标签与文字识别结果,确定当前轮对应的损失函数值;若损失函数值大于预设的损失值阈值,则调整文字检测模型与文字识别模型的参数,并进行下一轮训练,直至得到的损失函数值小于或等于损失值阈值,结束训练,得到训练后的文字检测模型与文字识别模型。
示例性的,预设的损失值阈值可以根据实际情况进行设定,具体数值在此不作限定。
在申请实施例中,可以采用0-1损失函数、绝对值损失函数、对数损失函数、交叉熵损失函数、平方损失函数或指数损失函数等损失函数来计算损失函数值。可以采用梯度下降算法、牛顿算法、共轭梯度法或柯西-牛顿法等收敛算法来调整文字检测模型与文字识别模型的参数。
可以理解的是,当文字识别模型收敛时,表示光学字符识别模型也收敛,可以得到训练后的光学字符识别模型。
为进一步保证上述训练后的光学字符识别模型的私密和安全性,上述训练后的光学字符识别模型还可以存储于一区块链的节点中。当需要使用训练后的光学字符识别模型时,可以从区块链的节点中获取。
通过根据预设的损失函数和收敛算法对文字检测模型与文字识别模型进行参数更新,可以使得文字检测模型与文字识别模型快速收敛,进而提高了光学字符识别模型的训练效率和准确度。
上述实施例提供的图像处理方法,通过调用光学字符识别模型,将每个票据图像输入光学字符识别模型进行文字识别,提高了识别文本信息的效率;通过对得到的文字识别结果进行校验,可以确定存在文字识别错误的票据图像,实现将存在文字识别错误的票据图像确定为待检测图像,后续可以对待检测图像进行人工识别;通过将待检测图像输入文本特征提取模型进行文本特征提取,并将获得的深度特征图像添加文本候选框,将深度特征序列输入BI_LSTM-CRF神经网络模型进行文本位置预测,由于BI_LSTM-CRF神经网络模型的分词具有更高准确性,因此有效提高了预测文本位置信息的准确性;通过基于GPU集群,将待检测 图像输入文本检测模型中进行文本检测,可以提高检测文本位置信息的准确度与效率;通过根据文本位置信息对待检测图像进行切分,得到多个不连续、不完整的分片图像,后续可以将分片图像分配至多个目标终端进行识别处理,不仅提高了图像的处理效率,而且还可以防止隐私信息的泄露;通过对分片图像进行敏感信息检验,并对存在敏感信息的分片图像进行屏蔽处理,可以避免敏感信息的泄露,提高了信息的安全性;通过根据目标图像与文字识别标签对光学字符识别模型进行迭代训练,可以使得光学字符识别模型学习识别错误的图像,从而提高了光学字符识别模型的文字识别的准确性;通过根据预设的损失函数和收敛算法对文字检测模型与文字识别模型进行参数更新,可以使得文字检测模型与文字识别模型快速收敛,进而提高了光学字符识别模型的训练效率和准确度。
请参阅图7,图7是本申请的实施例还提供一种图像处理装置1000的示意性框图,该图像处理装置用于执行前述的图像处理方法。其中,该图像处理装置可以配置于服务器或终端中。
如图7所示,该图像处理装置1000,包括:图像获取模块1001、文本检测模块1002、图像切分模块1003、信息屏蔽模块1004以及图像分配模块1005。
图像获取模块1001,用于获取待检测图像,所述待检测图像为光学字符识别模型文字识别错误的票据图像。
文本检测模块1002,用于调用文本检测模型,将所述待检测图像输入所述文本检测模型进行文本检测,获得所述待检测图像对应的文本位置信息。
图像切分模块1003,用于根据所述文本位置信息对所述待检测图像进行切分,得到所述待检测图像对应的至少一个分片图像。
信息屏蔽模块1004,用于对每个所述分片图像进行敏感信息屏蔽处理,得到至少一个目标图像。
图像分配模块1005,用于将每个所述目标图像分配至对应的目标终端,以使所述目标终端对分配的所述目标图像进行识别,获得文本识别结果。
需要说明的是,所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,上述描述的装置和各模块的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
上述的装置可以实现为一种计算机程序的形式,该计算机程序可以在如图8所示的计算机设备上运行。
请参阅图8,图8是本申请实施例提供的一种计算机设备的结构示意性框图。该计算机设备可以是服务器或终端。
请参阅图8,该计算机设备包括通过系统总线连接的处理器和存储器,其中,存储器可以包括存储介质和内存储器。所述存储介质既包括非易失性存储介质,也包括易失性存储介质。
处理器用于提供计算和控制能力,支撑整个计算机设备的运行。
内存储器为非易失性存储介质中的计算机程序的运行提供环境,该计算机程序被处理器 执行时,可使得处理器执行任意一种图像处理方法。
应当理解的是,处理器可以是中央处理单元(Central Processing Unit,CPU),该处理器还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。其中,通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
其中,在一个实施例中,所述处理器用于运行存储在存储器中的计算机程序,以实现如下步骤:
获取待检测图像,所述待检测图像为光学字符识别模型文字识别错误的票据图像;调用文本检测模型,将所述待检测图像输入所述文本检测模型进行文本检测,获得所述待检测图像对应的文本位置信息;根据所述文本位置信息对所述待检测图像进行切分,得到所述待检测图像对应的至少一个分片图像;对每个所述分片图像进行敏感信息屏蔽处理,得到至少一个目标图像;将每个所述目标图像分配至对应的目标终端,以使所述目标终端对分配的所述目标图像进行识别,获得文本识别结果。
在一个实施例中,所述处理器在实现获取待检测图像时,用于实现:
当检测到对票据图像的选中操作时,确定所述选中操作对应的至少一个票据图像;调用所述光学字符识别模型,将每个所述票据图像输入所述光学字符识别模型进行文字识别;将存在文字识别错误的票据图像,确定为所述待检测图像。
在一个实施例中,所述文本检测模型包括文本特征提取模型与文本位置检测模型;所述处理器在实现调用文本检测模型,将所述待检测图像输入所述文本检测模型进行文本检测,获得所述待检测图像对应的文本位置信息时,用于实现:
将所述待检测图像输入所述文本特征提取模型进行文本特征提取,获得所述待检测图像对应的深度特征图像;对所述深度特征图像添加文本候选框,根据同一行的所述文本候选框对应的深度特征信息,确定对应的深度特征序列;将所述深度特征序列输入所述文本位置检测模型进行文本位置预测,获得所述待检测图像对应预测的文本位置信息。
在一个实施例中,所述处理器在实现对每个所述分片图像进行敏感信息屏蔽处理,得到至少一个目标图像时,用于实现:
对多个所述分片图像进行敏感信息检验,以确定每个所述分片图像是否存在敏感信息;对存在敏感信息的所述分片图像进行屏蔽处理,得到屏蔽处理后的所述分片图像;根据屏蔽处理后的所述分片图像以及不存在敏感信息的分片图像,确定多个所述目标图像。
在一个实施例中,所述处理器在实现对多个所述分片图像进行敏感信息检验,以确定每个所述分片图像是否存在敏感信息之前,还用于实现:
确定所述待检测图像对应的票据类型;根据票据类型与敏感位置区域之间预设的对应关系,根据所述待检测图像对应的票据类型,确定所述待检测图像中的敏感信息区域。
在一个实施例中,所述处理器在实现对多个所述分片图像进行敏感信息检验,以确定每个所述分片图像是否存在敏感信息时,用于实现:
确定每个所述分片图像是否存在所述敏感信息区域;若所述分片图像存在所述敏感信息区域,则确定所述分片图像存在敏感信息。
在一个实施例中,所述处理器在实现对存在敏感信息的所述分片图像进行屏蔽处理时,用于实现:
确定所述分片图像中的所述敏感信息区域对应的关键词组,将所述关键词组替换成预设的字符标识。
在一个实施例中,所述处理器在实现将每个所述目标图像分配至对应的目标终端之后,还用于实现:
接收所述目标终端发送的所述文本识别结果,其中所述文本识别结果包括所述目标图像对应的文字识别标签;根据所述目标图像与所述文字识别标签对所述光学字符识别模型进行迭代训练,直至所述光学字符识别模型收敛。
本申请的实施例中还提供一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序中包括程序指令,所述处理器执行所述程序指令,实现本申请实施例提供的任一项图像处理方法。
其中,所述计算机可读存储介质可以是前述实施例所述的计算机设备的内部存储单元,例如所述计算机设备的硬盘或内存。所述计算机可读存储介质也可以是所述计算机设备的外部存储设备,例如所述计算机设备上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字卡(Secure Digital Card,SD Card),闪存卡(Flash Card)等。所述计算机可读存储介质可以是非易失性,也可以是易失性。
进一步地,所述计算机可读存储介质可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序等;存储数据区可存储根据区块链节点的使用所创建的数据等。
本申请所指区块链是分布式数据存储、点对点传输、共识机制、加密算法等计算机技术的新型应用模式。区块链(Blockchain),本质上是一个去中心化的数据库,是一串使用密码学方法相关联产生的数据块,每一个数据块中包含了一批次网络交易的信息,用于验证其信息的有效性(防伪)和生成下一个区块。区块链可以包括区块链底层平台、平台产品服务层以及应用服务层等。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到各种等效的修改或替换,这些修改或替换都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以权利要求的保护范围为准。
Claims (20)
- 一种图像处理方法,其中,包括:获取待检测图像,所述待检测图像为光学字符识别模型文字识别错误的票据图像;调用文本检测模型,将所述待检测图像输入所述文本检测模型进行文本检测,获得所述待检测图像对应的文本位置信息;根据所述文本位置信息对所述待检测图像进行切分,得到所述待检测图像对应的至少一个分片图像;对每个所述分片图像进行敏感信息屏蔽处理,得到至少一个目标图像;将每个所述目标图像分配至对应的目标终端,以使所述目标终端对分配的所述目标图像进行识别,获得文本识别结果。
- 根据权利要求1所述的图像处理方法,其中,所述获取待检测图像,包括:当检测到对票据图像的选中操作时,确定所述选中操作对应的至少一个票据图像;调用所述光学字符识别模型,将每个所述票据图像输入所述光学字符识别模型进行文字识别;将存在文字识别错误的票据图像,确定为所述待检测图像。
- 根据权利要求1所述的图像处理方法,其中,所述文本检测模型包括文本特征提取模型与文本位置检测模型;所述调用文本检测模型,将所述待检测图像输入所述文本检测模型进行文本检测,获得所述待检测图像对应的文本位置信息,包括:将所述待检测图像输入所述文本特征提取模型进行文本特征提取,获得所述待检测图像对应的深度特征图像;对所述深度特征图像添加文本候选框,根据同一行的所述文本候选框对应的深度特征信息,确定对应的深度特征序列;将所述深度特征序列输入所述文本位置检测模型进行文本位置预测,获得所述待检测图像对应预测的文本位置信息。
- 根据权利要求1所述的图像处理方法,其中,所述对每个所述分片图像进行敏感信息屏蔽处理,得到至少一个目标图像,包括:对多个所述分片图像进行敏感信息检验,以确定每个所述分片图像是否存在敏感信息;对存在敏感信息的所述分片图像进行屏蔽处理,得到屏蔽处理后的所述分片图像;根据屏蔽处理后的所述分片图像以及不存在敏感信息的分片图像,确定多个所述目标图像。
- 根据权利要求4所述的图像处理方法,其中,所述对多个所述分片图像进行敏感信息检验,以确定每个所述分片图像是否存在敏感信息之前,还包括:确定所述待检测图像对应的票据类型;根据票据类型与敏感位置区域之间预设的对应关系,根据所述待检测图像对应的票据类型,确定所述待检测图像中的敏感信息区域;所述对多个所述分片图像进行敏感信息检验,以确定每个所述分片图像是否存在敏感信息,包括:确定每个所述分片图像是否存在所述敏感信息区域;若所述分片图像存在所述敏感信息区域,则确定所述分片图像存在敏感信息。
- 根据权利要求5所述的图像处理方法,其中,所述对存在敏感信息的所述分片图像进行屏蔽处理,包括:确定所述分片图像中的所述敏感信息区域对应的关键词组,将所述关键词组替换成预设的字符标识。
- 根据权利要求1-6任一项所述的图像处理方法,其中,所述将每个所述目标图像分配至对应的目标终端之后,还包括:接收所述目标终端发送的所述文本识别结果,其中所述文本识别结果包括所述目标图像对应的文字识别标签;根据所述目标图像与所述文字识别标签对所述光学字符识别模型进行迭代训练,直至所述光学字符识别模型收敛。
- 一种图像处理装置,其中,包括:图像获取模块,用于获取待检测图像,所述待检测图像为光学字符识别模型文字识别错误的票据图像;文本检测模块,用于调用文本检测模型,将所述待检测图像输入所述文本检测模型进行文本检测,获得所述待检测图像对应的文本位置信息;图像切分模块,用于根据所述文本位置信息对所述待检测图像进行切分,得到所述待检测图像对应的至少一个分片图像;信息屏蔽模块,用于对每个所述分片图像进行敏感信息屏蔽处理,得到至少一个目标图像;图像分配模块,用于将每个所述目标图像分配至对应的目标终端,以使所述目标终端对分配的所述目标图像进行识别,获得文本识别结果。
- 一种计算机设备,其中,所述计算机设备包括存储器和处理器;所述存储器,用于存储计算机程序;所述处理器,用于执行所述计算机程序并在执行所述计算机程序时实现如下步骤:获取待检测图像,所述待检测图像为光学字符识别模型文字识别错误的票据图像;调用文本检测模型,将所述待检测图像输入所述文本检测模型进行文本检测,获得所述待检测图像对应的文本位置信息;根据所述文本位置信息对所述待检测图像进行切分,得到所述待检测图像对应的至少一个分片图像;对每个所述分片图像进行敏感信息屏蔽处理,得到至少一个目标图像;将每个所述目标图像分配至对应的目标终端,以使所述目标终端对分配的所述目标图像进行识别,获得文本识别结果。
- 根据权利要求9所述的计算机设备,其中,所述处理器实现获取待检测图像的步骤,包括:当检测到对票据图像的选中操作时,确定所述选中操作对应的至少一个票据图像;调用所述光学字符识别模型,将每个所述票据图像输入所述光学字符识别模型进行文字识别;将存在文字识别错误的票据图像,确定为所述待检测图像。
- 根据权利要求9所述的计算机设备,其中,所述文本检测模型包括文本特征提取模型与文本位置检测模型;所述处理器实现调用文本检测模型,将所述待检测图像输入所述文本检测模型进行文本检测,获得所述待检测图像对应的文本位置信息的步骤,包括:将所述待检测图像输入所述文本特征提取模型进行文本特征提取,获得所述待检测图像对应的深度特征图像;对所述深度特征图像添加文本候选框,根据同一行的所述文本候选框对应的深度特征信息,确定对应的深度特征序列;将所述深度特征序列输入所述文本位置检测模型进行文本位置预测,获得所述待检测图像对应预测的文本位置信息。
- 根据权利要求9所述的计算机设备,其中,所述处理器实现对每个所述分片图像进行敏感信息屏蔽处理,得到至少一个目标图像的步骤,包括:对多个所述分片图像进行敏感信息检验,以确定每个所述分片图像是否存在敏感信息;对存在敏感信息的所述分片图像进行屏蔽处理,得到屏蔽处理后的所述分片图像;根据屏蔽处理后的所述分片图像以及不存在敏感信息的分片图像,确定多个所述目标图像。
- 根据权利要求12所述的计算机设备,其中,所述处理器实现对多个所述分片图像进行敏感信息检验,以确定每个所述分片图像是否存在敏感信息的步骤之前,还包括:确定所述待检测图像对应的票据类型;根据票据类型与敏感位置区域之间预设的对应关系,根据所述待检测图像对应的票据类型,确定所述待检测图像中的敏感信息区域;所述对多个所述分片图像进行敏感信息检验,以确定每个所述分片图像是否存在敏感信息,包括:确定每个所述分片图像是否存在所述敏感信息区域;若所述分片图像存在所述敏感信息区域,则确定所述分片图像存在敏感信息;所述处理器实现对存在敏感信息的所述分片图像进行屏蔽处理的步骤,包括:确定所述分片图像中的所述敏感信息区域对应的关键词组,将所述关键词组替换成预设的字符标识。
- 根据权利要求9-13任一项所述的计算机设备,其中,所述处理器实现将每个所述目标图像分配至对应的目标终端的步骤之后,还包括:接收所述目标终端发送的所述文本识别结果,其中所述文本识别结果包括所述目标图像 对应的文字识别标签;根据所述目标图像与所述文字识别标签对所述光学字符识别模型进行迭代训练,直至所述光学字符识别模型收敛。
- 一种计算机可读存储介质,其中,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时使所述处理器实现如下步骤:获取待检测图像,所述待检测图像为光学字符识别模型文字识别错误的票据图像;调用文本检测模型,将所述待检测图像输入所述文本检测模型进行文本检测,获得所述待检测图像对应的文本位置信息;根据所述文本位置信息对所述待检测图像进行切分,得到所述待检测图像对应的至少一个分片图像;对每个所述分片图像进行敏感信息屏蔽处理,得到至少一个目标图像;将每个所述目标图像分配至对应的目标终端,以使所述目标终端对分配的所述目标图像进行识别,获得文本识别结果。
- 根据权利要求15所述的计算机可读存储介质,其中,所述处理器实现获取待检测图像的步骤,包括:当检测到对票据图像的选中操作时,确定所述选中操作对应的至少一个票据图像;调用所述光学字符识别模型,将每个所述票据图像输入所述光学字符识别模型进行文字识别;将存在文字识别错误的票据图像,确定为所述待检测图像。
- 根据权利要求15所述的计算机可读存储介质,其中,所述文本检测模型包括文本特征提取模型与文本位置检测模型;所述处理器实现调用文本检测模型,将所述待检测图像输入所述文本检测模型进行文本检测,获得所述待检测图像对应的文本位置信息的步骤,包括:将所述待检测图像输入所述文本特征提取模型进行文本特征提取,获得所述待检测图像对应的深度特征图像;对所述深度特征图像添加文本候选框,根据同一行的所述文本候选框对应的深度特征信息,确定对应的深度特征序列;将所述深度特征序列输入所述文本位置检测模型进行文本位置预测,获得所述待检测图像对应预测的文本位置信息。
- 根据权利要求15所述的计算机可读存储介质,其中,所述处理器实现对每个所述分片图像进行敏感信息屏蔽处理,得到至少一个目标图像的步骤,包括:对多个所述分片图像进行敏感信息检验,以确定每个所述分片图像是否存在敏感信息;对存在敏感信息的所述分片图像进行屏蔽处理,得到屏蔽处理后的所述分片图像;根据屏蔽处理后的所述分片图像以及不存在敏感信息的分片图像,确定多个所述目标图像。
- 根据权利要求18所述的计算机可读存储介质,其中,所述处理器实现对多个所述分片图像进行敏感信息检验,以确定每个所述分片图像是否存在敏感信息的步骤之前,还包括:确定所述待检测图像对应的票据类型;根据票据类型与敏感位置区域之间预设的对应关系,根据所述待检测图像对应的票据类型,确定所述待检测图像中的敏感信息区域;所述对多个所述分片图像进行敏感信息检验,以确定每个所述分片图像是否存在敏感信息,包括:确定每个所述分片图像是否存在所述敏感信息区域;若所述分片图像存在所述敏感信息区域,则确定所述分片图像存在敏感信息;所述处理器实现对存在敏感信息的所述分片图像进行屏蔽处理的步骤,包括:确定所述分片图像中的所述敏感信息区域对应的关键词组,将所述关键词组替换成预设的字符标识。
- 根据权利要求15-19任一项所述的计算机可读存储介质,其中,所述处理器实现将每个所述目标图像分配至对应的目标终端的步骤之后,还包括:接收所述目标终端发送的所述文本识别结果,其中所述文本识别结果包括所述目标图像对应的文字识别标签;根据所述目标图像与所述文字识别标签对所述光学字符识别模型进行迭代训练,直至所述光学字符识别模型收敛。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110217730.7A CN112966583A (zh) | 2021-02-26 | 2021-02-26 | 图像处理方法、装置、计算机设备和存储介质 |
CN202110217730.7 | 2021-02-26 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022179138A1 true WO2022179138A1 (zh) | 2022-09-01 |
Family
ID=76275753
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2021/125266 WO2022179138A1 (zh) | 2021-02-26 | 2021-10-21 | 图像处理方法、装置、计算机设备和存储介质 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN112966583A (zh) |
WO (1) | WO2022179138A1 (zh) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116939292A (zh) * | 2023-09-15 | 2023-10-24 | 天津市北海通信技术有限公司 | 轨道交通环境下的视频文本内容监测方法及系统 |
CN117727829A (zh) * | 2023-10-31 | 2024-03-19 | 北京城建集团有限责任公司 | 一种光电幕墙的制备方法及系统 |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114564141A (zh) * | 2020-11-27 | 2022-05-31 | 华为技术有限公司 | 文本提取方法及装置 |
CN112966583A (zh) * | 2021-02-26 | 2021-06-15 | 深圳壹账通智能科技有限公司 | 图像处理方法、装置、计算机设备和存储介质 |
CN113763203A (zh) * | 2021-08-10 | 2021-12-07 | 国网湖北省电力有限公司检修公司 | 变电站智慧安监系统及现场作业安全管控方法 |
CN113705449A (zh) * | 2021-08-27 | 2021-11-26 | 上海商汤临港智能科技有限公司 | 标识识别方法及相关装置 |
CN113723420B (zh) * | 2021-09-03 | 2024-07-02 | 安徽淘云科技股份有限公司 | 一种扫描方法及其相关设备 |
CN114091699A (zh) * | 2021-11-18 | 2022-02-25 | 广东电网有限责任公司 | 一种电力通信设备故障诊断方法及系统 |
CN114173190B (zh) * | 2021-11-22 | 2024-05-03 | 闪捷信息科技有限公司 | 视频数据检测方法、装置、电子设备和存储介质 |
CN114826734B (zh) * | 2022-04-25 | 2024-10-01 | 维沃移动通信有限公司 | 文字识别方法、装置和电子设备 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104484612A (zh) * | 2014-11-19 | 2015-04-01 | 中电长城(长沙)信息技术有限公司 | 一种用于远程桌面应用中的敏感信息屏蔽方法及系统 |
CN105528604A (zh) * | 2016-01-31 | 2016-04-27 | 华南理工大学 | 一种基于ocr的票据自动识别与处理系统 |
CN109344914A (zh) * | 2018-10-31 | 2019-02-15 | 焦点科技股份有限公司 | 一种端到端的不定长文字识别的方法和系统 |
CN112381038A (zh) * | 2020-11-26 | 2021-02-19 | 中国船舶工业系统工程研究院 | 一种基于图像的文本识别方法、系统和介质 |
CN112966583A (zh) * | 2021-02-26 | 2021-06-15 | 深圳壹账通智能科技有限公司 | 图像处理方法、装置、计算机设备和存储介质 |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108446621A (zh) * | 2018-03-14 | 2018-08-24 | 平安科技(深圳)有限公司 | 票据识别方法、服务器及计算机可读存储介质 |
CN108764226B (zh) * | 2018-04-13 | 2022-05-03 | 顺丰科技有限公司 | 图像文本识别方法、装置、设备及其存储介质 |
CN109919014B (zh) * | 2019-01-28 | 2023-11-03 | 平安科技(深圳)有限公司 | Ocr识别方法及其电子设备 |
CN110555372A (zh) * | 2019-07-22 | 2019-12-10 | 深圳壹账通智能科技有限公司 | 数据录入方法、装置、设备及存储介质 |
CN110647829A (zh) * | 2019-09-12 | 2020-01-03 | 全球能源互联网研究院有限公司 | 一种票据的文本识别方法及系统 |
-
2021
- 2021-02-26 CN CN202110217730.7A patent/CN112966583A/zh active Pending
- 2021-10-21 WO PCT/CN2021/125266 patent/WO2022179138A1/zh active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104484612A (zh) * | 2014-11-19 | 2015-04-01 | 中电长城(长沙)信息技术有限公司 | 一种用于远程桌面应用中的敏感信息屏蔽方法及系统 |
CN105528604A (zh) * | 2016-01-31 | 2016-04-27 | 华南理工大学 | 一种基于ocr的票据自动识别与处理系统 |
CN109344914A (zh) * | 2018-10-31 | 2019-02-15 | 焦点科技股份有限公司 | 一种端到端的不定长文字识别的方法和系统 |
CN112381038A (zh) * | 2020-11-26 | 2021-02-19 | 中国船舶工业系统工程研究院 | 一种基于图像的文本识别方法、系统和介质 |
CN112966583A (zh) * | 2021-02-26 | 2021-06-15 | 深圳壹账通智能科技有限公司 | 图像处理方法、装置、计算机设备和存储介质 |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116939292A (zh) * | 2023-09-15 | 2023-10-24 | 天津市北海通信技术有限公司 | 轨道交通环境下的视频文本内容监测方法及系统 |
CN116939292B (zh) * | 2023-09-15 | 2023-11-24 | 天津市北海通信技术有限公司 | 轨道交通环境下的视频文本内容监测方法及系统 |
CN117727829A (zh) * | 2023-10-31 | 2024-03-19 | 北京城建集团有限责任公司 | 一种光电幕墙的制备方法及系统 |
Also Published As
Publication number | Publication date |
---|---|
CN112966583A (zh) | 2021-06-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2022179138A1 (zh) | 图像处理方法、装置、计算机设备和存储介质 | |
US20210295114A1 (en) | Method and apparatus for extracting structured data from image, and device | |
US10140709B2 (en) | Automatic detection and semantic description of lesions using a convolutional neural network | |
US20230004604A1 (en) | Ai-augmented auditing platform including techniques for automated document processing | |
US20210142222A1 (en) | Automated data and label creation for supervised machine learning regression testing | |
WO2023056723A1 (zh) | 故障诊断的方法、装置、电子设备及存储介质 | |
US20200125695A1 (en) | Detecting hotspots in physical design layout patterns utilizing hotspot detection model with data augmentation | |
JP7364709B2 (ja) | 機械学習および自然言語処理を利用したワクチン接種データの抽出および確認 | |
WO2021196935A1 (zh) | 数据校验方法、装置、电子设备和存储介质 | |
CN112632268B (zh) | 投诉工单检测处理方法、装置、计算机设备及存储介质 | |
WO2019056496A1 (zh) | 图片复审概率区间生成方法及图片复审判定方法 | |
CN113728321A (zh) | 利用训练表的集合来准确预测各种表内的错误 | |
CN112559526A (zh) | 数据表导出方法、装置、计算机设备及存储介质 | |
US8787681B1 (en) | System and method for classifying documents | |
JP2024527831A (ja) | 画像マッチングの画像処理を行うためのシステム及び方法 | |
CN113434542B (zh) | 数据关系识别方法、装置、电子设备及存储介质 | |
CN111738290B (zh) | 图像检测方法、模型构建和训练方法、装置、设备和介质 | |
CN113591881A (zh) | 基于模型融合的意图识别方法、装置、电子设备及介质 | |
CN111639903A (zh) | 一种针对架构变更的评审处理方法及相关设备 | |
US11545253B2 (en) | Systems and methods to process electronic images to categorize intra-slide specimen tissue type | |
CN115809466A (zh) | 基于stride模型的安全需求生成方法、装置、电子设备及介质 | |
WO2019019456A1 (zh) | 理赔数据处理方法、装置、计算机设备和存储介质 | |
US20210342530A1 (en) | Framework for Managing Natural Language Processing Tools | |
CN117859122A (zh) | 包括用于自动化文档处理的技术的ai增强的审计平台 | |
KR102205810B1 (ko) | 인공지능 학습데이터 생성을 위한 크라우드소싱 기반 프로젝트의 재작업 결과의 자동 반려 방법 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21927562 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 28.11.2023) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21927562 Country of ref document: EP Kind code of ref document: A1 |