WO2022121218A1 - 智能图像识别方法、装置、计算机设备及存储介质 - Google Patents

智能图像识别方法、装置、计算机设备及存储介质 Download PDF

Info

Publication number
WO2022121218A1
WO2022121218A1 PCT/CN2021/090576 CN2021090576W WO2022121218A1 WO 2022121218 A1 WO2022121218 A1 WO 2022121218A1 CN 2021090576 W CN2021090576 W CN 2021090576W WO 2022121218 A1 WO2022121218 A1 WO 2022121218A1
Authority
WO
WIPO (PCT)
Prior art keywords
character
image
pixel
text information
feature
Prior art date
Application number
PCT/CN2021/090576
Other languages
English (en)
French (fr)
Inventor
林婉娜
罗旭志
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2022121218A1 publication Critical patent/WO2022121218A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • G06V10/751Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Definitions

  • the present application relates to the technical field of artificial intelligence, and belongs to the application scenario of intelligent image recognition in smart cities, and in particular, to an intelligent image recognition method, device, computer equipment and storage medium.
  • Recognizing images based on OCR technology can correspondingly obtain text information.
  • the traditional technical methods are that the user uploads the image to be recognized to the management server through the client, and the management server completes the identification and feeds back the corresponding text information to the client.
  • the inventor found that under normal circumstances, the user may only need to recognize a part of the image to be recognized, and although the accuracy of the recognition can be ensured in the traditional technical method, the complete image to be recognized occupies a large storage space.
  • the transmission is unstable, it takes a long time to transmit the image to be recognized to the management server, so that the user needs to wait for a long time to obtain the recognized text information, which affects the image recognition efficiency.
  • the technical method of local recognition on the client side can also be used, and the image to be recognized can not be transmitted, and the local recognition accuracy of numbers or letters is high, but the Chinese text contains a large number of characters, and the same character is written in different fonts. It will correspond to multiple matching templates, resulting in a huge number of matching templates for Chinese text recognition, which affects the efficiency and accuracy of Chinese text recognition in pictures. Therefore, the existing image recognition methods have problems of low recognition efficiency and accuracy.
  • the embodiments of the present application provide an intelligent image recognition method, device, computer equipment, and storage medium, which aim to solve the problems of low recognition efficiency and accuracy in the image recognition methods in the prior art.
  • an intelligent image recognition method which includes:
  • the target pixel set includes a plurality of target pixels
  • a plurality of the target pixels included in the target pixel set are divided to obtain a character image including a single character
  • Image text information matching the intercepted image is obtained by integrating the first text information and the second text information.
  • an intelligent image recognition device which includes:
  • a captured image acquisition unit configured to perform real-time monitoring on the display interface of the client, so as to acquire a captured image obtained by a user performing a screenshot operation on the display interface through real-time monitoring;
  • a target pixel obtaining unit configured to obtain a target pixel set corresponding to the intercepted image, wherein the target pixel set includes a plurality of target pixels;
  • a character image acquisition unit configured to segment a plurality of the target pixels contained in the target pixel set according to the position information of the target pixel to obtain a character image containing a single character
  • a character feature value acquisition unit used for digitizing character pixels in each of the character images to obtain character feature values corresponding to each of the character images
  • a first text information obtaining unit configured to identify the character feature value of each of the character images according to a preset character matching library to obtain the first text information
  • the second text information acquisition unit is configured to judge whether there is an unrecognized character image according to the first text information, and if there is, send the character feature value matching the unrecognized character image to the management server, to obtain the second text information fed back by the management server;
  • a text information integration unit configured to integrate the first text information and the second text information to obtain image text information matching the intercepted image.
  • an embodiment of the present application further provides a computer device, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor executes the computer
  • the intelligent image recognition method described in the first aspect above is implemented in the program.
  • an embodiment of the present application further provides a computer-readable storage medium, wherein the computer-readable storage medium stores a computer program, and when executed by a processor, the computer program causes the processor to execute the above-mentioned first step.
  • Embodiments of the present application provide an intelligent image recognition method, apparatus, computer device, and storage medium.
  • One text information, the character feature value of the unrecognized character image is sent to the management server for remote recognition to obtain the second text information, and the two parts of the text information are integrated to obtain the image text information.
  • the simple characters in the character image are recognized locally through the local character matching library, without occupying a large amount of storage space in the client, and the character feature values of the unrecognized character images are transmitted to the management server for remote recognition.
  • the management server In order to realize the fast and accurate identification of the intercepted image, it has the characteristics of high identification efficiency and high identification accuracy.
  • FIG. 1 is a schematic flowchart of an intelligent image recognition method provided by an embodiment of the present application.
  • FIG. 2 is a schematic diagram of an application scenario of an intelligent image recognition method provided by an embodiment of the present application
  • FIG. 3 is a schematic sub-flow diagram of the intelligent image recognition method provided by the embodiment of the present application.
  • FIG. 4 is a schematic diagram of another sub-flow of the intelligent image recognition method provided by the embodiment of the present application.
  • FIG. 5 is a schematic diagram of another sub-flow of the intelligent image recognition method provided by the embodiment of the present application.
  • FIG. 6 is a schematic diagram of another sub-flow of the intelligent image recognition method provided by the embodiment of the present application.
  • FIG. 7 is another schematic flowchart of the intelligent image recognition method provided by the embodiment of the present application.
  • FIG. 8 is another schematic flowchart of the intelligent image recognition method provided by the embodiment of the present application.
  • FIG. 9 is a schematic block diagram of an intelligent image recognition device provided by an embodiment of the present application.
  • FIG. 10 is a schematic block diagram of a computer device provided by an embodiment of the present application.
  • FIG. 1 is a schematic flowchart of the intelligent image recognition method provided by the embodiment of the present application
  • FIG. 2 is a schematic diagram of an application scenario of the intelligent image recognition method provided by the embodiment of the present application.
  • the intelligent image recognition method is applied to In the client 10, the method is executed by the application software installed in the client 10, and a network connection is established between the client 10 and the management server 20 to transmit data information, and the client 10 is used to perform intelligent image recognition.
  • the method is based on a terminal device that can intelligently identify images, such as a desktop computer, a notebook computer, a tablet computer or a mobile phone. enterprise server.
  • FIG. 2 only illustrates that one client 10 and the management server 20 perform information transmission. In practical applications, the management server 20 can also perform information transmission with multiple clients 10 at the same time.
  • the method includes steps S110-S170.
  • S110 Perform real-time monitoring on the display interface of the client, so as to obtain a captured image obtained by a user performing a screenshot operation on the display interface through real-time monitoring.
  • Real-time monitoring is performed on the display interface of the client, so as to obtain a captured image obtained by a user performing a screenshot operation on the display interface through the real-time monitoring.
  • the user is the user of the client terminal, the client terminal includes a display screen, and the display interface is the content displayed on the display screen.
  • the user can perform a screenshot operation based on the display interface, and the captured image is part of the display content included in the display interface.
  • the intercepted image can be rectangle, circle, ellipse or any other shape.
  • the user clicks the button of the screenshot operation in the display interface and selects an interception template of a specific shape.
  • the displayed content is the captured image obtained.
  • a target pixel set corresponding to the intercepted image is acquired, wherein the target pixel set includes multiple target pixels, and the target pixel set includes multiple target pixels.
  • the image processing rule is the rule information for processing the intercepted image.
  • the target pixel set composed of target pixels can be obtained from the intercepted image according to the image processing rules.
  • the target pixel corresponds to the text information to be recognized in the intercepted image.
  • the corresponding text information can be obtained by further identifying the target pixels contained in the target pixel set.
  • the image processing rule includes a contrast threshold
  • the intercepted image includes several pixels, each pixel includes a corresponding pixel value in the intercepted image, and the pixel value corresponding to the pixel is the color information of the pixel; If the image is a color image, each pixel in the color image corresponds to a pixel value on the three color channels of red (R), green (G), and blue (B) corresponding to RGB; if the intercepted image is a grayscale image , then each pixel in the grayscale image corresponds to a pixel value on the black color channel, and the pixel value is represented by a non-negative integer, and its value range is [0, 255], and the black color channel is used.
  • each pixel in the intercepted image can be screened through image processing rules to obtain the corresponding target pixels in the intercepted image, and all target pixels corresponding to the intercepted image are combined into a target pixel set.
  • step S120 includes sub-steps S121 , S122 , S123 and S124 .
  • grayscale processing can be performed on the intercepted image to obtain a corresponding grayscale image; if the intercepted image itself is a grayscale image, no grayscale processing is required, and the intercepted image is directly used as a grayscale image. . Obtain the grayscale value of each pixel in the grayscale image, and take the average value of the grayscale values of all pixels as the pixel grayscale average value of the grayscale image.
  • the grayscale difference between each pixel and the average grayscale value of the pixel can be calculated according to the grayscale value of each pixel.
  • the formula X i
  • can be used to calculate the ith
  • the grayscale difference value X i between each pixel and the pixel grayscale average value F v . Determine whether the grayscale difference of each pixel is less than the contrast threshold, and obtain the pixels whose grayscale difference is not less than the contrast threshold according to the judgment result. If the grayscale difference of a pixel is less than the contrast threshold, it can be judged that the pixel is gray background pixels in the image.
  • the pixels contained in the pixel block formed by connecting multiple pixels in the image can be used as target pixels. Specifically, Judging whether each pixel whose grayscale difference is not less than the contrast threshold is isolated, that is, whether each pixel obtained in the previous step is connected to other pixels whose grayscale difference is not less than the contrast threshold is connected. If they are connected, the judgment result is that the pixel is not isolated; if it is not connected, the judgment result is that the pixel is isolated, and the target pixel can be obtained by removing the isolated pixel.
  • a plurality of the target pixels included in the target pixel set are divided to obtain a character image including a single character.
  • a segmented image containing a single character can be obtained by segmenting the target pixel set, and the segmented image can be adjusted according to the image adjustment rule to obtain a character containing only a single character.
  • Image that is, according to the corresponding number of characters contained in the target pixel set, multiple character images with the same number of characters can be obtained.
  • step S130 includes sub-steps S131 , S132 , S133 and S134 .
  • the target pixels in the target pixel set can be binarized, and black pixels are filled at the position of each target pixel according to the position information of the target pixels, and white pixels are filled at other positions, and the obtained binarized image only contains black pixels. and white two colors, wherein, the position information of the target pixel is the coordinate position of the target pixel in the intercepted image.
  • a single character can correspond to a character block containing multiple pixels.
  • the A pixel block formed by a combination of a plurality of pixels, and a character block is obtained by integrating the plurality of pixel blocks based on the position information of the target pixel; wherein, each character block includes at least one pixel block, and each character block contains one character.
  • the distance between the character blocks is at least 2 pixels
  • a plurality of corresponding pixel blocks can be obtained from the binarized image to determine whether the distance between the pixel blocks is not greater than 1 pixel , if the distance between two pixel blocks is not greater than 1 pixel, the two pixel blocks are combined to form a character block; if the distance between a certain pixel block and other pixel blocks is greater than 1 pixels, the pixel block is regarded as a separate character block.
  • Each character block contains a plurality of target pixels. According to the position information of the target pixels contained in the character block, a segmented image corresponding to each character block can be extracted, and the segmented image is the smallest rectangular image corresponding to the character block. . Specifically, according to the position information of each target pixel in a character block, a minimum rectangular boundary corresponding to the character block can be determined, and the minimum rectangular boundary is based on the coordinate value of the outermost pixel in the character block The determined minimum rectangle frame has one and only one minimum rectangle boundary of the character block, and a segmented image corresponding to the character block is extracted from the binarized image according to the minimum rectangle boundary.
  • the divided images are adjusted according to preset image adjustment rules to obtain character images corresponding to each of the divided images.
  • the segmented images can be adjusted to obtain a character image corresponding to each segmented image;
  • the image adjustment rules include one of enlargement, reduction, and rotation or There are many kinds, and the adjusted character image is an image that satisfies the image adjustment rules.
  • S140 digitize the character pixels in each of the character images to obtain character feature values corresponding to each of the character images.
  • the character pixels in each of the character images are digitized to obtain character feature values corresponding to each of the character images.
  • the numerical rule is the rule information for numerical processing of the character image. After digitizing the character image, the character characteristic value corresponding to the character image can be obtained, and the character characteristic value is the quantification of the characteristics of the character image through numerical values
  • the characteristic information represented, the character characteristic value includes a size array and a coordinate array, the size array is used to represent the size information of the character image, and the coordinate array can be used to represent the coordinate value of each character pixel in the character image.
  • step S140 includes sub-steps S141 , S142 and S143 , that is, the specific process of digitizing a character image according to the digitizing rule includes the following three steps.
  • the resulting size array contains a set of numerical values
  • the resulting coordinate array contains a set of numerical values
  • the number of coordinate arrays is equal to the number of character pixels contained in the character image.
  • the size information of a target image is: 30 pixels long and 18 pixels wide, then the size array corresponding to the character image is ⁇ 30, 18 ⁇ ; a character pixel in the character image is located in the 10th row and the 5th column , the coordinate array corresponding to the character pixel is ⁇ 10, 5 ⁇ .
  • the character matching library includes one or more sample feature information corresponding to each sample character, and the same sample character adopts Writing in a variety of different fonts can correspond to a plurality of sample feature information
  • the first text information can be text information composed of Arabic numerals and English letters.
  • the characters contained in the character image can be Chinese characters, Arabic numerals or English letters; the character matching library includes sample feature information that matches numbers and letters.
  • the formed character matching library recognizes the character image, so as to quickly recognize the character image containing numbers or letters and obtain the first text information, then the process of recognizing the character image containing numbers or letters is performed locally on the client side,
  • the first text information includes characters that match numbers or letters and character codes that match each character, and a character code uniquely corresponds to a character feature value of a character image.
  • the matching rule includes a size threshold, a pixel density calculation formula, and a density threshold.
  • step S150 includes sub-steps S151 , S152 , S153 , S154 , S155 , S156 and S157 .
  • S153 Determine whether the difference between the first pixel density and each of the second pixel densities is less than a preset density threshold, so as to obtain sample feature information whose difference is less than the density threshold to obtain a candidate feature set.
  • the size threshold is the threshold information used to judge whether the size ratio in the character feature value matches the size ratio of the sample feature information.
  • the size ratio can be calculated according to the specific value of the size array in the character feature value. If the difference between the size ratio and the size ratio of the character feature value is not greater than the size threshold, the two match, otherwise the two do not match.
  • the size array of a character feature value is ⁇ 30, 18 ⁇
  • the size array of a certain sample feature information in the character matching library is ⁇ 25, 13 ⁇
  • the size threshold is 0.25
  • the size ratio of the character feature is 1.6667
  • the size ratio of the sample feature information is 1.9231
  • the difference between the two size ratios is 0.2563, which is greater than the size threshold, then the sample feature information does not match the character feature.
  • the pixel density calculation formula is the calculation formula used to calculate the character feature value or sample feature information to obtain the corresponding pixel density. If the pixel density corresponding to the character feature value is large, it indicates that the character image corresponding to the character feature value There are more character pixels contained in the unit area, otherwise it means that the character image corresponding to the character feature value contains fewer character pixels per unit area; the density threshold is used to judge whether the pixel density of the character feature value is consistent with the sample. Threshold information for whether the pixel density of the feature information matches. If the difference between the pixel density of the character feature value and the pixel density of the sample feature information is not greater than the density threshold, the two match, otherwise the two do not match.
  • the calculation of pixel density is described by taking the character eigenvalue as an example. Obtain the number of coordinate arrays contained in the character eigenvalue and divide it by the product of the values in the size array to obtain the pixel density of the character eigenvalue.
  • the pixel density calculation formula can be expressed as:
  • J is the pixel density corresponding to the character eigenvalue
  • T is the number of coordinate arrays in the character eigenvalue
  • C 1 is the first value of the size array in the character eigenvalue
  • C 2 is the character eigenvalue.
  • the size array in the array the second value of .
  • the sample feature information matching each character image is obtained. Since there is a big difference between Chinese characters and Arabic numerals and English letters, and the character matching library only contains the sample feature information of numbers and letters, it can be Determine whether the number of character feature information matched with each character image is greater than zero, if the number of character feature information matched with the character image is greater than zero, it indicates that the character image contains numbers or letters; The number of character feature information is not greater than zero, which means that the character image does not contain numbers or letters, and the character image can be regarded as an unrecognized character image, continue to acquire the next character image and return to step S151 to perform the next character image. identification processing.
  • S155 If the quantity of the sample feature information contained in the candidate feature set is greater than zero, calculate the matching degree between the character feature value and each sample feature information in the candidate feature set; S156, obtain the The sample character corresponding to a sample feature information with the highest matching degree in the candidate feature set is used as the target character matched with the character feature value; S157, the target character obtained by identifying each of the character images is used as the target character. first text message.
  • the value of each coordinate array in the character eigenvalue is divided by the size array of the character eigenvalue to obtain a vector array corresponding to each coordinate array.
  • the size array of a character eigenvalue is ⁇ 30, 18 ⁇ , where a certain coordinate array is ⁇ 10, 5 ⁇ , then a vector array corresponding to the coordinate array is calculated as ⁇ 10/30, 5/18 ⁇ , that is, ⁇ 0.3333, 0.2778 ⁇ .
  • step S1510 is further included before step S150 .
  • a corresponding character matching library can also be generated according to the preset sample character set.
  • the sample character set includes a plurality of sample characters, and each sample character corresponds to at least one sample in the sample character set If a sample image is identified and processed to obtain sample feature information corresponding to the sample image, the generated character matches one or more sample feature information corresponding to a sample character in the database.
  • the specific process of identifying a sample image includes:
  • S160 Determine whether there is an unrecognized character image according to the first text information, and if so, send a character feature value matching the unrecognized character image to the management server to obtain feedback from the management server the second text message.
  • the first text information contains characters corresponding to numbers or letters.
  • the number of characters in the first text information can be obtained and it can be determined whether it is equal to the number of character images. If they are equal, it means that all character images have been recognized.
  • Each character corresponds to a character code, and the sorting of the corresponding character codes can be obtained according to the sorting position of each character image in the intercepted image, and the characters contained in the first text information are sorted according to the sorting position of the character codes, so as to obtain
  • the Chinese text contained in the character image is recognized to obtain second text information, and the management server can feed back the obtained second text information to the client, then the second text information contains characters that match the Chinese text and characters that match each A character encoding that matches a character.
  • the second text information contains characters that match the Chinese text and characters that match each A character encoding that matches a character.
  • no image transmission is involved, but only the character characteristic values corresponding to some unrecognized character images are transmitted, so the amount of data transmission is greatly reduced, the transmission efficiency of character characteristic values can be improved, and the recognition results obtained by users are greatly reduced. the required waiting time. Due to the huge number of Chinese characters, it is difficult for the client to quickly identify, and the management server can be used for rapid identification to improve the efficiency and accuracy of identification.
  • the specific method of recognizing the Chinese characters contained in the character image is the same as the method of recognizing the character image and obtaining the first text information, the difference is that the recognition of the Chinese characters contained in the character image requires the use of a Character matching library.
  • each character in the first text information and the second text information corresponds to a character code
  • the character code is identification information that uniquely corresponds to the character feature value of each character image.
  • the sorting of the character code and the character image are intercepted.
  • the sorting of the corresponding character codes can be obtained according to the sorting position of each character image in the intercepted image, and the characters contained in the first text information and the second text information are sorted according to the sorting of the character codes.
  • the image text information may include numbers, letters and characters corresponding to Chinese texts.
  • step S1701 is further included after step S170 .
  • the obtained image text information can be correspondingly displayed on the side of the intercepted image, and the user can conveniently and quickly operate the image text information in the client.
  • the user can copy the image text information displayed in the display interface, or Extract part of the information from the image text information for use.
  • the technical methods in this application can be applied to smart government affairs/smart city management/smart community/smart security/smart logistics/smart medical care/smart education/smart environmental protection/smart transportation and other application scenarios including intelligent image recognition, so as to promote intelligent Construction of the city.
  • a captured image obtained by a user performing a screenshot operation is obtained, a corresponding target pixel set is obtained, a character image is obtained from the target pixel set by segmentation, and the corresponding character feature value is obtained by digitizing it.
  • identify the character feature value according to the local character matching library to obtain the first text information send the character feature value of the unrecognized character image to the management server for remote identification to obtain the second text information, and integrate the two parts of the text information to obtain the image text information.
  • the simple characters in the character image are recognized locally through the local character matching library, without occupying a large amount of storage space in the client, and the character feature values of the unrecognized character images are transmitted to the management server for remote recognition.
  • the management server In order to realize the fast and accurate identification of the intercepted image, it has the characteristics of high identification efficiency and high identification accuracy.
  • the embodiment of the present application further provides an intelligent image recognition device, and the intelligent image recognition device is used for executing any one of the foregoing intelligent image recognition methods.
  • FIG. 9 is a schematic block diagram of an intelligent image recognition apparatus provided by an embodiment of the present application.
  • the intelligent image recognition device can be configured in the client 10 .
  • the intelligent image recognition device 100 includes a captured image acquisition unit 110, a target pixel acquisition unit 120, a character image acquisition unit 130, a character feature value acquisition unit 140, a first text information acquisition unit 150, and a second text information acquisition unit 150. unit 160 and text information integration unit 170.
  • the captured image obtaining unit 110 is configured to monitor the display interface of the client in real time, so as to obtain a captured image obtained by a user performing a screenshot operation on the display interface through real-time monitoring.
  • the target pixel obtaining unit 120 is configured to obtain a target pixel set corresponding to the intercepted image, wherein the target pixel set includes a plurality of target pixels.
  • the target pixel acquisition unit 120 includes subunits: a grayscale average value acquisition unit, a grayscale difference value calculation unit, a grayscale difference value judgment unit, and an isolated pixel removal unit.
  • the grayscale average value obtaining unit is used to perform grayscale processing on the intercepted image to obtain a corresponding grayscale image and calculate the pixel grayscale average value; the grayscale difference value calculation unit is used to calculate each The grayscale difference between the grayscale value of a pixel and the average grayscale value of the pixel; the grayscale difference value judgment unit is used to determine whether the grayscale difference value of each pixel is smaller than the preset contrast threshold. Judging to obtain pixels whose grayscale difference is not less than the contrast threshold; an isolated pixel removal unit is used to judge whether each pixel whose grayscale difference is not less than the contrast threshold is isolated, so as to separate the isolated pixels Pixel culling obtains the target pixel.
  • the character image obtaining unit 130 is configured to segment a plurality of the target pixels included in the target pixel set according to the position information of the target pixels, so as to obtain a character image including a single character.
  • the character image acquisition unit 130 includes subunits: a binarized image acquisition unit, a character block acquisition unit, a segmented image extraction unit, and an image adjustment unit.
  • a binarized image acquisition unit configured to perform binarization on the target pixel according to the position information of each target pixel in the target pixel set to obtain a binarized image corresponding to the target pixel set; character block acquisition a unit for integrating the pixel blocks contained in the binarized image to obtain a character block, wherein each of the character blocks includes a character; a segmented image extraction unit is used for according to the character The target pixels included in the block are extracted to obtain a segmented image corresponding to each character block; an image adjustment unit is used to adjust the segmented image according to a preset image adjustment rule to obtain a segment corresponding to each of the segmented images. character image.
  • the character feature value obtaining unit 140 is configured to digitize the character pixels in each of the character images to obtain character feature values corresponding to each of the character images.
  • the character feature value obtaining unit 140 includes subunits: a size array obtaining unit, a coordinate array obtaining unit, and an array combining unit.
  • a size array acquisition unit used for acquiring the size information of a character image, and generating a size array corresponding to the size information according to a preset numerical rule
  • a coordinate array acquisition unit used for acquiring all the characters in the character image. The coordinate position of the character pixel, generating a coordinate array corresponding to each of the character pixels according to the numerical rule and the coordinate position; the array combination unit is used to combine the size array with all the coordinate arrays and combine them as the character feature value corresponding to the target image.
  • the first text information obtaining unit 150 is configured to identify the character feature value of each of the character images according to a preset character matching library to obtain the first text information.
  • the first text information obtaining unit 150 includes subunits: a first feature set obtaining unit, a pixel density obtaining unit, an alternative feature set obtaining unit, a quantity judging unit, a matching degree calculating unit, and a target character determination unit. unit and text information acquisition unit.
  • a first feature set obtaining unit configured to obtain, according to a preset size threshold, sample feature information whose size ratio in the character matching library matches the size ratio of each of the character feature values to obtain a first feature set; pixel density obtaining a unit for calculating the first pixel density of the character feature value and the second pixel density of each sample feature information in the first feature set according to a preset pixel density calculation formula; an alternative feature set acquisition unit, using for judging whether the difference between the first pixel density and each of the second pixel densities is less than a preset density threshold, to obtain sample feature information whose difference is less than the density threshold to obtain a candidate feature set; The judgment unit is used for judging whether the quantity of the sample feature information contained in the candidate feature set is greater than zero; the matching degree calculation unit is used to calculate if the quantity of the sample feature information contained in the candidate feature set is greater than zero.
  • the matching degree between the character feature value and each sample feature information in the candidate feature set; the target character determination unit is used to obtain a sample feature information corresponding to the highest matching degree in the candidate feature set.
  • the intelligent image recognition device further includes: a character matching library generating unit.
  • the character matching library generating unit is configured to perform identification processing on a preset sample character set according to the numerical rule to generate the character matching library.
  • the second text information acquisition unit 160 is configured to determine whether there is an unrecognized character image according to the first text information, and if there is, send the character feature value matching the unrecognized character image to the management server , to obtain the second text information fed back by the management server.
  • the text information integration unit 170 is configured to integrate the first text information and the second text information to obtain image text information matching the intercepted image.
  • the intelligent image recognition device further includes: an image text information display unit.
  • An image text information display unit configured to display the image text information in an area adjacent to the captured image in the display interface.
  • the above-mentioned intelligent image recognition method is applied to the intelligent image recognition device provided in the embodiment of the present application, and a captured image obtained by a user performing a screenshot operation is obtained and a corresponding target pixel set is obtained.
  • Corresponding character feature value identify the character feature value according to the local character matching library to obtain the first text information, send the character feature value of the unrecognized character image to the management server for remote identification to obtain the second text information, and compare the two parts of the text.
  • the information is integrated to obtain image text information.
  • the simple characters in the character image are recognized locally through the local character matching library, without occupying a large amount of storage space in the client, and the character feature values of the unrecognized character images are transmitted to the management server for remote recognition.
  • the management server In order to realize the fast and accurate identification of the intercepted image, it has the characteristics of high identification efficiency and high identification accuracy.
  • the above-mentioned intelligent image recognition apparatus can be implemented in the form of a computer program, and the computer program can be executed on a computer device as shown in FIG. 10 .
  • FIG. 10 is a schematic block diagram of a computer device provided by an embodiment of the present application.
  • the computer device may be a client 10 for executing an intelligent image recognition method to intelligently recognize an image.
  • the computer device 500 includes a processor 502 , a memory and a network interface 505 connected by a system bus 501 , wherein the memory may include a non-volatile storage medium 503 and an internal memory 504 .
  • the nonvolatile storage medium 503 can store an operating system 5031 and a computer program 5032 .
  • the computer program 5032 when executed, can cause the processor 502 to execute the intelligent image recognition method.
  • the processor 502 is used to provide computing and control capabilities to support the operation of the entire computer device 500 .
  • the internal memory 504 provides an environment for running the computer program 5032 in the non-volatile storage medium 503.
  • the processor 502 can execute the intelligent image recognition method.
  • the network interface 505 is used for network communication, such as providing transmission of data information.
  • the network interface 505 is used for network communication, such as providing transmission of data information.
  • FIG. 10 is only a block diagram of a partial structure related to the solution of the present application, and does not constitute a limitation on the computer device 500 to which the solution of the present application is applied.
  • the specific computer device 500 may include more or fewer components than shown, or combine certain components, or have a different arrangement of components.
  • the processor 502 is configured to run the computer program 5032 stored in the memory, so as to realize the corresponding functions in the above-mentioned intelligent image recognition method.
  • the embodiment of the computer device shown in FIG. 10 does not constitute a limitation on the specific structure of the computer device. Either some components are combined, or different components are arranged.
  • the computer device may only include a memory and a processor.
  • the structures and functions of the memory and the processor are the same as those of the embodiment shown in FIG. 10 , which will not be repeated here.
  • the processor 502 may be a central processing unit (Central Processing Unit, CPU), and the processor 502 may also be other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), Application Specific Integrated Circuit (ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • the general-purpose processor can be a microprocessor or the processor can also be any conventional processor or the like.
  • a computer-readable storage medium may be a non-volatile computer-readable storage medium.
  • the computer-readable storage medium stores a computer program, wherein when the computer program is executed by the processor, the steps included in the above-mentioned intelligent image recognition method are implemented.
  • the disclosed apparatus, apparatus and method may be implemented in other manners.
  • the apparatus embodiments described above are only illustrative.
  • the division of the units is only logical function division.
  • there may be other division methods, or units with the same function may be grouped into one Units, such as multiple units or components, may be combined or may be integrated into another system, or some features may be omitted, or not implemented.
  • the shown or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may also be electrical, mechanical or other forms of connection.
  • the units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solutions of the embodiments of the present application.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units may be implemented in the form of hardware, or may be implemented in the form of software functional units.
  • the integrated unit if implemented in the form of a software functional unit and sold or used as an independent product, may be stored in a computer-readable storage medium.
  • the read storage medium includes several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned computer-readable storage medium includes: U disk, mobile hard disk, Read-Only Memory (ROM, Read-Only Memory), magnetic disk or optical disk and other media that can store program codes.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Character Discrimination (AREA)

Abstract

本申请公开了智能图像识别方法、装置、计算机设备及存储介质,方法包括:获取用户进行截图操作得到的截取图像并获取对应的目标像素集合,从目标像素集合中分割得到字符图像并进行数值化得到对应的字符特征值,根据本地字符匹配库对字符特征值进行识别得到第一文本信息,将未识别的字符图像的字符特征值发送至管理服务器进行远程识别得到第二文本信息,对两部分文本信息进行整合得到图像文本信息。本申请基于OCR识别技术,属于人工智能领域,通过本地字符匹配库在本地对字符图像中的简单字符进行识别,无需占用客户端中大量的存储空间,将未能识别的字符图像的字符特征值传输至管理服务器进行远程识别,以实现对截取图像进行快速准确地识别。

Description

智能图像识别方法、装置、计算机设备及存储介质
本申请要求于2020年12月8日提交中国专利局、申请号为202011443365.3,发明名称为“智能图像识别方法、装置、计算机设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及人工智能技术领域,属于智慧城市中对图像进行智能识别的应用场景,尤其涉及一种智能图像识别方法、装置、计算机设备及存储介质。
背景技术
基于OCR技术对图像进行识别可对应获取得到文字信息,传统技术方法均是用户通过客户端上传待识别图像至管理服务器,管理服务器完成识别并反馈对应的文字信息至客户端。然而发明人发现通常情况下,用户可能仅需要对待识别图像中的部分区域进行识别,而传统技术方法中虽然能够确保识别的准确率,但完整的待识别图像所占存储空间较大,在网络传输不稳定时将待识别图像传输至管理服务器需耗费较长时间,导致用户需要等待的时间较长才能获取所识别得到的文字信息,影响了图像识别效率。还可以采用在客户端进行本地识别的技术方法,可不对待识别图像进行传输,对数字或字母进行本地识别准确率较高,但中文文本包含的字符数量较多,且同一字符采用不同字体书写则会对应多个匹配模板,导致对中文文本进行识别的匹配模板数量庞大,影响了对图片中中文文本进行识别的效率及准确率。因此,现有的图像识别方法存在识别效率及准确率不高的问题。
发明内容
本申请实施例提供了一种智能图像识别方法、装置、计算机设备及存储介质,旨在解决现有技术方法中的图像识别方法所存在的识别效率及准确率不高的问题。
第一方面,本申请实施例提供了一种智能图像识别方法,其包括:
对所述客户端的显示界面进行实时监控,以通过实时监控获取用户在所述显示界面进行截图操作得到的截取图像;
获取所述截取图像对应的目标像素集合,其中所述目标像素集合中包含多个目标像素;
根据所述目标像素的位置信息,对所述目标像素集合中包含的多个所述目标像素进行分割,得到包含单个字符的字符图像;
对每一所述字符图像中的字符像素进行数值化,得到与每一所述字符图像对应的字符特征值;
根据预置的字符匹配库,对每一所述字符图像的字符特征值进行识别以获取第一文本信息;
根据所述第一文本信息判断是否存在未识别的字符图像,若存在则将与所述未识别的字符图像相匹配的字符特征值发送至所述管理服务器,以获取所述管理服务器反馈的第二文本信息;
将所述第一文本信息及所述第二文本信息进行整合得到与所述截取图像相匹配的图像文本信息。
第二方面,本申请实施例提供了一种智能图像识别装置,其包括:
截取图像获取单元,用于对所述客户端的显示界面进行实时监控,以通过实时监控获取用户在所述显示界面进行截图操作得到的截取图像;
目标像素获取单元,用于获取所述截取图像对应的目标像素集合,其中所述目标像素集合中包含多个目标像素;
字符图像获取单元,用于根据所述目标像素的位置信息,对所述目标像素集合中包含的多个所述目标像素进行分割,得到包含单个字符的字符图像;
字符特征值获取单元,用于对每一所述字符图像中的字符像素进行数值化,得到与每一所述字符图像对应的字符特征值;
第一文本信息获取单元,用于根据预置的字符匹配库,对每一所述字符图像的字符特征值进行识别以获取第一文本信息;
第二文本信息获取单元,用于根据所述第一文本信息判断是否存在未识别的字符图像,若存在则将与所述未识别的字符图像相匹配的字符特征值发送至所述管理服务器,以获取所述管理服务器反馈的第二文本信息;
文本信息整合单元,用于将所述第一文本信息及所述第二文本信息进行整合得到与所述截取图像相匹配的图像文本信息。
第三方面,本申请实施例又提供了一种计算机设备,其包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现上述第一方面所述的智能图像识别方法。
第四方面,本申请实施例还提供了一种计算机可读存储介质,其中所述计算机可读存储介质存储有计算机程序,所述计算机程序当被处理器执行时使所述处理器执行上述第一方面所述的智能图像识别方法。
本申请实施例提供了一种智能图像识别方法、装置、计算机设备及存储介质。获取用户进行截图操作得到的截取图像并获取对应的目标像素集合,从目标像素集合中分割得到字符图像并进行数值化得到对应的字符特征值,根据本地字符匹配库对字符特征值进行识别得到第一文本信息,将未识别的字符图像的字符特征值发送至管理服务器进行远程识别得到第二文本信息,对两部分文本信息进行整合得到图像文本信息。通过上述方法,通过本地字符匹配库在本地对字符图像中的简单字符进行识别,无需占用客户端中大量的存储空间,将未能识别的字符图像的字符特征值传输至管理服务器进行远程识别,以实现对截取图像进行快速准确地识别,具有识别效率高及识别准确率高的特点。
附图说明
为了更清楚地说明本申请实施例技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本申请实施例提供的智能图像识别方法的流程示意图;
图2为本申请实施例提供的智能图像识别方法的应用场景示意图;
图3为本申请实施例提供的智能图像识别方法的子流程示意图;
图4为本申请实施例提供的智能图像识别方法的另一子流程示意图;
图5为本申请实施例提供的智能图像识别方法的另一子流程示意图;
图6为本申请实施例提供的智能图像识别方法的另一子流程示意图;
图7为本申请实施例提供的智能图像识别方法的另一流程示意图;
图8为本申请实施例提供的智能图像识别方法的另一流程示意图;
图9为本申请实施例提供的智能图像识别装置的示意性框图;
图10为本申请实施例提供的计算机设备的示意性框图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
应当理解,当在本说明书和所附权利要求书中使用时,术语“包括”和“包含”指示所描述特征、整体、步骤、操作、元素和/或组件的存在,但并不排除一个或多个其它特征、整体、步骤、操作、元素、组件和/或其集合的存在或添加。
还应当理解,在此本申请说明书中所使用的术语仅仅是出于描述特定实施例的目的而并不意在限制本申请。如在本申请说明书和所附权利要求书中所使用的那样,除非上下文清楚地指明其它情况,否则单数形式的“一”、“一个”及“该”意在包括复数形式。
还应当进一步理解,在本申请说明书和所附权利要求书中使用的术语“和/或”是指相关联列出的项中的一个或多个的任何组合以及所有可能组合,并且包括这些组合。
请参阅图1及图2,图1是本申请实施例提供的智能图像识别方法的流程示意图,图2为本申请实施例提供的智能图像识别方法的应用场景示意图,该智能图像识别方法应用于客户端10中,该方法通过安装于客户端10中的应用软件进行执行,客户端10与管理服务器20之间建立网络连接以进行数据信息的传输,客户端10即是用于执行智能图像识别方法以对图像进行智能识别的终端设备,如台式电脑、笔记本电脑、平板电脑或手机等,管理服务器20即是用于与客户端10建立网络连接的服务器端,管理服务器10可以是企业所设置的企业服务器。图2中仅仅示意出一台客户端10与管理服务器20进行信息传输,在实际应用中,该管理服务器20也可与多台客户端10同时进行信息传输。如图1所示,该方法包括步骤S110~S170。
S110、对所述客户端的显示界面进行实时监控,以通过实时监控获取用户在所述显示界面进行截图操作得到的截取图像。
对所述客户端的显示界面进行实时监控,以通过实时监控获取用户在所述显示界面进行截图操作得到的截取图像。用户即为客户端的使用者,客户端包含显示屏,显示界面即为显示屏内所显示的内容,用户可基于显示界面进行截图操作,则截取图像为显示界面中所包含的部分显示内容,其中,截取图像可以是矩形、圆形、椭圆形或其他任意形状。
例如,用户在显示界面内点击截图操作的按钮并选取特定形状的截取模板,用户点击鼠标进行拖动以实现对截取模板进行拖动放大,松开鼠标时截取模板所围合成的区域内包含的显示内容即为得到的截取图像。
S120、获取所述截取图像对应的目标像素集合,其中所述目标像素集合中包含多个目标像素。
获取所述截取图像对应的目标像素集合,其中所述目标像素集合中包含多个目标像素,目标像素集合中包含多个目标像素。图像处理规则即为对截取图像进行处理的规则信息,可根据图像处理规则从截取图像中对应获取得到由目标像素所组成的目标像素集合,目标像素即为与截取图像中的待识别文本信息对应的像素,对目标像素集合中所包含的目标像素进行进一步识别即可获取得到对应的文本信息。具体的,图像处理规则中包含对比度阈值,截取图像中包含若干个像素,每个像素在该截取图像中包含对应的像素值,像素所对应的像素值即为与该像素的色彩信息;若截取图像为彩色图像,则该彩色图像中每一像素点在RGB对应的红(R)、绿(G)、蓝(B)三个颜色通道上分别对应一个像素值;若截取图像为灰度图像,则该灰度图像中每一像素点在黑色这一颜色通道上对应一个像素值,像素值均采用非负整数进行表示,其取值范围为[0,255],以黑色这一颜色通道为例,若某一像素的像素值为0则表示该像素的颜色为黑色,若某一像素的像素值为255则表示该像素的颜色为白色,像素值为其他数值则表明该像素的颜色为介于白色与黑色之间的一个具体灰度。可通过图像处理规则对截取图像中的每一像素进行筛选,以获取截取图像中对应的目标像素,与截取图像对应的所有目标像素即组合成为目标像素集合。
在一实施例中,如图3所示,步骤S120包括子步骤S121、S122、S123和S124。
S121、对所述截取图像进行灰度处理得到对应的灰度图像并计算得到像素灰度平均值。
若截取图像不为灰度图像,可对该截取图像进行灰度处理得到对应的灰度图像;若截取图像本身即为灰度图像,则无需进行灰度处理,直接将截取图像作为灰度图像。获取灰度图像中每一像素的灰度值,取所有像素的灰度值的平均值作为灰度图像的像素灰度平均值。
S122、计算所述灰度图像中每一像素的灰度值与所述像素灰度平均值之间的灰度差值;S123、对每一所述像素的灰度差值是否小于预置的对比度阈值进行判断,以获取灰度差值不小于所述对比度阈值的像素。
可根据每一像素的灰度值计算得到每一像素与像素灰度平均值之间的灰度差值,具体的,可采用X i=|F i-F v|这一公式计算得到第i个像素与像素灰度平均值F v之间的灰度差值X i。判断每一像素的灰度差值是否小于对比度阈值,并根据判断结果获取灰度差值不小于对比度阈值的像素,若某一像素的灰度差值小于对比度阈值,则可判断该像素为灰度图像中的背景像素。
S124、对所述灰度差值不小于所述对比度阈值的每一像素是否孤立进行判断,以将孤立的像素剔除得到所述目标像素。
由于可从图像中识别得到的有效信息均为多个像素相连接而形成的像素区块,因此可将图像中多个像素相连接而形成的像素区块包含的像素作为目标像素,具体的,对所得到的灰度差值不小于对比度阈值的每一像素是否孤立进行判断,也即是对上一步骤所得到的每一像素是否与灰度差值不小于对比度阈值的其他像素相连,若相连,则判断结果为该像素不孤立;若不相连,则判断结果为该像素孤立,将孤立的像素剔除即可得到目标像素。
S130、根据所述目标像素的位置信息,对所述目标像素集合中包含的多个所述目标像素进行分割,得到包含单个字符的字符图像。
根据所述目标像素的位置信息,对所述目标像素集合中包含的多个所述目标像素进行分割,得到包含单个字符的字符图像。获取到目标像素集合后,还需要从中分割得到对应的字符图像,每一字符图像中包含对应的一个字符。具体的,根据目标像素中每一像素的位置信息,即可从目标像素集合中分割得到包含单个字符的分割图像,根据图像调整规则对该分割图像进行调整,即可得到仅包含单个字符的字符图像,也即是根据目标像素集合中对应包含字符的数量,即可得到与字符数量相同的多个字符图像。
在一实施例中,如图4所示,步骤S130包括子步骤S131、S132、S133和S134。
S131、根据所述目标像素集合中每一目标像素的位置信息对所述目标像素进行二值化得到与所述目标像素集合对应的二值化图像。
可对目标像素集合中的目标像素进行二值化,根据目标像素的位置信息在每一目标像素的位置处填充黑色像素,其它位置填充白色像素,则所得到的二值化图像中仅包含黑色和白色两种颜色,其中,目标像素的位置信息即为该目标像素在截取图像中所处的坐标位置。
S132、对所述二值化图像中所包含的像素区块进行整合得到字符区块,其中每一所述字符区块中均包含一个字符。
由于单个字符均是由多个像素组合拼接而形成的,单个字符可对应一个包含多个像素的字符区块,可根据二值化图像中每一目标像素的位置信息,获取二值化图像中由多个像素组合形成的像素区块,并基于目标像素的位置信息对多个像素区块进行整合得到字符区块;其中,每一字符区块包含至少一个像素区块,每一字符区块中均包含一个字符。具体的,正常情况下字符区块之间的距离至少为2个像素,则可从二值化图像中获取对应的多个像素区块,判断像素区块之间的距离是否不大于1个像素,若两个像素区块之间的距离不大于1个像素,则将两个像素区块进行组合形成一个字符区块;若某一像素区块与其它像素区块之间的距离均大于1个像素,将该像素区块作为单独的一个字符区块。
S133、根据所述字符区块所包含的目标像素提取得到与每一字符区块对应的分割图像。
每一字符区块中均包含多个目标像素,可根据字符区块所包含目标像素的位置信息,提取得到每一字符区块对应的分割图像,分割图像即为字符区块对应的最小矩形图像。具体的,根据某一个字符区块中每一目标像素的位置信息即可确定与该字符区块对应的一个最小矩形边界,最小矩形边界即为根据该字符区块中最外围的像素的坐标值确定的最小矩形框,字符区块的最小矩形边界有且仅有一个,根据该最小矩形边界从二值化图像中提取得到一个与该字符区块对应的分割图像。
S134、根据预置的图像调整规则对所述分割图像进行调整以得到与每一所述分割图像对应的字符图像。
根据预置的图像调整规则对所述分割图像进行调整以得到与每一所述分割图像对应的字符图像。具体的,可根据所得到的分割图像的尺寸等特征信息,对分割图像进行调整以得到与每一分割图像对应的字符图像;具体的,图像调整规则包括放大、缩小及旋转中的一种或多种,调整后的字符图像即为满足图像调整规则的图像。
S140、对每一所述字符图像中的字符像素进行数值化,得到与每一所述字符图像对应的字符特征值。
对每一所述字符图像中的字符像素进行数值化,得到与每一所述字符图像对应的字符特征值。数值化规则即为对字符图像进行数值化处理的规则信息,对字符图像进行数值化后即可得到与该字符图像对应的字符特征值,字符特征值即为通过数值对字符图像的特征进行量化表示的特征信息,字符特征值包括尺寸数组及坐标数组,尺寸数组用于表示字符图像的尺寸信息,坐标数组可用于表示字符图像中每一字符像素的坐标值。
在一实施例中,如图5所示,步骤S140包括子步骤S141、S142和S143,也即是根据数值化规则对一张字符图像进行数值化处理的具体过程包括以下三个步骤。
S141、获取一张所述字符图像的尺寸信息,根据预置的数值化规则生成与所述尺寸信息对应的尺寸数组;
S142、获取所述字符图像中所有字符像素的坐标位置,根据所述数值化规则及所述坐标位置生成与每一所述字符像素对应的坐标数组;
S143、将所述尺寸数组与所有所述坐标数组进行组合并作为与所述目标图像对应的字符特征值。
所得到的尺寸数组包含一组数值,所得到的坐标数组中包含一组数值,坐标数组的数量与字符图像中所包含的字符像素的数量相等。
例如,某一目标图像的尺寸信息为:长30像素,宽18像素,则得到与该字符图像对应的尺寸数组为{30,18};字符图像中某一字符像素位于第10行第5列,则得到与该字符像素对应的坐标数组为{10,5}。
S150、根据预置的字符匹配库,对每一所述字符图像的字符特征值进行识别以获取第一文本信息。
根据匹配规则及预存的字符匹配库对每一所述字符图像进行识别以获取第一文本信息,其中所述字符匹配库包含每一样本字符对应的一个或多个样本特征信息,同一样本字符采用多种不同字体进行书写可对应得到多个样本特征信息,第一文本信息可以是由阿拉伯数字和英文字母所组成的文本信息。字符图像中包含的字符可以是中文字符、阿拉伯数字或英文字母;所述字符匹配库包含与数字及字母相匹配的样本特征信息,由于阿拉伯数字和英文字母数量较少,则可将由数字及字母所组成的字符匹配库对字符图像进行识别,以对包含数字或字母的字符图像进行快速识别并得到第一文本信息,则对包含数字或字母的字符图像进行识别的过程在客户端本地进行,第一文本信息中包含与数字或字母相匹配的字符以及与每一字符相匹配的字符编码,一个字符编码与一张字符图像的字符特征值唯一对应。具体的,所述匹配规则中包括尺寸阈值、像素密度计算公式及密度阈值。
在一实施例中,如图6所示,步骤S150包括子步骤S151、S152、S153、S154、S155、S156和S157。
S151、根据预置的尺寸阈值获取所述字符匹配库中尺寸比值与每一所述字符特征值的尺寸比值相匹配的样本特征信息得到第一特征集合;
S152、根据预置的像素密度计算公式计算所述字符特征值的第一像素密度及所述第一特征集合中每一样本特征信息的第二像素密度;
S153、判断所述第一像素密度与每一所述第二像素密度之间的差值是否小于预置的密度 阈值,以获取差值小于所述密度阈值的样本特征信息得到备选特征集合。
其中,尺寸阈值即是用于判断字符特征值中的尺寸比值与样本特征信息的尺寸比值是否相匹配的阈值信息,尺寸比值可根据字符特征值中尺寸数组的具体数值计算得到,若字符特征的尺寸比值与字符特征值的尺寸比值之间的差值不大于尺寸阈值,则两者相匹配,否则两者不相匹配。
例如,某一字符特征值中的尺寸数组为{30,18},字符匹配库中某一样本特征信息的尺寸数组为{25,13},尺寸阈值为0.25;该字符特征的尺寸比值为1.6667,该样本特征信息的尺寸比值为1.9231,两者的尺寸比值的差值为0.2563,大于尺寸阈值,则该样本特征信息不与该字符特征相匹配。
像素密度计算公式即是用于对字符特征值或样本特征信息进行计算以获取对应像素密度的计算公式,若字符特征值对应的像素密度较大,则表明该字符特征值所对应的字符图像中单位面积内包含的字符像素较多,反之则表明该字符特征值所对应的字符图像中单位面积内所包含的字符像素较少;密度阈值即是用于判断字符特征值的像素密度是否与样本特征信息的像素密度是否相匹配的阈值信息,若字符特征值的像素密度与样本特征信息的像素密度之间的差值不大于密度阈值,则两者相匹,否则两者不相匹配。
计算像素密度以字符特征值为例进行说明,获取该字符特征值中所包含的坐标数组的数量,并除以尺寸数组中数值的乘积,即可得到该字符特征值的像素密度。像素密度计算公式可表示为:
J=T/(C 1×C 2)                       (1);
其中,J为字符特征值对应的像素密度,T为该字符特征值中坐标数组的数量,C 1为该字符特征值中尺寸数组的第一个数值,C 2为该字符特征值中尺寸数组的第二个数值。
S154、判断所述备选特征集合中包含的样本特征信息的数量是否大于零。
根据上述方法筛选得到与每一字符图像相匹配的样本特征信息,由于中文字符与阿拉伯数字及英文字母之间存在较大差异,且字符匹配库中仅包含数字及字母的样本特征信息,因此可判断与每一字符图像相匹配的字符特征信息的数量是否大于零,若与字符图像相匹配的字符特征信息的数量大于零,则表明该字符图像包含数字或字母;若与字符图像相匹配的字符特征信息的数量不大于零,则表明该字符图像不包含数字或字母,可将该字符图像作为未识别的字符图像,继续获取下一字符图像并返回执行步骤S151以对下一字符图像进行识别处理。
S155、若所述备选特征集合包含的样本特征信息的数量大于零,计算得到所述字符特征值与所述备选特征集合中的每一样本特征信息之间的匹配度;S156、获取所述备选特征集合中匹配度最高的一个样本特征信息对应的样本字符作为与所述字符特征值相匹配的目标字符;S157、将对每一所述字符图像进行识别得到的目标字符作为所述第一文本信息。
具体的,将字符特征值中每一坐标数组的数值除以该字符特征值的尺寸数组,得到与每一坐标数组对应的矢量数组,例如,某一字符特征值的尺寸数组为{30,18},其中某一坐标数组为{10,5},则计算得到与该坐标数组对应的一个矢量数组为{10/30,5/18},也即是{0.3333,0.2778}。以同样方式获取备选特征集合中每一样本特征信息的矢量数组,获取一个样本特征 信息的矢量数组与该字符特征值的矢量数组相重合的数组数量,将相重合的数组数量除以该字符特征值的矢量数组总数所得到的计算结果,作为该样本特征信息与该字符特征值之间的匹配度。根据上述方法计算备选特征集合中每一样本特征信息与该字符特征值之间的匹配度,并获取匹配度最高的一个样本特征信息所对应的样本字符作为与该字符特征值相匹配的目标字符,继续获取下一字符图像并返回执行步骤S151以对下一字符图像进行识别处理,直至对所有字符图像均执行上述识别操作;获取对每一所述字符图像进行识别得到的目标字符得到所述第一文本信息。
在一实施例中,如图7所示,步骤S150之前还包括步骤S1510。
S1510、根据所述数值化规则对预置的样本字符集合进行识别处理,以生成所述字符匹配库。
在对字符图像进行识别之前,还可根据预置的样本字符集合生成对应的字符匹配库,具体的,样本字符集合中包含多个样本字符,每一样本字符在样本字符集合中对应至少一个样本图像,对一个样本图像进行识别处理即可得到与该样本图像对应的样本特征信息,则生成的字符匹配库中一个样本字符可对应的一个或多个样本特征信息。
具体的,对一个样本图像进行识别处理的具体过程包括:
(1)获取一张所述样本图像的尺寸信息,根据预置的数值化规则生成与所述尺寸信息对应的尺寸数组;
(2)获取所述样本图像中所有样本像素的坐标位置,根据所述数值化规则及所述坐标位置生成与每一所述样本像素对应的坐标数组;
(3)将所述尺寸数组与所有所述坐标数组进行组合并作为与所述样本图像对应的样本特征信息。获取样本图像的样本特征信息的具体过程与获取字符特征值的具体过程相同,在此不作赘述。
S160、根据所述第一文本信息判断是否存在未识别的字符图像,若存在则将与所述未识别的字符图像相匹配的字符特征值发送至所述管理服务器,以获取所述管理服务器反馈的第二文本信息。
第一文本信息中包含与数字或字母对应的字符,可获取第一文本信息中字符的数量并判断是否与字符图像的数量相等,若相等则表明所有字符图像均被识别,第一文本信息中的每一字符均对应一个字符编码,可根据每一字符图像在截取图像中的排序位置获取对应字符编码的排序,根据字符编码的排序对第一文本信息中包含的字符进行排序,以整合得到与截取图像相匹配的图像文本信息;若不相等则表明还包含未被识别的字符图像,则可将与未识别的字符图像相匹配的字符特征值发送至管理服务器,通过管理服务器对未识别的字符图像中所包含的中文文本进行识别得到第二文本信息,管理服务器可将所得到的第二文本信息反馈至客户端,则第二文本信息中包含与中文文本相匹配的字符以及与每一字符相匹配的字符编码。在此过程中并不涉及图像的传输,而仅传输部分未识别的字符图像所对应的字符特征值,因此数据传输量大大减小,可提高字符特征值的传输效率,大幅减少用户获取识别结果所需的等待时间。由于中文字符数量庞大,客户端难以进行快速识别,则可通过管理服务器进行快速识别以提高识别的效率及准确率。对字符图像中所包含的中文字符进行识别的具体方式 与对字符图像进行识别并获取第一文本信息的方式相同,区别点在于对字符图像中所包含的中文字符进行识别需使用包含中文字符的字符匹配库。
S170、将所述第一文本信息及所述第二文本信息进行整合得到与所述截取图像相匹配的图像文本信息。
具体的,第一文本信息及第二文本信息中的每一字符均对应一个字符编码,字符编码即为与每一字符图像的字符特征值唯一对应标识信息,字符编码的排序与字符图像在截取图像中的排序相对应,可根据每一字符图像在截取图像中的排序位置获取对应字符编码的排序,并根据字符编码的排序对第一文本信息及第二文本信息中所包含的字符进行排序,以整合得到与截取图像相匹配的图像文本信息,则图像文本信息中可包含数字、字母及中文文本对应的字符。
在一实施例中,如图8所示,步骤S170之后还包括步骤S1701。
S1701、将所述图像文本信息显示于所述显示界面中与所述截取图像相邻的区域。
具体的,可将所得到的图像文本信息对应显示在截取图像一侧,用户可在客户端中方便快捷对图像文本信息进行操作,例如用户可对显示界面中显示的图像文本信息进行复制,或从图像文本信息中截取部分信息进行使用。
本申请中的技术方法可应用于智慧政务/智慧城管/智慧社区/智慧安防/智慧物流/智慧医疗/智慧教育/智慧环保/智慧交通等包含对图像进行智能识别的应用场景中,从而推动智慧城市的建设。
在本申请实施例所提供的智能图像识别方法中,获取用户进行截图操作得到的截取图像并获取对应的目标像素集合,从目标像素集合中分割得到字符图像并进行数值化得到对应的字符特征值,根据本地字符匹配库对字符特征值进行识别得到第一文本信息,将未识别的字符图像的字符特征值发送至管理服务器进行远程识别得到第二文本信息,对两部分文本信息进行整合得到图像文本信息。通过上述方法,通过本地字符匹配库在本地对字符图像中的简单字符进行识别,无需占用客户端中大量的存储空间,将未能识别的字符图像的字符特征值传输至管理服务器进行远程识别,以实现对截取图像进行快速准确地识别,具有识别效率高及识别准确率高的特点。
本申请实施例还提供一种智能图像识别装置,该智能图像识别装置用于执行前述智能图像识别方法的任一实施例。具体地,请参阅图9,图9是本申请实施例提供的智能图像识别装置的示意性框图。该智能图像识别装置可配置于客户端10中。
如图9所示,智能图像识别装置100包括截取图像获取单元110、目标像素获取单元120、字符图像获取单元130、字符特征值获取单元140、第一文本信息获取单元150、第二文本信息获取单元160和文本信息整合单元170。
截取图像获取单元110,用于对所述客户端的显示界面进行实时监控,以通过实时监控获取用户在所述显示界面进行截图操作得到的截取图像。
目标像素获取单元120,用于获取所述截取图像对应的目标像素集合,其中所述目标像素集合中包含多个目标像素。
在一实施例中,所述目标像素获取单元120包括子单元:灰度平均值获取单元、灰度差 值计算单元、灰度差值判断单元和孤立像素剔除单元。
灰度平均值获取单元,用于对所述截取图像进行灰度处理得到对应的灰度图像并计算得到像素灰度平均值;灰度差值计算单元,用于计算所述灰度图像中每一像素的灰度值与所述像素灰度平均值之间的灰度差值;灰度差值判断单元,用于对每一所述像素的灰度差值是否小于预置的对比度阈值进行判断,以获取灰度差值不小于所述对比度阈值的像素;孤立像素剔除单元,用于对所述灰度差值不小于所述对比度阈值的每一像素是否孤立进行判断,以将孤立的像素剔除得到所述目标像素。
字符图像获取单元130,用于根据所述目标像素的位置信息,对所述目标像素集合中包含的多个所述目标像素进行分割,得到包含单个字符的字符图像。
在一实施例中,所述字符图像获取单元130包括子单元:二值化图像获取单元、字符区块获取单元、分割图像提取单元和图像调整单元。
二值化图像获取单元,用于根据所述目标像素集合中每一目标像素的位置信息对所述目标像素进行二值化得到与所述目标像素集合对应的二值化图像;字符区块获取单元,用于对所述二值化图像中所包含的像素区块进行整合得到字符区块,其中每一所述字符区块中均包含一个字符;分割图像提取单元,用于根据所述字符区块所包含的目标像素提取得到与每一字符区块对应的分割图像;图像调整单元,用于根据预置的图像调整规则对所述分割图像进行调整以得到与每一所述分割图像对应的字符图像。
字符特征值获取单元140,用于对每一所述字符图像中的字符像素进行数值化,得到与每一所述字符图像对应的字符特征值。
在一实施例中,所述字符特征值获取单元140包括子单元:尺寸数组获取单元、坐标数组获取单元和数组组合单元。
尺寸数组获取单元,用于获取一张所述字符图像的尺寸信息,根据预置的数值化规则生成与所述尺寸信息对应的尺寸数组;坐标数组获取单元,用于获取所述字符图像中所有字符像素的坐标位置,根据所述数值化规则及所述坐标位置生成与每一所述字符像素对应的坐标数组;数组组合单元,用于将所述尺寸数组与所有所述坐标数组进行组合并作为与所述目标图像对应的字符特征值。
第一文本信息获取单元150,用于根据预置的字符匹配库,对每一所述字符图像的字符特征值进行识别以获取第一文本信息。
在一实施例中,所述第一文本信息获取单元150包括子单元:第一特征集合获取单元、像素密度获取单元、备选特征集合获取单元、数量判断单元、匹配度计算单元、目标字符确定单元和文本信息获取单元。
第一特征集合获取单元,用于根据预置的尺寸阈值获取所述字符匹配库中尺寸比值与每一所述字符特征值的尺寸比值相匹配的样本特征信息得到第一特征集合;像素密度获取单元,用于根据预置的像素密度计算公式计算所述字符特征值的第一像素密度及所述第一特征集合中每一样本特征信息的第二像素密度;备选特征集合获取单元,用于判断所述第一像素密度与每一所述第二像素密度之间的差值是否小于预置的密度阈值,以获取差值小于所述密度阈值的样本特征信息得到备选特征集合;数量判断单元,用于判断所述备选特征集合中包含的 样本特征信息的数量是否大于零;匹配度计算单元,用于若所述备选特征集合包含的样本特征信息的数量大于零,计算得到所述字符特征值与所述备选特征集合中的每一样本特征信息之间的匹配度;目标字符确定单元,用于获取所述备选特征集合中匹配度最高的一个样本特征信息对应的样本字符作为与所述字符特征值相匹配的目标字符;文本信息获取单元,用于将对每一所述字符图像进行识别得到的目标字符作为所述第一文本信息。
在一实施例中,所述智能图像识别装置还包括:字符匹配库生成单元。
所述字符匹配库生成单元,用于根据所述数值化规则对预置的样本字符集合进行识别处理,以生成所述字符匹配库。
第二文本信息获取单元160,用于根据所述第一文本信息判断是否存在未识别的字符图像,若存在则将与所述未识别的字符图像相匹配的字符特征值发送至所述管理服务器,以获取所述管理服务器反馈的第二文本信息。
文本信息整合单元170,用于将所述第一文本信息及所述第二文本信息进行整合得到与所述截取图像相匹配的图像文本信息。
在一实施例中,所述智能图像识别装置还包括:图像文本信息显示单元。
图像文本信息显示单元,用于将所述图像文本信息显示于所述显示界面中与所述截取图像相邻的区域。
在本申请实施例所提供的智能图像识别装置应用上述智能图像识别方法,获取用户进行截图操作得到的截取图像并获取对应的目标像素集合,从目标像素集合中分割得到字符图像并进行数值化得到对应的字符特征值,根据本地字符匹配库对字符特征值进行识别得到第一文本信息,将未识别的字符图像的字符特征值发送至管理服务器进行远程识别得到第二文本信息,对两部分文本信息进行整合得到图像文本信息。通过上述方法,通过本地字符匹配库在本地对字符图像中的简单字符进行识别,无需占用客户端中大量的存储空间,将未能识别的字符图像的字符特征值传输至管理服务器进行远程识别,以实现对截取图像进行快速准确地识别,具有识别效率高及识别准确率高的特点。
上述智能图像识别装置可以实现为计算机程序的形式,该计算机程序可以在如图10所示的计算机设备上运行。
请参阅图10,图10是本申请实施例提供的计算机设备的示意性框图。该计算机设备可以是用于执行智能图像识别方法以对图像进行智能识别的客户端10。
参阅图10,该计算机设备500包括通过系统总线501连接的处理器502、存储器和网络接口505,其中,存储器可以包括非易失性存储介质503和内存储器504。
该非易失性存储介质503可存储操作系统5031和计算机程序5032。该计算机程序5032被执行时,可使得处理器502执行智能图像识别方法。
该处理器502用于提供计算和控制能力,支撑整个计算机设备500的运行。
该内存储器504为非易失性存储介质503中的计算机程序5032的运行提供环境,该计算机程序5032被处理器502执行时,可使得处理器502执行智能图像识别方法。
该网络接口505用于进行网络通信,如提供数据信息的传输等。本领域技术人员可以理解,图10中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方 案所应用于其上的计算机设备500的限定,具体的计算机设备500可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。
其中,所述处理器502用于运行存储在存储器中的计算机程序5032,以实现上述的智能图像识别方法中对应的功能。
本领域技术人员可以理解,图10中示出的计算机设备的实施例并不构成对计算机设备具体构成的限定,在其他实施例中,计算机设备可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。例如,在一些实施例中,计算机设备可以仅包括存储器及处理器,在这样的实施例中,存储器及处理器的结构及功能与图10所示实施例一致,在此不再赘述。
应当理解,在本申请实施例中,处理器502可以是中央处理单元(Central Processing Unit,CPU),该处理器502还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。其中,通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
在本申请的另一实施例中提供计算机可读存储介质。该计算机可读存储介质可以为非易失性的计算机可读存储介质。该计算机可读存储介质存储有计算机程序,其中计算机程序被处理器执行时实现上述的智能图像识别方法中所包含的步骤。
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,上述描述的设备、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、计算机软件或者二者的结合来实现,为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各示例的组成及步骤。这些功能究竟以硬件还是软件方式来执行取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
在本申请所提供的几个实施例中,应该理解到,所揭露的设备、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为逻辑功能划分,实际实现时可以有另外的划分方式,也可以将具有相同功能的单元集合成一个单元,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另外,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口、装置或单元的间接耦合或通信连接,也可以是电的,机械的或其它的形式连接。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本申请实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以是两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分,或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个计算机可读存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的计算机可读存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到各种等效的修改或替换,这些修改或替换都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以权利要求的保护范围为准。

Claims (20)

  1. 一种智能图像识别方法,应用于客户端,所述客户端与管理服务器之间建立网络连接以进行数据信息的传输,其中,所述方法包括:
    对所述客户端的显示界面进行实时监控,以通过实时监控获取用户在所述显示界面进行截图操作得到的截取图像;
    获取所述截取图像对应的目标像素集合,其中所述目标像素集合中包含多个目标像素;
    根据所述目标像素的位置信息,对所述目标像素集合中包含的多个所述目标像素进行分割,得到包含单个字符的字符图像;
    对每一所述字符图像中的字符像素进行数值化,得到与每一所述字符图像对应的字符特征值;
    根据预置的字符匹配库,对每一所述字符图像的字符特征值进行识别以获取第一文本信息;
    根据所述第一文本信息判断是否存在未识别的字符图像,若存在则将与所述未识别的字符图像相匹配的字符特征值发送至所述管理服务器,以获取所述管理服务器反馈的第二文本信息;
    将所述第一文本信息及所述第二文本信息进行整合得到与所述截取图像相匹配的图像文本信息。
  2. 根据权利要求1所述的智能图像识别方法,其中,所述获取所述截取图像对应的目标像素集合,包括:
    对所述截取图像进行灰度处理得到对应的灰度图像并计算得到像素灰度平均值;
    计算所述灰度图像中每一像素的灰度值与所述像素灰度平均值之间的灰度差值;
    对每一所述像素的灰度差值是否小于预置的对比度阈值进行判断,以获取灰度差值不小于所述对比度阈值的像素;
    对所述灰度差值不小于所述对比度阈值的每一像素是否孤立进行判断,以将孤立的像素剔除得到所述目标像素。
  3. 根据权利要求1所述的智能图像识别方法,其中,所述根据所述目标像素的位置信息,对所述目标像素集合中包含的多个所述目标像素进行分割,得到包含单个字符的字符图像,包括:
    根据所述目标像素集合中每一目标像素的位置信息对所述目标像素进行二值化得到与所述目标像素集合对应的二值化图像;
    对所述二值化图像中所包含的像素区块进行整合得到字符区块,其中每一所述字符区块中均包含一个字符;
    根据所述字符区块所包含的目标像素提取得到与每一字符区块对应的分割图像;
    根据预置的图像调整规则对所述分割图像进行调整以得到与每一所述分割图像对应的字符图像。
  4. 根据权利要求1所述的智能图像识别方法,其中,所述对每一所述字符图像中的字符像素进行数值化,得到与每一所述字符图像对应的字符特征值,包括:
    获取一张所述字符图像的尺寸信息,根据预置的数值化规则生成与所述尺寸信息对应的 尺寸数组;
    获取所述字符图像中所有字符像素的坐标位置,根据所述数值化规则及所述坐标位置生成与每一所述字符像素对应的坐标数组;
    将所述尺寸数组与所有所述坐标数组进行组合并作为与所述目标图像对应的字符特征值。
  5. 根据权利要求1所述的智能图像识别方法,其中,所述根据预置的字符匹配库,对每一所述字符图像的字符特征值进行识别以获取第一文本信息,包括:
    根据预置的尺寸阈值获取所述字符匹配库中尺寸比值与每一所述字符特征值的尺寸比值相匹配的样本特征信息得到第一特征集合;
    根据预置的像素密度计算公式计算所述字符特征值的第一像素密度及所述第一特征集合中每一样本特征信息的第二像素密度;
    判断所述第一像素密度与每一所述第二像素密度之间的差值是否小于预置的密度阈值,以获取差值小于所述密度阈值的样本特征信息得到备选特征集合;
    判断所述备选特征集合中包含的样本特征信息的数量是否大于零;
    若所述备选特征集合包含的样本特征信息的数量大于零,计算得到所述字符特征值与所述备选特征集合中的每一样本特征信息之间的匹配度;
    获取所述备选特征集合中匹配度最高的一个样本特征信息对应的样本字符作为与所述字符特征值相匹配的目标字符;
    将对每一所述字符图像进行识别得到的目标字符作为所述第一文本信息。
  6. 根据权利要求4所述的智能图像识别方法,其中,所述根据预置的字符匹配库,对每一所述字符图像的字符特征值进行识别以获取第一文本信息之前,还包括:
    根据所述数值化规则对预置的样本字符集合进行识别处理,以生成所述字符匹配库。
  7. 根据权利要求1所述的智能图像识别方法,其中,所述将所述第一文本信息及所述第二文本信息进行整合得到与所述截取图像相匹配的图像文本信息之后,还包括:
    将所述图像文本信息显示于所述显示界面中与所述截取图像相邻的区域。
  8. 一种智能图像识别装置,包括:
    截取图像获取单元,用于对所述客户端的显示界面进行实时监控,以通过实时监控获取用户在所述显示界面进行截图操作得到的截取图像;
    目标像素获取单元,用于获取所述截取图像对应的目标像素集合,其中所述目标像素集合中包含多个目标像素;
    字符图像获取单元,用于根据所述目标像素的位置信息,对所述目标像素集合中包含的多个所述目标像素进行分割,得到包含单个字符的字符图像;
    字符特征值获取单元,用于对每一所述字符图像中的字符像素进行数值化,得到与每一所述字符图像对应的字符特征值;
    第一文本信息获取单元,用于根据预置的字符匹配库,对每一所述字符图像的字符特征值进行识别以获取第一文本信息;
    第二文本信息获取单元,用于根据所述第一文本信息判断是否存在未识别的字符图像,若存在则将与所述未识别的字符图像相匹配的字符特征值发送至所述管理服务器,以获取所 述管理服务器反馈的第二文本信息;
    文本信息整合单元,用于将所述第一文本信息及所述第二文本信息进行整合得到与所述截取图像相匹配的图像文本信息。
  9. 一种计算机设备,包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,其中,所述处理器执行所述计算机程序时实现以下步骤:
    对所述客户端的显示界面进行实时监控,以通过实时监控获取用户在所述显示界面进行截图操作得到的截取图像;
    获取所述截取图像对应的目标像素集合,其中所述目标像素集合中包含多个目标像素;
    根据所述目标像素的位置信息,对所述目标像素集合中包含的多个所述目标像素进行分割,得到包含单个字符的字符图像;
    对每一所述字符图像中的字符像素进行数值化,得到与每一所述字符图像对应的字符特征值;
    根据预置的字符匹配库,对每一所述字符图像的字符特征值进行识别以获取第一文本信息;
    根据所述第一文本信息判断是否存在未识别的字符图像,若存在则将与所述未识别的字符图像相匹配的字符特征值发送至所述管理服务器,以获取所述管理服务器反馈的第二文本信息;
    将所述第一文本信息及所述第二文本信息进行整合得到与所述截取图像相匹配的图像文本信息。
  10. 根据权利要求9所述的计算机设备,其中,所述获取所述截取图像对应的目标像素集合,包括:
    对所述截取图像进行灰度处理得到对应的灰度图像并计算得到像素灰度平均值;
    计算所述灰度图像中每一像素的灰度值与所述像素灰度平均值之间的灰度差值;
    对每一所述像素的灰度差值是否小于预置的对比度阈值进行判断,以获取灰度差值不小于所述对比度阈值的像素;
    对所述灰度差值不小于所述对比度阈值的每一像素是否孤立进行判断,以将孤立的像素剔除得到所述目标像素。
  11. 根据权利要求9所述的计算机设备,其中,所述根据所述目标像素的位置信息,对所述目标像素集合中包含的多个所述目标像素进行分割,得到包含单个字符的字符图像,包括:
    根据所述目标像素集合中每一目标像素的位置信息对所述目标像素进行二值化得到与所述目标像素集合对应的二值化图像;
    对所述二值化图像中所包含的像素区块进行整合得到字符区块,其中每一所述字符区块中均包含一个字符;
    根据所述字符区块所包含的目标像素提取得到与每一字符区块对应的分割图像;
    根据预置的图像调整规则对所述分割图像进行调整以得到与每一所述分割图像对应的字符图像。
  12. 根据权利要求9所述的计算机设备,其中,所述对每一所述字符图像中的字符像素进行数值化,得到与每一所述字符图像对应的字符特征值,包括:
    获取一张所述字符图像的尺寸信息,根据预置的数值化规则生成与所述尺寸信息对应的尺寸数组;
    获取所述字符图像中所有字符像素的坐标位置,根据所述数值化规则及所述坐标位置生成与每一所述字符像素对应的坐标数组;
    将所述尺寸数组与所有所述坐标数组进行组合并作为与所述目标图像对应的字符特征值。
  13. 根据权利要求9所述的计算机设备,其中,所述根据预置的字符匹配库,对每一所述字符图像的字符特征值进行识别以获取第一文本信息,包括:
    根据预置的尺寸阈值获取所述字符匹配库中尺寸比值与每一所述字符特征值的尺寸比值相匹配的样本特征信息得到第一特征集合;
    根据预置的像素密度计算公式计算所述字符特征值的第一像素密度及所述第一特征集合中每一样本特征信息的第二像素密度;
    判断所述第一像素密度与每一所述第二像素密度之间的差值是否小于预置的密度阈值,以获取差值小于所述密度阈值的样本特征信息得到备选特征集合;
    判断所述备选特征集合中包含的样本特征信息的数量是否大于零;
    若所述备选特征集合包含的样本特征信息的数量大于零,计算得到所述字符特征值与所述备选特征集合中的每一样本特征信息之间的匹配度;
    获取所述备选特征集合中匹配度最高的一个样本特征信息对应的样本字符作为与所述字符特征值相匹配的目标字符;
    将对每一所述字符图像进行识别得到的目标字符作为所述第一文本信息。
  14. 根据权利要求12所述的计算机设备,其中,所述根据预置的字符匹配库,对每一所述字符图像的字符特征值进行识别以获取第一文本信息之前,还包括:
    根据所述数值化规则对预置的样本字符集合进行识别处理,以生成所述字符匹配库。
  15. 根据权利要求9所述的计算机设备,其中,所述将所述第一文本信息及所述第二文本信息进行整合得到与所述截取图像相匹配的图像文本信息之后,还包括:
    将所述图像文本信息显示于所述显示界面中与所述截取图像相邻的区域。
  16. 一种计算机可读存储介质,其中,所述计算机可读存储介质存储有计算机程序,所述计算机程序当被处理器执行时使所述处理器执行以下操作:
    对所述客户端的显示界面进行实时监控,以通过实时监控获取用户在所述显示界面进行截图操作得到的截取图像;
    获取所述截取图像对应的目标像素集合,其中所述目标像素集合中包含多个目标像素;
    根据所述目标像素的位置信息,对所述目标像素集合中包含的多个所述目标像素进行分割,得到包含单个字符的字符图像;
    对每一所述字符图像中的字符像素进行数值化,得到与每一所述字符图像对应的字符特征值;
    根据预置的字符匹配库,对每一所述字符图像的字符特征值进行识别以获取第一文本信 息;
    根据所述第一文本信息判断是否存在未识别的字符图像,若存在则将与所述未识别的字符图像相匹配的字符特征值发送至所述管理服务器,以获取所述管理服务器反馈的第二文本信息;
    将所述第一文本信息及所述第二文本信息进行整合得到与所述截取图像相匹配的图像文本信息。
  17. 根据权利要求16所述的计算机可读存储介质,其中,所述获取所述截取图像对应的目标像素集合,包括:
    对所述截取图像进行灰度处理得到对应的灰度图像并计算得到像素灰度平均值;
    计算所述灰度图像中每一像素的灰度值与所述像素灰度平均值之间的灰度差值;
    对每一所述像素的灰度差值是否小于预置的对比度阈值进行判断,以获取灰度差值不小于所述对比度阈值的像素;
    对所述灰度差值不小于所述对比度阈值的每一像素是否孤立进行判断,以将孤立的像素剔除得到所述目标像素。
  18. 根据权利要求16所述的计算机可读存储介质,其中,所述根据所述目标像素的位置信息,对所述目标像素集合中包含的多个所述目标像素进行分割,得到包含单个字符的字符图像,包括:
    根据所述目标像素集合中每一目标像素的位置信息对所述目标像素进行二值化得到与所述目标像素集合对应的二值化图像;
    对所述二值化图像中所包含的像素区块进行整合得到字符区块,其中每一所述字符区块中均包含一个字符;
    根据所述字符区块所包含的目标像素提取得到与每一字符区块对应的分割图像;
    根据预置的图像调整规则对所述分割图像进行调整以得到与每一所述分割图像对应的字符图像。
  19. 根据权利要求16所述的计算机可读存储介质,其中,所述对每一所述字符图像中的字符像素进行数值化,得到与每一所述字符图像对应的字符特征值,包括:
    获取一张所述字符图像的尺寸信息,根据预置的数值化规则生成与所述尺寸信息对应的尺寸数组;
    获取所述字符图像中所有字符像素的坐标位置,根据所述数值化规则及所述坐标位置生成与每一所述字符像素对应的坐标数组;
    将所述尺寸数组与所有所述坐标数组进行组合并作为与所述目标图像对应的字符特征值。
  20. 根据权利要求16所述的计算机可读存储介质,其中,所述根据预置的字符匹配库,对每一所述字符图像的字符特征值进行识别以获取第一文本信息,包括:
    根据预置的尺寸阈值获取所述字符匹配库中尺寸比值与每一所述字符特征值的尺寸比值相匹配的样本特征信息得到第一特征集合;
    根据预置的像素密度计算公式计算所述字符特征值的第一像素密度及所述第一特征集合中每一样本特征信息的第二像素密度;
    判断所述第一像素密度与每一所述第二像素密度之间的差值是否小于预置的密度阈值,以获取差值小于所述密度阈值的样本特征信息得到备选特征集合;
    判断所述备选特征集合中包含的样本特征信息的数量是否大于零;
    若所述备选特征集合包含的样本特征信息的数量大于零,计算得到所述字符特征值与所述备选特征集合中的每一样本特征信息之间的匹配度;
    获取所述备选特征集合中匹配度最高的一个样本特征信息对应的样本字符作为与所述字符特征值相匹配的目标字符;
    将对每一所述字符图像进行识别得到的目标字符作为所述第一文本信息。
PCT/CN2021/090576 2020-12-08 2021-04-28 智能图像识别方法、装置、计算机设备及存储介质 WO2022121218A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011443365.3 2020-12-08
CN202011443365.3A CN112529004A (zh) 2020-12-08 2020-12-08 智能图像识别方法、装置、计算机设备及存储介质

Publications (1)

Publication Number Publication Date
WO2022121218A1 true WO2022121218A1 (zh) 2022-06-16

Family

ID=74999971

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/090576 WO2022121218A1 (zh) 2020-12-08 2021-04-28 智能图像识别方法、装置、计算机设备及存储介质

Country Status (2)

Country Link
CN (1) CN112529004A (zh)
WO (1) WO2022121218A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115578734A (zh) * 2022-09-23 2023-01-06 神州数码系统集成服务有限公司 一种基于金字塔特征的单一字符图像匹配识别方法
CN115981141A (zh) * 2023-03-17 2023-04-18 广东海新智能厨房股份有限公司 基于自适应匹配的控制方法、装置、设备及介质
CN115984859A (zh) * 2022-12-14 2023-04-18 广州市保伦电子有限公司 一种图像文字识别的方法、装置及存储介质
CN116912780A (zh) * 2023-09-12 2023-10-20 杭州慕皓新能源技术有限公司 基于模式动态切换的充电监测保护方法及保护系统

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112529004A (zh) * 2020-12-08 2021-03-19 平安科技(深圳)有限公司 智能图像识别方法、装置、计算机设备及存储介质
CN112989112B (zh) * 2021-04-27 2021-09-07 北京世纪好未来教育科技有限公司 在线课堂内容采集方法及装置
CN113192067B (zh) * 2021-05-31 2024-03-26 平安科技(深圳)有限公司 基于图像检测的智能预测方法、装置、设备及介质
CN113705561A (zh) * 2021-09-02 2021-11-26 北京云蝶智学科技有限公司 特殊符号的识别方法及装置
CN114387600A (zh) * 2022-01-19 2022-04-22 中国平安人寿保险股份有限公司 文本特征识别方法、装置、计算机设备和存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106682670A (zh) * 2016-12-19 2017-05-17 Tcl集团股份有限公司 一种台标识别方法及系统
CN110532837A (zh) * 2018-05-25 2019-12-03 九阳股份有限公司 一种物品取放过程中的图像数据处理方法和家电设备
CN111401322A (zh) * 2020-04-17 2020-07-10 Oppo广东移动通信有限公司 进出站识别方法、装置、终端及存储介质
CN112035821A (zh) * 2020-09-04 2020-12-04 平安科技(深圳)有限公司 图形验证码识别方法、装置、计算机设备及存储介质
CN112529004A (zh) * 2020-12-08 2021-03-19 平安科技(深圳)有限公司 智能图像识别方法、装置、计算机设备及存储介质

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107368827B (zh) * 2017-04-01 2020-09-15 阿里巴巴集团控股有限公司 字符识别方法及装置、用户设备、服务器
CN109086834B (zh) * 2018-08-23 2021-03-02 北京三快在线科技有限公司 字符识别方法、装置、电子设备及存储介质
CN110942074B (zh) * 2018-09-25 2024-04-09 京东科技控股股份有限公司 字符切分识别方法、装置、电子设备、存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106682670A (zh) * 2016-12-19 2017-05-17 Tcl集团股份有限公司 一种台标识别方法及系统
CN110532837A (zh) * 2018-05-25 2019-12-03 九阳股份有限公司 一种物品取放过程中的图像数据处理方法和家电设备
CN111401322A (zh) * 2020-04-17 2020-07-10 Oppo广东移动通信有限公司 进出站识别方法、装置、终端及存储介质
CN112035821A (zh) * 2020-09-04 2020-12-04 平安科技(深圳)有限公司 图形验证码识别方法、装置、计算机设备及存储介质
CN112529004A (zh) * 2020-12-08 2021-03-19 平安科技(深圳)有限公司 智能图像识别方法、装置、计算机设备及存储介质

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115578734A (zh) * 2022-09-23 2023-01-06 神州数码系统集成服务有限公司 一种基于金字塔特征的单一字符图像匹配识别方法
CN115984859A (zh) * 2022-12-14 2023-04-18 广州市保伦电子有限公司 一种图像文字识别的方法、装置及存储介质
CN115981141A (zh) * 2023-03-17 2023-04-18 广东海新智能厨房股份有限公司 基于自适应匹配的控制方法、装置、设备及介质
CN116912780A (zh) * 2023-09-12 2023-10-20 杭州慕皓新能源技术有限公司 基于模式动态切换的充电监测保护方法及保护系统
CN116912780B (zh) * 2023-09-12 2023-11-24 国网浙江省电力有限公司杭州供电公司 基于模式动态切换的充电监测保护方法及保护系统

Also Published As

Publication number Publication date
CN112529004A (zh) 2021-03-19

Similar Documents

Publication Publication Date Title
WO2022121218A1 (zh) 智能图像识别方法、装置、计算机设备及存储介质
US10867171B1 (en) Systems and methods for machine learning based content extraction from document images
AU2006252025B2 (en) Recognition of parameterised shapes from document images
US9311531B2 (en) Systems and methods for classifying objects in digital images captured using mobile devices
JP6994588B2 (ja) 顔特徴抽出モデル訓練方法、顔特徴抽出方法、装置、機器および記憶媒体
CN110516201B (zh) 图像处理方法、装置、电子设备及存储介质
US8503797B2 (en) Automatic document classification using lexical and physical features
US20180181594A1 (en) Searching Method and Apparatus
US11586863B2 (en) Image classification method and device
WO2021159802A1 (zh) 图形验证码识别方法、装置、计算机设备及存储介质
WO2023173557A1 (zh) 图像处理方法、装置、电子设备及存储介质
WO2022217711A1 (zh) 基于多层关联知识图谱的信息预测方法、装置、设备及介质
CN112084812A (zh) 图像处理方法、装置、计算机设备及存储介质
JP2022185143A (ja) テキスト検出方法、テキスト認識方法及び装置
JP2016024527A (ja) 情報処理装置、プログラム、及び自動ページ差し替え方法
US6360006B1 (en) Color block selection
CN110895811A (zh) 一种图像篡改检测方法和装置
CN113887375A (zh) 一种文本识别方法、装置、设备及存储介质
WO2013097072A1 (zh) 识别视频的字符的方法和装置
US11495040B2 (en) Information processing apparatus for designation of image type, image reading apparatus, and non-transitory computer readable medium storing program
WO2022252613A1 (zh) 基于桌面软件通过函数拟合识别pdf内多类线条的方法
CN114693955A (zh) 比对图像相似度的方法与装置及电子设备
RU2571510C2 (ru) Метод и устройство, использующие увеличение изображения для подавления визуально заметных дефектов на изображении
CN111627511A (zh) 眼科报告内容识别方法及装置、可读存储介质
CN110674091A (zh) 基于人工智能的文件上传方法、系统及存储介质

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21901933

Country of ref document: EP

Kind code of ref document: A1