WO2022100338A1 - 一种图片搜索方法、装置、电子设备、计算机可读存储介质及计算机程序产品 - Google Patents

一种图片搜索方法、装置、电子设备、计算机可读存储介质及计算机程序产品 Download PDF

Info

Publication number
WO2022100338A1
WO2022100338A1 PCT/CN2021/123256 CN2021123256W WO2022100338A1 WO 2022100338 A1 WO2022100338 A1 WO 2022100338A1 CN 2021123256 W CN2021123256 W CN 2021123256W WO 2022100338 A1 WO2022100338 A1 WO 2022100338A1
Authority
WO
WIPO (PCT)
Prior art keywords
dimensional
picture
ocr
recognition
low
Prior art date
Application number
PCT/CN2021/123256
Other languages
English (en)
French (fr)
Inventor
杜玮
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Priority to EP21890866.3A priority Critical patent/EP4184383A4/en
Publication of WO2022100338A1 publication Critical patent/WO2022100338A1/zh
Priority to US17/951,824 priority patent/US20230082638A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/5846Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using extracted text
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/418Document matching, e.g. of document images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/50Extraction of image or video features by performing operations within image blocks; by using histograms, e.g. histogram of oriented gradients [HoG]; by summing image-intensity values; Projection analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/158Segmentation of character regions using character size, text spacings or pitch estimation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/191Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06V30/1918Fusion techniques, i.e. combining data from various sources, e.g. sensor fusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/413Classification of content, e.g. text, photographs or tables

Definitions

  • the present application relates to the field of Internet technologies, and relates to, but is not limited to, an image search method, apparatus, electronic device, computer-readable storage medium, and computer program product.
  • OCR Optical Character Recognition
  • Embodiments of the present application provide a picture search method, apparatus, electronic device, computer-readable storage medium, and computer program product, which relate to the technical field of artificial intelligence and can improve picture search efficiency.
  • An embodiment of the present application provides a method for searching images, including:
  • the OCR identification result of each picture in the preset picture library is obtained; wherein, the OCR identification result includes at least one of the following: a low-dimensional OCR identification process obtained by using a low-dimensional OCR identification process based on an OCR identification threshold.
  • the target picture is determined as the search result of the picture search request, and the search result is displayed.
  • An embodiment of the present application provides a picture search device, including:
  • an obtaining module configured to obtain a picture search request, where the picture search request includes a key character string
  • a response module configured to obtain an OCR identification result of each picture in the preset picture library in response to the picture search request; wherein, the OCR identification result includes at least one of the following: adopt a low-dimensional OCR identification based on an OCR identification threshold Processing the obtained low-dimensional OCR recognition results and the high-dimensional OCR recognition results obtained by the high-dimensional OCR recognition processing based on depth recognition, and the recognition accuracy of the low-dimensional OCR recognition processing is lower than the recognition accuracy of the high-dimensional OCR recognition processing;
  • a processing module configured to traverse the pictures in the preset picture library that have not completed the low-dimensional OCR identification process and have not completed the high-dimensional OCR identification process, and perform the low-dimensional OCR on each traversed picture. Recognition processing to obtain a low-dimensional OCR recognition result of each corresponding image;
  • the first determination module is configured to, according to at least one of the low-dimensional OCR recognition result and the high-dimensional OCR recognition result of each picture, determine in the preset picture library the image matching the key string. target image;
  • the second determining module is configured to determine the target picture as a search result of the picture search request, and display the search result.
  • the embodiments of the present application provide a computer program product, including computer programs or instructions, and when the computer programs or instructions are executed by a processor, the image search method provided by the embodiments of the present application is implemented.
  • the embodiment of the present application provides a picture search device, including:
  • the memory is used to store executable instructions; and the processor is used to implement the image search method provided by the embodiment of the present application when executing the executable instructions stored in the memory.
  • the embodiments of the present application provide a computer-readable storage medium storing executable instructions for implementing the image search method provided by the embodiments of the present application when the executable instructions are executed by a processor.
  • the embodiment of the present application has the following beneficial effects: since some pictures in the preset picture library already include corresponding OCR identification results, when searching for pictures in the preset picture library, only the images in the preset picture library except those including the OCR identification results are searched. The remaining pictures other than some pictures are processed by low-dimensional OCR identification, and the OCR identification results of all pictures in the preset picture library can be obtained; therefore, the OCR identification results are obtained quickly; When the OCR recognition result is used for image search in the preset image library, the image search efficiency can be improved.
  • Fig. 1A is the process schematic diagram of image search
  • 1B is a schematic diagram of a search scene in which a picture to be searched is a note
  • Fig. 2 is a schematic diagram of an optional architecture of a picture search system provided by an embodiment of the present application
  • FIG. 3 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • FIG. 6 is an optional schematic flowchart 3 of a picture search method provided by an embodiment of the present application.
  • FIG. 7 is an optional schematic flowchart of high-dimensional OCR identification processing provided by an embodiment of the present application.
  • FIG. 8 is an optional fourth schematic flowchart of the image search method provided by the embodiment of the present application.
  • FIG. 9 is a schematic flowchart of a picture search method provided by an embodiment of the present application.
  • FIG. 10 is a detailed schematic flowchart of a picture search method provided by an embodiment of the present application.
  • FIG. 11 is a schematic flowchart of a simplified identification process provided by an embodiment of the present application.
  • FIG. 12 is a schematic diagram of a simplified identification process in an embodiment of the present application.
  • FIG. 13 is a schematic flowchart of a depth identification strategy provided by an embodiment of the present application.
  • FIG. 14A is a schematic diagram before dividing and enlarging a picture provided by an embodiment of the present application.
  • FIG. 14B is a schematic diagram of dividing and enlarging a picture according to an embodiment of the present application.
  • FIG. 1A it is a schematic diagram of the process of picture search.
  • you can enter "Hammer Depressed" the system will automatically match to picture 11 and output.
  • FIG. 1B It is a schematic diagram of the search scene where the picture to be searched is a note, and the existing search function cannot be supported, because the threshold for OCR recognition during the search will be relatively low, and it will be set to prioritize the identification of easier-to-identify content, for example, prioritize Identify large fonts 12, etc., which can ensure that the recognition time during search will not be too high, so users cannot search for detailed text information through the search function, such as text information 13, so it is necessary to address this situation.
  • the search function is optimized. That is to say, if you need to search for more detailed text information in the picture, such as the search for the note information and store name in the picture, the original search function cannot provide such refined search.
  • an embodiment of the present application proposes an image search method, which combines rapid identification with accurate identification for secondary processing of the image, and optimizes the OCR identification in the search process.
  • the efficiency and accuracy of the recognition of the text information make the recognition results faster and more accurate.
  • a picture search request is obtained, and the image search request includes a key character string; then, in response to the picture search request, an OCR identification result of each picture in a preset picture library is obtained; wherein the OCR
  • the recognition result includes at least one of the following: a low-dimensional OCR recognition result obtained by using a low-dimensional OCR recognition process based on an OCR recognition threshold and a high-dimensional OCR recognition result obtained by a high-dimensional OCR recognition process based on depth recognition, wherein the low-dimensional OCR recognition result is obtained.
  • the recognition accuracy of OCR recognition processing is lower than the recognition accuracy of high-dimensional OCR recognition processing; then, it traverses the pictures in the preset picture library that have not completed the low-dimensional OCR recognition processing and high-dimensional OCR recognition processing, and traverses each picture.
  • Low-dimensional OCR recognition processing to obtain the low-dimensional OCR recognition result of each corresponding picture; according to at least one of the low-dimensional OCR recognition result and the high-dimensional OCR recognition result of each picture, determine and key characters in the preset picture library The target picture that matches the string; finally, the target picture is determined as the search result of the picture search request, and the search result is displayed.
  • the image search is performed by combining the recognition results of the low-dimensional OCR recognition processing and the high-dimensional OCR recognition processing, the text information in the image can be searched more accurately, refined search can be achieved, accurate search results can be obtained, and the search can be improved. efficiency.
  • the electronic device for image search provided by the embodiments of the present application may be implemented as a notebook computer, a tablet computer, a desktop computer, a mobile device (for example, a mobile phone, a portable music player, a personal digital assistant, a special Any terminal such as a message device, a portable game device), an intelligent robot, etc.; in another implementation manner, the electronic device for image search provided by the embodiment of the present application may also be implemented as a server.
  • a client on the terminal may be used to perform image search.
  • FIG. 2 is a schematic diagram of an optional architecture of a picture search system provided by an embodiment of the present application.
  • the picture search system 10 provided in this embodiment of the present application includes a terminal 100 (ie an electronic device), a network 200 and a server 10-1, wherein the terminal There is a picture search application running on the terminal 100.
  • the picture search application corresponds to a preset picture library 400.
  • the preset picture library 400 stores a plurality of pictures. The user can input a key string through the client of the picture search application running on the terminal 100.
  • the client terminal responds to the user's picture search request, so as to obtain a target picture by matching in a preset picture library, wherein the target picture includes at least one picture;
  • Each picture in the preset picture library 400 is subjected to a low-dimensional OCR identification process based on an OCR identification threshold.
  • the server 10-1 acts as a background server, and is used to perform high-dimensional OCR recognition processing based on depth recognition on each picture in the preset picture library 400 during idle time, obtain a high-dimensional OCR recognition result, and send the high-dimensional OCR recognition result.
  • the idle time refers to the idle time of the terminal, and refers to the time period when each operating index (CPU occupancy rate, memory occupancy rate, graphics card occupancy rate, etc.) in the terminal is lower than the threshold, for example, late at night, when charging , when each function application is not in use, etc.
  • the threshold for example, late at night, when charging , when each function application is not in use, etc.
  • the terminal 100 when a picture search request is obtained, responds to the picture search request and obtains the OCR identification result of each picture in the preset picture library from the server 10-1 through the network 200; wherein, the OCR identification result Including at least one of the following: adopting the low-dimensional OCR recognition result obtained by the low-dimensional OCR recognition processing based on the OCR recognition threshold and the high-dimensional OCR recognition result obtained by the high-dimensional OCR recognition processing based on the depth recognition; through the network 200 from the server 10 -1 Obtain the pictures in the preset image library that have completed low-dimensional OCR recognition processing and high-dimensional OCR recognition processing, and traverse the pictures that have completed low-dimensional OCR recognition processing and high-dimensional OCR recognition processing in the unpreset image library, and traverse to Perform low-dimensional OCR recognition processing on each picture of the image to obtain a low-dimensional OCR recognition result for each corresponding picture; The target picture that matches the key character string is determined in ; it will be determined as the search result of the picture
  • the image search method provided by the embodiment of the present application relates to the field of artificial intelligence technology, and can be implemented at least by computer vision technology and machine learning technology in artificial intelligence technology.
  • computer vision technology (CV, Computer Vision) is a science that studies how to make machines "see”, which refers to the use of cameras and computers instead of human eyes to identify, track and measure targets and other machine vision, and further do graphics processing , so that the computer processing becomes an image more suitable for human eye observation or transmission to the instrument for detection.
  • CV Computer Vision
  • computer vision studies related theories and technologies trying to build artificial intelligence systems that can obtain information from images or multidimensional data.
  • Computer vision technology usually includes image processing, image recognition, image semantic understanding, image retrieval, OCR, video processing, video semantic understanding, video content/behavior recognition, three-dimensional (3D, Three Dimensional) object reconstruction, three-dimensional technology, virtual reality, augmentation Realistic, simultaneous positioning and map construction technologies, as well as common biometric identification technologies such as face recognition and fingerprint recognition.
  • Machine Learning is a multi-field interdisciplinary subject involving probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and other disciplines. It specializes in how computers simulate or realize human learning behaviors to acquire new knowledge or skills, and to reorganize existing knowledge structures to continuously improve their performance.
  • Machine learning is the core of artificial intelligence and the fundamental way to make computers intelligent, and its applications are in all fields of artificial intelligence.
  • Machine learning and deep learning usually include artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, teaching learning and other technologies.
  • the OCR recognition of the picture is realized by using the machine learning technology.
  • FIG. 3 is a schematic structural diagram of an electronic device 300 provided by an embodiment of the present application.
  • the electronic device 300 shown in FIG. 3 includes: at least one processor 310 , a memory 350 , at least one network interface 320 and a user interface 330 .
  • the various components in electronic device 300 are coupled together by bus system 340 . It is understood that the bus system 340 is used to implement the connection communication between these components.
  • the bus system 340 also includes a power bus, a control bus and a status signal bus. However, for clarity of illustration, the various buses are labeled as bus system 340 in FIG. 3 .
  • the processor 310 may be an integrated circuit chip with signal processing capabilities, such as a general-purpose processor, a digital signal processor (DSP, Digital Signal Processor), or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc., where a general-purpose processor may be a microprocessor or any conventional processor or the like.
  • DSP Digital Signal Processor
  • User interface 330 includes one or more output devices 331 that enable presentation of media content, including one or more speakers and/or one or more visual display screens.
  • User interface 330 also includes one or more input devices 332, including user interface components that facilitate user input, such as a keyboard, mouse, microphone, touch screen display, camera, and other input buttons and controls.
  • Memory 350 may be removable, non-removable, or a combination thereof. Exemplary hardware devices include solid state memory, hard drives, optical drives, and the like. Memory 350 optionally includes one or more storage devices that are physically remote from processor 310 . Memory 350 includes volatile memory or non-volatile memory, and may also include both volatile and non-volatile memory. The non-volatile memory may be a read-only memory (ROM, Read Only Memory), and the volatile memory may be a random access memory (RAM, Random Access Memory). The memory 350 described in the embodiments of the present application is intended to include any suitable type of memory. In some embodiments, memory 350 is capable of storing data to support various operations, examples of which include programs, modules, and data structures, or subsets or supersets thereof, as exemplified below.
  • the operating system 351 includes system programs for processing various basic system services and performing hardware-related tasks, such as a framework layer, a core library layer, a driver layer, etc., for implementing various basic services and processing hardware-based tasks;
  • An input processing module 353 for detecting one or more user inputs or interactions from one of the one or more input devices 332 and translating the detected inputs or interactions.
  • the picture search apparatus provided by the embodiments of the present application may be implemented in software.
  • FIG. 3 shows a picture search apparatus 354 stored in the memory 350 , and the picture search apparatus 354 may be in the electronic device 300
  • the picture search device which can be software in the form of programs and plug-ins, including the following software modules: acquisition module 3541, response module 3542, processing module 3543, first determination module 3544 and second determination module 3545, these modules are logical Therefore, any combination or further split can be carried out according to the realized functions. The function of each module will be explained below.
  • the image search apparatus provided by the embodiments of the present application may be implemented in hardware.
  • the image search apparatus provided by the embodiments of the present application may be a processor in the form of a hardware decoding processor, which is programmed to To perform the image search method provided by the embodiments of the present application, for example, a processor in the form of a hardware decoding processor may adopt one or more application specific integrated circuits (ASIC, Application Specific Integrated Circuit), DSP, programmable logic device (PLD, Programmable Logic Device), Complex Programmable Logic Device (CPLD, Complex Programmable Logic Device), Field Programmable Gate Array (FPGA, Field-Programmable Gate Array) or other electronic components.
  • ASIC application specific integrated circuits
  • DSP digital signal processor
  • PLD programmable logic device
  • CPLD Complex Programmable Logic Device
  • FPGA Field Programmable Gate Array
  • FIG. 4 is an optional first schematic flowchart of a picture search method provided by an embodiment of the present application, which will be described with reference to the steps shown in FIG. 4 .
  • a picture search application is running on the electronic device, and the user can input a key string on the client of the picture search application, and the client forms a picture search request based on the user's input operation or the user's click search operation to request the client to search
  • the image corresponding to this key string can be the type of the image, the text in the image, a summary of the text in the image, and so on.
  • the client when the client performs a picture search in response to a picture search request, it may search in an online state or in an offline state.
  • the OCR identification result includes at least one of the following: a low-dimensional OCR identification result obtained by using a low-dimensional OCR identification process based on an OCR identification threshold and a high-dimensional OCR identification result obtained by a high-dimensional OCR identification process based on depth identification.
  • the low-dimensional OCR recognition process is a simple recognition strategy
  • the high-dimensional OCR recognition process is a deep recognition strategy
  • the low-dimensional OCR recognition process performs simple recognition on the picture
  • the high-dimensional OCR recognition process performs more detailed and accurate recognition on the picture.
  • the recognition accuracy of low-dimensional OCR recognition processing is lower than that of high-dimensional OCR recognition processing; low-dimensional OCR recognition processing is less difficult, the recognition accuracy is low, the recognition rate is high, and resource consumption is low. OCR recognition processing is difficult, the recognition accuracy is high, the recognition rate is low, and the resource consumption is large.
  • the OCR recognition threshold is a relatively balanced value between recognition accuracy and recognition time, that is, when the OCR recognition threshold is met, not only the recognition speed is fast, but also the recognition error tolerance rate is high.
  • the OCR recognition threshold may include a threshold corresponding to the font size or a threshold corresponding to the recognition reliability, that is, when recognizing characters of a certain font size, not only the recognition accuracy can be guaranteed, but also the recognition accuracy can be guaranteed.
  • the font size value corresponding to the font size can be the OCR recognition threshold; or, when recognizing the text in the picture, when a certain degree of reliability is reached, the recognition accuracy is high, and When the recognition efficiency is also high, the reliability can be determined as the OCR recognition threshold.
  • the low-dimensional OCR recognition processing is performed based on the OCR recognition threshold, that is, when the low-dimensional OCR recognition processing is performed on a picture, the recognition parameters meet the OCR recognition threshold.
  • the OCR recognition threshold includes a font size threshold
  • the low-dimensional OCR recognition process only the text in the image whose font size is larger than the font size threshold is recognized, and the text smaller than the font size threshold is not recognized. identify. That is, when performing low-dimensional OCR recognition processing on a picture, if the picture is a picture with more detailed text, such as notes, OCR recognition will not be performed on all the text in the picture, but only part of the text that is easily recognized in the picture. Carry out OCR identification, which can improve the efficiency of identification.
  • Depth recognition refers to an accurate recognition method that also recognizes the detailed content in the picture. In the process of deep recognition, not only the overall content is recognized, but also the detailed text in the picture is recognized. The high-dimensional OCR recognition processing based on depth recognition can recognize each word in the picture. Therefore, the high-dimensional OCR recognition processing has higher recognition accuracy and is more time-consuming.
  • a low-dimensional OCR recognition result is obtained
  • a high-dimensional OCR recognition result is obtained
  • the electronic device judges each picture in the preset picture library, determines whether each picture has been subjected to low-dimensional OCR identification processing and high-dimensional OCR identification processing, and performs low-dimensional OCR identification processing for each picture in the preset picture library.
  • the images of recognition processing and high-dimensional OCR recognition processing are traversed.
  • the electronic device can determine whether a low-dimensional OCR recognition process or a high-dimensional OCR recognition process has been performed on each picture by checking whether the low-dimensional OCR recognition result or the high-dimensional OCR recognition result of each picture is stored in the preset storage unit. OCR recognition processing.
  • the low-dimensional OCR identification result of the picture may be correspondingly stored in the preset storage unit.
  • the electronic device when matching the target picture, can perform matching based not only on the low-dimensional OCR recognition result of the picture, but also on the basis of the high-dimensional OCR recognition result of the picture.
  • the picture has high-dimensional OCR recognition results, it is preferred to match based on the high-dimensional OCR recognition results, because the high-dimensional OCR recognition results have more recognition content and higher recognition accuracy than the low-dimensional OCR recognition results;
  • the matching is performed based on the low-dimensional OCR recognition result; in addition, the matching can also be performed based on the low-dimensional OCR recognition result and the high-dimensional OCR recognition result.
  • the key character string when matching the target image, may be matched with the low-dimensional OCR recognition result or with the corresponding text content in the high-dimensional OCR recognition result to determine the low-dimensional OCR recognition result or the high-dimensional OCR recognition result.
  • the similarity between the text content and the key string corresponding to the result, the picture with the highest similarity is determined as the target picture, or, after the similarity between each picture and the key string is determined, according to the similarity by Sort the pictures in the order of large to small to form a picture sequence, and then select a specific number of pictures as target pictures in the picture sequence.
  • the image key string corresponding to each image when matching the target image, may be determined according to the text content corresponding to the low-dimensional OCR recognition result or the high-dimensional OCR recognition result, and then the key character string in the image search request may be determined.
  • the character string is matched with the image key character string of each image, and the image corresponding to the image key character string that is the same as the key character string in the image search request or a similar image key character string is determined as the target image.
  • the high-dimensional OCR recognition results and the low-dimensional OCR recognition results of some pictures in the preset picture library is pre-stored, so that the high-dimensional OCR recognition results can be used to search for some pictures during recognition.
  • the remaining pictures are searched using the low-dimensional OCR recognition results; the low-dimensional OCR recognition results can be pre-stored or acquired in real time, but because the low-dimensional OCR recognition results can be obtained quickly, the image search efficiency can be guaranteed. It also ensures the speed of image search.
  • S405 Determine the target picture as the search result of the picture search request, and display the search result.
  • the picture when the determined target picture is one, the picture is displayed on the current interface of the electronic device, and when there are multiple determined target pictures, it is simultaneously displayed on the current interface of the electronic device Multiple pictures, or display multiple pictures in pagination.
  • the low-dimensional OCR recognition processing based on the OCR recognition threshold and the high-dimensional OCR recognition processing based on the depth recognition are used to process the pictures in the preset picture library, and the corresponding low-dimensional OCR recognition results and High-dimensional OCR recognition results, and according to the low-dimensional OCR recognition results or high-dimensional OCR recognition results of each picture, the target image of the image search request is matched.
  • Performing image search on the recognition result can more accurately search for text information in the image, realize refined search, obtain accurate search results, and improve search efficiency.
  • FIG. 5 is an optional second schematic flowchart of the image search method provided by the embodiment of the present application, which will be described in conjunction with the steps shown in FIG. 5 .
  • the OCR recognition result includes at least one of the following: a low-dimensional OCR recognition result obtained by using a low-dimensional OCR recognition process based on an OCR recognition threshold and a high-dimensional OCR recognition result obtained by a high-dimensional OCR recognition process based on depth recognition;
  • the recognition accuracy of the dimensional OCR recognition process is lower than that of the high-dimensional OCR recognition process.
  • S501 to S503 are the same as the descriptions of the implementation processes corresponding to the foregoing S401 to S403.
  • the OCR identification threshold includes an identification speed threshold.
  • the low-dimensional OCR identification process can be performed through the following steps.
  • S504 Determine the recognition speed for each character in the traversed picture.
  • the recognition speed of each character refers to the ratio between the time required to recognize a certain text and the number of characters of the text to be recognized. The higher the recognition speed, the lower the recognition difficulty and the easier the recognition of the corresponding text; the lower the recognition speed, the higher the recognition difficulty and the more difficult the recognition of the corresponding text.
  • the electronic device may determine the recognition speed for each type of text in advance according to the OCR recognition situation, so as to determine an appropriate recognition speed threshold.
  • S505 Perform OCR recognition on the characters whose recognition speed is greater than the recognition speed threshold.
  • the text whose recognition speed is greater than the recognition speed threshold is relatively easy to recognize, and the electronic device can only perform OCR recognition on the relatively easy to recognize text to complete the low-dimensional OCR recognition processing of the picture.
  • the OCR recognition threshold includes a font size threshold.
  • the low-dimensional OCR recognition process can be performed through the following steps.
  • S506 Determine the font size of each character in the traversed picture.
  • characters with larger fonts are relatively easier to recognize, and thus the recognition speed is higher, and characters with smaller fonts are relatively more difficult to recognize, and thus the recognition speed is lower.
  • the electronic device may determine the recognition speed for recognizing characters of different font sizes in advance according to the situation of OCR recognition, so as to determine an appropriate font size threshold.
  • the pictures may also include variant characters, and correspondingly, the following steps may be used to perform low-dimensional OCR recognition processing.
  • the electronic device cannot accurately identify the variant characters, the variant characters are not identified.
  • S510 Determine the target picture as the search result of the picture search request, and display the search result.
  • different OCR recognition thresholds can be set, and different OCR recognition thresholds can be used as reference conditions for recognition to perform text recognition, so that the recognition accuracy can be ensured at the same time. , to improve the speed of recognition and achieve a balance between recognition accuracy and recognition efficiency.
  • FIG. 6 is an optional schematic flow diagram 3 of the image search method provided by the embodiment of the present application; in some embodiments, the recognition accuracy of low-dimensional OCR recognition processing is lower than the recognition accuracy of high-dimensional OCR recognition processing, such as As shown in FIG. 6 , S404 can be implemented by the following steps.
  • step S602 If the judgment result is yes, execute S602; if the judgment result is no, return to step S403 to continue to perform low-dimensional OCR identification processing on the picture.
  • step S603 is executed, and if the judgment result is no, step S604 is executed.
  • the accuracy of the high-dimensional OCR recognition result is higher than that of the low-dimensional OCR recognition result, when there are both the low-dimensional OCR recognition result and the high-dimensional OCR recognition result, the higher-precision OCR recognition result is used.
  • the high-dimensional OCR recognition result is used as the basis to match the target image; and, when there is only a low-dimensional OCR recognition result, in order to ensure the timeliness of this image search task and improve the search efficiency of this image search task, continue to use this
  • the low-dimensional OCR recognition result is used as the basis to match the target image.
  • the low-dimensional OCR recognition result also has a certain degree of reliability and recognition accuracy, while ensuring the image search efficiency, it is necessary to To a certain extent, the accuracy of the final matching result can also be guaranteed.
  • FIG. 7 is a schematic flowchart of an optional high-dimensional OCR identification process provided by an embodiment of the present application, which will be described in conjunction with the steps shown in FIG. 7 .
  • the high-dimensional OCR recognition processing can be implemented in idle time, that is, when the electronic device does not perform the image search task, the high-dimensional OCR recognition processing can be performed in the background. Since the image search task is not performed before the image search request is acquired, or after the response to the image search request is completed, or when the search request response is interrupted, the high-dimensional OCR recognition process can be performed in these time periods. In order to realize the high-dimensional OCR recognition processing for each image in the preset image library, in the subsequent image search tasks, the image search can be performed based on the high-dimensional OCR recognition results with higher accuracy.
  • unprocessed pictures are pictures that have not been processed for high-dimensional OCR recognition, that is, unprocessed pictures include not only pictures for which low-dimensional OCR recognition processing has not been completed and high-dimensional OCR recognition processing has not been completed, but also low-dimensional OCR recognition processing that has been completed. Pictures that have not been processed by OCR and have not been processed by high-dimensional OCR.
  • the electronic device after the electronic device performs low-dimensional OCR identification processing on each traversed picture to obtain a low-dimensional OCR identification result of each corresponding picture, the electronic device stores the low-dimensional OCR identification result in a preset storage unit; After the electronic equipment processes each unprocessed picture by using high-dimensional OCR recognition processing, and obtains the high-dimensional OCR recognition result of each unprocessed picture, it stores the high-dimensional OCR recognition result in the preset storage unit, and deletes the corresponding unprocessed image. Process the low-dimensional OCR recognition results of images.
  • the electronic device stores the obtained low-dimensional OCR recognition result or high-dimensional OCR recognition result in the preset storage unit after each low-dimensional OCR recognition processing or high-dimensional OCR recognition processing is completed. In this way, It can ensure that the low-dimensional OCR recognition results or high-dimensional OCR recognition results can be quickly obtained directly from the preset storage unit when the image search task is performed subsequently, and the recognition results are performed according to the obtained low-dimensional OCR recognition results or high-dimensional OCR recognition results Fast key string matching eliminates the need to perform low-dimensional OCR recognition processing or high-dimensional OCR recognition processing on images, improving image search efficiency.
  • S702 may be implemented through S7021 and S7022, and each step will be described below.
  • the text sharpening process includes the following steps: first, dividing the unprocessed picture to form at least two sub-pictures; then, enlarging each sub-picture to obtain an enlarged sub-picture.
  • the electronic device may equally divide the unprocessed picture into at least two sub-pictures, or may divide the unprocessed picture into at least two irregular or unequal sub-pictures in an arbitrary division manner, or based on a certain division rule. sub image.
  • the unprocessed picture A When the unprocessed picture is irregularly or unequally divided, for example, the left third of the unprocessed picture A is a pure picture without any text, while the right two-thirds is a text picture formed by text , the unprocessed picture A can be divided into two parts, the first part is a sub-picture formed by the left third of the pure picture, and the second part is a sub-picture formed by the right two-thirds of the text picture.
  • the first part is a pure image
  • the second part is a text image
  • the division will not affect the continuity of the second part of the text, and the second part of the text can be more accurately recognized, and It is only necessary to perform OCR identification on the second part, so that not only the accuracy of identification is improved, but also the identification efficiency can be effectively improved.
  • S7022 Perform OCR recognition on the text in the image after the text sharpening process, to obtain a high-dimensional OCR recognition result of each unprocessed image.
  • the electronic device in S7022 performs OCR recognition on the text in the image after the text sharpening process can be implemented by the following steps: performing OCR recognition on the text in the enlarged sub-image to obtain a sub-recognition result corresponding to each sub-image. Then, the sub-recognition results of each sub-picture in the at least two sub-pictures are fused to obtain a high-dimensional OCR identification result of the unprocessed picture.
  • the electronic device may determine whether the at least two sub-recognition results corresponding to the at least two sub-pictures include overlapping content; when the at least two sub-recognition results include overlapping content, determine the non-overlapping content and overlapping content in the at least two sub-recognition results content; the non-overlapping content and the overlapping content are fused to obtain the high-dimensional OCR recognition result of the unprocessed image.
  • the fusion of the non-overlapping content and the overlapping content refers to deleting the repeated part of the overlapping content in the high-dimensional OCR recognition result.
  • the sub-recognition result of the first sub-picture includes four keywords A, B, C, and D
  • the sub-recognition result of the second sub-picture includes four keywords C, D, E, and F.
  • the non-overlapping contents of the sub-recognition results of the first sub-picture and the sub-recognition results of the second sub-picture are: A, B, E, F
  • the overlapping contents are C, D.
  • the high-dimensional OCR recognition result of the unprocessed image should be: A, B, C, D, E, F, not: A, B, C, D, C, D, E, F, that is, the overlapping parts C and D need to be deleted in the high-dimensional OCR recognition result.
  • the recognition result is to determine the high-dimensional OCR recognition result of the unprocessed picture.
  • the electronic device can segment, enlarge and identify the sub-picture again, and fuse the identified results to obtain the sub-picture more accurate identification results.
  • the image search method may be implemented by a client in the image search system, a preset storage unit corresponding to the client, and a server.
  • FIG. 8 is an optional image search method provided by the embodiment of the present application.
  • the fourth schematic flow chart, as shown in FIG. 8 the image search method includes S801 to S815 , and each step will be described below.
  • the server processes each picture in the preset picture library by using a high-dimensional OCR recognition process based on depth recognition, and obtains a high-dimensional OCR recognition result of each picture.
  • the server performs high-dimensional OCR identification processing on each picture in the preset picture library during idle time, which can effectively utilize resources and avoid the problem of reducing search efficiency by performing high-dimensional OCR identification processing during image search tasks.
  • the server stores the high-dimensional OCR identification result in a preset storage unit.
  • the server stores the high-dimensional OCR recognition result in a preset storage unit each time it processes and obtains a high-dimensional OCR recognition result of a picture, so as to ensure that the high-dimensional OCR recognition result can be obtained in the next image search task.
  • the high-dimensional OCR recognition result is used in time.
  • the client obtains a picture search request, where the picture search request includes a key character string.
  • the client in response to the picture search request, obtains the OCR identification result of each picture in the preset picture library from the preset storage unit.
  • the OCR recognition result includes at least one of the following: a low-dimensional OCR recognition result obtained by using a low-dimensional OCR recognition process based on an OCR recognition threshold and a high-dimensional OCR recognition result obtained by a high-dimensional OCR recognition process based on depth recognition;
  • the recognition accuracy of the dimensional OCR recognition process is lower than that of the high-dimensional OCR recognition process.
  • the client traverses the pictures in the preset picture library for which the low-dimensional OCR identification processing and the high-dimensional OCR identification processing have not been completed, and performs low-dimensional OCR identification processing on each traversed picture to obtain the low-dimensional OCR of each corresponding picture. Identify the results.
  • the low-dimensional OCR recognition process may be performed on the newly added image in time in the next image search task.
  • the client determines the reliability corresponding to the low-dimensional OCR identification result of each picture.
  • a specific OCR recognition model can be used for OCR recognition.
  • OCR recognition model is used for OCR recognition, not only the low-dimensional OCR recognition result can be obtained, but also the available low-dimensional OCR recognition result corresponding to the current low-dimensional OCR recognition result can be obtained. reliability.
  • the influencing factors of reliability include, but are not limited to, at least one of the following: the clarity of the picture, the type of the picture, and the number of words recognized, etc.
  • the reliability of the recognition results will be relatively low for pictures with low definition and blurred images; for the recognition of printed and handwritten characters, there is also a difference in the reliability.
  • the reliability of the recognition result of the text is low; when the same picture is recognized, if the number of recognized words is much smaller than the actual number of words, the reliability of the recognition result is low.
  • the client deletes the low-dimensional OCR identification results whose reliability is lower than the threshold.
  • a low-dimensional OCR identification result with high reliability is selected.
  • the client stores the low-dimensional OCR recognition result of each corresponding picture into a preset storage unit.
  • the client determines a target picture matching the key string in a preset picture library according to at least one of a low-dimensional OCR identification result and a high-dimensional OCR identification result of each picture.
  • the client determines the target image as the search result of the image search request, and displays the search result.
  • the server continues to use the high-dimensional OCR recognition processing based on depth recognition to process the pictures in the preset picture library that have not been processed for high-dimensional OCR recognition, to obtain a high-dimensional OCR recognition result of the pictures.
  • the background server can continue the high-dimensional OCR identification processing based on depth identification during the idle time after completing a picture search task.
  • the pictures in the picture library that have not yet been processed by high-dimensional OCR recognition are processed.
  • the server stores the high-dimensional OCR identification result in a preset storage unit.
  • the server deletes the low-dimensional OCR identification result of the corresponding picture in the preset storage unit.
  • the recognition accuracy of the high-dimensional OCR recognition result is higher than the recognition accuracy of the low-dimensional OCR recognition result, when any picture has both the low-dimensional OCR recognition result and the high-dimensional OCR recognition result, it can be Only the high-dimensional OCR recognition results with higher recognition accuracy are retained, and the low-dimensional OCR recognition results stored in the preset storage unit are deleted.
  • the high-dimensional OCR recognition result stored in the preset storage unit can be used for key string matching directly when performing subsequent image search tasks, without the need to start from low
  • the high-dimensional OCR recognition result with higher recognition accuracy is determined from the two-dimensional OCR recognition result and the high-dimensional OCR recognition result, which saves one step of interpretation and selection, and further improves the search efficiency.
  • This exemplary application describes a process of accurately and quickly searching for a target picture matched with the search keyword in the album based on the obtained search keyword input by the user when the preset picture library is an album.
  • the embodiment of the present application provides a picture search method.
  • the user only needs to input and search keywords on the input interface of the picture search application, and the picture search application can automatically search for search results matching the search keywords.
  • the search result can accurately include the text existing in the picture, and the method of the embodiment of the present application can be applied to all the scenes of searching for pictures.
  • FIG. 9 it is a schematic flowchart of a picture search method provided by an embodiment of the present application. As shown in FIG. 9 , the picture search method is implemented by a client, including S901 to S903 , and each step is described below.
  • S901 acquiring a search keyword (called a key character string) input by a user.
  • a search keyword called a key character string
  • S902 perform OCR identification on the picture, and determine whether the OCR identification result contains a search keyword, and obtain a search result (called a target picture).
  • the image search method can be implemented by combining the abbreviated identification strategy and the depth identification strategy, wherein the abbreviated identification strategy corresponds to the low-dimensional OCR identification processing in the embodiment of the present application, and the depth identification strategy corresponds to the embodiment of the present application. high-dimensional OCR recognition processing.
  • FIG. 10 is a detailed flowchart of a picture search method provided by an embodiment of the present application. As shown in FIG. 10 , the picture search method includes S1001 to S1013 , and each step is described below.
  • the client when the client starts to perform a picture search, it obtains a search keyword in response to a user operation.
  • the identification result can be used to query; if the brief identification of the full picture has not been completed in the preset picture library, enter Brief identification process. That is to say, if the judgment result is yes, then use the identification result to query, and execute S1003; if the judgment result is no, execute S1004.
  • the search keyword when searching by using the OCR identification content, if a full amount of abbreviated identification has been performed on the preset picture library, the search keyword can be used to search for the identification result, and at the same time the output contains the search key.
  • the search results for the word end with.
  • pictures here are pictures for which the low-dimensional OCR identification processing has not been completed, and the high-dimensional OCR identification processing has not been completed.
  • S1007 determine whether to perform a deep scan (that is, whether to adopt a deep recognition strategy to perform deep recognition in a background idle time).
  • the depth identification strategy When it is determined to use the depth identification strategy for processing, the depth identification strategy includes S1009 to S1013, and each step will be described below.
  • S1011 perform OCR identification on the segmented picture (referred to as a sub-picture).
  • the existing results refer to the results recognized by other equal parts of the picture in the historical process when any equal part (called a sub-picture) in any picture is currently recognized.
  • FIG. 11 is a schematic flowchart of the abbreviated recognition process provided by an embodiment of the present application.
  • OCR is used to simply identify the text on a picture, as shown in FIG. 12 , which is an embodiment of the present application.
  • the simplified recognition process for large characters and regular characters, such as the text 121 in Figure 12, it will be the target of active recognition, while small characters, variant characters, etc., such as the text 122 in Figure 12, because Recognition requires more time and resources. Therefore, recognition will be abandoned for this type of font to ensure that the recognition time of a single image can be controlled within 10 milliseconds.
  • the process of abbreviating the identification strategy includes S111 to S117 , and each step will be described below.
  • the client traverses the pictures in the preset picture library to obtain pictures for which the low-dimensional OCR identification process has not been completed and the high-dimensional OCR identification process has not been completed.
  • pictures here are pictures for which the low-dimensional OCR identification processing has not been completed, and the high-dimensional OCR identification processing has not been completed.
  • the recognition result with high reliability is selected.
  • the results with low reliability will be excluded, because the user is searching when the brief identification is triggered, and the deep identification has not been completed, so it is necessary to maintain a certain accuracy of identification to ensure that Users can search normally, and at the same time avoid generating too many search interference items due to low reliability.
  • the reliability is a value that can be obtained when performing OCR recognition, that is, when performing OCR recognition on a picture, not only the recognition result but also the reliability corresponding to the recognition result is output.
  • the OCR results of the corresponding pictures will be stored in the database.
  • a preset storage unit when a new picture is added, there is no need to perform full identification, but only one identification of the newly added picture is required, and the identification result is saved in a database (referred to as a preset storage unit).
  • the depth identification strategy is described in detail below.
  • FIG. 13 is a schematic flowchart of a depth identification strategy provided by an embodiment of the present application. As shown in FIG. 13 , the depth identification strategy includes S131 to S138 , and each step is described below.
  • the depth recognition will be performed during idle time (for example, late at night, when charging and when the application is not in use).
  • FIG. 14A is a schematic diagram before dividing and enlarging a picture provided by an embodiment of the present application
  • FIG. 14B is a schematic diagram after dividing and enlarging a picture provided by an embodiment of the present application. As shown in FIG. 14A and FIG.
  • the text is small and difficult to identify, while in the partially enlarged image area 142 after the image is divided into equal parts and enlarged, the text is enlarged and easy to identify; among them,
  • the picture area 142 is an enlarged result of the picture area 141 .
  • the image is identified after segmentation. If a segmented image does not contain any text information, the segmented image will be discarded, and the segmented image area will not be identified.
  • the results with low reliability may be re-segmented and identified.
  • the picture after the picture is divided into quarters, it may still contain too much text information (such as panoramic pictures, long screenshots, etc.), and the reliability of the recognized results will be low.
  • the segmented picture will be segmented again, and the secondary segmented picture will be recognized. Or if the picture does not contain text information, it does not need to be segmented again.
  • the recognition result may be saved in a database, and if the database already contains the result of the simplified recognition process of the picture, the result of the abbreviated recognition process is replaced with the result of the deep recognition. Similarly, if a new image is added, it can also directly perform incremental recognition on the new image, that is, perform deep recognition on the new image.
  • the image search method provided by the embodiment of the present application can more accurately search the text information in the photo when searching for a photo, and provides a photo search in more dimensions, and can utilize more search scenarios, such as: note search, chatting Record screenshots, search, etc., and without the need for background cloud recognition, it has a high accuracy rate and can be used offline.
  • the image search apparatus 354 stored in the memory 350 includes:
  • Obtaining module 3541 configured to obtain a picture search request, where the picture search request includes a key character string
  • the response module 3542 is configured to, in response to the picture search request, obtain the OCR identification result of each picture in the preset picture library; wherein, the OCR identification result includes at least one of the following: adopting a low-dimensional OCR based on an OCR identification threshold The low-dimensional OCR recognition results obtained by the recognition processing and the high-dimensional OCR recognition results obtained by the high-dimensional OCR recognition processing based on depth recognition, the recognition accuracy of the low-dimensional OCR recognition processing is lower than the recognition accuracy of the high-dimensional OCR recognition processing ;
  • the processing module 3543 is configured to traverse the pictures in the preset picture library for which the low-dimensional OCR identification process has not been completed and the high-dimensional OCR identification process has not been completed, and perform the low-dimensional OCR identification process on each picture traversed. OCR recognition processing, to obtain the low-dimensional OCR recognition result of each corresponding image;
  • the first determination module 3544 is configured to, according to at least one of the low-dimensional OCR recognition result and the high-dimensional OCR recognition result of each picture, determine a match with the key character string in the preset picture library the target image;
  • the second determining module 3545 is configured to determine the target picture as a search result of the picture search request, and display the search result.
  • the processing module 3543 is further configured to: when it is determined that the preset picture library includes at least one picture, the low-dimensional OCR identification processing has not been completed, and the high-dimensional OCR identification processing has not been completed. , traverse pictures for which the low-dimensional OCR identification processing and the high-dimensional OCR identification processing are not completed, and perform the low-dimensional OCR identification processing on each picture traversed.
  • the OCR recognition threshold includes a recognition speed threshold
  • the processing module 3543 is further configured to: determine a recognition speed for each character in the traversed picture; OCR recognition is performed on the text of the recognition speed threshold, and the OCR recognition is used to perform the low-dimensional OCR recognition processing on the picture.
  • the OCR identification threshold includes a font size threshold
  • the processing module 3543 is further configured to: determine the font size of each character in the traversed picture; for the font size larger than the font size
  • the thresholded text is subjected to OCR recognition, and the OCR recognition is used to perform the low-dimensional OCR recognition process on the picture.
  • the processing module 3543 is further configured to: when the traversed picture includes a variant character, end the process of performing the low-dimensional OCR identification process on the variant character.
  • the first determining module 3544 is further configured to: when the picture has both the low-dimensional OCR recognition result and the high-dimensional OCR recognition result, determine the high-dimensional OCR recognition result as The OCR recognition result of the picture; when the picture only has the low-dimensional OCR recognition result, the low-dimensional OCR recognition result is determined as the OCR recognition result of the picture; For the OCR identification result, the target picture that matches the key character string is determined in the preset picture library.
  • the picture search device 354 further includes: a third determination module, configured to, before acquiring the picture search request, or after completing the response to the picture search request, or after the picture search request is obtained When the response to the picture search request is interrupted, determine that the picture in the preset picture library that has not completed the high-dimensional OCR identification process is an unprocessed picture; perform the high-dimensional OCR identification process on each of the unprocessed pictures, and obtain each a result of the high-dimensional OCR recognition of the unprocessed picture.
  • a third determination module configured to, before acquiring the picture search request, or after completing the response to the picture search request, or after the picture search request is obtained When the response to the picture search request is interrupted, determine that the picture in the preset picture library that has not completed the high-dimensional OCR identification process is an unprocessed picture; perform the high-dimensional OCR identification process on each of the unprocessed pictures, and obtain each a result of the high-dimensional OCR recognition of the unprocessed picture.
  • the image processing module is further configured to: perform text sharpening processing on the unprocessed image to obtain a text sharpened image; and perform text sharpening processing on the text in the text sharpened image.
  • OCR identification to obtain the high-dimensional OCR identification result of each of the unprocessed pictures.
  • the picture processing module is further configured to: segment the unprocessed picture to obtain at least two sub-pictures; perform enlarging processing on each of the sub-pictures to obtain an enlarged sub-picture; performing OCR recognition on the text in the enlarged sub-pictures to obtain a sub-recognition result corresponding to each of the sub-pictures; merging at least two of the sub-recognition results corresponding to the at least two sub-pictures to obtain the The high-dimensional OCR recognition result of the unprocessed picture.
  • the picture processing module is further configured to: when at least two of the sub-recognition results corresponding to the at least two sub-pictures include overlapping content, determine which of the at least two sub-recognition results The non-overlapping content and the overlapping content; the non-overlapping content and the overlapping content are fused to obtain the high-dimensional OCR recognition result of the unprocessed picture; when the at least two sub-pictures corresponding to the at least two When the overlapping content is not included between the sub-recognition results, each of the sub-pictures is divided again, the enlargement process, the OCR recognition and the sub-recognition results are merged, so as to obtain each of the sub-images.
  • the identification result of the sub-picture according to the identification result of each sub-picture, determine the high-dimensional OCR identification result of the unprocessed picture.
  • the picture search device 354 further includes: a storage module, configured to perform the low-dimensional OCR identification process on each traversed picture to obtain a low-dimensional OCR identification result of each corresponding picture, storing the low-dimensional OCR identification result in a preset storage unit; and, performing the high-dimensional OCR identification process on each of the unprocessed pictures to obtain the high-dimensional OCR of each of the unprocessed pictures After identifying the result, the high-dimensional OCR identification result is stored in the preset storage unit, and the low-dimensional OCR identification result corresponding to the unprocessed picture is deleted.
  • a storage module configured to perform the low-dimensional OCR identification process on each traversed picture to obtain a low-dimensional OCR identification result of each corresponding picture, storing the low-dimensional OCR identification result in a preset storage unit; and, performing the high-dimensional OCR identification process on each of the unprocessed pictures to obtain the high-dimensional OCR of each of the unprocessed pictures After identifying the result, the high-dimensional OCR
  • the picture search device 354 further includes: a fourth determination module, configured to determine the reliability corresponding to the low-dimensional OCR recognition result of each of the pictures; a deletion module, configured to delete the low reliability The low-dimensional OCR recognition results at the threshold.
  • the picture search device 354 further includes: an OCR identification processing module, configured to perform the low-dimensional OCR identification processing on the new picture when a new picture is added to the preset picture library or the high-dimensional OCR recognition process.
  • an OCR identification processing module configured to perform the low-dimensional OCR identification processing on the new picture when a new picture is added to the preset picture library or the high-dimensional OCR recognition process.
  • Embodiments of the present application provide a computer program product or computer program, where the computer program product or computer program includes computer instructions, and the computer instructions are stored in a computer-readable storage medium.
  • the processor of the computer device (electronic device for image search) reads the computer instruction from the computer-readable storage medium, and the processor executes the computer instruction, so that the computer device executes the image search method in the embodiment of the present application.
  • Embodiments of the present application provide a computer-readable storage medium storing executable instructions, wherein executable instructions are stored, and when the executable instructions are executed by a processor, the processor will cause the processor to execute the image search method provided by the embodiments of the present application , for example, the picture search method shown in FIG. 4 .
  • a computer-readable storage medium for example, Ferromagnetic Random Access Memory (FRAM), Read Only Memory (ROM, Read Only Memory), Programmable Read Only Memory (PROM) , Erasable Programmable Read Only Memory (EPROM, Erasable Programmable Read Only Memory), Charged Erasable Programmable Read Only Memory (EEPROM, Electrically Erasable Programmable Read Only Memory), Flash Memory, Magnetic Surface Memory, Optical Disk, or Optical Disk Read Only Memory (CD-ROM, Compact Disk-Read Only Memory) and other memories; it can also be various devices including one or any combination of the above memories.
  • FRAM Ferromagnetic Random Access Memory
  • ROM Read Only Memory
  • PROM Programmable Read Only Memory
  • EPROM Erasable Programmable Read Only Memory
  • EEPROM Electrically Erasable Programmable Read Only Memory
  • Flash Memory Magnetic Surface Memory
  • Optical Disk or Optical Disk Read Only Memory
  • CD-ROM Compact Disk-Read Only Memory
  • executable instructions may take the form of programs, software, software modules, scripts, or code, written in any form of programming language, including compiled or interpreted languages, or declarative or procedural languages, and which Deployment may be in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment.
  • executable instructions may, but do not necessarily correspond to files in a file system, may be stored as part of a file that holds other programs or data, for example, a Hyper Text Markup Language (HTML, Hyper Text Markup Language) document
  • HTML Hyper Text Markup Language
  • One or more scripts in stored in a single file dedicated to the program in question, or in multiple cooperating files (eg, files that store one or more modules, subroutines, or code sections).
  • executable instructions may be deployed to be executed on one computing device, or on multiple computing devices located at one site, or alternatively, distributed across multiple sites and interconnected by a communication network execute on.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Library & Information Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Character Discrimination (AREA)

Abstract

一种图片搜索方法、装置、电子设备、计算机可读存储介质及计算机程序产品,涉及人工智能技术领域。方法包括:响应于图片搜索请求,获取预设图片库中每一图片的OCR识别结果;遍历预设图片库中未完成低维OCR识别处理、且未完成高维OCR识别处理的图片,并对遍历到的每一图片进行基于OCR识别阈值的低维OCR识别处理,得到每一对应图片的低维OCR识别结果;根据每一图片的低维OCR识别结果和高维OCR识别结果中的至少一种,在预设图片库中确定与关键字符串匹配的目标图片;将目标图片确定为图片搜索请求的搜索结果,并显示搜索结果。

Description

一种图片搜索方法、装置、电子设备、计算机可读存储介质及计算机程序产品
相关申请的交叉引用
本申请基于申请号为202011248141.7、申请日为2020年11月10日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。
技术领域
本申请涉及互联网技术领域,涉及但不限于一种图片搜索方法、装置、电子设备、计算机可读存储介质及计算机程序产品。
背景技术
基于光学字符识别(OCR,Optical Character Recognition)的图片搜索是依靠OCR识别图片上的文字,再进行搜索。在图片数量很多时,用户可能需要等待OCR全量识别完成后才能得到搜索结果,时间成本上相对较高,导致图片搜索的效率较低。
发明内容
本申请实施例提供一种图片搜索方法、装置、电子设备、计算机可读存储介质及计算机程序产品,涉及人工智能技术领域,能够提高图片搜索效率。
本申请实施例的技术方案是这样实现的:
本申请实施例提供一种图片搜索方法,包括:
获取图片搜索请求,所述图片搜索请求中包括关键字符串;
响应于所述图片搜索请求,获取预设图片库中每一图片的OCR识别结果;其中,所述OCR识别结果包括以下至少之一:采用基于OCR识别阈值的低维OCR识别处理所得到的低维OCR识别结果和基于深度识别的高维OCR识别处理所得到的高维OCR识别结果,所述低维OCR识别处理的识别精度小于所述高维OCR识别处理的识别精度;
遍历所述预设图片库中未完成所述低维OCR识别处理、且未完成所述高维OCR识别处理的图片,并对遍历到的每一图片进行所述低维OCR识别处理,得到每一对应图片的低维OCR识别结果;
根据每一图片的所述低维OCR识别结果和所述高维OCR识别结果中的至少一种,在所述预设图片库中确定与所述关键字符串匹配的目标图片;
将所述目标图片确定为所述图片搜索请求的搜索结果,并显示所述搜索结果。
本申请实施例提供一种图片搜索装置,包括:
获取模块,配置为获取图片搜索请求,所述图片搜索请求中包括关键字符串;
响应模块,配置为响应于所述图片搜索请求,获取预设图片库中每一图片的OCR识别结果;其中,所述OCR识别结果包括以下至少之一:采用基于OCR识别阈值的低维OCR识别处理所得到的低维OCR识别结果和基于深度识别的高维OCR识别处理所得到的高维OCR识别结果,所述低维OCR识别处理的识别精度小于所述高维OCR识别处理的识别精度;
处理模块,配置为遍历所述预设图片库中未完成所述低维OCR识别处理、且未完成所述高维OCR识别处理的图片,并对遍历到的每一图片进行所述低维OCR识别处理,得到每一对应图片的低维OCR识别结果;
第一确定模块,配置为根据每一图片的所述低维OCR识别结果和所述高维OCR识别结果中的至少一种,在所述预设图片库中确定与所述关键字符串匹配的目标图片;
第二确定模块,配置为将所述目标图片确定为所述图片搜索请求的搜索结果,并显示所述搜索结果。
本申请实施例提供一种计算机程序产品,包括计算机程序或指令,所述计算机程序或指令被处理器执行时,实现本申请实施例提供的图片搜索方法。
本申请实施例提供一种图片搜索设备,包括:
存储器,用于存储可执行指令;处理器,用于执行所述存储器中存储的可执行指令时,实现本申请实施例提供的图片搜索方法。
本申请实施例提供一种计算机可读存储介质,存储有可执行指令,用于被处理器执行所述可执行指令时,实现本申请实施例提供的图片搜索方法。
本申请实施例具有以下有益效果:由于预设图片库中的部分图片已经包括对应的OCR识别结果,在预设图片库中进行图片搜索时,仅对预设图片库中除包括OCR识别结果的部分图片之外的剩余图片进行低维OCR识别处理,就能获得预设图片库中所有图片的OCR识别结果;因此,OCR识别结果获取速度较快;进而,基于预设图片库中所有图片的OCR识别结果在预设图片库中进行图片搜索时,能够提高图片搜索效率。
附图说明
图1A是图片搜索的过程示意图;
图1B是待搜索图片为笔记的搜索场景示意图;
图2是本申请实施例提供的图片搜索系统的一个可选的架构示意图;
图3是本申请实施例提供的电子设备的结构示意图;
图4是本申请实施例提供的图片搜索方法的一个可选的流程示意图一;
图5是本申请实施例提供的图片搜索方法的一个可选的流程示意图二;
图6是本申请实施例提供的图片搜索方法的一个可选的流程示意图三;
图7是本申请实施例提供的高维OCR识别处理的一个可选的流程示意图;
图8是本申请实施例提供的图片搜索方法的一个可选的流程示意图四;
图9是本申请实施例提供的图片搜索方法的流程示意图;
图10是本申请实施例提供的图片搜索方法的详细流程示意图;
图11是本申请实施例提供的简略识别流程的流程示意图;
图12是本申请实施例中简略识别流程的示意图;
图13是本申请实施例提供的深度识别策略的流程示意图;
图14A是本申请实施例提供的对图片进行等分并放大前的示意图;
图14B是本申请实施例提供的对图片进行等分并放大后的示意图。
具体实施方式
为了使本申请的目的、技术方案和优点更加清楚,下面将结合附图对本申请作进一步地详细描述,所描述的实施例不应视为对本申请的限制,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其它实施例,都属于本申请保护的范围。
在以下的描述中,涉及到“一些实施例”,其描述了所有可能实施例的子集,但是可以理解,“一些实施例”可以是所有可能实施例的相同子集或不同子集,并且可以在不冲突的情况下相互结合。除非另有定义,本申请实施例所使用的所有的技术和科学术语与属于本申请实施例的技术领域的技术人员通常理解的含义相同。本申请实施例所使用的术语只是为了描述本申请实施例的目的,不是旨在限制本申请。
为了更好地理解本申请实施例中提供的图片搜索方法,首先对图片搜索方法进行说明。
在图片搜索时(例如进行表情包图片搜索),可以通过图片上的一些关键字进行搜索,如图1A所示,是图片搜索的过程示意图,当用户想要搜索“锤头丧气”对应的表情包图片时,可以输入“锤头丧气”,系统会自动匹配到图片11并输出。
但是当用户想要搜索某些图片中拍摄到的细节文字信息时,通常的搜索方法是无法支持如此精细的识别的;例如,用户想要搜索一份笔记中的文字信息时,如图1B所示,是待搜索图片为笔记的搜索场景示意图,现有的搜索功能是无法支持的,因为在搜索时OCR识别的阈值会相对较低,会 设置成优先识别更容易识别的内容,例如,优先识别大字体12等,这样能够保证搜索时候的识别耗时不会过高,所以用户是无法通过搜索功能来搜索到细节的文字信息的,比如文字信息13,所以有必要针对这种情况下的搜索功能进行优化。也就是说,如果需要对图片中更细节的文字信息进行搜索,比如对图片中的笔记信息和店铺店名的搜索,这时候原有的搜索功能并不能提供这种精细化的搜索。
另外,部分方案使用的是先将图片上传到服务器,由服务器进行OCR识别,当识别完成后再将结果同步到客户端,这种方案能够有比较高的准确度,但是服务器上传下载等请求会有一定失败的风险,对于大图片和长图片的上传会耗费较多的时间与流量,同时用户图片的隐私性也难以得到保障。
综上所述,图片搜索过程中至少存在以下问题:高精度识别耗时过高,无法应用到搜索场景;低精度识别导致搜索时无法查找到细节内容;云端识别会面临上传下载失败的风险、上传下载耗时、隐私泄漏的风险与离线不可用的问题。
为了解决的图片搜索方法所存在的上述至少一个问题,本申请实施例提出一种图片搜索方法,将快速识别与对图片进行二次加工的精准识别相结合,优化搜索过程中OCR识别对图片上的文字信息进行识别的效率与准确度,使识别结果更加快捷与准确。
本申请实施例提供的图片搜索方法,首先,获取图片搜索请求,图片搜索请求中包括关键字符串;然后响应于图片搜索请求,获取预设图片库中每一图片的OCR识别结果;其中,OCR识别结果包括以下至少之一:采用基于OCR识别阈值的低维OCR识别处理所得到的低维OCR识别结果和基于深度识别的高维OCR识别处理所得到的高维OCR识别结果,其中,低维OCR识别处理的识别精度小于高维OCR识别处理的识别精度;再然后,遍历预设图片库中未完成低维OCR识别处理和高维OCR识别处理的图片,并对遍历到的每一图片进行低维OCR识别处理,得到每一对应图片的低维OCR识别结果;根据每一图片的低维OCR识别结果和高维OCR识别结果中的至少一种,在预设图片库中确定与关键字符串匹配的目标图片;最后,将目标图片确定为图片搜索请求的搜索结果,并显示搜索结果。如此,由于结合了低维OCR识别处理和高维OCR识别处理的识别结果进行图片搜索,能够更准确的搜索图片中的文字信息,实现精细化的搜索,得到准确的搜索结果,并且能够提高搜索效率。
下面,说明本申请实施例的用于图片搜索的电子设备的示例性应用。在一种实现方式中,本申请实施例提供的用于图片搜索的电子设备可以实施为笔记本电脑,平板电脑,台式计算机,移动设备(例如,移动电话,便携式音乐播放器,个人数字助理,专用消息设备,便携式游戏设备)、智能机器人等任意的终端;在另一种实现方式中,本申请实施例提供的用于 图片搜索的电子设备还可以实施为服务器。下面,将说明用于图片搜索的电子设备实施为终端时的示例性应用,可以采用终端上的客户端来进行图片搜索。
参见图2,图2是本申请实施例提供的图片搜索系统的一个可选的架构示意图。为实现对图片搜索请求进行准确的响应,以搜索得到准确的目标图片,本申请实施例提供的图片搜索系统10中包括终端100(即电子设备)、网络200和服务器10-1,其中,终端100上运行有图片搜索应用,图片搜索应用对应一预设图片库400,预设图片库400中存储有多张图片,用户可以通过终端100上运行的图片搜索应用的客户端输入关键字符串,以形成图片搜索请求,客户端对用户的图片搜索请求进行响应,以在预设图片库中匹配得到目标图片,其中,目标图片包括至少一张图片;本申请实施例中,客户端还可以对预设图片库400中的每一图片进行基于OCR识别阈值的低维OCR识别处理。服务器10-1作为后台服务器,用于在闲时对预设图片库400中的每一图片进行基于深度识别的高维OCR识别处理,得到高维OCR识别结果,并将高维OCR识别结果发送给终端100;其中,闲时是指终端的空闲时间,是指终端中的各运行指标(CPU占有率、内存占用率和显卡占用率等)低于阈值的时间段,比如,深夜,充电时,各功能应用未使用时,等等。
本申请实施例中,在获取到图片搜素请求时,终端100响应于图片搜索请求,通过网络200从服务器10-1获取预设图片库中每一图片的OCR识别结果;其中,OCR识别结果包括以下至少之一:采用基于OCR识别阈值的低维OCR识别处理所得到的低维OCR识别结果和基于深度识别的高维OCR识别处理所得到的高维OCR识别结果;通过网络200从服务器10-1获取预设图片库中完成低维OCR识别处理和高维OCR识别处理的图片,并遍历未预设图片库中完成低维OCR识别处理和高维OCR识别处理的图片,并对遍历到的每一图片进行低维OCR识别处理,得到每一对应图片的低维OCR识别结果;根据每一图片的低维OCR识别结果和高维OCR识别结果中的至少一种,在预设图片库中确定与关键字符串匹配的目标图片;将确定为图片搜索请求的搜索结果,并在终端100的当前界面100-1上显示搜索结果。
本申请实施例提供的图片搜索方法涉及人工智能技术领域,至少可以通过人工智能技术中的计算机视觉技术和机器学习技术来实现。其中,计算机视觉技术(CV,Computer Vision)是一门研究如何使机器“看”的科学,就是指用摄影机和电脑代替人眼对目标进行识别、跟踪和测量等机器视觉,并进一步做图形处理,使电脑处理成为更适合人眼观察或传送给仪器检测的图像。作为一个科学学科,计算机视觉研究相关的理论和技术,试图建立能够从图像或者多维数据中获取信息的人工智能系统。计算机视觉技术通常包括图像处理、图像识别、图像语义理解、图像检索、OCR、 视频处理、视频语义理解、视频内容/行为识别、三维(3D,Three Dimensional)物体重建、三维技术、虚拟现实、增强现实、同步定位与地图构建等技术,还包括常见的人脸识别、指纹识别等生物特征识别技术。
机器学习(ML,Machine Learning)是一门多领域交叉学科,涉及概率论、统计学、逼近论、凸分析、算法复杂度理论等多门学科。专门研究计算机怎样模拟或实现人类的学习行为,以获取新的知识或技能,重新组织已有的知识结构使之不断改善自身的性能。机器学习是人工智能的核心,是使计算机具有智能的根本途径,其应用遍及人工智能的各个领域。机器学习和深度学习通常包括人工神经网络、置信网络、强化学习、迁移学习、归纳学习、式教学习等技术。本申请实施例中,通过机器学习技术实现对对图片的OCR识别。
图3是本申请实施例提供的电子设备300的结构示意图,图3所示的电子设备300包括:至少一个处理器310、存储器350、至少一个网络接口320和用户接口330。电子设备300中的各个组件通过总线系统340耦合在一起。可理解,总线系统340用于实现这些组件之间的连接通信。总线系统340除包括数据总线之外,还包括电源总线、控制总线和状态信号总线。但是为了清楚说明起见,在图3中将各种总线都标为总线系统340。
处理器310可以是一种集成电路芯片,具有信号的处理能力,例如通用处理器、数字信号处理器(DSP,Digital Signal Processor),或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等,其中,通用处理器可以是微处理器或者任何常规的处理器等。
用户接口330包括使得能够呈现媒体内容的一个或多个输出装置331,包括一个或多个扬声器和/或一个或多个视觉显示屏。用户接口330还包括一个或多个输入装置332,包括有助于用户输入的用户接口部件,比如键盘、鼠标、麦克风、触屏显示屏、摄像头、其他输入按钮和控件。
存储器350可以是可移除的,不可移除的或其组合。示例性的硬件设备包括固态存储器,硬盘驱动器,光盘驱动器等。存储器350可选地包括在物理位置上远离处理器310的一个或多个存储设备。存储器350包括易失性存储器或非易失性存储器,也可包括易失性和非易失性存储器两者。非易失性存储器可以是只读存储器(ROM,Read Only Memory),易失性存储器可以是随机存取存储器(RAM,Random Access Memory)。本申请实施例描述的存储器350旨在包括任意适合类型的存储器。在一些实施例中,存储器350能够存储数据以支持各种操作,这些数据的示例包括程序、模块和数据结构或者其子集或超集,下面示例性说明。
操作系统351,包括用于处理各种基本系统服务和执行硬件相关任务的系统程序,例如框架层、核心库层、驱动层等,用于实现各种基础业务以及处理基于硬件的任务;
网络通信模块352,用于经由一个或多个(有线或无线)网络接口320 到达其他计算设备,示例性的网络接口320包括:蓝牙、无线相容性认证(Wi-Fi)、和通用串行总线(USB,Universal Serial Bus)等;
输入处理模块353,用于对一个或多个来自一个或多个输入装置332之一的一个或多个用户输入或互动进行检测以及翻译所检测的输入或互动。
在一些实施例中,本申请实施例提供的图片搜索装置可以采用软件方式实现,图3示出了存储在存储器350中的一种图片搜索装置354,该图片搜索装置354可以是电子设备300中的图片搜索装置,其可以是程序和插件等形式的软件,包括以下软件模块:获取模块3541、响应模块3542、处理模块3543、第一确定模块3544和第二确定模块3545,这些模块是逻辑上的,因此根据所实现的功能可以进行任意的组合或进一步拆分。将在下文中说明各个模块的功能。
在一些实施例中,本申请实施例提供的图片搜索装置可以采用硬件方式实现,作为示例,本申请实施例提供的图片搜索装置可以是采用硬件译码处理器形式的处理器,其被编程以执行本申请实施例提供的图片搜索方法,例如,硬件译码处理器形式的处理器可以采用一个或多个应用专用集成电路(ASIC,Application Specific Integrated Circuit)、DSP、可编程逻辑器件(PLD,Programmable Logic Device)、复杂可编程逻辑器件(CPLD,Complex Programmable Logic Device)、现场可编程门阵列(FPGA,Field-Programmable Gate Array)或其他电子元件。
下面将结合本申请实施例提供的电子设备300的示例性应用和实施,说明本申请实施例提供的图片搜索方法。参见图4,图4是本申请实施例提供的图片搜索方法的一个可选的流程示意图一,将结合图4示出的步骤进行说明。
S401,获取图片搜索请求,图片搜索请求中包括关键字符串。
这里,电子设备上运行有图片搜索应用,用户可以在图片搜索应用的客户端上输入关键字符串,则客户端基于用户的输入操作或者用户点击搜索的操作形成图片搜索请求,以请求客户端搜索与该关键字符串对应的图片。关键字符串可以是图片的类型、图片中的文字、图片中文字的摘要等等。
本申请实施例中,客户端在响应图片搜索请求进行图片搜索时,可以在在线状态下进行搜索,也可以在离线状态下进行搜索。
S402,响应于图片搜索请求,获取预设图片库中每一图片的OCR识别结果。
这里,OCR识别结果包括以下至少之一:采用基于OCR识别阈值的低维OCR识别处理所得到的低维OCR识别结果和基于深度识别的高维OCR识别处理所得到的高维OCR识别结果。其中,低维OCR识别处理是一种简略识别策略,高维OCR识别处理是一种深度识别策略;低维OCR识别处理对图片进行简单的识别,高维OCR识别处理对图片进行更加细致和精 确的识别,低维OCR识别处理的识别精度小于高维OCR识别处理的识别精度;低维OCR识别处理的难度较低,识别的精度较低,识别的速率较高,资源损耗较低,高维OCR识别处理的难度较高,识别的精度较高,识别的速率较低,资源损耗较大。
需要说明的是,OCR识别阈值是识别的精确度与识别时间之间相对平衡的一个值,也就是说,当满足OCR识别阈值时,不仅识别的速度快,且识别的容错率较高。举例来说,OCR识别阈值可以包括与字体大小对应的阈值或者与识别的可信度对应的阈值,即当对某一字体大小的文字进行识别的时候,不仅能够保证识别的准确率,还能保证识别的效率,则该字体大小对应的字体大小值即可以是OCR识别阈值;或者,当对图片中的文字进行识别时,在达到某一可信度时,识别的准确率较高,且识别的效率也较高时,则可以将该可信度确定为OCR识别阈值。
本申请实施例中,低维OCR识别处理是基于OCR识别阈值来进行的,也就是说,在对图片进行低维OCR识别处理时,识别参数是满足OCR识别阈值的。举例来说,当OCR识别阈值包括字体大小阈值时,则在低维OCR识别处理时,仅对图片中字体大小大于该字体大小阈值的文字进行识别,对小于该字体大小阈值的文字则不进行识别。即,在对图片进行低维OCR识别处理时,如果图片是类似于笔记等具有较多细节文字的图片,是不会对图片中的全部文字进行OCR识别,仅对图片中容易识别的部分文字进行OCR识别,如此能够提高识别的效率。
深度识别是指对图片中的细节内容也进行识别的一种精准识别方式,在深度识别过程中,不仅对整体内容进行识别,还对图片中的细节文字进行识别。基于深度识别的高维OCR识别处理能够对图片中的每一个字进行识别处理,因此,高维OCR识别处理的识别准确度更高,同时识别更耗时。
在一些实施例中,在对图片进行低维OCR识别处理之后,得到低维OCR识别结果,在对图片进行高维OCR识别处理之后,得到高维OCR识别结果,在得到低维OCR识别结果或高维OCR识别结果之后,均将对应的低维OCR识别结果或高维OCR识别结果、以及低维OCR识别结果与图片之间的映射关系、高维OCR识别结果与图片之间的映射关系,存储至预设存储单元中;如此,电子设备能够从预设存储单元中获取到预设图片库中部分图片的OCR识别结果。
S403,遍历预设图片库中未完成低维OCR识别处理、且未完成高维OCR识别处理的图片,并对遍历到的每一图片进行低维OCR识别处理,得到每一对应图片的低维OCR识别结果。
这里,电子设备对预设图片库中每一图片进行判断,确定每一图片是否已经进行过低维OCR识别处理和高维OCR识别处理,并对预设图片库中的未进行过低维OCR识别处理和高维OCR识别处理的图片进行遍历。例如,电子设备可以通过在预设存储单元中查找是否存储有每一图片的低 维OCR识别结果或高维OCR识别结果,来确定是否对每一图片已经进行过低维OCR识别处理或高维OCR识别处理。
本申请实施例中,对于在当前时刻仍然未进行低维OCR识别处理和高维OCR识别处理的图片,则对这些图片进行低维OCR识别处理,由于低维OCR识别处理的识别效率较高,因此在本次的图片搜索过程中,能够提高图片识别的效率,进而提高图片搜索的效率。
需要说明的是,在当前时刻对任一图片进行低维OCR识别处理之后,可以将该图片的低维OCR识别结果对应存储至预设存储单元中。
S404,根据每一图片的低维OCR识别结果和高维OCR识别结果中的至少一种,在预设图片库中确定与关键字符串匹配的目标图片。
本申请实施例中,在匹配目标图片时,电子设备不仅可以基于图片的低维OCR识别结果进行匹配,还可以基于图片的高维OCR识别结果进行匹配。当图片具有高维OCR识别结果,则优先选择基于高维OCR识别结果进行匹配,因为高维OCR识别结果比低维OCR识别结果的识别内容更多,识别准确度更高;当图片仅具有低维OCR识别结果时,则基于低维OCR识别结果进行匹配;另外,也可以基于低维OCR识别结果和高维OCR识别结果进行匹配。
在一些实施例中,在匹配目标图片时,可以是将关键字符串与低维OCR识别结果或与高维OCR识别结果中对应的文本内容进行匹配,确定低维OCR识别结果或高维OCR识别结果对应的文本内容与关键字符串之间的相似度,将具有最高相似度的图片确定为目标图片,或者,在确定出每一图片与关键字符串之间的相似度之后,按照相似度由大到小的顺序对图片进行排序,形成图片序列,然后在该图片序列中选择特定数量的图片作为目标图片。
在一些实施例中,在匹配目标图片时,还可以先根据低维OCR识别结果或高维OCR识别结果对应的文本内容确定每一图片对应的图片关键字符串,然后将图片搜索请求中的关键字符串与每一图片的图片关键字符串进行匹配,将与图片搜索请求中的关键字符串相同的图片关键字符串或者相近的图片关键字符串对应的图片确定为目标图片。
可以理解的是,预先存储了预设图片库中的部分图片的高维OCR识别结果和低维OCR识别结果中的至少一种,使得识别时可以对部分图片采用高维OCR识别结果进行搜索,对剩余图片采用低维OCR识别结果进行搜索;而低维OCR识别结果可以是预先存储的,也可以是实时获取的,但由于低维OCR识别结果的获取速度快,从而能够即保证图片搜索效率又保证图片搜索速度。
S405,将目标图片确定为图片搜索请求的搜索结果,并显示搜索结果。
本申请实施例中,当确定出的目标图片为一张时,在电子设备的当前界面上显示这一张图片,当确定出的目标图片为多张时,在电子设备的当 前界面上同时显示多张图片,或者分页显示多张图片。
本申请实施例提供的图片搜索方法,采用基于OCR识别阈值的低维OCR识别处理和基于深度识别的高维OCR识别处理对预设图片库中的图片进行处理,对应得到低维OCR识别结果和高维OCR识别结果,并根据每一图片的低维OCR识别结果或高维OCR识别结果,匹配得到图片搜索请求的目标图片,如此,由于同时结合低维OCR识别处理和高维OCR识别处理的识别结果进行图片搜索,能够更准确的搜索图片中的文字信息,实现精细化的搜索,得到准确的搜索结果,并且能够提高搜索效率。
在一些实施例中,可以采用不同的方式进行低维OCR识别处理,图5是本申请实施例提供的图片搜索方法的一个可选的流程示意图二,将结合图5示出的步骤进行说明。
S501,获取图片搜索请求,图片搜索请求中包括关键字符串。
S502,响应于图片搜索请求,获取预设图片库中每一图片的OCR识别结果。
其中,OCR识别结果包括以下至少之一:采用基于OCR识别阈值的低维OCR识别处理所得到的低维OCR识别结果和基于深度识别的高维OCR识别处理所得到的高维OCR识别结果,低维OCR识别处理的识别精度小于高维OCR识别处理的识别精度。
S503,当确定出预设图片库中存在至少一张图片未完成低维OCR识别处理、且未完成高维OCR识别处理时,遍历未完成低维OCR识别处理和高维OCR识别处理的图片。
需要说明的是,S501至S503与上述S401至S403对应的实现过程的描述相同。
在一些实施例中,OCR识别阈值包括识别速度阈值,对应地,可以通过以下步骤进行低维OCR识别处理。
S504,确定针对于遍历到的图片中的每一文字的识别速度。
这里,每一文字的识别速度是指要识别出某段文字所需要的时长与要识别的这段文字的字数之间的比值。识别速度越高表明对应的文字的识别难度越低,越容易识别,识别速度越低表明对应的文字的识别难度越高,越难识别。
本申请实施例中,电子设备可以预先根据OCR识别的情况确定对每一类型的文字的识别速度,从而确定出合适的识别速度阈值。
S505,对识别速度大于识别速度阈值的文字进行OCR识别。
这里,识别速度大于识别速度阈值的文字是相对较容易识别的文字,电子设备可以仅对相对容易识别的文字进行OCR识别,以完成对图片的低维OCR识别处理。
在一些实施例中,OCR识别阈值包括字体大小阈值,对应地,可以通过以下步骤进行低维OCR识别处理。
S506,确定遍历到的图片中的每一文字的字体大小。
这里,字体越大的文字相对更加容易识别,从而识别的速度更高,字体越小的文字相对更难识别,从而识别的速度更低。
本申请实施例中,电子设备可以预先根据OCR识别的情况确定不同字体大小的文字进行识别的识别速度,从而确定出合适的字体大小阈值。
S507,对字体大小大于字体大小阈值的文字进行OCR识别。
在一些实施例中,图片中还可以包括异体字,对应地,可以通过以下步骤进行低维OCR识别处理。
S508,当遍历到的图片中包括异体字时,结束对异体字进行低维OCR识别处理的流程。
这里,由于电子设备无法准确的识别出异体字,因此,不对异体字进行识别。
S509,根据每一图片的低维OCR识别结果或高维OCR识别结果,在预设图片库中确定与关键字符串匹配的目标图片。
S510,将目标图片确定为图片搜索请求的搜索结果,并显示搜索结果。
本申请实施例中,在对图片进行低维OCR识别处理时,可以设置不同的OCR识别阈值,以不同的OCR识别阈值作为识别的参考条件进行文字识别,从而能够实现在保证识别准确度的同时,提高识别的速度,实现识别准确度与识别效率之间的平衡。
基于图4,图6是本申请实施例提供的图片搜索方法的一个可选的流程示意图三;在一些实施例中,低维OCR识别处理的识别精度小于高维OCR识别处理的识别精度,如图6所示,S404可以通过以下步骤实现。
S601,判断每一图片是否具有低维OCR识别结果。
如果判断结果为是,则执行S602,如果判断结果为否,则返回步骤S403继续对该图片进行低维OCR识别处理。
S602,判断每一图片是否具有高维OCR识别结果。
如果判断结果为是,则执行步骤S603,如果判断结果为否,则执行步骤S604。
S603,当图片同时具有低维OCR识别结果和高维OCR识别结果时,将高维OCR识别结果确定为图片的OCR识别结果。
S604,当图片仅具有低维OCR识别结果时,将低维OCR识别结果确定为图片的OCR识别结果。
S605,根据图片的OCR识别结果,在预设图片库中确定与关键字符串匹配的目标图片。
本申请实施例中,由于高维OCR识别结果的精准度高于低维OCR识别结果的精准度,因此当同时存在低维OCR识别结果和高维OCR识别结果时,则以精准度更高的高维OCR识别结果为依据进行目标图片的匹配;并且,当仅具有低维OCR识别结果时,为了保证本次图片搜索任务的时效 性,提高本次图片搜索任务的搜索效率,则继续以该低维OCR识别结果为依据进行目标图片的匹配,此时由于低维OCR识别结果也是具有一定的可信度,且具有一定的识别准确度的,因此,在保证图片搜索效率的同时,在一定程度上也能够保证最终匹配结果的准确性。
图7是本申请实施例提供的高维OCR识别处理的一个可选的流程示意图,将结合图7示出的步骤进行说明。
S701,在获取图片搜索请求之前,或者,在完成对图片搜索请求的响应之后,或者,在搜索请求响应中断时,确定预设图片库中未完成高维OCR识别处理的图片为未处理图片。
本申请实施例中,可以在闲时实现高维OCR识别处理,也就是说,在电子设备不执行图片搜索任务时,可以在后台执行高维OCR识别处理。由于在获取图片搜索请求之前,或者,在完成对图片搜索请求的响应之后,或者,在搜索请求响应中断时,均没有执行图片搜索任务,因此可以在这些时间段内进行高维OCR识别处理,以实现对预设图片库中的每一图片完成高维OCR识别处理,使得在后续的图片搜索任务中,均能够基于精确度更高的高维OCR识别结果进行图片搜素。
需要说明的是,未处理图片是没有完成高维OCR识别处理的图片,即,未处理图片不仅包括未完成低维OCR识别处理且未完成高维OCR识别处理的图片,还包括已完成低维OCR识别处理且未完成高维OCR识别处理的图片。
S702,对每一未处理图片进行高维OCR识别处理,得到每一未处理图片的高维OCR识别结果。
在一些实施例中,电子设备在对遍历到的每一图片进行低维OCR识别处理,得到每一对应图片的低维OCR识别结果之后,将低维OCR识别结果存储至预设存储单元中;电子设备在采用高维OCR识别处理对每一未处理图片进行处理,得到每一未处理图片的高维OCR识别结果之后,将高维OCR识别结果存储至预设存储单元中,并删除对应未处理图片的低维OCR识别结果。
本申请实施例中,电子设备在每完成一次低维OCR识别处理或高维OCR识别处理之后,均将得到的低维OCR识别结果或高维OCR识别结果存储至预设存储单元中,如此,能够保证在后续执行图片搜索任务时,可以直接从预设存储单元中快速的获取到低维OCR识别结果或高维OCR识别结果,根据获取到的低维OCR识别结果或高维OCR识别结果进行快速的关键字符串匹配,而无需再对图片进行低维OCR识别处理或高维OCR识别处理,提高了图片搜索效率。
在一些实施例中,S702可以通过S7021和S7022实现,下面对各步骤分别进行说明。
S7021,对未处理图片进行文本清晰化处理,得到文本清晰化处理后的 图片。
这里,文本清晰化处理包括以下步骤:首先,对未处理图片进行分割,形成至少两个子图片;然后,对每一子图片进行放大处理,得到放大后的子图片。
本申请实施例中,电子设备可以将未处理图片等分为至少两个子图片,也可以采用任意的分割方式,或基于一定的分割规则,将未处理图片分割为不规则或不相等的至少两个子图片。
当对未处理图片进行不规则或不等分割时,例如,未处理图片A的左侧三分之一是纯图片,没有任何文字,而右侧的三分之二是由文字形成的文字图片,则可以将未处理图片A划分为两部分,第一部分是左侧三分之一的纯图片形成的一个子图片,第二部分是右侧三分之二的文字图片形成的一个子图片。这样,由于第一部分是纯图片,因此无需进行OCR识别,而第二部分是文字图片,这样划分还不会影响第二部分文字的连续性,能够对第二部分文字进行更加准确的识别,且仅需对第二部分进行OCR识别,如此,不仅提高了识别的准确性,还能够有效的提高识别效率。
本申请实施例中,由于高维OCR识别结果需要对图片中的细节内容也进行识别,而图片中的细节内容,例如文字,通常会比较小,因此,为了提高识别的准确度,可以对分割后的子图片进行放大处理,以降低细节内容的识别难度。
S7022,对文本清晰化处理后的图片中的文字进行OCR识别,得到每一未处理图片的高维OCR识别结果。
这里,S7022中电子设备对文本清晰化处理后的图片中的文字进行OCR识别可以通过以下步骤实现:对放大后的子图片中的文字进行OCR识别,得到每一子图片对应的子识别结果。然后,对至少两个子图片中每一子图片的子识别结果进行融合,得到未处理图片的高维OCR识别结果。
这里,电子设备可以判断至少两个子图片对应的至少两个子识别结果之间是否包括重叠内容;当至少两个子识别结果之间包括重叠内容时,确定至少两个子识别结果中的非重叠内容和重叠内容;将非重叠内容与重叠内容进行融合,得到未处理图片的高维OCR识别结果。
需要说明的是,这里将非重叠内容与重叠内容进行融合,是指将重叠内容在高维OCR识别结果中的重复的部分删除。举例来说,当第一张子图片的子识别结果中包括A、B、C、D四个关键字,第二张子图片的子识别结果中包括C、D、E、F四个关键字时,此时,第一张子图片的子识别结果与第二张子图片的子识别结果的非重叠内容为:A、B、E、F,而重叠内容为C、D,因此,在将非重叠内容与重叠内容进行融合,得到未处理图片的高维OCR识别结果则应该是:A、B、C、D、E、F,而不应该是:A、B、C、D、C、D、E、F,即需要将重叠内容C、D在高维OCR识别结果中的重复的部分C、D删除。
当至少两个子识别结果之间不包括重叠内容时,对每一子图片进行再次分割、放大处理、OCR识别和子识别结果的融合,以得到每一子图片的识别结果;根据每一子图片的识别结果,确定未处理图片的高维OCR识别结果。
这里,当至少两个子识别结果之间不包括重叠内容时,为了进一步提高识别的准确度,电子设备可以再次对子图片进行分割、放大和识别,以及识别后的结果融合处理,从而得到子图片的更加准确的识别结果。
在一些实施例中,图片搜索方法可以由图片搜索系统中的客户端、与客户端对应的预设存储单元和服务器来实现,图8是本申请实施例提供的图片搜索方法的一个可选的流程示意图四,如图8所示,该图片搜索方法包括S801至S815,下面对各步骤分别进行说明。
S801,服务器采用基于深度识别的高维OCR识别处理对预设图片库中的每一图片进行处理,得到每一图片的高维OCR识别结果。
这里,服务器在闲时对预设图片库中的每一图片进行高维OCR识别处理,能够有效的利用资源,避免在进行图片搜索任务时进行高维OCR识别处理而降低搜索效率的问题。
S802,服务器将高维OCR识别结果存储至预设存储单元中。
本申请实施例中,服务器在每处理得到一张图片的高维OCR识别结果时,即将该高维OCR识别结果存储至预设存储单元中,这样能够保证在接下来的图片搜索任务中就能够及时的使用到该高维OCR识别结果。
S803,客户端获取图片搜索请求,图片搜索请求中包括关键字符串。
S804,客户端响应于图片搜索请求,从预设存储单元中获取预设图片库中每一图片的OCR识别结果。
其中,OCR识别结果包括以下至少之一:采用基于OCR识别阈值的低维OCR识别处理所得到的低维OCR识别结果和基于深度识别的高维OCR识别处理所得到的高维OCR识别结果,低维OCR识别处理的识别精度小于高维OCR识别处理的识别精度。
S805,客户端遍历预设图片库中未完成低维OCR识别处理和高维OCR识别处理的图片,并对遍历到的每一图片进行低维OCR识别处理,得到每一对应图片的低维OCR识别结果。
S806,当预设图片库中增加新的图片时,客户端对新的图片进行低维OCR识别处理。
本申请实施例中,当预设图片库中新增图片时,还需要对新增图片进行低维OCR识别处理,以保证预设图片库中的每一图片均具有低维OCR识别结果。或者,当预设图片库中新增图片时,可以在下一次图片搜索任务中及时对该新增图片进行低维OCR识别处理。
S807,客户端确定每一图片的低维OCR识别结果对应的可信度。
本申请实施例中,可以采用特定的OCR识别模型进行OCR识别,在 采用该OCR识别模型进行OCR识别时,不仅能够得到低维OCR识别结果,还能够得到本次低维OCR识别结果对应的可信度。
可信度的影响因素包括但不限于以下至少之一:图片的清晰度、图片的类型和识别出的字数等。例如,对于本身清晰度比较低,拍摄的比较模糊的图片,识别结果的可信度相对会较低;对于印刷体和手写体的文字的识别,可信度也存在差别,相对于印刷体,手写体文字的识别结果可信度较低;当对于同一张图片进行识别时,如果识别出来的文字的数量远小于实际的字数,则识别结果的可信度较低。
S808,客户端删除可信度低于阈值的低维OCR识别结果。
本申请实施例中,选取具有高可信度的低维OCR识别结果。
S809,客户端将每一对应图片的低维OCR识别结果存储至预设存储单元中。
S810,客户端根据每一图片的低维OCR识别结果和高维OCR识别结果中的至少一种,在预设图片库中确定与关键字符串匹配的目标图片。
S811,客户端将目标图片确定为图片搜索请求的搜索结果,并显示搜索结果。
S812,服务器继续采用基于深度识别的高维OCR识别处理对预设图片库中还未进行高维OCR识别处理的图片进行处理,得到图片的高维OCR识别结果。
这里,由于预设图片库中的图片还没有完全完成高维OCR识别处理,因此,在完成一次图片搜索任务之后的空闲时间内,后台服务器可以继续基于深度识别的高维OCR识别处理对预设图片库中还未进行高维OCR识别处理的图片进行处理。
S813,服务器将高维OCR识别结果存储至预设存储单元中。
S814,当预设图片库中增加新的图片时,服务器对新的图片进行高维OCR识别处理。
本申请实施例中,当预设图片库中新增图片时,还需要对新增图片进行高维OCR识别处理,以保证预设图片库中的每一图片均具有高维OCR识别结果。
S815,在采用高维OCR识别处理对每一图片进行处理,得到每一图片的高维OCR识别结果之后,服务器删除对应图片在预设存储单元中的低维OCR识别结果。
本申请实施例中,由于高维OCR识别结果的识别准确度高于低维OCR识别结果的识别准确度,因此,当任一图片同时具有低维OCR识别结果和高维OCR识别结果时,可以仅保留具有较高识别准确度的高维OCR识别结果,删除预设存储单元中存储的低维OCR识别结果。这样,不仅能够节省预设存储单元中的存储空间,还能够保证在进行后续图片搜索任务时,可以直接采用预设存储单元中存储的高维OCR识别结果进行关键字符串匹 配,而无需从低维OCR识别结果和高维OCR识别结果中确定出识别准确度更高的高维OCR识别结果,即节省了一次判读和选择的步骤,进一步提高了搜索效率。
下面,将说明本申请实施例在一个实际的应用场景中的示例性应用。该示例性应用描述了预设图片库为相册时,基于获得的用户输入的搜索关键字,在相册中准确且快速查找出于搜索关键字匹配的目标图片的过程。
本申请实施例提供一种图片搜索方法,在实际产品应用上,用户只需要在图片搜索应用的输入界面输入与搜索关键字,图片搜索应用就可以自动搜索到与搜索关键字匹配的搜索结果,且搜索结果能够准确包括图片中存在的文字,本申请实施例的方法能够应用到所有搜索图片的场景中。
如图9所示,是本申请实施例提供的图片搜索方法的流程示意图,如图9所示,该图片搜索方法通过客户端实现,包括S901至S903,下面对各步骤分别进行说明。
S901,获取用户输入的搜索关键字(称为关键字符串)。
S902,对图片进行OCR识别,并确定OCR识别结果中是否包含搜索关键字,得到搜索结果(称为目标图片)。
S903,输出搜索结果。
由于OCR识别需要一定时间,所以在用户搜索时进行OCR识别的效率需要快,因此整体搜索流程将基于后台闲时识别与搜索时的快速识别相结合,在保证能够快速输出搜索结果的同时,确保了图片OCR识别的准确性与完整性。本申请实施例中,可以将简略识别策略与深度识别策略进行结合来实现图片搜索方法,其中,简略识别策略对应本申请实施例中的低维OCR识别处理,深度识别策略对应本申请实施例中的高维OCR识别处理。简略识别策略与深度识别策略的详细流程与它们之间的调度关系将在下文详细阐述。
图10是本申请实施例提供的图片搜索方法的详细流程示意图,如图10所示,该图片搜索方法包括S1001至S1013,下面对各步骤分别进行说明。
S1001,获取搜索关键字。
需要说明的是,客户端开始进行图片搜索时,响应于用户操作,获取搜索关键字。
S1002,判断图片是否全部扫描完成。
这里,如果对预设图片库已经进行了一次全量图片的简略识别(称为全部扫描完成),则可以利用该识别结果进行查询;如果对预设图片库尚未完成全量图片的简略识别,则进入简略识别流程。也就是说,如果判断结果为是,则利用该识别结果进行查询,并执行S1003;如果判断结果为否,则执行S1004。
S1003,输出搜索结果。
本申请实施例中,当利用OCR识别内容进行搜索时,如果对预设图片 库已经进行了一次全量的简略识别,则可以利用该搜索关键字对识别结果进行搜索,并同时输出包含了搜索关键字的搜索结果以结束。
S1004,使用简略识别策略(称为低维OCR识别处理)。
S1005,遍历预设图片库中的图片。
S1006,对图片中的文字进行OCR识别。
需要说明的是,这里的图片为未完成低维OCR识别处理、且未完成高维OCR识别处理的图片。
S1007,判断是否进行深度扫描(即是否采用深度识别策略进行后台闲时深度识别)。
如果判断结果为是,则进行深度扫描,执行S1009;如果判断结果为否,则执行S1008。
S1008,判断是否遍历完成。
如果判断结果为是,则返回继续执行S1003;如果判断结果为否,则返回继续执行S1005。
当确定使用深度识别策略进行处理时,深度识别策略包括S1009至S1013,下面对各步骤分别进行说明。
S1009,使用深度识别策略(称为高维OCR识别处理)。
S1010,图片4等分并放大。
S1011,对分割后的图片(称为子图片)进行OCR识别。
S1012,判断识别结果(称为子识别结果)是否与已有结果(称为子识别结果)重复。
这里,已有结果是指当前在对任一图片中的任一等分部分(称为子图片)进行识别时,历史过程中对该图片的其他等分部分识别到的结果。
本申请实施例中,判断当前对任一图片中的任一等分部分识别到的结果,与历史过程中对该图片的其他等分部分识别到的结果是否有重叠内容。如果判断结果为是,则执行S1013;如果判断结果为否,则返回继续执行S1010,继续进行分割并识别。
S1013,记录识别结果。
以下对简略识别流程进行详细说明:
图11是本申请实施例提供的简略识别流程的流程示意图,如图11所示,在简略识别流程中,利用OCR简单识别一张图片上的文字,如图12所示,是本申请实施例中简略识别流程的示意图,在简略识别流程中,对于大字、正体字,例如图12中的文字121,将会是主动识别的目标,而小字、异体字等,例如图12中的文字122,由于需要较多时间和资源进行识别,因此,对于这一类字体将会放弃识别,确保单张图片的识别时间能够控制在10毫秒内。
请继续参照图11,简略识别策略的流程包括S111至S117,下面对各步骤分别进行说明。
S111,遍历预设图片库中的图片。
当开始执行简略识别策略时,客户端遍历预设图片库中的图片,以获取未完成低维OCR识别处理、且未完成高维OCR识别处理的图片。
S112,对图片中的文字进行OCR识别。
需要说明的是,这里的图片为未完成低维OCR识别处理、且未完成高维OCR识别处理的图片。
S113,判断识别结果的可信度是否大于80%。
这里,选取可信度高的识别结果。在简略识别过程中,将对可信度低的结果进行排除,因为在简略识别触发的时机正是用户正在进行搜索,而深度识别尚未完成的时候,所以需要保持一定的识别的准确度,确保用户能够正常搜索,同时避免由于可信度低导致产出过多搜索干扰项。
本申请实施例中,可信度是在进行OCR识别的时候即可得到的一个值,即在对图片进行OCR识别时,不仅输出识别结果,还会输出该识别结果对应的可信度。
在S113中,如果判断结果为是,则执行S114,如果判断结果为否,则执行S116。
S114,保存识别结果。
本申请实施例中,当所有图片识别完成后,对应图片的OCR结果将会保存在数据库中。
S115,判断是否遍历完成。
如果判断结果为是,则结束流程;如果判断结果为否,则返回继续执行S111。
S116,丢弃识别结果。
S117,新增图片。
需要说明的是,当新增图片时,则继续执行S112对新增图片中的文字进行OCR识别。
本申请实施例中,当有新的图片增加时,不需要再进行全量识别,只需要对新增的图片进行一次识别,并将识别结果保存到数据库(称为预设存储单元)即可。
以下对深度识别策略进行详细说明。
图13是本申请实施例提供的深度识别策略的流程示意图,如图13所示,深度识别策略包括S131至S138,下面对各步骤分别进行说明。
S131,遍历预设图片库中的图片。
需要说明的是,本申请实施例中,深度识别将会在闲时(例如,深夜、充电中且应用程序未在使用时)进行。
S132,图片四等分并放大。
在深度识别中,将分割图片并放大,确保能够识别到更多的信息,例如,在深度识别时,可以对图片进行四等分,该处理的目的是为了在图片 中识别更多的文字。图14A是本申请实施例提供的对图片进行等分并放大前的示意图,图14B是本申请实施例提供的对图片进行等分并放大后的示意图,如图14A和图14B所示,在对图片进行等分并放大前的原始图片区域141中,文字较小,很难识别,而在对图片进行等分并放大后的局部放大图片区域142中,文字被放大,容易识别;其中,图片区域142是图片区域141的放大结果。
本申请实施例中,分割后对图片进行识别,如果某一张分割后的图片中没有包含有任何文字信息,则将抛弃该分割图片,不再对该分割图片区域进行识别。
S133,对分割后的图片进行OCR识别。
S134,判断是否有OCR识别结果。
如果判断结果为是,则执行S135,如果判断结果为否,则结束对该图片的继续分割,并执行S136。
S135,判断识别结果的可信度是否低于阈值。
本申请实施例中,可以针对可信度低的结果进行再次分割识别。
在本申请实施例中,可能将图片四等分后仍然在图片中包含了过多的文字信息(如全景图片、长截图等),这时候识别出来的结果可信度将会偏低,对于这部分的图片,将对分割后的图片再次进行分割,并对二次分割的图片同样进行识别,如果图片之前已经能够识别出可信度高(例如,可信度大于70%)的内容,或者图片中并未包含文字信息,则不需要进行再次分割。
在S135中,如果判断结果为是,则返回继续执行S132,对图片进行继续分割和识别;如果判断结果为否,则执行S136。
S136,跳过该分割图片。
S137,判断是否遍历完所有分割图片。
如果判断结果为是,则执行S138;如果判断结果为否,则继续遍历已分割图片,并返回继续执行S133。
S138,判断预设图片库中的图片是否遍历完成。
如果判断结果为是,则结束流程;如果判断结果为否,则返回S131继续遍历图片。
本申请实施例中,可以将识别结果保存到数据库中,如果数据库中已包含该图片的简略识别流程的结果,则将简略识别流程的结果替换为深度识别的结果。同样的,如果有新图片加入时,也可以直接对新图片进行增量识别,即对新图片进行深度识别。
本申请实施例提供的图片搜索方法,在搜索照片时,能够更准确地搜索照片中的文字信息,并提供更多维度的照片搜索,能够利用到更多的搜索场景,如:笔记搜索、聊天记录截图搜索等,且无需后台云端识别就有较高的准确率,并可以离线使用。
下面继续说明本申请实施例提供的图片搜索装置354实施为软件模块的示例性结构,在一些实施例中,如图3所示,存储在存储器350的图片搜索装置354,包括:
获取模块3541,配置为获取图片搜索请求,所述图片搜索请求中包括关键字符串;
响应模块3542,配置为响应于所述图片搜索请求,获取预设图片库中每一图片的OCR识别结果;其中,所述OCR识别结果包括以下至少之一:采用基于OCR识别阈值的低维OCR识别处理所得到的低维OCR识别结果和基于深度识别的高维OCR识别处理所得到的高维OCR识别结果,所述低维OCR识别处理的识别精度小于所述高维OCR识别处理的识别精度;
处理模块3543,配置为遍历所述预设图片库中未完成所述低维OCR识别处理、且未完成所述高维OCR识别处理的图片,并对遍历到的每一图片进行所述低维OCR识别处理,得到每一对应图片的低维OCR识别结果;
第一确定模块3544,配置为根据每一图片的所述低维OCR识别结果和所述高维OCR识别结果中的至少一种,在所述预设图片库中确定与所述关键字符串匹配的目标图片;
第二确定模块3545,配置为将所述目标图片确定为所述图片搜索请求的搜索结果,并显示所述搜索结果。
在一些实施例中,所述处理模块3543还配置为:当确定出所述预设图片库中包括至少一张图片未完成所述低维OCR识别处理、且未完成所述高维OCR识别处理时,遍历未完成所述低维OCR识别处理和所述高维OCR识别处理的图片,并对遍历到的每一图片进行所述低维OCR识别处理。
在一些实施例中,所述OCR识别阈值包括识别速度阈值,所述处理模块3543还配置为:确定针对于遍历到的所述图片中的每一文字的识别速度;对所述识别速度大于所述识别速度阈值的文字进行OCR识别,所述OCR识别用于对所述图片进行所述低维OCR识别处理。
在一些实施例中,所述OCR识别阈值包括字体大小阈值,所述处理模块3543还配置为:确定遍历到的所述图片中的每一文字的字体大小;对所述字体大小大于所述字体大小阈值的文字进行OCR识别,所述OCR识别用于对所述图片进行所述低维OCR识别处理。
在一些实施例中,所述处理模块3543还配置为:当遍历到的所述图片中包括异体字时,结束对所述异体字进行所述低维OCR识别处理的流程。
在一些实施例中,所述第一确定模块3544还配置为:当所述图片同时具有所述低维OCR识别结果和所述高维OCR识别结果时,将所述高维OCR识别结果确定为所述图片的所述OCR识别结果;当所述图片仅具有所述低维OCR识别结果时,将所述低维OCR识别结果确定为所述图片的所述OCR识别结果;根据所述图片的所述OCR识别结果,在所述预设图片库中确定与所述关键字符串匹配的所述目标图片。
在一些实施例中,所述图片搜索装置354还包括:第三确定模块,配置为在获取所述图片搜索请求之前,或者,在完成对所述图片搜索请求的响应之后,或者,在所述图片搜索请求响应中断时,确定所述预设图片库中未完成所述高维OCR识别处理的图片为未处理图片;对每一所述未处理图片进行所述高维OCR识别处理,得到每一所述未处理图片的所述高维OCR识别结果。
在一些实施例中,所述图片处理模块还配置为:对所述未处理图片进行文本清晰化处理,得到文本清晰化处理后的图片;对所述文本清晰化处理后的图片中的文字进行OCR识别,以得到每一所述未处理图片的所述高维OCR识别结果。
在一些实施例中,所述图片处理模块还配置为:对所述未处理图片进行分割,得到至少两个子图片;对每一所述子图片进行放大处理,得到放大后的子图片;对所述放大后的子图片中的文字进行OCR识别,以得到每一所述子图片对应的子识别结果;对所述至少两个子图片对应的至少两个所述子识别结果进行融合,得到所述未处理图片的所述高维OCR识别结果。
在一些实施例中,所述图片处理模块还配置为:当所述至少两个子图片对应的至少两个所述子识别结果之间包括重叠内容时,确定至少两个所述子识别结果中的非重叠内容和所述重叠内容;将所述非重叠内容与所述重叠内容进行融合,得到所述未处理图片的所述高维OCR识别结果;当所述至少两个子图片对应的至少两个所述子识别结果之间不包括所述重叠内容时,对每一所述子图片进行再次分割、所述放大处理、所述OCR识别和所述子识别结果的融合,以得到每一所述子图片的识别结果;根据每一所述子图片的所述识别结果,确定所述未处理图片的所述高维OCR识别结果。
在一些实施例中,所述图片搜索装置354还包括:存储模块,配置为在对遍历到的每一图片进行所述低维OCR识别处理,得到每一对应图片的低维OCR识别结果之后,将所述低维OCR识别结果存储至预设存储单元中;以及,在对每一所述未处理图片进行所述高维OCR识别处理,得到每一所述未处理图片的所述高维OCR识别结果之后,将所述高维OCR识别结果存储至所述预设存储单元中,并删除对应未处理图片的所述低维OCR识别结果。
在一些实施例中,所述图片搜索装置354还包括:第四确定模块,配置为确定每一所述图片的低维OCR识别结果对应的可信度;删除模块,配置为删除可信度低于阈值的低维OCR识别结果。
在一些实施例中,所述图片搜索装置354还包括:OCR识别处理模块,配置为当所述预设图片库中增加新的图片时,对所述新的图片进行所述低维OCR识别处理或所述高维OCR识别处理。
需要说明的是,本申请实施例提供的图片搜索装置的描述,与本申请实施例提供的图片搜索方法的描述是类似的,具有相似的有益效果。
本申请实施例提供了一种计算机程序产品或计算机程序,该计算机程序产品或计算机程序包括计算机指令,该计算机指令存储在计算机可读存储介质中。计算机设备(用于图片搜索的电子设备)的处理器从计算机可读存储介质读取该计算机指令,处理器执行该计算机指令,使得该计算机设备执行本申请实施例上述的图片搜索方法。
本申请实施例提供一种存储有可执行指令的计算机可读存储介质,其中存储有可执行指令,当可执行指令被处理器执行时,将引起处理器执行本申请实施例提供的图片搜索方法,例如,如图4示出的图片搜索方法。
在一些实施例中,计算机可读存储介质,例如,铁电存储器(FRAM,Ferromagnetic Random Access Memory)、只读存储器(ROM,Read Only Memory)、可编程只读存储器(PROM,Programmable Read Only Memory)、可擦除可编程只读存储器(EPROM,Erasable Programmable Read Only Memory)、带电可擦可编程只读存储器(EEPROM,Electrically Erasable Programmable Read Only Memory)、闪存、磁表面存储器、光盘、或光盘只读存储器(CD-ROM,Compact Disk-Read Only Memory)等存储器;也可以是包括上述存储器之一或任意组合的各种设备。
在一些实施例中,可执行指令可以采用程序、软件、软件模块、脚本或代码的形式,按任意形式的编程语言(包括编译或解释语言,或者声明性或过程性语言)来编写,并且其可按任意形式部署,包括被部署为独立的程序或者被部署为模块、组件、子例程或者适合在计算环境中使用的其它单元。
作为示例,可执行指令可以但不一定对应于文件系统中的文件,可以可被存储在保存其它程序或数据的文件的一部分,例如,存储在超文本标记语言(HTML,Hyper Text Markup Language)文档中的一个或多个脚本中,存储在专用于所讨论的程序的单个文件中,或者,存储在多个协同文件(例如,存储一个或多个模块、子程序或代码部分的文件)中。作为示例,可执行指令可被部署为在一个计算设备上执行,或者在位于一个地点的多个计算设备上执行,又或者,在分布在多个地点且通过通信网络互连的多个计算设备上执行。
以上所述,仅为本申请的实施例而已,并非用于限定本申请的保护范围。凡在本申请的精神和范围之内所作的任何修改、等同替换和改进等,均包含在本申请的保护范围之内。

Claims (16)

  1. 一种图片搜索方法,所述方法由电子设备执行,包括:
    获取图片搜索请求,所述图片搜索请求中包括关键字符串;
    响应于所述图片搜索请求,获取预设图片库中每一图片的OCR识别结果;其中,所述OCR识别结果包括以下至少之一:采用基于OCR识别阈值的低维OCR识别处理所得到的低维OCR识别结果和基于深度识别的高维OCR识别处理所得到的高维OCR识别结果,所述低维OCR识别处理的识别精度小于所述高维OCR识别处理的识别精度;
    遍历所述预设图片库中未完成所述低维OCR识别处理、且未完成所述高维OCR识别处理的图片,并对遍历到的每一图片进行所述低维OCR识别处理,得到每一对应图片的低维OCR识别结果;
    根据每一图片的所述低维OCR识别结果和所述高维OCR识别结果中的至少一种,在所述预设图片库中确定与所述关键字符串匹配的目标图片;
    将所述目标图片确定为所述图片搜索请求的搜索结果,并显示所述搜索结果。
  2. 根据权利要求1所述的方法,其中,所述遍历所述预设图片库中未完成所述低维OCR识别处理、且未完成所述高维OCR识别处理的图片,并对遍历到的每一图片进行所述低维OCR识别处理,包括:
    当确定出所述预设图片库中包括至少一张图片未完成所述低维OCR识别处理、且未完成所述高维OCR识别处理时,遍历未完成所述低维OCR识别处理和所述高维OCR识别处理的图片,并对遍历到的每一图片进行所述低维OCR识别处理。
  3. 根据权利要求1所述的方法,其中,所述OCR识别阈值包括识别速度阈值,所述对遍历到的每一图片进行所述低维OCR识别处理,包括:
    确定针对于遍历到的所述图片中的每一文字的识别速度;
    对所述识别速度大于所述识别速度阈值的文字进行OCR识别,所述OCR识别用于对所述图片进行所述低维OCR识别处理。
  4. 根据权利要求1所述的方法,其中,所述OCR识别阈值包括字体大小阈值,所述对遍历到的每一图片进行所述低维OCR识别处理,包括:
    确定遍历到的所述图片中的每一文字的字体大小;
    对所述字体大小大于所述字体大小阈值的文字进行OCR识别,所述OCR识别用于对所述图片进行所述低维OCR识别处理。
  5. 根据权利要求1所述的方法,其中,所述对遍历到的每一图片进行所述低维OCR识别处理之前,所述方法还包括:
    当遍历到的所述图片中包括异体字时,结束对所述异体字进行所述低维OCR识别处理的流程。
  6. 根据权利要求1所述的方法,其中,所述根据每一图片的所述低维OCR识别结果和所述高维OCR识别结果中的至少一种,在所述预设图片库中确定与所述关键字符串匹配的目标图片,包括:
    当所述图片同时具有所述低维OCR识别结果和所述高维OCR识别结果时,将所述高维OCR识别结果确定为所述图片的所述OCR识别结果;
    当所述图片仅具有所述低维OCR识别结果时,将所述低维OCR识别结果确定为所述图片的所述OCR识别结果;
    根据所述图片的所述OCR识别结果,在所述预设图片库中确定与所述关键字符串匹配的所述目标图片。
  7. 根据权利要求1至6任一项所述的方法,其中,所述方法还包括:
    在获取所述图片搜索请求之前,或者,在完成对所述图片搜索请求的响应之后,或者,在所述图片搜索请求响应中断时,确定所述预设图片库中未完成所述高维OCR识别处理的图片为未处理图片;
    对每一所述未处理图片进行所述高维OCR识别处理,得到每一所述未处理图片的所述高维OCR识别结果。
  8. 根据权利要求7所述的方法,其中,所述对每一所述未处理图片进行所述高维OCR识别处理,得到每一所述未处理图片的所述高维OCR识别结果,包括:
    对所述未处理图片进行文本清晰化处理,得到文本清晰化处理后的图片;
    对所述文本清晰化处理后的图片中的文字进行OCR识别,得到每一所述未处理图片的所述高维OCR识别结果。
  9. 根据权利要求8所述的方法,其中,所述对所述未处理图片进行文本清晰化处理,得到文本清晰化处理后的图片,包括:
    对所述未处理图片进行分割,得到至少两个子图片;
    对每一所述子图片进行放大处理,得到放大后的子图片;
    所述对所述文本清晰化处理后的图片中的文字进行OCR识别,得到每一所述未处理图片的所述高维OCR识别结果,包括:
    对所述放大后的子图片中的文字进行OCR识别,得到每一所述子图片对应的子识别结果;
    对所述至少两个子图片对应的至少两个所述子识别结果进行融合,得到所述未处理图片的所述高维OCR识别结果。
  10. 根据权利要求9所述的方法,其中,所述对所述至少两个子图片对应的至少两个所述子识别结果进行融合,得到所述未处理图片的所述高维OCR识别结果,包括:
    当所述至少两个子图片对应的至少两个所述子识别结果之间包括重叠内容时,确定至少两个所述子识别结果中的非重叠内容和所述重叠内容;
    将所述非重叠内容与所述重叠内容进行融合,得到所述未处理图片的所述高维OCR识别结果;
    当所述至少两个子图片对应的至少两个所述子识别结果之间不包括所述重叠内容时,对每一所述子图片进行再次分割、所述放大处理、所述OCR识别和所述子识别结果的融合,得到每一所述子图片的识别结果;
    根据每一所述子图片的所述识别结果,确定所述未处理图片的所述高维OCR识别结果。
  11. 根据权利要求7所述的方法,其中,所述方法还包括:
    在对遍历到的每一图片进行所述低维OCR识别处理,得到每一对应图片的低维OCR识别结果之后,将所述低维OCR识别结果存储至预设存储单元中;
    在对每一所述未处理图片进行所述高维OCR识别处理,得到每一所述未处理图片的所述高维OCR识别结果之后,将所述高维OCR识别结果存储至所述预设存储单元中,并删除对应未处理图片的所述低维OCR识别结果。
  12. 根据权利要求1至6任一项所述的方法,其中,所述方法还包括:
    当所述预设图片库中增加新的图片时,对所述新的图片进行所述低维OCR识别处理或所述高维OCR识别处理。
  13. 一种图片搜索装置,包括:
    获取模块,配置为获取图片搜索请求,所述图片搜索请求中包括关键字符串;
    响应模块,配置为响应于所述图片搜索请求,获取预设图片库中每一图片的OCR识别结果;其中,所述OCR识别结果包括以下至少之一:采用基于OCR识别阈值的低维OCR识别处理所得到的低维OCR识别结果和基于深度识别的高维OCR识别处理所得到的高维OCR识别结果,所述低维OCR识别处理的识别精度小于所述高维OCR识别处理的识别精度;
    处理模块,配置为遍历未所述预设图片库中完成所述低维OCR识别处理、且未完成所述高维OCR识别处理的图片,并对遍历到的每一图片进行所述低维OCR识别处理,得到每一对应图片的低维OCR识别结果;
    第一确定模块,配置为根据每一图片的所述低维OCR识别结果和所述高维OCR识别结果中的至少一种,在所述预设图片库中确定与所述关键字符串匹配的目标图片;
    第二确定模块,配置为将所述目标图片确定为所述图片搜索请求的搜索结果,并显示所述搜索结果。
  14. 一种用于图片搜索的电子设备,包括:
    存储器,用于存储可执行指令;处理器,用于执行所述存储器中存储的可执行指令时,实现权利要求1至12任一项所述的图片搜索方法。
  15. 一种计算机程序产品,包括计算机程序或指令,所述计算机程序或指令被处理器执行时,实现权利要求1至12任一项所述的图片搜索方法。
  16. 一种计算机可读存储介质,存储有可执行指令,用于被处理器执行所述可执行指令时,实现权利要求1至12任一项所述的图片搜索方法。
PCT/CN2021/123256 2020-11-10 2021-10-12 一种图片搜索方法、装置、电子设备、计算机可读存储介质及计算机程序产品 WO2022100338A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP21890866.3A EP4184383A4 (en) 2020-11-10 2021-10-12 IMAGE SEARCH METHOD AND APPARATUS, ELECTRONIC DEVICE, COMPUTER READABLE STORAGE MEDIUM, AND COMPUTER PROGRAM PRODUCT
US17/951,824 US20230082638A1 (en) 2020-11-10 2022-09-23 Picture search method and apparatus, electronic device, computer-readable storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011248141.7A CN112347948A (zh) 2020-11-10 2020-11-10 图片搜索方法、装置、设备及计算机程序产品
CN202011248141.7 2020-11-10

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/951,824 Continuation US20230082638A1 (en) 2020-11-10 2022-09-23 Picture search method and apparatus, electronic device, computer-readable storage medium

Publications (1)

Publication Number Publication Date
WO2022100338A1 true WO2022100338A1 (zh) 2022-05-19

Family

ID=74362454

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/123256 WO2022100338A1 (zh) 2020-11-10 2021-10-12 一种图片搜索方法、装置、电子设备、计算机可读存储介质及计算机程序产品

Country Status (4)

Country Link
US (1) US20230082638A1 (zh)
EP (1) EP4184383A4 (zh)
CN (1) CN112347948A (zh)
WO (1) WO2022100338A1 (zh)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112347948A (zh) * 2020-11-10 2021-02-09 腾讯科技(深圳)有限公司 图片搜索方法、装置、设备及计算机程序产品
CN113032641B (zh) * 2021-04-23 2021-12-07 赛飞特工程技术集团有限公司 一种智能搜索方法和设备
CN113221901A (zh) * 2021-05-06 2021-08-06 中国人民大学 一种面向不成熟自检系统的图片识字转化方法及系统
CN115022268B (zh) * 2022-06-24 2023-05-12 深圳市六度人和科技有限公司 一种会话识别方法及装置、可读存储介质、计算机设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101464903A (zh) * 2009-01-09 2009-06-24 江阴明伦科技有限公司 一种利用web方式进行OCR图文识别检索方法和系统
US20090169131A1 (en) * 2007-12-26 2009-07-02 Oscar Nestares Ocr multi-resolution method and apparatus
CN101937438A (zh) * 2009-06-30 2011-01-05 富士通株式会社 网页内容提取方法和装置
CN109033261A (zh) * 2018-07-06 2018-12-18 北京旷视科技有限公司 图像处理方法、装置、处理设备及其存储介质
CN112347948A (zh) * 2020-11-10 2021-02-09 腾讯科技(深圳)有限公司 图片搜索方法、装置、设备及计算机程序产品

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4240859B2 (ja) * 2001-09-05 2009-03-18 株式会社日立製作所 携帯端末装置及び通信システム

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090169131A1 (en) * 2007-12-26 2009-07-02 Oscar Nestares Ocr multi-resolution method and apparatus
CN101464903A (zh) * 2009-01-09 2009-06-24 江阴明伦科技有限公司 一种利用web方式进行OCR图文识别检索方法和系统
CN101937438A (zh) * 2009-06-30 2011-01-05 富士通株式会社 网页内容提取方法和装置
CN109033261A (zh) * 2018-07-06 2018-12-18 北京旷视科技有限公司 图像处理方法、装置、处理设备及其存储介质
CN112347948A (zh) * 2020-11-10 2021-02-09 腾讯科技(深圳)有限公司 图片搜索方法、装置、设备及计算机程序产品

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4184383A4 *

Also Published As

Publication number Publication date
EP4184383A4 (en) 2023-12-27
CN112347948A (zh) 2021-02-09
US20230082638A1 (en) 2023-03-16
EP4184383A1 (en) 2023-05-24

Similar Documents

Publication Publication Date Title
WO2022100338A1 (zh) 一种图片搜索方法、装置、电子设备、计算机可读存储介质及计算机程序产品
US11205100B2 (en) Edge-based adaptive machine learning for object recognition
US11327978B2 (en) Content authoring
US11899681B2 (en) Knowledge graph building method, electronic apparatus and non-transitory computer readable storage medium
CN110020411B (zh) 图文内容生成方法及设备
US10140261B2 (en) Visualizing font similarities for browsing and navigation using a font graph
WO2019200783A1 (zh) 动态图表类页面数据爬取方法、装置、终端及存储介质
KR20190142288A (ko) 학습 컨텐츠 생성 방법
US20150170333A1 (en) Grouping And Presenting Images
CN106294798A (zh) 一种基于缩略图的图像分享方法和终端
US10789284B2 (en) System and method for associating textual summaries with content media
US10929478B2 (en) Filtering document search results using contextual metadata
CN110765301B (zh) 图片处理方法、装置、设备及存储介质
JP2015162244A (ja) 発話ワードをランク付けする方法、プログラム及び計算処理システム
JP7242994B2 (ja) ビデオイベント識別方法、装置、電子デバイス及び記憶媒体
CN112765387A (zh) 图像检索方法、图像检索装置和电子设备
US20190227634A1 (en) Contextual gesture-based image searching
US9940320B2 (en) Plugin tool for collecting user generated document segmentation feedback
US11558471B1 (en) Multimedia content differentiation
US20220292173A1 (en) Systems, Methods and Apparatuses For Integrating A Service Application Within An Existing Application
Khademi et al. Deep Learning from History: Unlocking Historical Visual Sources Through Artificial Intelligence
US20140032583A1 (en) Multi-Resolution Exploration of Large Image Datasets
US7909238B2 (en) User-created trade cards
Li Document Layout Analysis for Historical Documents
Girdhar et al. Mobile Visual Search for Digital Heritage Applications

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21890866

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2021890866

Country of ref document: EP

Effective date: 20230217

NENP Non-entry into the national phase

Ref country code: DE