WO2016187888A1 - 基于字符识别的关键词通知方法及设备、计算机程序产品 - Google Patents

基于字符识别的关键词通知方法及设备、计算机程序产品 Download PDF

Info

Publication number
WO2016187888A1
WO2016187888A1 PCT/CN2015/080127 CN2015080127W WO2016187888A1 WO 2016187888 A1 WO2016187888 A1 WO 2016187888A1 CN 2015080127 W CN2015080127 W CN 2015080127W WO 2016187888 A1 WO2016187888 A1 WO 2016187888A1
Authority
WO
WIPO (PCT)
Prior art keywords
keyword
matching
character
image
recognized
Prior art date
Application number
PCT/CN2015/080127
Other languages
English (en)
French (fr)
Inventor
周舒畅
周昕宇
吴育昕
姚聪
Original Assignee
北京旷视科技有限公司
北京小孔科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京旷视科技有限公司, 北京小孔科技有限公司 filed Critical 北京旷视科技有限公司
Priority to CN201580000345.XA priority Critical patent/CN105518712B/zh
Priority to PCT/CN2015/080127 priority patent/WO2016187888A1/zh
Publication of WO2016187888A1 publication Critical patent/WO2016187888A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/20Scenes; Scene-specific elements in augmented reality scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/768Arrangements for image or video recognition or understanding using pattern recognition or machine learning using context analysis, e.g. recognition aided by known co-occurring patterns

Definitions

  • the present disclosure relates to the field of information discovery and prompting technologies, and more particularly to a keyword recognition method and device based on character recognition, and a computer program product.
  • OCR optical character recognition
  • characters or words contained in the image can be recognized from images (including pictures and videos).
  • OCR technology can be run on a mobile terminal, which can include a smart phone, a tablet, a wearable device, and the like.
  • the OCR technology can be applied in real time on the mobile terminal. For example, optical character recognition of one frame of image per second can be realized on the mobile terminal (ie, real-time processing of one frame per second). speed).
  • a mobile terminal such as a smart phone
  • OCR application on the mobile terminal.
  • Such an optical character recognition process obviously relies on the user first finding the target recognition character, and the user issues an explicit instruction to the mobile terminal to request optical character recognition of the target recognition character, however, such an optical character recognition process has not been found to be of interest to the user. The case of characters cannot help the user find the character content of interest.
  • An embodiment of the present disclosure provides a keyword recognition method and device based on character recognition, and a computer program product, which can perform character recognition results by presetting a target keyword and filtering a character recognition result based on the target keyword. When the target keyword is matched, the user is prompted to find the target keyword.
  • a keyword recognition method based on character recognition comprising: capturing an image to be recognized; performing character recognition in the image to be recognized; and identifying from the image to be recognized In the case where the character is matched with the preset keyword, a matching notification message is generated and output.
  • a keyword notification based on character recognition comprises: an image acquisition device for capturing an image to be recognized; a notification device for outputting a matching notification message; one or more processors; one or more memories; computer program instructions stored in the memory, The computer program instructions, when executed by the processor, perform the steps of: performing character recognition in the image to be recognized; and in a case where a character recognized from the image to be recognized matches a preset keyword, The matching notification message is generated.
  • a computer program product for keyword notification based on character recognition comprising one or more computer readable storage media on which computer program instructions are stored
  • the computer program instructions when executed by the computer, perform the steps of: performing character recognition in the image to be recognized; and generating, in the case where the character recognized from the image to be recognized matches the preset keyword Match notification messages.
  • the character recognition-based keyword notification method and device and the computer program product by setting the target keyword in advance and filtering the character recognition result based on the target keyword, the character recognition result and the target keyword can be When matching, the user is prompted to find the target keyword. Since the captured image can be optically recognized on the electronic terminal at a real-time processing speed of, for example, one frame per second, when the electronic terminal captures an image in real time, the electronic terminal can perform optical character recognition on the currently captured image in real time, and in the optical When the character recognition result matches the target keyword, the user can be notified in real time that the target keyword is found, thereby advantageously using the OCR technique to assist the user in character discovery.
  • FIG. 1 is a schematic block diagram of an exemplary electronic terminal for implementing a character recognition based keyword notification method and apparatus of an embodiment of the present disclosure
  • FIG. 3 is an example of an image to be recognized according to an embodiment of the present disclosure
  • FIG. 4A is another example of an image to be recognized according to an embodiment of the present disclosure.
  • 4B is a schematic diagram of superimposing a video cue on an image to be recognized according to an embodiment of the present disclosure
  • FIG. 5A is a schematic diagram of image region division according to an embodiment of the present disclosure.
  • 5B is a schematic diagram of a two-dimensional coordinate system of an image in accordance with an embodiment of the present disclosure
  • FIG. 6 is a schematic block diagram of a character recognition based keyword notification device according to an embodiment of the present disclosure.
  • electronic terminal 100 includes one or more processors 102, one or more storage devices 104, input devices 106, output devices 108, and image acquisition devices 110 that pass through bus system 112 and/or other Formal connection mechanisms (not shown) are interconnected. It should be noted that the components and structures of the electronic terminal 100 shown in FIG. 1 are merely exemplary and not limiting, and the electronic terminal 100 may have other components and structures as needed.
  • the processor 102 can be a central processing unit (CPU) or other form of processing unit with data processing capabilities and/or instruction execution capabilities, and can control other components in the electronic terminal 100 to perform desired functions.
  • CPU central processing unit
  • the processor 102 can be a central processing unit (CPU) or other form of processing unit with data processing capabilities and/or instruction execution capabilities, and can control other components in the electronic terminal 100 to perform desired functions.
  • the storage device 104 can include one or more computer program products, which can include various forms of computer readable storage media, such as volatile memory and/or nonvolatile memory.
  • the volatile memory may include, for example, a random access memory (RAM) and/or a cache or the like.
  • the nonvolatile memory may include, for example, a read only memory (ROM), a hard disk, a flash memory, or the like.
  • One or both of the computer readable storage medium may be stored
  • Various applications and various data may also be stored in the computer readable storage medium, such as image data collected by the image capture device 110, preset (target) keywords, etc., and the application usage and/or Various data generated, etc.
  • the input device 106 can be a device used by a user to input an instruction, and can include one or more of a keyboard, a mouse, a microphone, a touch screen, and the like.
  • the instructions are, for example, an instruction to perform target keyword discovery using the electronic terminal 100, or an instruction to capture an image to be recognized using the image capture device 110, or an instruction to activate an optical character recognition (OCR) application.
  • OCR optical character recognition
  • the output device 108 may output various information (eg, images, sounds, or vibrations) to the outside (eg, a user), and may include one or more of a display, a speaker, a vibration generator, and the like.
  • the image capture device 110 can capture images (eg, photos, videos, etc.) desired by the user and store the captured images in the storage device 104 for use by other components.
  • images eg, photos, videos, etc.
  • the exemplary electronic terminal 100 for implementing the character recognition-based keyword notification method and apparatus of the embodiments of the present disclosure may be a mobile terminal such as a smartphone, a tablet, a wearable device, or the like.
  • the electronic terminal 100 may also be a fixed electronic terminal, and the image capturing device 110 in the electronic terminal 100 may be mounted with the processor 102 or may be installed at a distance from the processor 102 respectively. s position.
  • the image capturing device 110 in the electronic terminal 100 may be installed, for example, in a place such as a square or a meeting place.
  • the mobile device can include a smartphone, a tablet, a wearable device, and the like.
  • FIG. 2 is a schematic flowchart of a character recognition based keyword notification method according to an embodiment of the present disclosure.
  • step S210 an image to be recognized is captured.
  • the other image acquisition device that transmits the image by the electronic terminal 100 captures an image of the user-selected scene as the image to be recognized at the position where the user is located.
  • the image to be identified may be a photo or a frame in the video.
  • the photo may include one or more photos of a single scene, or may be a panoramic photo.
  • the The image capturing device in the electronic terminal captures a photo of the selected scene of the user, or captures a video of the scene selected by the user, or changes the shooting direction or the viewing range of the image capturing device at a speed lower than a predetermined moving speed threshold. Take a video of a wider range of user-selected scenes.
  • the image to be identified may reflect the environment in which the user is located, and may accordingly include characters present in the environment in which the user is located, which may include, but is not limited to, building identification, store identification, street identification, billboard characters, and the like.
  • FIG 3 shows an example of an image to be recognized taken at a location where the user is located, in this example, the image is a photo, the user is located near the Red Star Laundry, the photo contains a "Red Star Laundry” string, and Including “Li shop”, “shop”, “13 stores”, “Midea”, “beautiful” and other strings.
  • FIG. 4A shows another example of a photo taken at a location where the user is located, in this example, the image is a photo, and the user wants to find information about the flight CA3856 he is about to ride in front of the flight information display screen of the airport.
  • the photo contains fields such as "flight number”, “plan”, “terminal/stop”, “counter number” and "level of processing”.
  • step S220 character recognition is performed in the image to be recognized. After the captured image to be recognized is obtained, characters appearing in the image to be recognized may be identified.
  • the image to be recognized may be pre-processed to facilitate the character recognition.
  • the pre-processing may include scaling the photo
  • the pre-processing may include extracting a key frame of the video.
  • the character recognized from the image to be recognized may include at least one character
  • the character recognition result may include the at least one character and the position of each character.
  • the image to be identified as shown in FIG. 3 “red”, “star”, “wash”, “clothing”, “shop”, “reason”, “shop” can be identified from the image to be recognized.
  • the at least one character in the character recognition result may be organized into a character string in the order of rows or columns.
  • the at least one character may be combined to form at least one character string according to a position of each of the at least one character in the image to be recognized.
  • the at least one character may be arranged into one or more lines according to the position of each of the at least one character included in the character recognition result, and then the characters are organized into at least the order of the lines.
  • One word a string of characters; or, the at least one character may be arranged into one or more columns according to a position of each of the at least one character included in the character recognition result, and then the characters are arranged in a column order Organized as at least one string.
  • it may be determined according to the writing habits of the country and/or region where the user is located, whether the identified at least one character should be arranged in a row or a column.
  • the character string recognized from the image to be recognized may include at least one character string
  • the character string recognition result may include the at least one character string and the position of each character string . Still for the image to be identified as shown in FIG. 3, strings such as "Red Star Laundry", “Lishop”, “Store”, “Thirteen Stores”, “Midea”, “Beauty”, and the like can be identified.
  • OCR optical character recognition
  • the image to be identified may be scanned to detect the position of all character frames in the image to be recognized that may contain characters, and then the characters in each character frame may be identified and the content in each character frame as a character string, and Generate a string recognition result.
  • the string recognition result may include the recognized character string and the position of the character string.
  • the image to be identified may be divided into 9 blocks as shown in FIG. 5A, and values of 1-9 may be used to indicate the position of the character string, and "1" indicates that the character string is located to be identified. In the upper left block of the image, "2" indicates that the character string is located in the upper middle block of the image to be recognized, "3" indicates that the character string is located in the upper right block of the image to be recognized, and so on.
  • the image to be identified may be divided into fewer or more blocks, for example, 4 blocks, 16 blocks, and the like.
  • a two-dimensional coordinate system may be established with one of a center point, an upper left vertex, a lower left vertex, an upper right vertex, and a lower right vertex of the image to be recognized as an origin, and for each pixel, the pixel point is The number of pixels between the origins is taken as a two-dimensional coordinate value of the pixel.
  • OCR Optical Character Recognition
  • each of the string recognition results may also be analyzed to determine one or more words with specific meanings.
  • the character string can be analyzed using methods well known in the art, and the character string can be segmented to divide the string into one or more words with specific meanings.
  • the method of analyzing and segmenting the character string is in the field It is well known that detailed descriptions thereof are omitted herein for the sake of brevity.
  • the "Red Star Laundry” character string can be divided into substrings such as the following words “Red Star”, “Laundry”, “Store”, “Laundry”, and the like. .
  • step S230 it is determined whether the character recognized from the image to be recognized matches the preset keyword.
  • the preset keyword may include at least one preset keyword. Specifically, it is determined whether a character string recognized from the image to be recognized matches one of the at least one preset keyword.
  • a matching degree threshold may be set in advance, for each of the at least one character string, determining a matching degree of the character string with one of the at least one preset keyword, and in the matching When the degree is higher than the matching degree threshold, it is determined that the string matches the preset keyword. For example, if the string completely contains a certain preset keyword, it can be determined that the string exactly matches the preset keyword.
  • a preset keyword is "laundry”
  • the character string recognized in the image to be recognized as shown in FIG. 3 includes “Red Star Laundry”, so the recognized character string “Red Star Laundry” completely contains the pre-
  • the keyword "laundry” is set to determine that the string matches the preset keyword.
  • the position of the character string "Red Star Laundry” in the image to be recognized may be used as a character string position presented to the user.
  • a preset keyword is “Red Star Dry Cleaner”, and the matching degree threshold is 60%.
  • the character string recognized in the image to be recognized as shown in FIG. 3 includes “Red Star Laundry”, and the recognized character string. "Red Star Laundry” does not match the default keyword "Red Star Dry Cleaner”.
  • the matching degree can be calculated as 70% or 80%. If the matching degree is higher than the matching degree threshold of 60%, the The string matches the preset keyword. In this case, the position of the character string "Red Star Laundry" in the image to be recognized may be used as a character string position presented to the user.
  • a default keyword is "CA3856”.
  • the character string "CA3856” exists in the image to be recognized as shown in FIG. 4A.
  • "CA3856” in the image to be recognized is recognized as the character string "CA5856”
  • the recognized character string "CA5856” does not exactly match the preset keyword "CA3856”. For example, determining between the identified character string "CA5856” and the preset keyword "CA3856” according to a predetermined matching degree calculation algorithm The match is 85%.
  • the character recognition based keyword notification method determines that the recognized character string does not match the preset keyword; setting the matching degree to 80%
  • the character recognition based keyword notification method may determine that the identified character string matches the preset keyword.
  • the matching degree threshold can be set by weighing the character recognition accuracy and the false alarm rate.
  • a preset keyword is “CA3856”
  • the preset matching degree threshold is 50%
  • the characters “CA3856” and “CA3448” in FIG. 4A may be Determined to match the default keyword, which obviously increases the false positive rate.
  • an edit distance of the string and one of the at least one preset keyword may be calculated, and the string and the preset are determined when the edit distance is lower than a predetermined edit distance threshold Keyword matching.
  • the edit distance may represent the minimum number of edit operations required to convert the first string to the second string, and the permitted edit operation may include, for example, replacing one character in the first string with another character, in the first string Insert a character and delete a character in the first string.
  • the edit distance of the character string and one of the at least one keyword is zero, the matching degree is an exact match; and the editing distance of the character string and one of the at least one keyword is larger. The smaller the match.
  • the predetermined edit distance threshold can be set as needed.
  • the predetermined edit distance threshold may be set by weighing the character recognition accuracy and the false positive rate. The higher the predetermined edit distance threshold, the lower the required character recognition accuracy but the higher the false positive rate; the lower the predetermined edit distance threshold, the lower the false positive rate but the higher the required character recognition accuracy.
  • the predetermined edit distance threshold is set to 0 it indicates that an exact match is required, thereby reducing the false positive rate. In this case, if the character recognition is incorrect, the user cannot be alerted;
  • the edit distance threshold is set to 1, it indicates that the character string can have a character difference from a specific keyword, for example, one character, one character less character or one character difference is different from a specific keyword.
  • a default keyword is "CA3856”.
  • the character string “CA3856” exists in the image to be recognized as shown in FIG. 4A. However, since the character “CA3856” is recognized as “CA5856” due to an error of the character recognition algorithm, the recognized character string “CA5856” does not exactly match the preset keyword "CA3856”. Specifically, the identified character string "CA5856” has a different character from the preset keyword “CA3856” and is calculated, for example, according to a predetermined edit distance calculation algorithm. Its edit distance is 1.
  • the predetermined edit distance threshold is set to 0, it is determined that the recognized character string "CA5856” does not match the preset keyword “CA3856”; and the predetermined edit distance threshold is set to 1 or When it is greater than 1, it can be determined that the recognized character string "CA5856” matches the target keyword "CA3856”.
  • step S240 in a case where the character recognized from the image to be recognized matches the preset keyword, a matching notification message is generated and output.
  • the matching notification message may be output in visual information, audio information, tactile information, or the like.
  • the match notification message may indicate that a character matching the preset keyword is found in the to-be-identified image.
  • the electronic terminal can generate a vibration, and the user perceives the vibration, whereby it can be determined that there is a character matching the preset keyword in the currently captured image.
  • the electronic terminal may output audio to notify the user that there is a character matching the preset keyword in the currently captured image.
  • the electronic terminal may output video information such as image blinking, alarm identification, alarm identification blinking, etc. on its display screen to notify the user of There are characters matching the preset keyword in the currently captured image.
  • the matching notification message may not only indicate that a character matching the preset keyword is found in the to-be-identified image, but further indicating the found character that matches the preset keyword.
  • the position in the image to be identified For example, in a case where the recognized character matches a preset keyword, the electronic terminal may output audio to prompt the user that the found character matching the preset keyword is in the recognized image. position. Alternatively, if the recognized character matches the preset keyword, the electronic terminal may output a video prompt to prompt the user to find the character that matches the preset keyword in the to-be-identified The position in the image.
  • the video cue may be superimposed on the identified image for display.
  • the position of the found character string matching the preset keyword in the image to be recognized may be indicated in the manner of the image block in FIG. 5A, or may be determined in a two-dimensional coordinate manner.
  • the position of the found string that matches the preset keyword in the image to be identified may be indicated in the manner of the image block in FIG. 5A, or may be determined in a two-dimensional coordinate manner.
  • the video indication can be a box superimposed on the image to identify the location of the found character that matches the preset keyword in the image.
  • the image to be recognized can be captured in real time by its built-in image capturing device, and can be moved in real time. Displaying, on the display screen of the device, the video indication superimposed on the current captured image (ie, the image to be recognized) at or near the position in the image to be recognized;
  • the image to be recognized may be captured in real time by an image capturing device in the eyeglass-type wearable device, and the augmented reality of the eyeglass-type wearable device may be real-time.
  • the video indication is displayed on or near the position of the character string matching the preset keyword on the lens in the image to be recognized, thereby performing keyword notification to the user through augmented reality technology.
  • the image to be recognized is subjected to character recognition in real time by capturing an image to be recognized in real time, and the user is notified in real time in the case where the target keyword is found in the image to be recognized. Therefore, the character recognition based keyword notification method according to an embodiment of the present disclosure can perform target keyword discovery and notification in real time based on the image to be recognized captured in real time.
  • user feedback for the match notification message may be received.
  • the user feedback may include ignoring the matching notification message, reducing the matching degree matching threshold, increasing the matching degree matching threshold, adding a preset keyword, modifying a certain preset keyword, deleting a certain preset keyword, or Filter a preset keyword.
  • the preset keyword and/or the parameter for matching the judgment can be adjusted in real time.
  • the adjusting the preset keyword may include changing a preset keyword list, and the changing the preset keyword list may include adding a preset keyword, modifying a preset keyword, or deleting a certain preset. Set keywords.
  • the adjusting the preset keyword may further include filtering, in real time, a preset keyword used by the matching determination when the matching notification message is generated.
  • the parameter of the user matching judgment may include the matching degree threshold or the editing distance threshold.
  • Steps S250 and S260 are shown by dashed lines in FIG. 2 to indicate the difference between them and steps S210-S240, and steps S210-S240 are necessary steps of the character recognition-based keyword notification method according to an embodiment of the present disclosure, and the steps are The S250 and S260 are optional steps.
  • the keyword notification device may be a mobile device carried by the user, and the above method may be performed. Since the details of the respective operations performed by the keyword notification device are substantially the same as those described above with respect to FIG. 2, in order to avoid repetition, only the keyword notification device will be briefly described below, and the same details will be omitted. description of.
  • the character recognition based keyword notification apparatus 600 includes an image collection device 610, a character recognition device 620, a keyword matching device 630, a notification device 640, and a storage device 650.
  • the image capture device 610 can be implemented by the image capture device 110 shown in FIG. 1
  • the character recognition device 620 and the keyword matching device 630 can be implemented by the processor 102 shown in FIG. 1
  • a portion of the notification device 640 is also This may be implemented by the processor 102 of FIG. 1, which may be implemented by the storage device 104 shown in FIG.
  • the image capture device 610 can capture an image to be recognized, and specifically can be used to capture an image to be recognized of a user selected scene at a location where the user is located.
  • the image to be identified may be a photo or a frame in the video.
  • the photo may include one or more photos of a single scene, or may be a panoramic photo.
  • the image capturing device 510 may capture a photo of a scene selected by the user, or capture a video of a scene selected by the user, or change the shooting direction of the image capturing device at a speed lower than a predetermined moving speed threshold or The framing range allows for a wider range of videos of the user-selected scene.
  • the image to be identified may reflect the environment in which the user is located, and may accordingly include characters present in the environment in which the user is located, which may include, but is not limited to, building identification, store identification, street identification, billboard characters, and the like.
  • the image to be recognized may also be captured by other photographing devices, and the photographed image may be transmitted to the keyword notification device 600, in which case the image capture device 610 may be omitted.
  • the character recognition device 620 can perform character recognition on the captured image to be recognized to recognize characters in the image to be recognized.
  • the character recognition device 520 may preprocess the image to be recognized to facilitate the character recognition before performing character recognition on the image to be recognized.
  • the pre-processing may include scaling the photo, and in the case where the image is a video, the pre-processing may include extracting a key frame of the video.
  • the character recognized from the image to be recognized may include at least one character
  • the character recognition result may include the at least one character and the position of each character.
  • the at least one character in the character recognition result may be organized into a character string in the order of rows or columns.
  • the at least one character may be combined to form at least one character string according to a position of each of the at least one character in the image to be recognized.
  • the at least one character may be arranged into one or more lines according to the position of each of the at least one character included in the character recognition result, and then the characters are organized into at least the order of the lines.
  • the character string recognized from the image to be recognized may include at least one character string
  • the character string recognition result may include the at least one character string and the position of each character string.
  • the character recognition device 620 may use an optical character recognition (OCR) technique to identify characters in the image to be recognized.
  • OCR optical character recognition
  • the character recognition device 620 may scan the image to be recognized to detect the position of all the character boxes in the image to be recognized that may contain characters, and then identify the characters in each character box and each character frame
  • the content in the file acts as a string and produces a string recognition result.
  • the string recognition result may include the recognized character string and the position of the character string.
  • the character recognition device 620 may further analyze each character string in the string recognition result to determine one or more words having specific meanings.
  • the character string can be analyzed using methods well known in the art, and the character string can be segmented to divide the string into one or more words with specific meanings.
  • the keyword matching device 630 can determine whether the character recognized from the image to be recognized matches the preset keyword.
  • the preset keyword may include at least one preset keyword. Specifically, for each character string in the at least one character string recognized by the character recognition device 620, the keyword matching device 630 may determine whether the character string matches one of the at least one preset keyword, And if it is determined that the character string matches one of the at least one preset keyword, determining that the character string matches the preset keyword.
  • the matching degree threshold may be preset, and when the matching degree between a character string and a preset keyword is higher than the matching degree threshold, determining that the character string matches the preset keyword.
  • the keyword matching device 630 may calculate a matching degree of the character string with one of the at least one preset keyword. And determining that the character string matches the preset keyword if the matching degree is higher than the matching degree threshold.
  • the matching degree threshold can be set as needed. For example, the matching degree threshold can be set by weighing the character recognition accuracy and the false positive rate. The higher the matching degree threshold, the higher the required character recognition accuracy but the lower the false positive rate; the lower the matching degree threshold, the higher the false positive rate but the lower the required character recognition accuracy.
  • a programming distance threshold may be preset, and the string may be calculated and the at least one An edit distance of one of the preset keywords, and determining that the character string matches the preset keyword when the edit distance is lower than the edit distance threshold.
  • the predetermined edit distance threshold can be set as needed.
  • the predetermined edit distance threshold may be set by weighing the character recognition accuracy and the false positive rate. The higher the predetermined edit distance threshold, the lower the required character recognition accuracy but the higher the false positive rate; the lower the predetermined edit distance threshold, the lower the false positive rate but the higher the required character recognition accuracy.
  • the notification means 640 is configured to generate and output a matching notification message in a case where the character recognized from the image to be recognized matches the preset keyword.
  • the matching notification message may be output in visual information, audio information, tactile information, or the like.
  • the match notification message may indicate that a character matching the preset keyword is found in the to-be-identified image.
  • the notification device 640 may be a vibration device that can generate a vibration if the recognized character matches a preset keyword, and the user perceives the vibration, thereby determining that the presence and the preset key are present in the current captured image. The word matches the character.
  • the notification device 640 may be a voice interaction device (including an audio output device) that can output audio if the recognized character matches the preset keyword to notify the user that the current captured image exists A character that matches a preset keyword.
  • the notification device 640 may be a display device that displays video cue information, such as the image to be recognized blinking, alarm identification displayed on the display device, if the recognized character matches the preset keyword.
  • the alarm flag is blinking or the like to notify the user that there is a character matching the preset keyword in the currently captured image.
  • the matching notification message may not only indicate that a character matching the preset keyword is found in the to-be-identified image, but further indicating the found character that matches the preset keyword.
  • the position in the image to be identified the notification device 540 may be an audio output device that can output audio when the recognized character matches the preset keyword to prompt the user for the found character matching the preset keyword.
  • the position in the identified image the notification device 640 may be a display device that displays video prompt information when the recognized characters match the preset keywords to prompt the user for the found matches with the preset keywords.
  • the video cue information may be superimposed and displayed on the identified image.
  • the keyword notification device is a wearable device
  • the image capture device captures the image to be recognized in real time
  • the notification device outputs the match notification message in real time.
  • the character recognition based keyword notification device 600 is a smart phone and a tablet
  • the image to be recognized may be captured in real time by its built-in image capture device, and a character string matching the preset keyword may be displayed on the display screen of the mobile device in real time.
  • the video indication is superimposed on the current captured image (ie, the image to be recognized) at or near the position in the image to be recognized; in the case where the keyword recognition based message notification device 600 is a glasses-type wearable device
  • the notification device 650 may be an augmented reality display lens in the eyeglass wear device, and the image to be recognized may be captured in real time by an image capture device in the eyeglass wear device, and may be in real time
  • the augmented reality display of the glasses-type wearable device displays the video indication on or near the position of the character string matching the preset keyword in the image to be recognized, thereby performing key to the user through augmented reality technology Word notification.
  • the storage device 650 is configured to store the preset keyword and the image to be recognized, and may also store the matching degree threshold and/or the edit distance threshold. Moreover, the storage device 650 is further for storing computer program code for implementing a method of character recognition based keyword notification in accordance with an embodiment of the present disclosure.
  • the character recognition based keyword notification device 600 may further include a feedback device (not shown) for receiving user feedback for the matching notification message.
  • the user feedback may include ignoring the matching notification message, reducing the matching degree matching threshold, increasing the matching degree matching threshold, adding a preset keyword, modifying a certain preset keyword, deleting a certain preset keyword, or Filter a preset keyword.
  • the feedback device may be a touch detection device, a voice detection device, or the like.
  • the speech detection device and the speech output device may be integrated together and are generally referred to as a speech interaction device, and the touch detection device and the display device may also be integrated together and collectively referred to as a video interaction device.
  • the character recognition-based keyword notification apparatus 600 may further include adjustment means (not shown) for adjusting the preset keyword and/or using it in real time according to the user feedback.
  • the adjusting the preset keyword may include changing a preset keyword list, and the changing the preset keyword list may include adding a preset keyword, modifying a preset keyword, or deleting a certain preset. Set keywords.
  • the adjusting the preset keyword may further include filtering, in real time, a preset keyword used by the matching determination when the matching notification message is generated.
  • the parameter of the user matching judgment may include the matching degree threshold or the editing distance threshold.
  • a computer program product comprising computing A machine readable storage medium on which computer program instructions are stored.
  • the computer program instructions may implement a character recognition based keyword notification method according to an embodiment of the present disclosure when executed by a computer, and/or may implement character recognition in a character recognition based keyword notification device according to an embodiment of the present disclosure. All or part of the functions of the device, the keyword matching device, the notification device, and the adjustment device.
  • the optical character recognition result and the target key can be When the word matches, the user is prompted to find the target keyword. Since the captured image can be optically recognized on the electronic terminal at a real-time processing speed of, for example, one frame per second, when the electronic terminal captures an image in real time, the electronic terminal can perform optical character recognition on the currently captured image in real time, and in the optical When the character recognition result matches the target keyword, the user can be notified in real time that the target keyword is found, thereby advantageously using the OCR technique to assist the user in character discovery.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Databases & Information Systems (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Character Discrimination (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

一种基于字符识别的关键词通知方法及设备、以及计算机程序产品,属于信息发现与提示技术领域。所述基于字符识别的关键词通知方法包括:拍摄待识别图像;在所述待识别图像中进行字符识别;以及在从所述待识别图像中识别出的字符与预设关键词匹配的情况下,产生并输出匹配通知消息。通过预先设定目标关键词,并且基于目标关键词来筛选字符识别结果,从而能够在字符识别结果与目标关键词匹配时向用户提示发现了目标关键词。

Description

基于字符识别的关键词通知方法及设备、计算机程序产品 技术领域
本公开涉及信息发现与提示技术领域,更具体地涉及一种基于字符识别的关键词通知方法及设备、以及计算机程序产品。
背景技术
通过采用光学字符识别(OCR)技术,可以从图像(包括图片和视频)中识别出该图像中包含的字符或文字。目前,OCR技术可以在移动终端上运行,所述移动终端可以包括智能手机、平板电脑、穿戴设备等。随着各种移动终端处理能力的大幅提升,在移动终端上已经能够实时地应用OCR技术,例如在移动终端上可以实现每秒完成一帧图像的光学字符识别(即每秒一帧的实时处理速度)。
通常,在用户发现感兴趣的字符时,该用户使用移动终端(诸如智能手机)拍摄包含该字符的待识别图像,并利用该移动终端上的OCR应用程序对该待识别图像进行光学字符识别。这样的光学字符识别过程显然依赖于用户首先发现目标识别字符,并且用户向该移动终端发出显式指令要求对目标识别字符进行光学字符识别,然而,这样的光学字符识别过程在用户尚未发现感兴趣的字符的情况下无法帮助用户发现感兴趣的字符内容。
因此,需要一种能够帮助用户进行字符发现的技术。
发明内容
鉴于上述问题而提出了本公开。本公开实施例提供了一种基于字符识别的关键词通知方法及设备、以及计算机程序产品,其通过预先设定目标关键词,并且基于目标关键词来筛选字符识别结果,从而能够在字符识别结果与目标关键词匹配时向用户提示发现了目标关键词。
根据本公开实施例的一个方面,提供了一种基于字符识别的关键词通知方法,包括:拍摄待识别图像;在所述待识别图像中进行字符识别;以及在从所述待识别图像中识别出的字符与预设关键词匹配的情况下,产生并输出匹配通知消息。
根据本公开实施例的另一方面,提供了一种基于字符识别的关键词通知 设备,包括:图像采集装置,用于拍摄待识别图像;通知装置,用于输出匹配通知消息;一个或多个处理器;一个或多个存储器;存储在所述存储器中的计算机程序指令,在所述计算机程序指令被所述处理器运行时执行以下步骤:在所述待识别图像中进行字符识别;以及在从所述待识别图像中识别出的字符与预设关键词匹配的情况下,产生所述匹配通知消息。
根据本公开实施例的又一方面,提供了一种基于字符识别进行关键词通知的计算机程序产品,包括一个或多个计算机可读存储介质,所述计算机可读存储介质上存储了计算机程序指令,所述计算机程序指令在被计算机运行时执行以下步骤:在待识别图像中进行字符识别;以及在从所述待识别图像中识别出的字符与预设关键词匹配的情况下,产生所述匹配通知消息。
根据本公开实施例的基于字符识别的关键词通知方法及设备、以及计算机程序产品,通过预先设定目标关键词,并且基于目标关键词来筛选字符识别结果,能够在字符识别结果与目标关键词匹配时向用户提示发现了目标关键词。由于在电子终端上可以以例如每秒一帧的实时处理速度对拍摄图像进行光学字符识别,因此在电子终端实时拍摄图像时,电子终端可以实时地对当前拍摄图像进行光学字符识别,并且在光学字符识别结果与目标关键词匹配时可以实时地向用户通知发现了目标关键词,从而将OCR技术有利地用于帮助用户进行字符发现。
本公开的其它特征和优点将在随后的说明书中阐述,并且,部分地从说明书中变得显而易见,或者通过实施本公开而了解。本公开的目的和其他优点可通过在说明书、权利要求书以及附图中所特别指出的结构来实现和获得。
附图说明
通过结合附图对本公开实施例进行更详细的描述,本公开的上述以及其它目的、特征和优势将变得更加明显。附图用来提供对本公开实施例的进一步理解,并且构成说明书的一部分,与本公开实施例一起用于解释本公开,并不构成对本公开的限制。在附图中,相同的参考标号通常代表相同装置或步骤。
图1是用于实现本公开实施例的基于字符识别的关键词通知方法和设备的示例性电子终端的示意性框图;
图2是根据本公开实施例的基于字符识别的关键词通知方法的示意性流 程图;
图3是根据本公开实施例的待识别图像的示例;
图4A是根据本公开实施例的待识别图像的另一示例;
图4B是根据本公开实施例的在待识别图像上叠加视频提示的示意图;
图5A是根据本公开实施例的图像区域划分的示意图;
图5B是根据本公开实施例的图像的二维坐标系统的示意图;以及
图6是根据本公开实施例的基于字符识别的关键词通知设备的示意性框图。
具体实施方式
为了使得本公开的目的、技术方案和优点更为明显,下面将参照附图详细描述根据本公开的示例实施例。显然,所描述的实施例仅仅是本公开的一部分实施例,而不是本公开的全部实施例,应理解,本公开不受这里描述的示例实施例的限制。基于本公开中描述的本公开实施例,本领域技术人员在没有付出创造性劳动的情况下所得到的所有其它实施例都应落入本公开的保护范围之内。
首先,参照图1来描述用于实现本公开实施例的基于字符识别的关键词通知方法和设备的示例性电子终端100。
如图1所示,电子终端100包括一个或多个处理器102、一个或多个存储装置104、输入装置106、输出装置108、以及图像采集装置110,这些组件通过总线系统112和/或其它形式的连接机构(未示出)互连。应当注意,图1所示的电子终端100的组件和结构只是示例性的,而非限制性的,根据需要,所述电子终端100也可以具有其他组件和结构。
所述处理器102可以是中央处理单元(CPU)或者具有数据处理能力和/或指令执行能力的其它形式的处理单元,并且可以控制所述电子终端100中的其它组件以执行期望的功能。
所述存储装置104可以包括一个或多个计算机程序产品,所述计算机程序产品可以包括各种形式的计算机可读存储介质,例如易失性存储器和/或非易失性存储器。所述易失性存储器例如可以包括随机存取存储器(RAM)和/或高速缓冲存储器(cache)等。所述非易失性存储器例如可以包括只读存储器(ROM)、硬盘、闪存等。在所述计算机可读存储介质上可以存储一个或 多个计算机程序指令,处理器102可以运行所述程序指令,以实现下文所述的本发明实施例中(由处理器实现)的功能以及/或者其它期望的功能。在所述计算机可读存储介质中还可以存储各种应用程序和各种数据,例如所述图像采集装置110采集的图像数据、预设(目标)关键词等以及所述应用程序使用和/或产生的各种数据等。
所述输入装置106可以是用户用来输入指令的装置,并且可以包括键盘、鼠标、麦克风和触摸屏等中的一个或多个。所述指令例如是使用所述电子终端100进行目标关键词发现的指令,或者是使用所述图像采集装置110拍摄待识别图像的指令,或者是启动光学字符识别(OCR)应用程序的指令。
所述输出装置108可以向外部(例如用户)输出各种信息(例如图像、声音或振动),并且可以包括显示器、扬声器、振动发生器等中的一个或多个。
所述图像采集装置110可以拍摄用户期望的图像(例如照片、视频等),并且将所拍摄的图像存储在所述存储装置104中以供其它组件使用。
优选地,用于实现本公开实施例的基于字符识别的关键词通知方法和设备的示例性电子终端100可以为诸如智能手机、平板电脑、穿戴设备等移动终端。然而,本公开不限于此,电子终端100也可以是固定的电子终端,并且电子终端100中的图像采集装置110可以与处理器102安装在一起,或者可以与处理器102分别安装在距离较远的位置。在此情况下,电子终端100中的图像采集装置110例如可以是安装于广场、会场等场所内。
下面,将以移动设备为例来描述根据本公开实施例的基于字符识别的关键词通知方法及设备。所述移动设备可以包括智能手机、平板电脑、穿戴设备等。
图2是根据本公开实施例的基于字符识别的关键词通知方法的示意性流程图。
如图2所示,在步骤S210,拍摄待识别图像。具体地,可以利用如图1所示的用于实现本公开实施例的基于字符识别的关键词通知方法的电子终端100中的图像采集装置110或者独立于所述电子终端100的可以向所述电子终端100传送图像的其它图像采集装置,在用户所在的位置处拍摄用户选定场景的图像作为待识别图像。
所述待识别图像可以是照片,也可以是视频中的一帧。所述照片可以包括一幅或多幅单一场景的照片,也可以是全景照片。具体地,可以利用所述 电子终端中的图像采集装置拍摄用户选定场景的一张照片,或拍摄用户选定场景的一段视频,或者以低于预定移动速度阈值的速度改变所述图像采集装置的拍摄方向或取景范围从而拍摄更大范围的用户选定场景的视频。所述待识别图像可以反应用户所处的环境,并且相应地可以包含用户所处环境中存在的字符,所述字符可以包括但不限于建筑物标识、店铺标识、街道标识、广告牌字符等。
图3示出了在用户所在的位置处拍摄的待识别图像的示例,在该示例中,所述图像是照片,用户位于红星洗衣店附近,该照片包含“红星洗衣店”字符串,并且还包括“理店”、“店”、“十三店”、“Midea”、“美的”等字符串。
图4A示出了在用户所在的位置处拍摄的照片的另一示例,在该示例中,所述图像是照片,用户站在机场的航班信息显示屏前希望找到他即将乘坐的航班CA3856的信息,该照片包含了“航班号”、“计划”、“终点站/经停站”、“柜台号”和“办理等级时间”等字段。
在步骤S220,在所述待识别图像中进行字符识别。在获得所拍摄的待识别图像之后,可以识别所述待识别图像中出现的字符。
可选地,在识别所述待识别图像中的字符之前,可以对所述待识别图像进行预处理,以利于所述字符识别。例如,在所述图像是照片的情况下,所述预处理可以包括对照片进行缩放,在所述图像是视频的情况下,所述预处理可以包括提取视频的关键帧。
根据本公开实施例,从所述待识别图像中识别出的字符可以包括至少一个字符,并且字符识别结果可以包括所述至少一个字符以及每个字符的位置。例如,在如图3所示的待识别图像中,从所述待识别图像可以识别出“红”、“星”、“洗”、“衣”、“店”、“理”、“店”、“十”、“三”、“店”、“M”、“i”、“d”、“e”、“a”、“美”、“的”等字符。
更进一步,对于所述字符识别结果中的所述至少一个字符,可以将其按照行或列的顺序组织为字符串。具体地,在对所述待识别图像进行字符识别时,可以按照所述至少一个字符中每个字符在所述待识别图像中的位置,将所述至少一个字符进行组合以形成至少一个字符串。例如,可以根据包含在所述字符识别结果中的所述至少一个字符中每个字符的位置,将所述至少一个字符排列成一行或多行,然后将所述字符按照行的顺序组织为至少一个字 符串;或者,可以根据包含在所述字符识别结果中的所述至少一个字符中每个字符的位置,将所述至少一个字符排列成一列或多列,然后将所述字符按照列的顺序组织为至少一个字符串。在本公开实施例中,可以根据用户所在国家和/或地区的书写习惯,确定应当将所识别的至少一个字符排列成行还是列。在此情况下,根据本公开实施例,从所述待识别图像中识别出的字符串可以包括至少一个字符串,并且字符串识别结果可以包括所述至少一个字符串以及每个字符串的位置。仍针对如图3所示的待识别图像,可以识别出“红星洗衣店”、“理店”、“店”、“十三店”、“Midea”、“美的”等字符串。
具体地,可以使用光学字符识别(OCR)技术来识别所述待识别图像中的字符。可以扫描所述待识别图像以检测所述待识别图像中所有可能含有字符的字符框的位置,然后可以识别每个字符框中的字符并将每个字符框中的内容作为一个字符串,并且产生字符串识别结果。所述字符串识别结果可以包括所识别的字符串以及所述字符串的位置。
例如,可以将所述待识别图像划分为如图5A所示的9个块,并且可以采用数值1-9来表示所述字符串的位置,“1”表示所述字符串位于所述待识别图像的左上块中,“2”表示所述字符串位于所述待识别图像的中上块中,“3”表示所述字符串位于所述待识别图像的右上块中,依此类推。当然,根据实际需要,可以将所述待识别图像划分为更少或更多块,例如4块、16块等。
再例如,可以以所述待识别图像的中心点、左上顶点、左下顶点、右上顶点、右下顶点之一为原点建立二维坐标系,并且对于每个像素点而言,以该像素点与所述原点之间的像素点数量作为该像素点的二维坐标值。
光学字符识别(OCR)技术已经是本领域中公知的技术,在本公开中不对现有的OCR技术展开描述。此外,应了解,本公开实施例不限于采用现有的OCR技术进行字符识别,而且应涵盖采用将来开发的字符识别技术进行字符识别并继而进行关键词通知的任何应用。
可选地,还可以分析所述字符串识别结果中的每个字符串,以确定一个或多个具有具体含义的词语。具体地,可以使用本领域公知的方法对所述字符串进行分析,并且对所述字符串进行分词,从而将所述字符串划分为一个或多个具有具体含义的词语。对所述字符串进行分析和分词的方法是本领域 公知的,在这里为了简单起见而省略其详细描述。在图3所示的示例中,通过上述分析和分词操作,例如可以将“红星洗衣店”字符串划分为以下词语“红星”、“洗衣”、“店”、“洗衣店”等子字符串。
在步骤S230,判断从所述待识别图像中识别出的字符与预设关键词是否匹配。所述预设关键词可以包括至少一个预设关键词。具体地,判断从所述待识别图像中识别出的字符串与所述至少一个预设关键词之一是否匹配。
具体地,可以预先设置匹配程度阈值,对于所述至少一个字符串中的每个字符串而言,确定该字符串与所述至少一个预设关键词之一的匹配程度,以及在所述匹配程度高于匹配程度阈值时确定该字符串与所述预设关键词匹配。例如,在该字符串完全包含某个预设关键词的情况下,可以确定该字符串与所述预设关键词完全匹配。
例如,一个预设关键词为“洗衣店”,在如图3所示的待识别图像中识别出的字符串包括“红星洗衣店”,因此识别出的字符串“红星洗衣店”完全包含预设关键词“洗衣店”,可以确定该字符串与该预设关键词匹配。在此情况下,可以将该字符串“红星洗衣店”在所述待识别图像中的位置作为向用户提示的字符串位置。
可选地,如上所述,在将如图3所示的待识别图像中识别出的字符串“红星洗衣店”划分为“红星”、“洗衣”、“洗衣店”子字符串的情况下,“洗衣店”子字符串与预设关键词“洗衣店”完全匹配。在此情况下,可以将该字符串“洗衣店”在所述待识别图像中的位置作为向用户提示的字符串位置。
再例如,一个预设关键词为“红星干洗店”,匹配程度阈值为60%,在如图3所示的待识别图像中识别出的字符串包括“红星洗衣店”,识别出的字符串“红星洗衣店”与预设关键词“红星干洗店”不能完全匹配,可以例如可以计算其匹配程度为70%或80%,该匹配程度高于所述匹配程度阈值60%,则可以确定该字符串与该预设关键词匹配。在此情况下,可以将该字符串“红星洗衣店”在所述待识别图像中的位置作为向用户提示的字符串位置。
再例如,一个预设关键词为“CA3856”。在如图4A所示的待识别图像中存在字符串“CA3856”。然而,由于字符识别算法的错误将所述待识别图像中的“CA3856”识别为字符串“CA5856”,所识别到的字符串“CA5856”与所述预设关键词“CA3856”没有完全匹配,例如根据预定的匹配程度计算算法确定所识别到的字符串“CA5856”与所述预设关键词“CA3856”之间 的匹配程度为85%。在将匹配程度阈值设置为100%时,根据本公开实施例的基于字符识别的关键词通知方法确定所识别到的字符串与所述预设关键词不匹配;在将匹配程度设置为80%时,根据本公开实施例的基于字符识别的关键词通知方法可以确定所识别到的字符串与所述预设关键词匹配。
因此,可以权衡字符识别精度以及误报率来设置所述匹配程度阈值。所述匹配程度阈值越高,所要求的字符识别精度越高但误报率越低;所述匹配程度阈值越低,误报率越高但所要求的字符识别精度越低。例如,在如图4A所示的待识别图像中,一个预设关键词为“CA3856”,预设的匹配程度阈值为50%,图4A中的字符串“CA3856”和“CA3448”都可能被确定为与预设关键词匹配,这显然提高了误报率。
可选地,可以计算该字符串与所述至少一个预设关键词之一的编辑距离(edit distance),以及在所述编辑距离低于预定编辑距离阈值时确定该字符串与所述预设关键词匹配。编辑距离可以表示由第一字符串转成第二字符串所需的最少编辑操作次数,许可的编辑操作例如可以包括将第一字符串中的一个字符替换成另一个字符,在第一字符串中插入一个字符,以及在第一字符串中删除一个字符。在此情况下,在该字符串与所述至少一个关键词之一的编辑距离为零时,匹配程度为完全匹配;而该字符串与所述至少一个关键词之一的编辑距离越大,匹配程度越小。
可以根据需要设置所述预定编辑距离阈值。例如,可以权衡字符识别精度以及误报率来设置所述预定编辑距离阈值。所述预定编辑距离阈值越高,所要求的字符识别精度越低但误报率越高;所述预定编辑距离阈值越低,误报率越低但所要求的字符识别精度越高。具体地,在将所述预定编辑距离阈值设置为0时,表示要求完全匹配,从而降低误报率,在此情况下,如果字符识别有误,则无法向用户发出提醒;在将所述预定编辑距离阈值设置为1时,表示该字符串与特定关键词可以有一个字符的区别,例如比特定关键词多一个字符、少一个字符或者有一个字符不同。
例如,一个预设关键词为“CA3856”。在如图4A所示的待识别图像中存在字符串“CA3856”。然而,由于字符识别算法的错误将字符串“CA3856”识别为“CA5856”,所识别到的字符串“CA5856”与所述预设关键词“CA3856”没有完全匹配。具体地,所识别到的字符串“CA5856”与所述预设关键词“CA3856”有一个字符不同并且例如根据预定的编辑距离计算算法计算得到 其编辑距离为1。在所述预定编辑距离阈值被设置为0时,则确定所识别到的字符串“CA5856”与所述预设关键词“CA3856”不匹配;而在将所述预定编辑距离阈值设置为1或者大于1时,可以确定识别到的字符串“CA5856”与目标关键词“CA3856”匹配。
在步骤S240,在从所述待识别图像中识别出的字符与预设关键词匹配的情况下,产生并输出匹配通知消息。可以以视觉信息、音频信息、触觉信息等来输出匹配通知消息。
所述匹配通知消息可以指示在所述待识别图像中发现了与所述预设关键词匹配的字符。例如,在所识别出的字符与预设关键词匹配的情况下,所述电子终端可以产生振动,用户感知振动,由此可以确定在当前拍摄图像中存在与预设关键词匹配的字符。替代地,在所识别出的字符与预设关键词匹配的情况下,所述电子终端可以输出音频,以向用户通知在当前拍摄图像中存在与预设关键词匹配的字符。替代地,在所识别出的字符与预设关键词匹配的情况下,所述电子终端可以在其显示屏幕上输出视频信息,例如图像闪烁、警报标识、警报标识闪烁等,以向用户通知在当前拍摄图像中存在与预设关键词匹配的字符。
可选地,所述匹配通知消息不仅可以指示在所述待识别图像中发现了与所述预设关键词匹配的字符,而且还更进一步指示所发现的与所述预设关键词匹配的字符在所述待识别图像中的位置。例如,在所识别出的字符与预设关键词匹配的情况下,所述电子终端可以输出音频,以向用户提示所发现的与所述预设关键词匹配的字符在所识别的图像中的位置。替代地,在所识别出的字符与预设关键词匹配的情况下,所述电子终端可以输出视频提示,以向用户提示所发现的与所述预设关键词匹配的字符在所述待识别图像中的位置。可选地,所述视频提示可以叠加在所识别的图像上显示。
如前所述,可以以图5A中的图像块的方式来指示所发现的与所述预设关键词匹配的字符串在所述待识别图像中的位置,或者可以以二维坐标方式确定所发现的与所述预设关键词匹配的字符串在所述待识别图像中的位置。
如图4B所示,所述视频指示可以为方框,该方框叠加在所述图像上以标识出所发现的与所述预设关键词匹配的字符在所述图像中的位置。
例如,在所述移动设备为智能手机和平板电脑的情况下,可以通过其内置的图像采集装置实时地拍摄所述待识别图像,并且可以实时地在所述移动 设备的显示屏上在与所述预设关键词匹配的字符串在所述待识别图像中的位置处或附近将所述视频指示叠加在当前拍摄图像(即所述待识别图像)上显示;在所述移动设备为眼镜式穿戴设备的情况下,可以通过所述眼镜式穿戴设备中的图像采集装置实时地拍摄所述待识别图像,并且可以实时地在所述眼镜式穿戴设备的增强现实显示镜片上在与所述预设关键词匹配的字符串在所述待识别图像中的位置处或附近显示所述视频指示,从而通过增强现实技术来向用户进行关键词通知。
根据本公开实施例,通过实时地拍摄待识别图像,实时地对所述待识别图像进行字符识别,并且在所述待识别图像中发现了目标关键词的情况下实时地向用户进行通知。因此,根据本公开实施例的基于字符识别的关键词通知方法基于实时地拍摄的待识别图像,能够实时地进行目标关键词发现与通知。
返回图2,可选地,在步骤S250,可以接收对于所述匹配通知消息的用户反馈。所述用户反馈可以包括忽略所述匹配通知消息,减小匹配程度匹配阈值,增大匹配程度匹配阈值,增加预设关键词、修改某个预设关键词、删除某个预设关键词、或者过滤某个预设关键词。
然后,在步骤S260,根据所述用户反馈,可以实时地调节预设关键词以及/或者用于匹配判断的参数。其中,所述调节所述预设关键词可以包括改变预设关键词列表,并且所述改变预设关键词列表可以包括增加预设关键词、修改某个预设关键词、或者删除某个预设关键词。此外,所述调节所述预设关键词还可以包括实时地过滤在产生所述匹配通知消息时匹配判断所使用的预设关键词。所述用户匹配判断的参数可以包括所述匹配程度阈值、或者所述编辑距离阈值。
在图2中用虚线示出步骤S250和S260以便表示其与步骤S210-S240之间的区别,步骤S210-S240是根据本公开实施例的基于字符识别的关键词通知方法的必须步骤,而步骤S250和S260则是可选步骤。
下面,将参考图6描述根据本公开实施例的基于字符识别的关键词通知设备。该关键词通知设备可以是用户携带的移动设备,并且可以执行上述方法。由于该关键词通知设备执行的各个操作的细节与在上文中针对图2描述的方法基本相同,因此为了避免重复,在下文中仅对所述关键词通知设备进行简要的描述,而省略对相同细节的描述。
如图6所示,根据本公开实施例的基于字符识别的关键词通知设备600包括图像采集装置610、字符识别装置620、关键词匹配装置630、通知装置640、以及存储装置650。图像采集装置610可以由图1所示的图像采集装置110实现,所述字符识别装置620以及关键词匹配装置630可以由图1所示的处理器102实现,并且所述通知装置640的一部分也可以由图1所述处理器102实现,所述存储装置650可以由图1所示的存储装置104实现。
所述图像采集装置610可以拍摄待识别图像,具体地可以用于在用户所在的位置处拍摄用户选定场景的作为待识别图像。如上文所述,所述待识别图像可以是照片,也可以是视频中的一帧。所述照片可以包括一幅或多幅单一场景的照片,也可以是全景照片。具体地,所述图像采集装置510可以拍摄用户选定场景的一张照片,或拍摄用户选定场景的一段视频,或者以低于预定移动速度阈值的速度改变所述图像采集装置的拍摄方向或取景范围从而拍摄更大范围的用户选定场景的视频。所述待识别图像可以反应用户所处的环境,并且相应地可以包含用户所处环境中存在的字符,所述字符可以包括但不限于建筑物标识、店铺标识、街道标识、广告牌字符等。当然,也可以利用其它拍摄设备拍摄所述待识别图像,并且将拍摄的图像发送给所述关键词通知设备600,在此情况下,可以省略图像采集设备610。
字符识别装置620可以对所拍摄的待识别图像进行字符识别,以识别出所述待识别图像中的字符。可选地,所述字符识别装置520在对所述待识别图像进行字符识别之前,可以对所述待识别图像进行预处理,以利于所述字符识别。例如,在所述图像是照片的情况下,所述预处理可以包括对照片进行缩放,在所述图像是视频的情况下,所述预处理可以包括提取视频的关键帧。
根据本公开实施例,从所述待识别图像中识别出的字符可以包括至少一个字符,并且字符识别结果可以包括所述至少一个字符以及每个字符的位置。对于所述字符识别结果中的所述至少一个字符,可以将其按照行或列的顺序组织为字符串。具体地,在对所述待识别图像进行字符识别时,可以按照所述至少一个字符中每个字符在所述待识别图像中的位置,将所述至少一个字符进行组合以形成至少一个字符串。例如,可以根据包含在所述字符识别结果中的所述至少一个字符中每个字符的位置,将所述至少一个字符排列成一行或多行,然后将所述字符按照行的顺序组织为至少一个字符串;或者,可 以根据包含在所述字符识别结果中的所述至少一个字符中每个字符的位置,将所述至少一个字符排列成一列或多列,然后将所述字符按照列的顺序组织为至少一个字符串。根据本公开实施例,从所述待识别图像中识别出的字符串可以包括至少一个字符串,并且字符串识别结果可以包括所述至少一个字符串以及每个字符串的位置。
具体地,在本公开实施例中,所述字符识别装置620可以使用光学字符识别(OCR)技术来识别所述待识别图像中的字符。具体地,所述字符识别装置620可以扫描所述待识别图像以检测所述待识别图像中所有可能含有字符的字符框的位置,然后可以识别每个字符框中的字符并将每个字符框中的内容作为一个字符串,并且产生字符串识别结果。如上文所述,所述字符串识别结果可以包括所识别的字符串以及所述字符串的位置。
可选地,所述字符识别装置620还可以分析所述字符串识别结果中的每个字符串,以确定一个或多个具有具体含义的词语。具体地,可以使用本领域公知的方法对所述字符串进行分析,并且对所述字符串进行分词,从而将所述字符串划分为一个或多个具有具体含义的词语。
所述关键词匹配装置630可以判断从所述待识别图像中识别出的字符与预设关键词是否匹配。所述预设关键词可以包括至少一个预设关键词。具体地,对于所述字符识别装置620识别出的至少一个字符串中的每个字符串,所述关键词匹配装置630可以判断该字符串是否与所述至少一个预设关键词之一匹配,并且在判断该字符串与所述至少一个预设关键词之一匹配的情况下,确定该字符串与所述预设关键词匹配。
可选地,可以预先设置匹配程度阈值,在一个字符串与一个预设关键词之间的匹配程度高于该匹配程度阈值时,确定该字符串与所述预设关键词匹配。具体地,对于所述字符识别装置620识别出的至少一个字符串中的每个字符串,所述关键词匹配装置630可以计算该字符串与所述至少一个预设关键词之一的匹配程度,并且在所述匹配程度高于所述匹配程度阈值的情况下,确定该字符串与所述预设关键词匹配。可以根据需要设置所述匹配程度阈值。例如,可以权衡字符识别精度以及误报率来设置所述匹配程度阈值。所述匹配程度阈值越高,所要求的字符识别精度越高但误报率越低;所述匹配程度阈值越低,误报率越高但所要求的字符识别精度越低。
可选地,可以预先设置编程距离阈值,可以计算该字符串与所述至少一 个预设关键词之一的编辑距离,以及在所述编辑距离低于所述编辑距离阈值时确定该字符串与所述预设关键词匹配。可以根据需要设置所述预定编辑距离阈值。例如,可以权衡字符识别精度以及误报率来设置所述预定编辑距离阈值。所述预定编辑距离阈值越高,所要求的字符识别精度越低但误报率越高;所述预定编辑距离阈值越低,误报率越低但所要求的字符识别精度越高。
通知装置640用于在从所述待识别图像中识别出的字符与预设关键词匹配的情况下,产生并输出匹配通知消息。可以以视觉信息、音频信息、触觉信息等来输出匹配通知消息。
所述匹配通知消息可以指示在所述待识别图像中发现了与所述预设关键词匹配的字符。例如,所述通知装置640可以为振动装置,其在所识别出的字符与预设关键词匹配的情况下可以产生振动,用户感知振动,由此可以确定在当前拍摄图像中存在与预设关键词匹配的字符。替代地,所述通知装置640可以为语音交互装置(包括音频输出装置),其在所识别出的字符与预设关键词匹配的情况下可以输出音频,以向用户通知在当前拍摄图像中存在与预设关键词匹配的字符。替代地,所述通知装置640可以为显示装置,其在所识别出的字符与预设关键词匹配的情况下显示视频提示信息,例如在显示装置上显示的所述待识别图像闪烁、警报标识、警报标识闪烁等,以向用户通知在当前拍摄图像中存在与预设关键词匹配的字符。
可选地,所述匹配通知消息不仅可以指示在所述待识别图像中发现了与所述预设关键词匹配的字符,而且还更进一步指示所发现的与所述预设关键词匹配的字符在所述待识别图像中的位置。例如,所述通知装置540可以为音频输出装置,其在所识别出的字符与预设关键词匹配的情况下可以输出音频,以向用户提示所发现的与所述预设关键词匹配的字符在所识别的图像中的位置。替代地,所述通知装置640可以为显示装置,其在所识别出的字符与预设关键词匹配的情况下显示视频提示信息,以向用户提示所发现的与所述预设关键词匹配的字符在所述待识别图像中的位置。可选地,所述视频提示信息可以叠加在所识别的图像上显示。
根据本公开实施例,所述关键词通知设备为穿戴设备,所述图像采集装置实时地拍摄所述待识别图像;以及所述通知装置实时地输出所述匹配通知消息。
例如,在所述基于字符识别的关键词通知设备600为智能手机和平板电 脑的情况下,可以通过其内置的图像采集装置实时地拍摄所述待识别图像,并且可以实时地在所述移动设备的显示屏上在与所述预设关键词匹配的字符串在所述待识别图像中的位置处或附近将所述视频指示叠加在当前拍摄图像(即所述待识别图像)上显示;在所述基于字符识别的关键词通知设备600为眼镜式穿戴设备的情况下,所述通知装置650可以为所述眼镜式穿戴设备中的增强现实显示镜片,可以通过所述眼镜式穿戴设备中的图像采集装置实时地拍摄所述待识别图像,并且可以实时地在所述眼镜式穿戴设备的增强现实显示镜片上在与所述预设关键词匹配的字符串在所述待识别图像中的位置处或附近显示所述视频指示,从而通过增强现实技术来向用户进行关键词通知。
所述存储装置650用于存储所述预设关键词以及所述待识别图像,并且还可以存储所述匹配程度阈值和/或所述编辑距离阈值。此外,所述存储装置650还用于存储用于实现根据本公开实施例的基于字符识别的关键词通知的方法的计算机程序代码。
此外,根据本公开实施例的基于字符识别的关键词通知设备600还可以包括反馈装置(未示出),用于接收对于所述匹配通知消息的用户反馈。所述用户反馈可以包括忽略所述匹配通知消息,减小匹配程度匹配阈值,增大匹配程度匹配阈值,增加预设关键词、修改某个预设关键词、删除某个预设关键词、或者过滤某个预设关键词。所述反馈设备可以是触摸检测装置、语音检测装置等。所述语音检测装置和所述语音输出装置可以集成在一起并且通称为语音交互装置,所述触摸检测装置和所述显示装置也可以集成在一起并且通称为视频交互装置。
此外,根据本公开实施例的基于字符识别的关键词通知设备600还可以包括调节装置(未示出),用于根据所述用户反馈,可以实时地调节所述预设关键词以及/或者用于匹配判断的参数。其中,所述调节所述预设关键词可以包括改变预设关键词列表,并且所述改变预设关键词列表可以包括增加预设关键词、修改某个预设关键词、或者删除某个预设关键词。此外,所述调节所述预设关键词还可以包括实时地过滤在产生所述匹配通知消息时匹配判断所使用的预设关键词。所述用户匹配判断的参数可以包括所述匹配程度阈值、或者所述编辑距离阈值。
此外,根据本公开实施例,还提供了一种计算机程序产品,其包括计算 机可读存储介质,在所述计算机可读存储介质上存储了计算机程序指令。所述计算机程序指令在被计算机运行时可以实现根据本公开实施例的基于字符识别的关键词通知方法,并且/或者可以实现根据本公开实施例的基于字符识别的关键词通知设备中的字符识别装置、关键词匹配装置、通知装置、调节装置的全部或部分功能。
根据本公开实施例的基于字符识别的关键词通知方法及设备、以及计算机程序产品,通过预先设定目标关键词,并且基于目标关键词来筛选字符识别结果,能够在光学字符识别结果与目标关键词匹配时向用户提示发现了目标关键词。由于在电子终端上可以以例如每秒一帧的实时处理速度对拍摄图像进行光学字符识别,因此在电子终端实时拍摄图像时,电子终端可以实时地对当前拍摄图像进行光学字符识别,并且在光学字符识别结果与目标关键词匹配时可以实时地向用户通知发现了目标关键词,从而将OCR技术有利地用于帮助用户进行字符发现。
在上面详细描述的本公开的示例实施例仅仅是说明性的,而不是限制性的。本领域技术人员应该理解,在不脱离本公开的原理和精神的情况下,可对这些实施例进行各种修改,组合或子组合,并且这样的修改应落入本公开的范围内。

Claims (20)

  1. 一种基于字符识别的关键词通知方法,包括:
    拍摄待识别图像;
    在所述待识别图像中进行字符识别;以及
    在从所述待识别图像中识别出的字符与预设关键词匹配的情况下,产生并输出匹配通知消息。
  2. 如权利要求1所述的关键词通知方法,其中,
    通过穿戴设备中的图像采集装置实时地拍摄所述待识别图像;以及
    通过穿戴设备中的通知装置实时地输出所述匹配通知消息。
  3. 如权利要求2所述的关键词通知方法,其中,从所述待识别图像中识别出的字符包括至少一个字符,
    其中,在所述待识别图像中进行字符识别包括:按照所述至少一个字符中每个字符在所述待识别图像中的位置,将所述至少一个字符进行组合以形成至少一个字符串;以及
    其中,在从所述待识别图像中识别出的字符与预设关键词匹配的情况下产生并输出匹配通知消息包括:对于所述至少一个字符串中的每个字符串,确定该字符串与所述预设关键词是否匹配,并且在该字符串与所述预设关键词匹配的情况下,产生并输出匹配通知消息。
  4. 如权利要求3所述的关键词通知方法,其中,所述预设关键词包括至少一个关键词,
    其中,确定该字符串与所述预设关键词是否匹配包括:确定该字符串与所述至少一个关键词中之一的匹配程度,以及在所述匹配程度高于预定匹配程度阈值时确定该字符串与所述预设关键词匹配。
  5. 如权利要求4所述的关键词通知方法,其中,确定该字符串与所述至少一个关键词中之一的匹配程度,以及在所述匹配程度高于预定匹配程度阈值时确定该字符串与所述预设关键词匹配包括:
    计算该字符串与所述至少一个关键词中之一的编辑距离;以及
    在所述编辑距离低于预定编辑距离阈值时确定该字符串与所述预设关键词匹配,
    其中,在该字符串与所述至少一个关键词之一的编辑距离为零时,匹配 程度为完全匹配;而该字符串与所述至少一个关键词之一的编辑距离越大,匹配程度越小。
  6. 如权利要求5所述的关键词通知方法,其中,所述穿戴设备为眼镜式穿戴设备,并且所述通知装置为所述眼镜式穿戴设备中的增强现实显示镜片,
    其中,通过所述增强现实显示镜片实时地输出所述匹配通知消息,所述匹配通知消息指示与所述预设关键词匹配的字符串在所述待识别图像中的位置。
  7. 如权利要求5所述的关键词通知方法,其中,所述匹配通知装置为所述穿戴设备中的语音交互装置,
    其中,通过所述语音交互装置实时地输出所述匹配通知消息,所述匹配通知消息指示与所述预设关键词匹配的字符串在所述待识别图像中的位置。
  8. 如权利要求4所述的关键词通知方法,还包括:
    接收对于所述匹配通知消息的用户反馈;以及
    根据所述用户反馈,实时地调节所述预设关键词以及/或者用于匹配判断的参数。
  9. 一种基于光学字符识别的关键词通知设备,包括:
    图像采集装置,用于拍摄待识别图像;
    通知装置,用于输出匹配通知消息;
    一个或多个处理器;
    一个或多个存储器;以及
    存储在所述存储器中的计算机程序指令,在所述计算机程序指令被所述处理器运行时执行以下步骤:
    在所述待识别图像中进行字符识别;以及
    在从所述待识别图像中识别出的字符与预设关键词匹配的情况下,产生所述匹配通知消息。
  10. 如权利要求9所述的关键词通知设备,其中,所述关键词通知设备为穿戴设备,其中,
    所述图像采集装置实时地拍摄所述待识别图像;以及
    所述通知装置实时地输出所述匹配通知消息。
  11. 如权利要求10所述的关键词通知设备,其中,从所述待识别图像中识别出的字符包括至少一个字符,
    其中,在所述待识别图像中进行字符识别包括:按照所述至少一个字符中每个字符在所述待识别图像中的位置,将所述至少一个字符进行组合以形成至少一个字符串;以及
    其中,在从所述待识别图像中识别出的字符与预设关键词匹配的情况下产生匹配通知消息包括:对于所述至少一个字符串中的每个字符串,确定该字符串与所述预设关键词是否匹配,并且在该字符串与所述预设关键词匹配的情况下,产生匹配通知消息。
  12. 如权利要求11所述的关键词通知设备,其中,所述预设关键词包括至少一个关键词,
    其中,确定该字符串与所述预设关键词是否匹配包括:确定该字符串与所述至少一个关键词中之一的匹配程度,以及在所述匹配程度高于预定匹配程度阈值时确定该字符串与所述预设关键词匹配。
  13. 如权利要求12所述的关键词通知设备,其中,确定该字符串与所述至少一个关键词中之一的匹配程度,以及在所述匹配程度高于预定匹配程度阈值时确定该字符串与所述预设关键词匹配包括:
    计算该字符串与所述至少一个关键词中之一的编辑距离;以及
    在所述编辑距离低于预定编辑距离阈值时确定该字符串与所述预设关键词匹配,
    其中,在该字符串与所述至少一个关键词之一的编辑距离为零时,匹配程度为完全匹配;而该字符串与所述至少一个关键词之一的编辑距离越大,匹配程度越小。
  14. 如权利要求13所述的关键词通知设备,其中,所述穿戴设备为眼镜式穿戴设备,并且所述通知装置为所述眼镜式穿戴设备中的增强现实显示镜片,
    其中,通过所述增强现实显示镜片实时地输出所述匹配通知消息,所述匹配通知消息指示与所述预设关键词匹配的字符串在所述待识别图像中的位置。
  15. 如权利要求13所述的关键词通知设备,其中,所述匹配通知装置为所述穿戴设备中的语音交互装置,
    其中,通过所述语音交互装置实时地输出所述匹配通知消息,所述匹配通知消息指示与所述预设关键词匹配的字符串在所述待识别图像中的位置。
  16. 如权利要求12所述的关键词通知设备,还包括:
    反馈装置,用于接收对于所述匹配通知消息的用户反馈;以及
    调节装置,用于根据所述用户反馈,实时地调节所述预设关键词以及/或者用于匹配判断的参数。
  17. 一种用于基于字符识别进行关键词通知的计算机程序产品,包括一个或多个计算机可读存储介质,所述计算机可读存储介质上存储了计算机程序指令,所述计算机程序指令可由处理器执行以使得所述处理器:
    在待识别图像中进行字符识别;以及
    在从所述待识别图像中识别出的字符与预设关键词匹配的情况下,产生所述匹配通知消息。
  18. 如权利要求17所述的计算机程序产品,其中,所述待识别图像是由眼镜式穿戴设备的图像采集装置实时地拍摄的,所述匹配通知消息由眼镜式穿戴设备中的增强现实显示镜片实时地在与所述预设关键词匹配的字符串在所述待识别图像中的位置处或附近显示。
  19. 如权利要求17所述的计算机程序产品,其中,从所述待识别图像中识别出的字符包括至少一个字符,所述预设关键词包括至少一个关键词,
    其中,在所述待识别图像中进行字符识别包括:按照所述至少一个字符中每个字符在所述待识别图像中的位置,将所述至少一个字符进行组合以形成至少一个字符串;以及
    其中,在从所述待识别图像中识别出的字符与预设关键词匹配的情况下产生匹配通知消息包括:对于所述至少一个字符串中的每个字符串,确定该字符串与所述至少一个关键词中之一的匹配程度,以及在所述匹配程度高于预定匹配程度阈值时确定该字符串与所述预设关键词匹配,并且产生匹配通知消息。
  20. 如权利要求19所述的计算机程序产品,其中,所述计算机程序指令可由处理器执行还使得所述处理器:
    从反馈装置接收对于所述匹配通知消息的用户反馈;以及
    根据所述用户反馈,实时地调节所述预设关键词以及/或者用于匹配判断的参数。
PCT/CN2015/080127 2015-05-28 2015-05-28 基于字符识别的关键词通知方法及设备、计算机程序产品 WO2016187888A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201580000345.XA CN105518712B (zh) 2015-05-28 2015-05-28 基于字符识别的关键词通知方法及设备
PCT/CN2015/080127 WO2016187888A1 (zh) 2015-05-28 2015-05-28 基于字符识别的关键词通知方法及设备、计算机程序产品

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2015/080127 WO2016187888A1 (zh) 2015-05-28 2015-05-28 基于字符识别的关键词通知方法及设备、计算机程序产品

Publications (1)

Publication Number Publication Date
WO2016187888A1 true WO2016187888A1 (zh) 2016-12-01

Family

ID=55725026

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/080127 WO2016187888A1 (zh) 2015-05-28 2015-05-28 基于字符识别的关键词通知方法及设备、计算机程序产品

Country Status (2)

Country Link
CN (1) CN105518712B (zh)
WO (1) WO2016187888A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109766552A (zh) * 2019-01-08 2019-05-17 安徽省泰岳祥升软件有限公司 一种基于公告信息的指代消解方法及装置
CN112449057A (zh) * 2019-08-15 2021-03-05 腾讯科技(深圳)有限公司 消息的提示方法和装置、存储介质及电子装置
CN113420549A (zh) * 2021-07-02 2021-09-21 珠海金山网络游戏科技有限公司 异常字符串识别方法及装置
CN116229973A (zh) * 2023-03-16 2023-06-06 润芯微科技(江苏)有限公司 一种基于ocr的可见即可说功能的实现方法

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106203425B (zh) * 2016-07-01 2020-02-04 北京旷视科技有限公司 字符识别方法及装置
CN107798004B (zh) * 2016-08-29 2022-09-30 中兴通讯股份有限公司 关键词查找方法、装置及终端
CN106846008B (zh) * 2016-12-27 2021-06-29 北京五八信息技术有限公司 营业执照验证方法及装置
CN111191640B (zh) * 2017-03-30 2023-06-20 成都汇亿诺嘉文化传播有限公司 一种三维场景呈现方法、装置及系统
CN107357865A (zh) * 2017-06-30 2017-11-17 北京小米移动软件有限公司 信息提示方法及装置
CN107958212A (zh) * 2017-11-20 2018-04-24 珠海市魅族科技有限公司 一种信息提示方法、装置、计算机装置及计算机可读存储介质
CN109979012A (zh) * 2017-12-27 2019-07-05 北京亮亮视野科技有限公司 展示消息通知的方法及装置
CN108830126B (zh) * 2018-06-20 2021-08-27 上海凌脉网络科技股份有限公司 一种基于图像智能识别的产品营销互动方法
CN110059572B (zh) * 2019-03-22 2021-08-10 中国科学院自动化研究所 基于单字匹配的文档图像中文关键词检测方法、系统
CN112445450A (zh) * 2019-08-30 2021-03-05 比亚迪股份有限公司 基于语音控制终端的方法、装置、存储介质和电子设备
CN110992139B (zh) * 2019-11-28 2022-03-08 珠海采筑电子商务有限公司 竞标价格实现方法及相关产品
CN111563514B (zh) * 2020-05-14 2023-12-22 广东小天才科技有限公司 一种三维字符的显示方法及装置、电子设备、存储介质
CN112199545B (zh) * 2020-11-23 2021-09-07 湖南蚁坊软件股份有限公司 基于图片文字定位的关键词显示方法、装置及存储介质
CN113468023A (zh) * 2021-07-09 2021-10-01 中国电信股份有限公司 监控方法、装置、介质及电子设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101520783A (zh) * 2008-02-29 2009-09-02 富士通株式会社 基于图像内容的关键词搜索方法和装置
CN101571921A (zh) * 2008-04-28 2009-11-04 富士通株式会社 关键字识别方法和装置
US20100008582A1 (en) * 2008-07-10 2010-01-14 Samsung Electronics Co., Ltd. Method for recognizing and translating characters in camera-based image
CN101751433A (zh) * 2008-12-22 2010-06-23 汉王科技股份有限公司 名片字符条目分类方法与装置
CN104090970A (zh) * 2014-07-17 2014-10-08 百度在线网络技术(北京)有限公司 兴趣点的展现方法及装置

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103176999A (zh) * 2011-12-21 2013-06-26 上海博路信息技术有限公司 一种基于ocr的阅读辅助系统
KR102013329B1 (ko) * 2012-08-23 2019-08-22 삼성전자 주식회사 광학식 문자 판독기를 이용한 데이터 처리 방법 및 장치
CN103116752A (zh) * 2013-02-25 2013-05-22 新浪网技术(中国)有限公司 图片审核方法和系统

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101520783A (zh) * 2008-02-29 2009-09-02 富士通株式会社 基于图像内容的关键词搜索方法和装置
CN101571921A (zh) * 2008-04-28 2009-11-04 富士通株式会社 关键字识别方法和装置
US20100008582A1 (en) * 2008-07-10 2010-01-14 Samsung Electronics Co., Ltd. Method for recognizing and translating characters in camera-based image
CN101751433A (zh) * 2008-12-22 2010-06-23 汉王科技股份有限公司 名片字符条目分类方法与装置
CN104090970A (zh) * 2014-07-17 2014-10-08 百度在线网络技术(北京)有限公司 兴趣点的展现方法及装置

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109766552A (zh) * 2019-01-08 2019-05-17 安徽省泰岳祥升软件有限公司 一种基于公告信息的指代消解方法及装置
CN109766552B (zh) * 2019-01-08 2023-01-31 安徽省泰岳祥升软件有限公司 一种基于公告信息的指代消解方法及装置
CN112449057A (zh) * 2019-08-15 2021-03-05 腾讯科技(深圳)有限公司 消息的提示方法和装置、存储介质及电子装置
CN112449057B (zh) * 2019-08-15 2022-07-29 腾讯科技(深圳)有限公司 消息的提示方法和装置、存储介质及电子装置
CN113420549A (zh) * 2021-07-02 2021-09-21 珠海金山网络游戏科技有限公司 异常字符串识别方法及装置
CN116229973A (zh) * 2023-03-16 2023-06-06 润芯微科技(江苏)有限公司 一种基于ocr的可见即可说功能的实现方法
CN116229973B (zh) * 2023-03-16 2023-10-17 润芯微科技(江苏)有限公司 一种基于ocr的可见即可说功能的实现方法

Also Published As

Publication number Publication date
CN105518712A (zh) 2016-04-20
CN105518712B (zh) 2021-05-11

Similar Documents

Publication Publication Date Title
WO2016187888A1 (zh) 基于字符识别的关键词通知方法及设备、计算机程序产品
US10832069B2 (en) Living body detection method, electronic device and computer readable medium
WO2017185630A1 (zh) 基于情绪识别的信息推荐方法、装置和电子设备
US9667860B2 (en) Photo composition and position guidance in a camera or augmented reality system
JP2023018021A (ja) 制御されていない照明条件の画像中の肌色を識別する技術
CN105373768B (zh) 提供图像内容的方法和设备
JP6154075B2 (ja) オブジェクト検出及び分割の方法,装置,コンピュータプログラム製品
KR102087882B1 (ko) 시각적 이미지 매칭을 기반으로 한 미디어 스트림 식별 장치 및 방법
WO2020134238A1 (zh) 活体检测方法、装置以及存储介质
US9807300B2 (en) Display apparatus for generating a background image and control method thereof
US20170109912A1 (en) Creating a composite image from multi-frame raw image data
JP2010103980A (ja) 画像処理方法、画像処理装置及びシステム
KR20190120106A (ko) 동영상의 대표 이미지를 결정하는 방법 및 그 방법을 처리하는 전자 장치
CN107977636B (zh) 人脸检测方法及装置、终端、存储介质
WO2022193911A1 (zh) 指令信息获取方法及装置、可读存储介质、电子设备
US10354161B2 (en) Detecting font size in a digital image
CN111881740A (zh) 人脸识别方法、装置、电子设备及介质
US10699145B1 (en) Systems and methods for augmented reality assisted form data capture
KR20160046399A (ko) 텍스쳐 맵 생성 방법 및 장치와 데이터 베이스 생성 방법
US9286707B1 (en) Removing transient objects to synthesize an unobstructed image
JP2013195725A (ja) 画像表示システム
CN113297416A (zh) 视频数据存储方法、装置、电子设备和可读存储介质
US11755758B1 (en) System and method for evaluating data files
KR20140134844A (ko) 객체 기반 사진 촬영 방법 및 장치
KR20210120599A (ko) 아바타 서비스 제공 방법 및 시스템

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15892969

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15892969

Country of ref document: EP

Kind code of ref document: A1