CN110781886A - Keyword acquisition method based on image and OCR recognition - Google Patents
Keyword acquisition method based on image and OCR recognition Download PDFInfo
- Publication number
- CN110781886A CN110781886A CN201911021828.4A CN201911021828A CN110781886A CN 110781886 A CN110781886 A CN 110781886A CN 201911021828 A CN201911021828 A CN 201911021828A CN 110781886 A CN110781886 A CN 110781886A
- Authority
- CN
- China
- Prior art keywords
- image
- interest
- frame
- keyword
- ocr recognition
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/62—Text, e.g. of license plates, overlay texts or captions on TV images
Abstract
The invention relates to the technical field of image recognition, in particular to a keyword acquisition method based on image and OCR recognition. The invention comprises the following steps: the method comprises the following steps: continuously acquiring frame pictures in a video, and sequentially identifying interest points of each frame picture according to the acquisition sequence of the frame pictures; step two: and obtaining a target recognition area according to the position information of the interest image, obtaining a keyword in the target recognition area by adopting an OCR recognition algorithm, and extracting the keyword. The invention can automatically identify and acquire the specified keywords.
Description
Technical Field
The invention relates to the technical field of image recognition, in particular to a keyword acquisition method based on image and OCR recognition.
Background
With the rapid development of modern science and technology, the image recognition technology is widely applied to various scenes in life as one of the key basic technologies of the modern society, and the algorithm of image recognition is changing day by day.
However, in some special scenes, the application of image recognition is still not humanized enough and needs to be further improved. For example, in the field of modern intelligent reading, people utilize a camera in combination with an OCR recognition technology to synchronously and quickly recognize the current reading content of a user in the reading process of people, and quickly extract high-frequency keywords through an algorithm, and utilize the keywords to perform associated retrieval so as to provide intelligent reading services such as associative reading, keyword paraphrasing and the like for the user. However, since the interest points of the readers are different, there is a reading requirement in the above scenario that the interest points of the readers are not high-frequency keywords but certain low-frequency words specified by the high-frequency keywords, and obviously, no relevant algorithm has been developed in the prior art to meet the requirement of the user in the reading scenario.
Disclosure of Invention
The invention aims to: provided is a keyword acquisition method based on image and OCR recognition, which can automatically recognize and acquire a keyword for specification.
The invention is realized by the following technical scheme: a keyword acquisition method based on image and OCR recognition is characterized by comprising the following steps:
the method comprises the following steps: continuously acquiring frame pictures in a video, and sequentially identifying interest points of each frame picture according to the acquisition sequence of the frame pictures;
the interest point identification processing comprises the steps of adopting an image identification technology to identify whether a preset interest image exists in a frame picture, if no interest image exists, continuing to identify the next frame picture, if the interest image exists, initializing coordinate information of the frame picture with the interest image, and acquiring position information of the interest image;
respectively and continuously recording position information data groups of the interest images of all continuous frame images from an initial frame image of the interest image to a last frame image of the interest image which is not obtained any more, and calculating the motion of the interest image in the process of the frame images according to the position information data groups; if it is
If the action of the interest image is inconsistent with the preset action, continuing to sequentially perform interest point identification processing on the rest frame pictures, and if the action of the interest image is consistent with the preset action, entering the next step;
step two: and obtaining a target recognition area according to the position information of the interest image, obtaining a keyword in the target recognition area by adopting an OCR recognition algorithm, and extracting the keyword.
For better implementation of the scheme, the following optimization scheme is also provided:
further, the interest image is a finger image or a marker pen image.
Further, the preset action is used as stopping, marking or circling.
Compared with the prior art, the invention has the beneficial effects that: the method solves the problem that the existing image recognition field can not obtain the user-specified keywords by adopting the technology of combining image recognition and an OCR recognition algorithm, and the algorithm operation is simple and efficient.
Detailed Description
This embodiment specifically introduces one of the applications of the method in a reading scene, in which a fixed camera is adopted to face a book being read by a reader, and the step of extracting keywords is as follows:
the method comprises the following steps: continuously acquiring frame pictures in a video acquired by a camera, and sequentially identifying interest points of each frame picture according to the acquisition sequence of the frame pictures; here we preset the interest image as the image when the reader's right index finger is extended.
The interest point identification processing comprises the steps of adopting an image identification technology to identify whether a preset interest image exists in a frame picture, if no interest image exists, continuing to identify the next frame picture, if the interest image exists, initializing coordinate information of the frame picture with the interest image, and acquiring position information of the interest image;
respectively and continuously recording position information data groups of the interest images of all continuous frame images from an initial frame image of the interest image to a last frame image of the interest image which is not obtained any more, and calculating the motion of the interest image in the process of the frame images according to the position information data groups; if it is
If the action of the interest image is inconsistent with the preset action, continuing to sequentially perform interest point identification processing on the rest frame pictures, and if the action of the interest image is consistent with the preset action, entering the next step; here the action is preset as dwell.
Step two: and obtaining a target recognition area according to the position information of the interest image, obtaining a keyword in the target recognition area by adopting an OCR recognition algorithm, and extracting the keyword. The target recognition area here is a square area located above the straight line change trajectory. The length of the square area and the length of the straight line change track, and the width of the square area is set as the length of one font.
For better understanding, the functions implemented by the method are further described below with reference to specific scene behaviors: in the first step, when the fact that the index finger of the right hand of the reader extends is detected, the step two is entered; the preset track is a straight line, namely when the fact that the reader stretches out the index finger of the right hand to draw the cross is detected, the judgment is consistent with the preset, and the step three is entered; and in the third step, the characters in the target area in the area above the stroke track made by the index finger of the right hand are recognized, and the keyword is extracted.
Through the process of the method, a scene can be realized, namely, when a reader finds that keywords such as 'Song Dynasty' which are interested in the book exist in the reading process, the reader only needs to stretch out the right finger to make a stopping action at the bottom of the word of the Song Dynasty on the book, and then the reader can grab the keyword of 'Song Dynasty' to perform next step of extension service.
In addition, in other embodiments, the circling may be used as a preset action, and an area in the direction of the circling action is set as a target area, for example, when a reader finds that a keyword such as "song dynasty" which is interested in the book exists in the reading process, the right hand finger is only required to be stretched out to perform the circling action around the word of the song dynasty in the book to scribe the word of the "song dynasty", and OCR recognition performed according to the target area determined in the circling can capture the keyword of the "song dynasty" which is interested by the user, so as to perform next extended service.
While the invention has been illustrated and described with respect to specific embodiments and alternatives thereof, it will be understood that various changes and modifications can be made without departing from the spirit and scope of the invention. It is understood, therefore, that the invention is not to be in any way limited except by the appended claims and their equivalents.
Claims (5)
1. A keyword acquisition method based on image and OCR recognition is characterized by comprising the following steps:
the method comprises the following steps: continuously acquiring frame pictures in a video, and sequentially identifying interest points of each frame picture according to the acquisition sequence of the frame pictures;
the interest point identification processing comprises the steps of adopting an image identification technology to identify whether a preset interest image exists in a frame picture, if no interest image exists, continuing to identify the next frame picture, if the interest image exists, initializing coordinate information of the frame picture with the interest image, and acquiring position information of the interest image;
respectively and continuously recording position information data groups of the interest images of all continuous frame images from an initial frame image of the interest image to a last frame image of the interest image which is not obtained any more, and calculating the motion of the interest image in the process of the frame images according to the position information data groups; if it is
If the action of the interest image is inconsistent with the preset action, continuing to sequentially perform interest point identification processing on the rest frame pictures, and if the action of the interest image is consistent with the preset action, entering the next step;
step two: and obtaining a target recognition area according to the position information of the interest image, obtaining a keyword in the target recognition area by adopting an OCR recognition algorithm, and extracting the keyword.
2. A keyword acquisition method based on image and OCR recognition according to claim 1, characterized in that: the interest image is a finger image or a marker pen image.
3. A keyword acquisition method based on image and OCR recognition according to claim 1, characterized in that: the preset action is used as stopping, marking or circling.
4. A computer-readable storage medium, in which a computer program is stored, which, when being executed by a processor, carries out the steps of a method for obtaining keywords based on image and OCR recognition according to any one of claims 1 to 3.
5. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor when executing the computer program implements the steps of the image and OCR recognition based keyword acquisition method according to any one of claims 1 to 3.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911021828.4A CN110781886A (en) | 2019-10-25 | 2019-10-25 | Keyword acquisition method based on image and OCR recognition |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911021828.4A CN110781886A (en) | 2019-10-25 | 2019-10-25 | Keyword acquisition method based on image and OCR recognition |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110781886A true CN110781886A (en) | 2020-02-11 |
Family
ID=69387538
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911021828.4A Pending CN110781886A (en) | 2019-10-25 | 2019-10-25 | Keyword acquisition method based on image and OCR recognition |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110781886A (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101907923A (en) * | 2010-06-29 | 2010-12-08 | 汉王科技股份有限公司 | Information extraction method, device and system |
CN105590486A (en) * | 2014-10-21 | 2016-05-18 | 黄小曼 | Machine vision-based pedestal-type finger reader, related system device and related method |
CN108256504A (en) * | 2018-02-11 | 2018-07-06 | 苏州笛卡测试技术有限公司 | A kind of Three-Dimensional Dynamic gesture identification method based on deep learning |
CN108345387A (en) * | 2018-03-14 | 2018-07-31 | 百度在线网络技术(北京)有限公司 | Method and apparatus for output information |
CN109325464A (en) * | 2018-10-16 | 2019-02-12 | 上海翎腾智能科技有限公司 | A kind of finger point reading character recognition method and interpretation method based on artificial intelligence |
-
2019
- 2019-10-25 CN CN201911021828.4A patent/CN110781886A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101907923A (en) * | 2010-06-29 | 2010-12-08 | 汉王科技股份有限公司 | Information extraction method, device and system |
CN105590486A (en) * | 2014-10-21 | 2016-05-18 | 黄小曼 | Machine vision-based pedestal-type finger reader, related system device and related method |
CN108256504A (en) * | 2018-02-11 | 2018-07-06 | 苏州笛卡测试技术有限公司 | A kind of Three-Dimensional Dynamic gesture identification method based on deep learning |
CN108345387A (en) * | 2018-03-14 | 2018-07-31 | 百度在线网络技术(北京)有限公司 | Method and apparatus for output information |
CN109325464A (en) * | 2018-10-16 | 2019-02-12 | 上海翎腾智能科技有限公司 | A kind of finger point reading character recognition method and interpretation method based on artificial intelligence |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Yin et al. | Text detection, tracking and recognition in video: a comprehensive survey | |
Yue et al. | Robustscanner: Dynamically enhancing positional clues for robust text recognition | |
Zhao et al. | Strategy for dynamic 3D depth data matching towards robust action retrieval | |
JP5604256B2 (en) | Human motion detection device and program thereof | |
CN106960181B (en) | RGBD data-based pedestrian attribute identification method | |
CN104866805B (en) | Method and device for real-time tracking of human face | |
CN108921204B (en) | Electronic device, picture sample set generation method, and computer-readable storage medium | |
Tian et al. | Scene Text Detection in Video by Learning Locally and Globally. | |
Küçüktunç et al. | Video copy detection using multiple visual cues and MPEG-7 descriptors | |
Wang et al. | Scene text detection and tracking in video with background cues | |
Su et al. | Robust video fingerprinting based on visual attention regions | |
Yaghoubi et al. | You look so different! Haven’t I seen you a long time ago? | |
Ji et al. | News videos anchor person detection by shot clustering | |
CN105204752B (en) | Projection realizes interactive method and system in reading | |
CN113837006A (en) | Face recognition method and device, storage medium and electronic equipment | |
Heilbron et al. | Camera motion and surrounding scene appearance as context for action recognition | |
CN111079527B (en) | Shot boundary detection method based on 3D residual error network | |
CN111079777B (en) | Page positioning-based click-to-read method and electronic equipment | |
Soundes et al. | Pseudo Zernike moments-based approach for text detection and localisation from lecture videos | |
Liu et al. | Automated player identification and indexing using two-stage deep learning network | |
CN110781886A (en) | Keyword acquisition method based on image and OCR recognition | |
Liang et al. | Pedestrian detection based on sparse coding and transfer learning | |
Arai et al. | Method for extracting product information from TV commercial | |
Paliwal et al. | A survey on various text detection and extraction techniques from videos and images | |
Yang et al. | An improved system for real-time scene text recognition |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20200211 |