CN110019661A - Text search method, apparatus and electronic equipment based on office documents - Google Patents
Text search method, apparatus and electronic equipment based on office documents Download PDFInfo
- Publication number
- CN110019661A CN110019661A CN201710818141.8A CN201710818141A CN110019661A CN 110019661 A CN110019661 A CN 110019661A CN 201710818141 A CN201710818141 A CN 201710818141A CN 110019661 A CN110019661 A CN 110019661A
- Authority
- CN
- China
- Prior art keywords
- text
- search
- picture
- video
- audio
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 24
- 230000006870 function Effects 0.000 description 22
- 238000010586 diagram Methods 0.000 description 3
- 230000007547 defect Effects 0.000 description 1
- 235000013399 edible fruits Nutrition 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/40—Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
- G06F16/43—Querying
Abstract
The invention discloses a kind of text search method, apparatus and electronic equipment based on office documents.Text search method based on office documents includes: to determine object content according to the content of text to be searched of user's input;Determine search range;Determine that the object search in search range, object search include at least one of text box objects, object picture and audio/video object;Search target is searched in determining object search.Above-mentioned text search method, apparatus and electronic equipment based on office documents of the invention can search object content in the data other than text data, and treatment effeciency is high, and processing result is more accurate.
Description
Technical field
The present invention relates to the information processing technology, espespecially a kind of text search method, apparatus and electronics based on office documents
Equipment.
Background technique
In existing office software, such as MS Office, WPS Office, it can be existed by " search and replace " function
The position for wanting object content to occur is found in document.It can only searched within the scope of the text data of document, text data with
If outer data can not find out comprising object content, such as the text of a certain section of phonetic representation in picture, video/audio.
Summary of the invention
The present invention provides a kind of text search method, apparatus and electronic equipment based on office documents, it is existing to solve
Technology cannot search the problem of object content in the data other than text data.
In order to reach the object of the invention, the text search method based on office documents that the present invention provides a kind of, comprising: root
Object content is determined according to the content of text to be searched that user inputs;Determine search range;Determine the search within the scope of described search
Object, described search object include at least one of text box objects, object picture and audio/video object;In determination
Object search in search described search target.
Further, described search range is one of current document, current page and current text frame.
Further, include in the background filling that determining object search includes text box objects and text frame object
In the case where object picture: searching in text frame object described search target;As text frame object
Described search target is searched in the object picture of background filling.
Further, in the case where determining object search includes object picture, image text is carried out to the object picture
Word identification, obtains the corresponding picture recognition text of the object picture, to search the object content in the picture recognition text.
Further, described the step of carrying out pictograph identification to the object picture includes: when the object picture
When including bitmap, the text in the bitmap is identified using picture character identification module interface, it is corresponding as the object picture
Picture recognition text;And/or when the object picture includes polar plot, mentioned using vector map data parsing functional module interface
The text in the polar plot is taken out, as the corresponding picture recognition text of the object picture.
Further, in the case where determining object search includes audio/video object, to the audio/video object
Language and characters identification is carried out, the corresponding speech recognition text of the audio/video object is obtained, to look into the speech recognition text
Look for the object content.
Further, in the case where in the audio/video object in object search including the video object, to the video pair
Partial frame or whole frame as in carry out pictograph identification, obtain the video identification text of the partial frame or whole frame, with
The object content is searched in the video identification text.
Further, the audio/video object includes: audio object and/or the video object.
The present invention also provides a kind of text search device based on office documents, comprising: search content input unit,
Content of text to be searched suitable for being inputted according to user determines object content;Search range determination unit is adapted to determine that search
Range;Object search determination unit, the object search being adapted to determine that within the scope of described search, described search object include text
At least one of frame object, object picture and audio/video object;Processing unit is searched, is suitable in determining search
Described search target is searched in object.
In addition, the present invention also provides a kind of electronic equipment, including as described above based on the text search of office documents
Device.
Compared with prior art, the present invention can not only search object content within the scope of the text data of document, when need
Will picture in a document, search in audio/video object object content when the search range of locating function will can be expanded
In picture, video/audio object greatly into document, so that locating function is become more complete, existing office can be overcome soft
The defect of text locating function point, treatment effeciency with higher and accurate lookup result in part.
Further, the present invention can use picture character identification function module for the seeking scope of the function of String searching
It has been expanded to bitmap object picture, has parsed functional module looking into the function of String searching using the data of wmf/emf polar plot
Range is looked for be expanded to the object picture of wmf/emf vector bitmap-format.
Further, the present invention can use speech identifying function module and expand the seeking scope of the function of String searching
To the voice data of audio/video object.
Further, the present invention can use pictograph identification function and expand the seeking scope of the function of String searching
To the image frame data of audio/video object.
Further, the present invention can be in conjunction with the image frame data and voice data progress Text region in video, in institute
Object content search is carried out in obtained text, so that search result is more accurate, matching degree is higher.
Other features and advantages of the present invention will be illustrated in the following description, also, partly becomes from specification
It obtains it is clear that understand through the implementation of the invention.The objectives and other advantages of the invention can be by specification, right
Specifically noted structure is achieved and obtained in claim and attached drawing.
Detailed description of the invention
Attached drawing is used to provide to further understand technical solution of the present invention, and constitutes part of specification, with this
The embodiment of application technical solution for explaining the present invention together, does not constitute the limitation to technical solution of the present invention.
Fig. 1 is the flow chart of an exemplary process of the text search method of the invention based on office documents;
Fig. 2 is the structural schematic diagram of the text search device of the invention based on office documents;
Fig. 3 is the schematic diagram in the preferred embodiment of the present invention about search interface.
Specific embodiment
To make the objectives, technical solutions, and advantages of the present invention clearer, below in conjunction with attached drawing to the present invention
Embodiment be described in detail.It should be noted that in the absence of conflict, in the embodiment and embodiment in the application
Feature can mutual any combination.
Step shown in the flowchart of the accompanying drawings can be in a computer system such as a set of computer executable instructions
It executes.Also, although logical order is shown in flow charts, and it in some cases, can be to be different from herein suitable
Sequence executes shown or described step.
The text search method based on office documents that the embodiment provides a kind of, comprising: inputted according to user
Content of text to be searched determine object content;Determine search range;Determine the object search in search range, object search packet
Include at least one of text box objects, object picture and audio/video object;Search is searched in determining object search
Target.
Fig. 1 gives the flow chart of a processing example of the text search method of the invention based on office documents.
As shown in Figure 1, executing step S101 after this method starts.
In step s101, object content is determined according to the content of text to be searched that user inputs.Then, step is executed
S102。
In step s 102, search range is determined.Then, step S103 is executed.
Wherein, search range is, for example, one of current document, current page and current text frame.
For example, can determine corresponding part in current document as search range according to the user's choice.User not
In the case where selecting search range, such as can be using default search range as current document.
In step s 103, determine that the object search in search range, object search include text box objects, object picture
And at least one of audio/video object.Then, step S104 is executed.
Wherein, audio/video object is for example including at least one of audio object and the video object.
In step S104, search target is searched in determining object search.
It include the background of text box objects and text frame object in determining object search according to an implementation
In the case that filling includes object picture, such as search target can be searched in text frame object, it can also be
Search target is searched in the object picture that background as text frame object is filled.Wherein, " as text frame
Object background filling object picture in search target search " processing can for example execute " in the text
In frame object to search target search " processing after execute (such as executing " in text frame object to search for mesh
Mark is searched " after, when " searching next " button when the user clicks, then " fill out in the background as text frame object
In the object picture filled to search target search " processing).
In addition, including the case where that object picture (can be and only include in determining object search according to an implementation
The case where object picture, is also possible to simultaneously include the case where object picture is with other objects) under, figure is carried out to the object picture
As Text region, the corresponding picture recognition text of the object picture is obtained, to search object content in the picture recognition text.
For example, pictograph identification can be carried out to object picture in the following way: when the object picture includes position
When figure, the text in bitmap is identified using picture character identification module interface, as the corresponding picture recognition of the object picture
Text;And/or when the object picture includes polar plot, which is extracted using vector map data parsing functional module interface
Text in figure, as the corresponding picture recognition text of the object picture.
According to another implementation, include the case where that audio/video object (can be and only wrap in determining object search
The case where containing audio/video object, is also possible to simultaneously include the case where audio/video object is with other objects) under, to the sound
Frequently/the video object carries out language and characters identification, the corresponding speech recognition text of the audio/video object is obtained, in the voice
Object content is searched in identification text.
For example, in the case where including the video object in audio/video object in object search, in the video object
Partial frame or whole frame carry out pictograph identification, the video identification text of partial frame or whole frame is obtained, in the video
Object content is searched in identification text.
In addition, the present invention also provides a kind of text search device based on office documents, comprising: search content input is single
Member is suitable for determining object content according to the content of text to be searched that user inputs;Search range determination unit, is adapted to determine that
Search range;Object search determination unit, the object search being adapted to determine that in search range, object search include text box pair
As at least one of, object picture and audio/video object;Processing unit is searched, is suitable in determining object search
Target is searched in middle lookup.
Fig. 2 gives the structural schematic diagram of the above-mentioned text search device based on office documents.
As shown in Fig. 2, the text search device based on office documents may include search content input unit 201, search
Range determination unit 202, object search determination unit 203 and lookup processing unit 204.
Search content input unit 201 is suitable for determining object content according to the content of text to be searched that user inputs.
Search range determination unit 202 is adapted to determine that search range.
Object search determination unit 203 is adapted to determine that the object search in search range, and object search is for example including text
At least one of frame object, object picture and audio/video object.
Processing unit 204 is searched to be suitable for searching search target in determining object search.
Wherein, content input unit 201, search range determination unit 202, object search determination unit 203 are searched for and is looked into
Looking for processing unit 204 for example can execute respectively and above in conjunction with the text search side based on office documents described in Fig. 1
Step S101, S102, S103 and S104 in method distinguish identical processing, and can reach similar function and technology effect
Fruit, which is not described herein again.
In addition, the present invention also provides a kind of electronic equipment, including as described above based on the text search of office documents
Device.Electronic equipment for example can be smart phone, tablet computer, laptop or desktop computer, etc..
Preferred embodiment
The preferred embodiment improves on the basis of lookup in WPP demonstration with replacement function.Collect in WPP demoware
At picture character identification module, for searching object content in picture in a document, while also integrated speech turns character module,
For searching object content in video/audio object (audio/video object i.e. described above) in a document.
The diversification for considering user demand, in the corresponding dialog box of " lookup " function of WPP demoware, such as Fig. 3 institute
Show, item can be chosen below last middle addition similar to " matching whole word only ": being searched in object picture, in video/audio object
Search.When choosing " searching in object picture ", seeking scope includes the object picture in document, when choose " video/
Searched in audio object " when seeking scope include video/audio object in document.
After clicking " searching next " button, software is searched according to the configuration for searching dialog box, and former locating function is
It is begun looking for from first object of present slide, finds next object of current location every time (when not beginning looking for
Next object is first object).
If text box objects, then the text in this text box is searched according to original method.Particularly, if it is literary
The background filling of this frame is picture filling, then in the picture of this text box filling when clicking " searching next " button again later
In searched and (see below description), otherwise find next object.
It if object picture, is then searched in picture, if finding object content, chooses the object where picture.
For example, can mode as described below object content is searched in picture.
Picture is divided into two classes: one kind is that (bitmap images (bitmap), also known as dot matrix image or drawing image are bitmap
What a single point by being referred to as pixel (picture element) formed), such as png, jpg, bmp;Another is polar plot, is being handled official business
Common format has wmf, emf in document.For bitmap, first identified in bitmap using picture character identification module interface first
Text, then search target text in these texts again.For the polar plot of two kinds of formats of wmf, emf, first with soft
The data parsing functional module interface of existing polar plot extracts the text in polar plot in part, then looks into these texts
Look for target text.
In addition, then searching in this video/audio object if video/audio object, if finding object content, selecting
In this video/audio object.For example, can mode as described below search object content in this video/audio object: it is right
In video/audio object, the text in voice is first identified using speech recognition module interface first, then again in these texts
Search target text.
Present invention improves over text locating functions in existing office software, and the seeking scope of improved locating function is not only
It is limited to the text data of document, in the picture or video/audio data being also extend in document.
The seeking scope of the function of String searching bitmap picture pair has been expanded to using picture character identification function module
As.The seeking scope of the function of String searching is expanded to wmf/emf using the data parsing functional module of wmf/emf polar plot
The object picture of vector bitmap-format.Using speech identifying function module by the seeking scope of the function of String searching be expanded to view/
Audio object.
Although disclosed herein embodiment it is as above, the content only for ease of understanding the present invention and use
Embodiment is not intended to limit the invention.Technical staff in any fields of the present invention is taken off not departing from the present invention
Under the premise of the spirit and scope of dew, any modification and variation, but the present invention can be carried out in the form and details of implementation
Scope of patent protection, still should be subject to the scope of the claims as defined in the appended claims.
Claims (10)
1. the text search method based on office documents characterized by comprising
Object content is determined according to the content of text to be searched that user inputs;
Determine search range;
Determine the object search within the scope of described search, described search object include text box objects, object picture and audio/
At least one of the video object;
Described search target is searched in determining object search.
2. the text search method according to claim 1 based on office documents, which is characterized in that described search range is
One of current document, current page and current text frame.
3. the text search method according to claim 1 based on office documents, which is characterized in that in determining search pair
As including in the case that the background filling of text box objects and text frame object includes object picture:
Described search target is searched in text frame object;
Described search target is searched in the object picture of the background filling as text frame object.
4. the text search method according to any one of claim 1-3 based on office documents, which is characterized in that true
In the case that fixed object search includes object picture, pictograph identification is carried out to the object picture, obtains the object picture
Corresponding picture recognition text, to search the object content in the picture recognition text.
5. the text search method according to claim 4 based on office documents, which is characterized in that described to the figure
Piece object carry out pictograph identification the step of include:
When the object picture includes bitmap, the text in the bitmap is identified using picture character identification module interface, is made
For the corresponding picture recognition text of the object picture;And/or
When the object picture includes polar plot, extracted in the polar plot using vector map data parsing functional module interface
Text, as the corresponding picture recognition text of the object picture.
6. the text search method according to any one of claims 1-5 based on office documents, which is characterized in that true
In the case that fixed object search includes audio/video object, language and characters identification is carried out to the audio/video object, is obtained
The corresponding speech recognition text of the audio/video object, to search the object content in the speech recognition text.
7. the text search method according to claim 6 based on office documents, which is characterized in that in object search
In the case where including the video object in audio/video object, to the partial frame or whole frame progress image text in the video object
Word identification, obtains the video identification text of the partial frame or whole frame, to search the target in the video identification text
Content.
8. the text search method described in any one of -7 based on office documents according to claim 1, which is characterized in that described
Audio/video object includes: audio object and/or the video object.
9. the text search device based on office documents characterized by comprising
Content input unit is searched for, is suitable for determining object content according to the content of text to be searched that user inputs;
Search range determination unit, is adapted to determine that search range;
Object search determination unit, the object search being adapted to determine that within the scope of described search, described search object include text
At least one of frame object, object picture and audio/video object;
Processing unit is searched, is suitable for searching described search target in determining object search.
10. electronic equipment, which is characterized in that including the text search device as claimed in claim 9 based on office documents.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710818141.8A CN110019661A (en) | 2017-09-12 | 2017-09-12 | Text search method, apparatus and electronic equipment based on office documents |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710818141.8A CN110019661A (en) | 2017-09-12 | 2017-09-12 | Text search method, apparatus and electronic equipment based on office documents |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110019661A true CN110019661A (en) | 2019-07-16 |
Family
ID=67186282
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710818141.8A Pending CN110019661A (en) | 2017-09-12 | 2017-09-12 | Text search method, apparatus and electronic equipment based on office documents |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110019661A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112329409A (en) * | 2019-07-30 | 2021-02-05 | 珠海金山办公软件有限公司 | Cell color conversion method and device and electronic equipment |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1581172A (en) * | 2003-08-08 | 2005-02-16 | 富士通株式会社 | Multimedia object searching device and methoed |
CN103914486A (en) * | 2013-01-08 | 2014-07-09 | 邓寅生 | Document search and display system |
CN104246678A (en) * | 2012-02-15 | 2014-12-24 | 苹果公司 | Device, method, and graphical user interface for sharing a content object in a document |
-
2017
- 2017-09-12 CN CN201710818141.8A patent/CN110019661A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1581172A (en) * | 2003-08-08 | 2005-02-16 | 富士通株式会社 | Multimedia object searching device and methoed |
CN104246678A (en) * | 2012-02-15 | 2014-12-24 | 苹果公司 | Device, method, and graphical user interface for sharing a content object in a document |
CN103914486A (en) * | 2013-01-08 | 2014-07-09 | 邓寅生 | Document search and display system |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112329409A (en) * | 2019-07-30 | 2021-02-05 | 珠海金山办公软件有限公司 | Cell color conversion method and device and electronic equipment |
CN112329409B (en) * | 2019-07-30 | 2024-03-22 | 珠海金山办公软件有限公司 | Cell color conversion method and device and electronic equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110458918B (en) | Method and device for outputting information | |
CN108073910B (en) | Method and device for generating human face features | |
CN110446063B (en) | Video cover generation method and device and electronic equipment | |
CN109583952B (en) | Advertisement case processing method, device, equipment and computer readable storage medium | |
CN109919244B (en) | Method and apparatus for generating a scene recognition model | |
CN112507706B (en) | Training method and device for knowledge pre-training model and electronic equipment | |
CN113159010B (en) | Video classification method, device, equipment and storage medium | |
CN110222330B (en) | Semantic recognition method and device, storage medium and computer equipment | |
US20170115853A1 (en) | Determining Image Captions | |
JP2020149686A (en) | Image processing method, device, server, and storage medium | |
CN111696176A (en) | Image processing method, image processing device, electronic equipment and computer readable medium | |
WO2015026750A1 (en) | Presenting fixed format documents in reflowed format | |
CN111783508A (en) | Method and apparatus for processing image | |
CN107958078A (en) | Information generating method and device | |
US11750547B2 (en) | Multimodal named entity recognition | |
CA3166742A1 (en) | Method of generating text plan based on deep learning, device and electronic equipment | |
CN112749695A (en) | Text recognition method and device | |
CN115982376A (en) | Method and apparatus for training models based on text, multimodal data and knowledge | |
CN108256523B (en) | Identification method and device based on mobile terminal and computer readable storage medium | |
CN113255377A (en) | Translation method, translation device, electronic equipment and storage medium | |
CN111695518A (en) | Method and device for labeling structured document information and electronic equipment | |
CN111126372B (en) | Logo region marking method and device in video and electronic equipment | |
CN111881900B (en) | Corpus generation method, corpus translation model training method, corpus translation model translation method, corpus translation device, corpus translation equipment and corpus translation medium | |
CN115661846A (en) | Data processing method and device, electronic equipment and storage medium | |
CN110019661A (en) | Text search method, apparatus and electronic equipment based on office documents |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190716 |