CN110019661A - Text search method, apparatus and electronic equipment based on office documents - Google Patents

Text search method, apparatus and electronic equipment based on office documents Download PDF

Info

Publication number
CN110019661A
CN110019661A CN201710818141.8A CN201710818141A CN110019661A CN 110019661 A CN110019661 A CN 110019661A CN 201710818141 A CN201710818141 A CN 201710818141A CN 110019661 A CN110019661 A CN 110019661A
Authority
CN
China
Prior art keywords
text
search
picture
video
audio
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710818141.8A
Other languages
Chinese (zh)
Inventor
王峰
区钺坚
黄志军
高延平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Kingsoft Office Software Inc
Zhuhai Kingsoft Office Software Co Ltd
Guangzhou Kingsoft Mobile Technology Co Ltd
Guangzhou Jinshan Mobile Technology Co Ltd
Original Assignee
Beijing Kingsoft Office Software Inc
Zhuhai Kingsoft Office Software Co Ltd
Guangzhou Jinshan Mobile Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Kingsoft Office Software Inc, Zhuhai Kingsoft Office Software Co Ltd, Guangzhou Jinshan Mobile Technology Co Ltd filed Critical Beijing Kingsoft Office Software Inc
Priority to CN201710818141.8A priority Critical patent/CN110019661A/en
Publication of CN110019661A publication Critical patent/CN110019661A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/43Querying

Abstract

The invention discloses a kind of text search method, apparatus and electronic equipment based on office documents.Text search method based on office documents includes: to determine object content according to the content of text to be searched of user's input;Determine search range;Determine that the object search in search range, object search include at least one of text box objects, object picture and audio/video object;Search target is searched in determining object search.Above-mentioned text search method, apparatus and electronic equipment based on office documents of the invention can search object content in the data other than text data, and treatment effeciency is high, and processing result is more accurate.

Description

Text search method, apparatus and electronic equipment based on office documents
Technical field
The present invention relates to the information processing technology, espespecially a kind of text search method, apparatus and electronics based on office documents Equipment.
Background technique
In existing office software, such as MS Office, WPS Office, it can be existed by " search and replace " function The position for wanting object content to occur is found in document.It can only searched within the scope of the text data of document, text data with If outer data can not find out comprising object content, such as the text of a certain section of phonetic representation in picture, video/audio.
Summary of the invention
The present invention provides a kind of text search method, apparatus and electronic equipment based on office documents, it is existing to solve Technology cannot search the problem of object content in the data other than text data.
In order to reach the object of the invention, the text search method based on office documents that the present invention provides a kind of, comprising: root Object content is determined according to the content of text to be searched that user inputs;Determine search range;Determine the search within the scope of described search Object, described search object include at least one of text box objects, object picture and audio/video object;In determination Object search in search described search target.
Further, described search range is one of current document, current page and current text frame.
Further, include in the background filling that determining object search includes text box objects and text frame object In the case where object picture: searching in text frame object described search target;As text frame object Described search target is searched in the object picture of background filling.
Further, in the case where determining object search includes object picture, image text is carried out to the object picture Word identification, obtains the corresponding picture recognition text of the object picture, to search the object content in the picture recognition text.
Further, described the step of carrying out pictograph identification to the object picture includes: when the object picture When including bitmap, the text in the bitmap is identified using picture character identification module interface, it is corresponding as the object picture Picture recognition text;And/or when the object picture includes polar plot, mentioned using vector map data parsing functional module interface The text in the polar plot is taken out, as the corresponding picture recognition text of the object picture.
Further, in the case where determining object search includes audio/video object, to the audio/video object Language and characters identification is carried out, the corresponding speech recognition text of the audio/video object is obtained, to look into the speech recognition text Look for the object content.
Further, in the case where in the audio/video object in object search including the video object, to the video pair Partial frame or whole frame as in carry out pictograph identification, obtain the video identification text of the partial frame or whole frame, with The object content is searched in the video identification text.
Further, the audio/video object includes: audio object and/or the video object.
The present invention also provides a kind of text search device based on office documents, comprising: search content input unit, Content of text to be searched suitable for being inputted according to user determines object content;Search range determination unit is adapted to determine that search Range;Object search determination unit, the object search being adapted to determine that within the scope of described search, described search object include text At least one of frame object, object picture and audio/video object;Processing unit is searched, is suitable in determining search Described search target is searched in object.
In addition, the present invention also provides a kind of electronic equipment, including as described above based on the text search of office documents Device.
Compared with prior art, the present invention can not only search object content within the scope of the text data of document, when need Will picture in a document, search in audio/video object object content when the search range of locating function will can be expanded In picture, video/audio object greatly into document, so that locating function is become more complete, existing office can be overcome soft The defect of text locating function point, treatment effeciency with higher and accurate lookup result in part.
Further, the present invention can use picture character identification function module for the seeking scope of the function of String searching It has been expanded to bitmap object picture, has parsed functional module looking into the function of String searching using the data of wmf/emf polar plot Range is looked for be expanded to the object picture of wmf/emf vector bitmap-format.
Further, the present invention can use speech identifying function module and expand the seeking scope of the function of String searching To the voice data of audio/video object.
Further, the present invention can use pictograph identification function and expand the seeking scope of the function of String searching To the image frame data of audio/video object.
Further, the present invention can be in conjunction with the image frame data and voice data progress Text region in video, in institute Object content search is carried out in obtained text, so that search result is more accurate, matching degree is higher.
Other features and advantages of the present invention will be illustrated in the following description, also, partly becomes from specification It obtains it is clear that understand through the implementation of the invention.The objectives and other advantages of the invention can be by specification, right Specifically noted structure is achieved and obtained in claim and attached drawing.
Detailed description of the invention
Attached drawing is used to provide to further understand technical solution of the present invention, and constitutes part of specification, with this The embodiment of application technical solution for explaining the present invention together, does not constitute the limitation to technical solution of the present invention.
Fig. 1 is the flow chart of an exemplary process of the text search method of the invention based on office documents;
Fig. 2 is the structural schematic diagram of the text search device of the invention based on office documents;
Fig. 3 is the schematic diagram in the preferred embodiment of the present invention about search interface.
Specific embodiment
To make the objectives, technical solutions, and advantages of the present invention clearer, below in conjunction with attached drawing to the present invention Embodiment be described in detail.It should be noted that in the absence of conflict, in the embodiment and embodiment in the application Feature can mutual any combination.
Step shown in the flowchart of the accompanying drawings can be in a computer system such as a set of computer executable instructions It executes.Also, although logical order is shown in flow charts, and it in some cases, can be to be different from herein suitable Sequence executes shown or described step.
The text search method based on office documents that the embodiment provides a kind of, comprising: inputted according to user Content of text to be searched determine object content;Determine search range;Determine the object search in search range, object search packet Include at least one of text box objects, object picture and audio/video object;Search is searched in determining object search Target.
Fig. 1 gives the flow chart of a processing example of the text search method of the invention based on office documents.
As shown in Figure 1, executing step S101 after this method starts.
In step s101, object content is determined according to the content of text to be searched that user inputs.Then, step is executed S102。
In step s 102, search range is determined.Then, step S103 is executed.
Wherein, search range is, for example, one of current document, current page and current text frame.
For example, can determine corresponding part in current document as search range according to the user's choice.User not In the case where selecting search range, such as can be using default search range as current document.
In step s 103, determine that the object search in search range, object search include text box objects, object picture And at least one of audio/video object.Then, step S104 is executed.
Wherein, audio/video object is for example including at least one of audio object and the video object.
In step S104, search target is searched in determining object search.
It include the background of text box objects and text frame object in determining object search according to an implementation In the case that filling includes object picture, such as search target can be searched in text frame object, it can also be Search target is searched in the object picture that background as text frame object is filled.Wherein, " as text frame Object background filling object picture in search target search " processing can for example execute " in the text In frame object to search target search " processing after execute (such as executing " in text frame object to search for mesh Mark is searched " after, when " searching next " button when the user clicks, then " fill out in the background as text frame object In the object picture filled to search target search " processing).
In addition, including the case where that object picture (can be and only include in determining object search according to an implementation The case where object picture, is also possible to simultaneously include the case where object picture is with other objects) under, figure is carried out to the object picture As Text region, the corresponding picture recognition text of the object picture is obtained, to search object content in the picture recognition text.
For example, pictograph identification can be carried out to object picture in the following way: when the object picture includes position When figure, the text in bitmap is identified using picture character identification module interface, as the corresponding picture recognition of the object picture Text;And/or when the object picture includes polar plot, which is extracted using vector map data parsing functional module interface Text in figure, as the corresponding picture recognition text of the object picture.
According to another implementation, include the case where that audio/video object (can be and only wrap in determining object search The case where containing audio/video object, is also possible to simultaneously include the case where audio/video object is with other objects) under, to the sound Frequently/the video object carries out language and characters identification, the corresponding speech recognition text of the audio/video object is obtained, in the voice Object content is searched in identification text.
For example, in the case where including the video object in audio/video object in object search, in the video object Partial frame or whole frame carry out pictograph identification, the video identification text of partial frame or whole frame is obtained, in the video Object content is searched in identification text.
In addition, the present invention also provides a kind of text search device based on office documents, comprising: search content input is single Member is suitable for determining object content according to the content of text to be searched that user inputs;Search range determination unit, is adapted to determine that Search range;Object search determination unit, the object search being adapted to determine that in search range, object search include text box pair As at least one of, object picture and audio/video object;Processing unit is searched, is suitable in determining object search Target is searched in middle lookup.
Fig. 2 gives the structural schematic diagram of the above-mentioned text search device based on office documents.
As shown in Fig. 2, the text search device based on office documents may include search content input unit 201, search Range determination unit 202, object search determination unit 203 and lookup processing unit 204.
Search content input unit 201 is suitable for determining object content according to the content of text to be searched that user inputs.
Search range determination unit 202 is adapted to determine that search range.
Object search determination unit 203 is adapted to determine that the object search in search range, and object search is for example including text At least one of frame object, object picture and audio/video object.
Processing unit 204 is searched to be suitable for searching search target in determining object search.
Wherein, content input unit 201, search range determination unit 202, object search determination unit 203 are searched for and is looked into Looking for processing unit 204 for example can execute respectively and above in conjunction with the text search side based on office documents described in Fig. 1 Step S101, S102, S103 and S104 in method distinguish identical processing, and can reach similar function and technology effect Fruit, which is not described herein again.
In addition, the present invention also provides a kind of electronic equipment, including as described above based on the text search of office documents Device.Electronic equipment for example can be smart phone, tablet computer, laptop or desktop computer, etc..
Preferred embodiment
The preferred embodiment improves on the basis of lookup in WPP demonstration with replacement function.Collect in WPP demoware At picture character identification module, for searching object content in picture in a document, while also integrated speech turns character module, For searching object content in video/audio object (audio/video object i.e. described above) in a document.
The diversification for considering user demand, in the corresponding dialog box of " lookup " function of WPP demoware, such as Fig. 3 institute Show, item can be chosen below last middle addition similar to " matching whole word only ": being searched in object picture, in video/audio object Search.When choosing " searching in object picture ", seeking scope includes the object picture in document, when choose " video/ Searched in audio object " when seeking scope include video/audio object in document.
After clicking " searching next " button, software is searched according to the configuration for searching dialog box, and former locating function is It is begun looking for from first object of present slide, finds next object of current location every time (when not beginning looking for Next object is first object).
If text box objects, then the text in this text box is searched according to original method.Particularly, if it is literary The background filling of this frame is picture filling, then in the picture of this text box filling when clicking " searching next " button again later In searched and (see below description), otherwise find next object.
It if object picture, is then searched in picture, if finding object content, chooses the object where picture. For example, can mode as described below object content is searched in picture.
Picture is divided into two classes: one kind is that (bitmap images (bitmap), also known as dot matrix image or drawing image are bitmap What a single point by being referred to as pixel (picture element) formed), such as png, jpg, bmp;Another is polar plot, is being handled official business Common format has wmf, emf in document.For bitmap, first identified in bitmap using picture character identification module interface first Text, then search target text in these texts again.For the polar plot of two kinds of formats of wmf, emf, first with soft The data parsing functional module interface of existing polar plot extracts the text in polar plot in part, then looks into these texts Look for target text.
In addition, then searching in this video/audio object if video/audio object, if finding object content, selecting In this video/audio object.For example, can mode as described below search object content in this video/audio object: it is right In video/audio object, the text in voice is first identified using speech recognition module interface first, then again in these texts Search target text.
Present invention improves over text locating functions in existing office software, and the seeking scope of improved locating function is not only It is limited to the text data of document, in the picture or video/audio data being also extend in document.
The seeking scope of the function of String searching bitmap picture pair has been expanded to using picture character identification function module As.The seeking scope of the function of String searching is expanded to wmf/emf using the data parsing functional module of wmf/emf polar plot The object picture of vector bitmap-format.Using speech identifying function module by the seeking scope of the function of String searching be expanded to view/ Audio object.
Although disclosed herein embodiment it is as above, the content only for ease of understanding the present invention and use Embodiment is not intended to limit the invention.Technical staff in any fields of the present invention is taken off not departing from the present invention Under the premise of the spirit and scope of dew, any modification and variation, but the present invention can be carried out in the form and details of implementation Scope of patent protection, still should be subject to the scope of the claims as defined in the appended claims.

Claims (10)

1. the text search method based on office documents characterized by comprising
Object content is determined according to the content of text to be searched that user inputs;
Determine search range;
Determine the object search within the scope of described search, described search object include text box objects, object picture and audio/ At least one of the video object;
Described search target is searched in determining object search.
2. the text search method according to claim 1 based on office documents, which is characterized in that described search range is One of current document, current page and current text frame.
3. the text search method according to claim 1 based on office documents, which is characterized in that in determining search pair As including in the case that the background filling of text box objects and text frame object includes object picture:
Described search target is searched in text frame object;
Described search target is searched in the object picture of the background filling as text frame object.
4. the text search method according to any one of claim 1-3 based on office documents, which is characterized in that true In the case that fixed object search includes object picture, pictograph identification is carried out to the object picture, obtains the object picture Corresponding picture recognition text, to search the object content in the picture recognition text.
5. the text search method according to claim 4 based on office documents, which is characterized in that described to the figure Piece object carry out pictograph identification the step of include:
When the object picture includes bitmap, the text in the bitmap is identified using picture character identification module interface, is made For the corresponding picture recognition text of the object picture;And/or
When the object picture includes polar plot, extracted in the polar plot using vector map data parsing functional module interface Text, as the corresponding picture recognition text of the object picture.
6. the text search method according to any one of claims 1-5 based on office documents, which is characterized in that true In the case that fixed object search includes audio/video object, language and characters identification is carried out to the audio/video object, is obtained The corresponding speech recognition text of the audio/video object, to search the object content in the speech recognition text.
7. the text search method according to claim 6 based on office documents, which is characterized in that in object search In the case where including the video object in audio/video object, to the partial frame or whole frame progress image text in the video object Word identification, obtains the video identification text of the partial frame or whole frame, to search the target in the video identification text Content.
8. the text search method described in any one of -7 based on office documents according to claim 1, which is characterized in that described Audio/video object includes: audio object and/or the video object.
9. the text search device based on office documents characterized by comprising
Content input unit is searched for, is suitable for determining object content according to the content of text to be searched that user inputs;
Search range determination unit, is adapted to determine that search range;
Object search determination unit, the object search being adapted to determine that within the scope of described search, described search object include text At least one of frame object, object picture and audio/video object;
Processing unit is searched, is suitable for searching described search target in determining object search.
10. electronic equipment, which is characterized in that including the text search device as claimed in claim 9 based on office documents.
CN201710818141.8A 2017-09-12 2017-09-12 Text search method, apparatus and electronic equipment based on office documents Pending CN110019661A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710818141.8A CN110019661A (en) 2017-09-12 2017-09-12 Text search method, apparatus and electronic equipment based on office documents

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710818141.8A CN110019661A (en) 2017-09-12 2017-09-12 Text search method, apparatus and electronic equipment based on office documents

Publications (1)

Publication Number Publication Date
CN110019661A true CN110019661A (en) 2019-07-16

Family

ID=67186282

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710818141.8A Pending CN110019661A (en) 2017-09-12 2017-09-12 Text search method, apparatus and electronic equipment based on office documents

Country Status (1)

Country Link
CN (1) CN110019661A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112329409A (en) * 2019-07-30 2021-02-05 珠海金山办公软件有限公司 Cell color conversion method and device and electronic equipment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1581172A (en) * 2003-08-08 2005-02-16 富士通株式会社 Multimedia object searching device and methoed
CN103914486A (en) * 2013-01-08 2014-07-09 邓寅生 Document search and display system
CN104246678A (en) * 2012-02-15 2014-12-24 苹果公司 Device, method, and graphical user interface for sharing a content object in a document

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1581172A (en) * 2003-08-08 2005-02-16 富士通株式会社 Multimedia object searching device and methoed
CN104246678A (en) * 2012-02-15 2014-12-24 苹果公司 Device, method, and graphical user interface for sharing a content object in a document
CN103914486A (en) * 2013-01-08 2014-07-09 邓寅生 Document search and display system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112329409A (en) * 2019-07-30 2021-02-05 珠海金山办公软件有限公司 Cell color conversion method and device and electronic equipment
CN112329409B (en) * 2019-07-30 2024-03-22 珠海金山办公软件有限公司 Cell color conversion method and device and electronic equipment

Similar Documents

Publication Publication Date Title
CN110458918B (en) Method and device for outputting information
CN108073910B (en) Method and device for generating human face features
CN110446063B (en) Video cover generation method and device and electronic equipment
CN109583952B (en) Advertisement case processing method, device, equipment and computer readable storage medium
CN109919244B (en) Method and apparatus for generating a scene recognition model
CN112507706B (en) Training method and device for knowledge pre-training model and electronic equipment
CN113159010B (en) Video classification method, device, equipment and storage medium
CN110222330B (en) Semantic recognition method and device, storage medium and computer equipment
US20170115853A1 (en) Determining Image Captions
JP2020149686A (en) Image processing method, device, server, and storage medium
CN111696176A (en) Image processing method, image processing device, electronic equipment and computer readable medium
WO2015026750A1 (en) Presenting fixed format documents in reflowed format
CN111783508A (en) Method and apparatus for processing image
CN107958078A (en) Information generating method and device
US11750547B2 (en) Multimodal named entity recognition
CA3166742A1 (en) Method of generating text plan based on deep learning, device and electronic equipment
CN112749695A (en) Text recognition method and device
CN115982376A (en) Method and apparatus for training models based on text, multimodal data and knowledge
CN108256523B (en) Identification method and device based on mobile terminal and computer readable storage medium
CN113255377A (en) Translation method, translation device, electronic equipment and storage medium
CN111695518A (en) Method and device for labeling structured document information and electronic equipment
CN111126372B (en) Logo region marking method and device in video and electronic equipment
CN111881900B (en) Corpus generation method, corpus translation model training method, corpus translation model translation method, corpus translation device, corpus translation equipment and corpus translation medium
CN115661846A (en) Data processing method and device, electronic equipment and storage medium
CN110019661A (en) Text search method, apparatus and electronic equipment based on office documents

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190716