CN115374061A - Optimization processing method and device for document search and electronic equipment - Google Patents

Optimization processing method and device for document search and electronic equipment Download PDF

Info

Publication number
CN115374061A
CN115374061A CN202211031844.3A CN202211031844A CN115374061A CN 115374061 A CN115374061 A CN 115374061A CN 202211031844 A CN202211031844 A CN 202211031844A CN 115374061 A CN115374061 A CN 115374061A
Authority
CN
China
Prior art keywords
document
search
target
path
searching
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211031844.3A
Other languages
Chinese (zh)
Inventor
申亚坤
谭莹坤
周慧婷
陶威
程璐
刘烨敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Bank of China Ltd
Original Assignee
Bank of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Bank of China Ltd filed Critical Bank of China Ltd
Priority to CN202211031844.3A priority Critical patent/CN115374061A/en
Publication of CN115374061A publication Critical patent/CN115374061A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/14Details of searching files based on file metadata
    • G06F16/148File search processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • G06F16/168Details of user interfaces specifically adapted to file systems, e.g. browsing and visualisation, 2d or 3d GUIs

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Library & Information Science (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses an optimization processing method, an optimization processing device and electronic equipment for document searching, which can be applied to the field of big data, wherein the method comprises the following steps: obtaining a document search request containing a search keyword; obtaining a document search result in a document library according to the search keyword; the document search result comprises a plurality of target documents; outputting document information corresponding to the target document through the search page, wherein the target page in the search page is in a visible state; obtaining a target search path matched with the search keyword in a path library; the path library comprises a plurality of document searching paths, and the document searching paths at least represent document positions of document information corresponding to the corresponding historical documents in historical searching results; and at least adjusting the search page where the document information corresponding to the target document is located according to the document position represented by the target search path, so that the document information corresponding to the first document corresponding to the target search path in the target document is displayed in the target page.

Description

Optimization processing method and device for document search and electronic equipment
Technical Field
The present application relates to the field of data processing technologies, and in particular, to an optimization processing method and apparatus for document search, and an electronic device.
Background
When different clients use the document library for retrieval, the same or different retrieval conditions can be input, and the document library inevitably causes inaccurate search results due to insufficient input conditions of the clients, so that a large number of pages can be turned to find the required document results, and the search efficiency of the documents is low.
Therefore, a technical solution capable of improving the document searching efficiency is needed.
Disclosure of Invention
In view of this, the present application provides a method and an apparatus for optimizing document search, and an electronic device, so as to solve the technical problem of low document search rate. The following were used:
an optimization processing method for document searching, the method comprising:
obtaining a document search request, wherein the document search request comprises at least one search keyword;
obtaining a document search result in a document library according to the search keyword; the document search result comprises a plurality of target documents;
outputting document information corresponding to the target document through at least two search pages, wherein the target page in the at least two search pages is in a visible state;
obtaining a target search path matched with the search keyword in a path library; the path library comprises a plurality of document searching paths, and the document searching paths at least represent document positions of document information corresponding to corresponding historical documents in historical searching results;
and at least adjusting the search page where the document information corresponding to the target document is located according to the document position represented by the target search path, so that the document information corresponding to the first document corresponding to the target search path in the target document is displayed in the target page.
The above method, preferably, further comprises:
obtaining historical keywords and historical documents hit by the historical keywords;
generating the historical keywords and the document searching path corresponding to the historical documents at least according to the page turning times of the searching pages corresponding to the historical keywords, the searching pages where the document information corresponding to the historical documents is located and the document positions of the document information corresponding to the historical documents in the searching pages where the document information is located;
and adding the document searching path to the path library.
The above method, preferably, further comprises:
obtaining document keywords of the historical documents;
obtaining keyword supplementary items corresponding to the historical keywords in the document keywords;
and adding the keyword supplement item into a document search path corresponding to the historical document.
Preferably, the adjusting, according to the document position represented by the target search path, at least the search page where the document information corresponding to the target document is located includes:
determining a first document corresponding to the target search path in the target document according to the document position represented by the target search path;
and adjusting the document information corresponding to the first document from the current page where the first document is located to the first position on the target page, so that the document corresponding to the first position and the document ranked after the first position are ranked after the first document in the document search result.
In the above method, preferably, after adjusting the document information corresponding to the first document from the current page to the first position on the target page, the method further includes:
monitoring whether a hit determination operation for the first document is received;
and under the condition that a hit determining operation aiming at the first document is received, generating a document searching path corresponding to the searching keyword and the first document according to the searching keyword, the first document and the first position.
The above method, preferably, further comprises:
monitoring whether a hit-discard operation for the first document is received;
under the condition that a hit-and-drop operation for the first document is received, obtaining a second document which meets the association relation with the first document in the target document;
and adjusting the document information corresponding to the second document from the current page where the document information is located to a second position associated with the first position.
In the above method, preferably, the second position is a position in the document search result after the first position and adjacent to the first position, so that the document corresponding to the second position and the document ranked after the second position are ranked after the second document in the document search result;
wherein the method further comprises:
monitoring whether a hit determination operation for the second document is received;
and under the condition that a hit determining operation aiming at the second document is received, generating a document searching path corresponding to the searching keyword and the second document according to the searching keyword, the second document and the second position.
Preferably, the association relationship includes:
the document identification corresponding to the first document comprises the document identification corresponding to the second document.
An optimization processing apparatus of document search, the apparatus comprising:
the device comprises a request obtaining unit, a document searching unit and a searching unit, wherein the request obtaining unit is used for obtaining a document searching request which comprises at least one searching keyword;
a result obtaining unit, configured to obtain a document search result in a document library according to the search keyword; the document search result comprises a plurality of target documents;
the page output unit is used for outputting the document information corresponding to the target document through at least two search pages, and the target page in the at least two search pages is in a visible state;
a path obtaining unit, configured to obtain, in a path library, a target search path matching the search keyword; the path library comprises a plurality of document searching paths, and the document searching paths at least represent document positions of document information corresponding to corresponding historical documents in historical searching results;
and the page adjusting unit is used for at least adjusting the search page where the document information corresponding to the target document is located according to the document position represented by the target search path, so that the document information corresponding to the first document corresponding to the target search path in the target document is displayed in the target page.
An electronic device, comprising:
a memory for storing a computer program and data generated by the execution of the computer program;
a processor for executing the computer program to implement: obtaining a document search request, wherein the document search request comprises at least one search keyword; obtaining a document search result in a document library according to the search keyword; the document search result comprises a plurality of target documents; outputting document information corresponding to the target document through at least two search pages, wherein the target page in the at least two search pages is in a visible state; obtaining a target search path matched with the search keyword in a path library; the path library comprises a plurality of document searching paths, and the document searching paths at least represent document positions of document information corresponding to corresponding historical documents in historical searching results; and at least adjusting the search page where the document information corresponding to the target document is located according to the document position represented by the target search path, so that the document information corresponding to the first document corresponding to the target search path in the target document is displayed in the target page.
According to the technical scheme, after the document search request is obtained, a plurality of target documents are obtained in the document library according to the search keywords in the document search request, the document information corresponding to the target documents is displayed through a plurality of search pages, the document information in only one target page is in a visible state, then the document search path corresponding to the search keywords is obtained, and the document position of the document information corresponding to the document hit in the history search represented by the document search path in the history search result is used to adjust the search page where the document information corresponding to the target document is located, so that the document information corresponding to the first document hit in the history search by the search keywords is in a visible state. Therefore, the document information in the current document search result is positioned through the document position of the document information corresponding to the hit document in the history search result, and the positioned document information is sequenced in the search page in the visible state in the document search result, so that the user can find the document information corresponding to the history hit document in time, and the document search efficiency is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.
FIG. 1 is a flowchart of an optimization processing method for document search according to an embodiment of the present disclosure;
FIG. 2, FIG. 3, FIG. 4, FIG. 5, and FIG. 6 are partial flow charts of a method for optimizing a document search according to an embodiment of the present application;
FIG. 7 is a schematic structural diagram of an apparatus for optimizing document search according to a second embodiment of the present application;
FIG. 8 is a schematic structural diagram of another optimization processing apparatus for document search according to the second embodiment of the present application;
fig. 9 is a schematic structural diagram of an electronic device according to a third embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Referring to fig. 1, a flowchart of an implementation of an optimization processing method for document search according to an embodiment of the present application is provided, where the method may be applied to an electronic device capable of data processing, such as a computer or a server. The technical scheme in the embodiment is mainly used for improving the document searching efficiency.
Specifically, the method in this embodiment may include the following steps:
step 101: a document search request is obtained.
Wherein, the document searching request comprises at least one searching keyword. The search keyword may be a name of a document to be searched, a word or a word in a title. There may be one or more search keywords.
For example, the document search request includes search keywords such as "interest rate", "interest", "bank", "business", and "help".
Step 102: and obtaining a document search result in the document library according to the search keyword.
Wherein the document search result comprises a plurality of target documents.
Specifically, the document library is a database in which a plurality of documents are stored, and after receiving a document search request, documents matched with the search keyword can be searched in the document library according to the search keyword, so that a plurality of target documents are obtained. The matching of the target document with the search keyword means: the target document contains the search keyword, or the similarity between the document characters in the target document and the search keyword is higher than a similarity threshold.
For example, according to search keywords such as "interest rate", "interest", "bank", "business", and "help", 100 target documents are retrieved from the document repository to match the search keywords.
Step 103: and outputting the document information corresponding to the target document through the at least two search pages, wherein the target page in the at least two search pages is in a visible state.
The document information can be a jump link of a corresponding document, and the jump link is used for pointing to a storage position of the corresponding document, so that when a user clicks the jump link, the corresponding document can be read and output to the user.
In addition, limited by the output area of the electronic equipment, a plurality of target documents can be displayed in a paging mode, only one target page in a plurality of search pages formed in the paging mode is in a visible state, correspondingly, a page turning control corresponds to each search page, and the search pages are switched under the condition that the page turning control is clicked, so that the selected search pages are taken as the target pages and are in the visible state.
For example, in the present embodiment, the retrieved 100 target documents are output in a display manner of 10 jump links per page, whereby jump links of the 100 target documents are output through 10 search pages, and only the jump links of the 10 target documents on one of the pages are in a visible state at any time.
Step 104: and obtaining a target search path matched with the search keyword in the path library.
The path library comprises a plurality of document searching paths, and the document searching paths at least represent document positions of document information corresponding to the corresponding historical documents in the historical searching results. The document location here may include: the order of the search pages in the history search result and the order of the search pages when the document information corresponding to the history document is hit in the history search result.
For example, when document A is hit in the history search, the jump link of document A is in the third row of the second page, and the document search path corresponding to document A, i.e., the jump link characterizing document A is in the third row of the second search page in the history search result.
Based on this, in this embodiment, a document search path, i.e., a target search path, matching the search keyword may be found in the path library according to the search keyword.
Step 105: and at least adjusting the search page where the document information corresponding to the target document is located according to the document position represented by the target search path, so that the document information corresponding to the first document corresponding to the target search path in the target document is displayed in the target page.
Specifically, in this embodiment, the document information corresponding to the first document corresponding to the target search path in the target document may be adjusted from the current page where the first document is located to the target page, so that the document information of the first document may be in a visible state.
It can be seen from the foregoing technical solutions that, in an optimization method for document search provided in an embodiment of the present application, after a document search request is obtained, a plurality of target documents are obtained in a document library according to a search keyword therein, document information corresponding to the plurality of target documents is displayed through a plurality of search pages, and only document information in one of the target pages is in a visible state, and then a document search path corresponding to the search keyword is obtained, so that a document position of document information corresponding to a document hit in a history search represented by the document search path in a history search result is used to adjust a search page where the document information corresponding to the target document is located, so that the document information corresponding to a first document hit by the search keyword in the history search is in a visible state. Therefore, in the embodiment, the document information in the current document search result is positioned according to the document position of the document information corresponding to the hit document in the history search result, and the positioned document information is sequenced in the search page in the visible state in the document search result, so that the user can find the document information corresponding to the history hit document in time, and the document search efficiency is improved.
In an implementation manner, the present embodiment may further include the following steps to establish a path library, as shown in fig. 2:
step 201: and obtaining the historical keywords and historical documents hit by the historical keywords.
The history keywords are original input keywords of users in history search, and the history documents hit by the history keywords are documents determined by searching in a document library by using the history keywords in the history search.
Specifically, in this embodiment, history search records of the document may be read, and history keywords and history documents hit by the history keywords may be obtained according to the history search records.
Step 202: and generating a document searching path corresponding to the historical keywords and the historical documents at least according to the page turning times of the searching page corresponding to the historical keywords, the searching page where the document information corresponding to the historical documents is located and the document positions of the document information corresponding to the historical documents in the searching page where the document information is located.
In the embodiment, in the history search, information such as a history keyword, a document information click record, a search page turning frequency, a target result (a hit history document), and the like of a user is recorded to obtain a history search record, and the document information click record can be understood as: before the target result is hit, the user selects one or more document information so as to check whether the corresponding document is a required document, and finally the user selects the historical document as the target result. Based on this, in this embodiment, information such as a document information click record, a search page turning number, and a hit history document corresponding to a history keyword may be obtained from a history search record, and then, the information may be analyzed to determine a page order of a search page in which the document information corresponding to the hit history document is located in all search pages, and determine a document position of the document information corresponding to the history document in the search page in which the document information is located.
Step 203: the document search path is added to the path library.
Therefore, as the number of historical search times increases, the path library stores the document search paths corresponding to the historical keywords and the hit historical documents corresponding to the historical keywords.
Based on the above implementation, after step 202, the following steps may be further included in this embodiment, as shown in fig. 3:
step 204: document keywords of the historical documents are obtained.
In the embodiment, the detailed document contents of the historical document may be extracted, for example, the title name, the keyword list, the document tag, and other contents of the historical document are extracted, and then, the keywords are extracted from the detailed contents to obtain the document keywords.
Step 205: and obtaining keyword supplementary items corresponding to the historical keywords in the document keywords.
In this embodiment, similarity calculation may be performed on the document keywords and the history keywords, and then the document keywords with similarity lower than a threshold value with the history keywords are used as the keyword supplement items corresponding to the history keywords. For example, all keywords of the document keywords that are different from the history keywords are used as the keyword supplement items.
Step 206: and adding the keyword supplement item into the document searching path corresponding to the historical document.
That is, the document search path includes, in addition to the history keyword, another keyword different from the history keyword in the history document, and thus the document search path can be retrieved by more keywords.
In one implementation manner, when the search page where the document information corresponding to the target document is located is adjusted in step 105, the following manner may be specifically implemented, as shown in fig. 4:
step 401: and determining a first document corresponding to the target search path in the target document according to the document position represented by the target search path.
Step 402: and adjusting the document information corresponding to the first document from the current page where the first document is located to the first position on the target page, so that the document corresponding to the first position and the document ranked after the first position are ranked after the first document in the document search result.
The first position may be any position in the target page, or the first position may be the first N positions in the target page, where N is a positive integer greater than or equal to 1, or the first position may be the last N positions in the target page.
Therefore, in the embodiment, the document information corresponding to the first document is moved forward to the first position in the target page, so that the document information corresponding to the other documents at the first position and the later positions are all moved backward in sequence, that is, in the document search result, the first document is moved forward, and the corresponding other documents are correspondingly moved backward, so as to optimize the output position of the document information corresponding to the first document, so that the document information corresponding to the first document is in a visible state, and a user can view the document information corresponding to the first document in time.
For example, the jump link representing document A according to the target search path matched by the search keyword is located in the third row of the second search page in the historical search results. Based on this, in the current document search result, according to the target search path, the jump link to the document a is located, and since the jump link of the document a is in the third row of the second search page and is not in the visible state, in this embodiment, the jump link of the document a is moved from the third row of the second search page to the second row of the first search page which is currently in the visible state, so that the jump links of the documents other than the first row of the first search page do not understand, and the jump links of the documents are all moved backward by one order, and therefore, the jump link of the document a is moved forward to a position visible to the user without performing a page turning operation again by the user.
In one implementation, after step 402, the following steps may be further included in this embodiment, as shown in fig. 5:
step 403: it is monitored whether a hit determination operation for the first document is received, in which case step 404 is performed.
The hit determining operation may be understood as an operation of reading at least part of the document content of the first document, for example, an operation of clicking document information of the first document and copying part of words in the first document, and for example, an operation of clicking document information of the first document and downloading the first document; as another example, an operation to download the first document. The hit determination operation characterizes the first document as being a desired document for the current search.
Step 404: and generating a document searching path corresponding to the searching keyword and the first document according to the searching keyword, the first document and the first position.
The manner of generating the document search path in step 404 may refer to the corresponding content in the foregoing.
That is, in this embodiment, after the output position of the document information corresponding to the first document is optimized, the document search path corresponding to the first document is optimized, and the search keyword and the optimized document search path corresponding to the first document are added to the document library.
In one implementation, after step 402, the method in this embodiment may further include the following steps, as shown in fig. 6:
step 405: monitoring whether a hit-rejection operation for the first document is received; in the event that a hit-and-drop operation is received for the first document, step 406 is performed.
Wherein the hit-and-miss operation may be an operation that ignores the first document. For example, after clicking the document information of the first document, closing the document page of the first document, and for example, clicking the document information of the first document and performing no browsing or closing operation on the first document; for another example, the operation of browsing the search page where the first document is located but not clicking the first document. The hit-rejection operation characterizes the first document as one that is not needed for the current search.
Step 406: in the target document, a second document satisfying the association relation with the first document is obtained.
Wherein, the association relationship may be: the document identification corresponding to the first document comprises the document identification corresponding to the second document, or the similarity between the document identification corresponding to the first document and the document identification corresponding to the second document is higher than the association threshold. Based on this, a second document associated with the first document is retrieved from the target document, and the second document can be understood as a document which is highly related to the first document and is presumed to be required for the current search.
In particular, the document identification may be understood as a path identification named document, such as a document ID, a name, etc.
Step 407: and adjusting the document information corresponding to the second document from the current page where the document information is located to a second position associated with the first position.
In one implementation, the second location may be a location in the document search result that is subsequent to and adjacent to the first location, such that documents corresponding to the second location and documents ranked after the second location are ranked in the document search result after the second document.
Specifically, the manner of adjusting the second document may refer to the manner of adjusting the first document in the foregoing.
Based on the above implementation, in this embodiment, whether a hit determination operation for the second document is received may be further monitored, and in a case where the hit determination operation for the second document is received, a document search path corresponding to the search keyword and the second document is generated according to the search keyword, the second document, and the second position.
That is, if the second document is hit in the present embodiment, a corresponding new document search path is generated and added to the path library for the next use.
In another implementation, the second location may be a location in the document search result that is before and adjacent to the first location, such that documents corresponding to the first location and documents ranked after the first location are ranked in the document search result after the second document.
It should be noted that, because the position of the first document is adjusted, the document search path corresponding to the first document in the path library may be updated accordingly, that is, the following steps are performed: and generating a new document searching path according to the current new position of the first document and the searching keyword.
Referring to fig. 7, a schematic structural diagram of an optimization processing apparatus for document search according to the second embodiment of the present application is provided, where the apparatus may be configured in an electronic device capable of performing data processing, such as a computer or a server. The technical scheme in the embodiment is mainly used for improving the document searching efficiency.
Specifically, the apparatus in this embodiment may include the following units:
a request obtaining unit 701, configured to obtain a document search request, where the document search request includes at least one search keyword;
a result obtaining unit 702, configured to obtain a document search result in a document library according to the search keyword; the document search result comprises a plurality of target documents;
a page output unit 703, configured to output document information corresponding to the target document through at least two search pages, where a target page in the at least two search pages is in a visible state;
a path obtaining unit 704, configured to obtain, in a path library, a target search path matching the search keyword; the path library comprises a plurality of document searching paths, and the document searching paths at least represent document positions of document information corresponding to corresponding historical documents in historical searching results;
the page adjusting unit 705 is configured to at least adjust a search page where the document information corresponding to the target document is located according to the document position represented by the target search path, so that the document information corresponding to the first document corresponding to the target search path in the target document is displayed in the target page.
It can be seen from the foregoing technical solution that, in the optimization apparatus for document search provided in the second embodiment of the present application, after the document search request is obtained, according to the search keyword therein, a plurality of target documents are obtained in the document library, the document information corresponding to the plurality of target documents is displayed through a plurality of search pages, and only the document information in one of the target pages is in a visible state, and then, by obtaining the document search path corresponding to the search keyword, the search page where the document information corresponding to the document hit in the history search represented by the document search path is located in the history search result is adjusted, so that the document information corresponding to the first document hit by the search keyword in the history search is in a visible state. Therefore, in the embodiment, the document information in the current document search result is positioned according to the document position of the document information corresponding to the hit document in the history search result, and the positioned document information is sequenced in the search page in the visible state in the document search result, so that the user can find the document information corresponding to the history hit document in time, and the document search efficiency is improved.
In one implementation, the apparatus in this embodiment may further include the following units, as shown in fig. 8:
a path processing unit 706, configured to obtain a history keyword and a history document hit by the history keyword; generating the historical keywords and the document searching path corresponding to the historical documents at least according to the page turning times of the searching pages corresponding to the historical keywords, the searching pages where the document information corresponding to the historical documents is located and the document positions of the document information corresponding to the historical documents in the searching pages where the document information is located; adding the document search path to the path library.
In one implementation, the path processing unit 706 is further configured to: obtaining document keywords of the historical documents; obtaining keyword supplementary items corresponding to the historical keywords in the document keywords; and adding the keyword supplement item into a document search path corresponding to the historical document.
In an implementation manner, the page adjusting unit 705 is specifically configured to: determining a first document corresponding to the target search path in the target document according to the document position represented by the target search path; and adjusting the document information corresponding to the first document to a first position on the target page from the current page where the document information is located, so that the document corresponding to the first position and the document ranked after the first position are ranked after the first document in the document search result.
In one implementation, the page adjusting unit 705, after adjusting the document information corresponding to the first document from the current page to the first position on the target page, is further configured to: monitoring whether a hit determination operation is received for the first document; in a case where a hit determination operation for the first document is received, the trigger path processing unit 706 generates a document search path corresponding to the search keyword and the first document according to the search keyword, the first document, and the first position.
In one implementation, the page adjusting unit 705 is further configured to: monitoring whether a hit-discard operation for the first document is received; under the condition that a hit rejection operation aiming at the first document is received, obtaining a second document which meets the association relation with the first document in the target document; and adjusting the document information corresponding to the second document from the current page where the document information is located to a second position associated with the first position.
In one implementation, the second location is a location in the document search result that is after the first location and adjacent to the first location, such that documents corresponding to the second location and documents ranked after the second location are ranked after the second document in the document search result;
wherein, the page adjusting unit 705 is further configured to: monitoring whether a hit determination operation for the second document is received; in a case where a hit determination operation for the second document is received, the trigger path processing unit 706 generates a document search path corresponding to the search keyword and the second document according to the search keyword, the second document, and the second position.
In a preferred embodiment, the association relationship includes: the document identification corresponding to the first document comprises the document identification corresponding to the second document.
It should be noted that, for the specific implementation of each unit in the present embodiment, reference may be made to the corresponding content in the foregoing, and details are not described here.
Referring to fig. 9, a schematic structural diagram of an electronic device according to a third embodiment of the present application is provided, where the electronic device may be an electronic device capable of performing data processing, such as a computer or a server. The technical scheme in the embodiment is mainly used for improving the document searching efficiency.
Specifically, the electronic device in this embodiment may include the following structure:
a memory 901 for storing a computer program and data generated by the computer program;
a processor 902 for executing the computer program to implement: obtaining a document search request, wherein the document search request comprises at least one search keyword; obtaining a document search result in a document library according to the search keyword; the document search result comprises a plurality of target documents; outputting document information corresponding to the target document through at least two search pages, wherein the target page in the at least two search pages is in a visible state; obtaining a target search path matched with the search keyword in a path library; the path library comprises a plurality of document searching paths, and the document searching paths at least represent document positions of document information corresponding to historical documents corresponding to the document searching paths in historical searching results; and at least adjusting the search page where the document information corresponding to the target document is located according to the document position represented by the target search path, so that the document information corresponding to the first document corresponding to the target search path in the target document is displayed in the target page.
It can be seen from the foregoing technical solutions that, in an electronic device provided in the third embodiment of the present application, after a document search request is obtained, a plurality of target documents are obtained in a document library according to a search keyword therein, document information corresponding to the plurality of target documents is displayed through a plurality of search pages, and only document information in one of the target pages is in a visible state, and then, a document search path corresponding to the search keyword is obtained, so that a document position of document information corresponding to a document hit in a history search represented by the document search path in a history search result is used to adjust a search page where the document information corresponding to the target document is located, so that the document information corresponding to a first document hit by the search keyword in the history search is in a visible state. Therefore, in the embodiment, the document information in the current document search result is positioned according to the document position of the document information corresponding to the hit document in the history search result, and the positioned document information is sequenced in the search page in the visible state in the document search result, so that the user can find the document information corresponding to the history hit document in time, and the document search efficiency is improved.
Taking a search scene of a client on a bank document library as an example, when different clients use the document library for retrieval, the same or different retrieval conditions may be input, the document library inevitably causes inaccurate search results due to insufficient input conditions of the client, so that a large number of pages can be turned to find required document results, and the existing document library cannot dynamically record the page turning process or the process of inputting supplementary keywords of the user, so that support is provided for retrieval of subsequent similar clients, and improvement of client experience is influenced.
In view of this, based on the present application, this embodiment provides a document library playback search optimization scheme, where after a client completes a search, a search process is performed offline for playback through a search history and an operation record of the client, after the playback is completed, a document result path is optimized in a dynamic analysis manner for an input and a target result set of the client, and a path identifier is generated for each optimized path, and the path identifiers are matched in a client online retrieval process, so as to optimize a hit document set of the client automatically, help the client to locate a desired document quickly, and reduce the number of invalid operations of the client.
The scheme in this embodiment mainly comprises two parts:
an offline playback module: and dynamically optimizing the click path of the client according to the input and the hit result of the client, and generating a path identifier and an association relation according to document sampling.
An online application module: and optimizing the hit result set by adopting an identification matching fast approximation method during client searching, and helping the client to fast approximate the target result.
Based on this, in this embodiment, a search optimization method for document library playback based on search history is established, and the method can automatically playback the search history of the client, optimize and associate the click path of the client, and implement optimization of the client document hit set, and the specific scheme is as follows:
when a client uses the document library to search, the system records original input keywords, document click records, page turning times, target results, page numbers of the target results and the like of the client, transmits the recorded results to an offline playback module, generates an optimized path table, and automatically performs matching and result set click optimization according to the optimized path table when the client subsequently uses the document library to search.
The offline playback module is responsible for click record playback and search optimization of the client, and the specific process is as follows: and extracting an original keyword M and a target result D from the search information recorded by the system, and generating a retrieval path of the client as an original path of the M and D pairs according to the retrieval record. Secondly, for the target result D, obtaining the detailed content of the document including a title name, a keyword list, a label and the like, calculating the similarity between the element items and the M incoming lines, obtaining the element item value with the maximum similarity difference as a supplementary item of M, calculating the path from M to D again, and marking the result as a supplementary keyword path p.
In the online application module, when the document element item to be retrieved is similar to the keyword result M, optimizing page turning results in a hit result set, advancing the paging item where D is located, and recording the results as an optimized path q.
In the process of generating the optimized paths p and q, the result of the hit document set is sampled, for example, 5 × n +1 th document is sampled (n is a page number, each page has information of 5 documents), the ID and the document name of the document are used as path identifiers, all the document identifiers in p and q are compared, if the identifier of the optimized path 1 (i.e., the sampled document) includes the identifier of the optimized path 2 (e.g., the document corresponding to p), an association relationship is established between the path 1 and the path 2, that is, the optimized path 2 is optimized (the document is moved forward), and if the path 1 changes after the optimized path 2, the optimized path 1 is optimized.
When a client uses the system, the generated optimized path table can be applied on line, and the on-line application optimizes a hit document set by adopting a method of mark matching and fast approximation, wherein the specific method comprises the following steps: firstly, a native matching method is adopted to search a document library, the documents are dynamically sampled and subjected to identification matching in the clicking process, and when the documents hit the identification result in the clicking process, the hit optimization path is automatically changed, so that the subsequent result set selection of a client is facilitated; if the document set is not hit, the hit optimization of the relevant optimization path is continued according to the association relationship until the result set required by the client is hit.
The method, the device and the electronic equipment for optimizing the document search can be used for big data or other fields, for example, can be used for massive data search scenes in the big data field. Other fields are any fields other than the financial field, for example, the distributed field, the cloud computing field, the artificial intelligence field, the internet of things field. The above description is only an example, and does not limit the application field of the name of the invention provided by the present invention.
In the present specification, the embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in Random Access Memory (RAM), memory, read-only memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. An optimization processing method for document searching is characterized by comprising the following steps:
obtaining a document search request, wherein the document search request comprises at least one search keyword;
obtaining a document search result in a document library according to the search keyword; the document search result comprises a plurality of target documents;
outputting document information corresponding to the target document through at least two search pages, wherein the target page in the at least two search pages is in a visible state;
obtaining a target search path matched with the search keyword in a path library; the path library comprises a plurality of document searching paths, and the document searching paths at least represent document positions of document information corresponding to corresponding historical documents in historical searching results;
and at least adjusting the search page where the document information corresponding to the target document is located according to the document position represented by the target search path, so that the document information corresponding to the first document corresponding to the target search path in the target document is displayed in the target page.
2. The method of claim 1, further comprising:
obtaining historical keywords and historical documents hit by the historical keywords;
generating the historical keywords and the document searching path corresponding to the historical documents at least according to the page turning times of the searching pages corresponding to the historical keywords, the searching pages where the document information corresponding to the historical documents is located and the document positions of the document information corresponding to the historical documents in the searching pages where the document information is located;
and adding the document searching path to the path library.
3. The method of claim 2, further comprising:
obtaining document keywords of the historical document;
obtaining keyword supplementary items corresponding to the historical keywords in the document keywords;
and adding the keyword supplement item into a document search path corresponding to the historical document.
4. The method according to claim 1 or 2, wherein the adjusting at least the search page where the document information corresponding to the target document is located according to the document position represented by the target search path includes:
determining a first document corresponding to the target search path in the target document according to the document position represented by the target search path;
and adjusting the document information corresponding to the first document to a first position on the target page from the current page where the document information is located, so that the document corresponding to the first position and the document ranked after the first position are ranked after the first document in the document search result.
5. The method of claim 4, wherein after adjusting document information corresponding to the first document from the current page to the first location on the target page, the method further comprises:
monitoring whether a hit determination operation for the first document is received;
and under the condition that a hit determining operation aiming at the first document is received, generating a document searching path corresponding to the searching keyword and the first document according to the searching keyword, the first document and the first position.
6. The method of claim 4, further comprising:
monitoring whether a hit rejection operation for the first document is received;
under the condition that a hit rejection operation aiming at the first document is received, obtaining a second document which meets the association relation with the first document in the target document;
and adjusting the document information corresponding to the second document from the current page where the document information is located to a second position associated with the first position.
7. The method of claim 6, wherein the second location is a location in the document search results that is after the first location and adjacent to the first location, such that documents corresponding to the second location and documents ranked after the second location are ranked after the second document in the document search results;
wherein the method further comprises:
monitoring whether a hit determination operation for the second document is received;
and under the condition that a hit determining operation aiming at the second document is received, generating a document searching path corresponding to the searching keyword and the second document according to the searching keyword, the second document and the second position.
8. The method of claim 6, wherein the association comprises:
the document identification corresponding to the first document comprises the document identification corresponding to the second document.
9. An optimization processing apparatus for document search, the apparatus comprising:
the device comprises a request obtaining unit, a document searching unit and a searching unit, wherein the request obtaining unit is used for obtaining a document searching request which comprises at least one searching keyword;
a result obtaining unit, configured to obtain a document search result in a document library according to the search keyword; the document search result comprises a plurality of target documents;
the page output unit is used for outputting the document information corresponding to the target document through at least two search pages, and the target page in the at least two search pages is in a visible state;
a path obtaining unit, configured to obtain, in a path library, a target search path matching the search keyword; the path library comprises a plurality of document searching paths, and the document searching paths at least represent document positions of document information corresponding to corresponding historical documents in historical searching results;
and the page adjusting unit is used for at least adjusting the search page where the document information corresponding to the target document is located according to the document position represented by the target search path, so that the document information corresponding to the first document corresponding to the target search path in the target document is displayed in the target page.
10. An electronic device, comprising:
a memory for storing a computer program and data generated by the execution of the computer program;
a processor for executing the computer program to implement: obtaining a document search request, wherein the document search request comprises at least one search keyword; obtaining a document search result in a document library according to the search keyword; the document search result comprises a plurality of target documents; outputting document information corresponding to the target document through at least two search pages, wherein the target page in the at least two search pages is in a visible state; obtaining a target search path matched with the search keyword in a path library; the path library comprises a plurality of document searching paths, and the document searching paths at least represent document positions of document information corresponding to historical documents corresponding to the document searching paths in historical searching results; and at least adjusting the search page where the document information corresponding to the target document is located according to the document position represented by the target search path, so that the document information corresponding to the first document corresponding to the target search path in the target document is displayed in the target page.
CN202211031844.3A 2022-08-26 2022-08-26 Optimization processing method and device for document search and electronic equipment Pending CN115374061A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211031844.3A CN115374061A (en) 2022-08-26 2022-08-26 Optimization processing method and device for document search and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211031844.3A CN115374061A (en) 2022-08-26 2022-08-26 Optimization processing method and device for document search and electronic equipment

Publications (1)

Publication Number Publication Date
CN115374061A true CN115374061A (en) 2022-11-22

Family

ID=84067204

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211031844.3A Pending CN115374061A (en) 2022-08-26 2022-08-26 Optimization processing method and device for document search and electronic equipment

Country Status (1)

Country Link
CN (1) CN115374061A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116013296A (en) * 2023-03-28 2023-04-25 国网浙江省电力有限公司营销服务中心 Searching method based on computer natural language processing

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116013296A (en) * 2023-03-28 2023-04-25 国网浙江省电力有限公司营销服务中心 Searching method based on computer natural language processing
CN116013296B (en) * 2023-03-28 2023-05-30 国网浙江省电力有限公司营销服务中心 Searching method based on computer natural language processing

Similar Documents

Publication Publication Date Title
US9031885B2 (en) Technologies for encouraging search engine switching based on behavior patterns
RU2378680C2 (en) Determination of user intention
CN109086394B (en) Search ranking method and device, computer equipment and storage medium
CN107992585B (en) Universal label mining method, device, server and medium
CN102193973B (en) Present answer
US8396865B1 (en) Sharing search engine relevance data between corpora
EP1587009A2 (en) Content propagation for enhanced document retrieval
US7827172B2 (en) “Query-log match” relevance features
US20110307542A1 (en) Active Image Tagging
CN105453122A (en) Contextual mobile application advertisements
KR20080086868A (en) Dynamic search with implicit user intention mining
US20100191758A1 (en) System and method for improved search relevance using proximity boosting
CN110888990A (en) Text recommendation method, device, equipment and medium
WO2010015068A1 (en) Topic word generation method and system
CN101685448A (en) Method and device for establishing association between query operation of user and search result
JP2015525929A (en) Weight-based stemming to improve search quality
US20110184940A1 (en) System and method for detecting changes in the relevance of past search results
CN111400586A (en) Group display method, terminal, server, system and storage medium
CN112364126A (en) Keyword prompting method and device, computer equipment and storage medium
CN113297457A (en) High-precision intelligent information resource pushing system and pushing method
US20040034635A1 (en) Method and system for identifying and matching companies to business event information
CN115374061A (en) Optimization processing method and device for document search and electronic equipment
US20120059786A1 (en) Method and an apparatus for matching data network resources
CN112989179A (en) Model training and multimedia content recommendation method and device
CN114282119B (en) Scientific and technological information resource retrieval method and system based on heterogeneous information network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination