CN114064849A - Data retrieval method and operation system thereof - Google Patents

Data retrieval method and operation system thereof Download PDF

Info

Publication number
CN114064849A
CN114064849A CN202111194738.2A CN202111194738A CN114064849A CN 114064849 A CN114064849 A CN 114064849A CN 202111194738 A CN202111194738 A CN 202111194738A CN 114064849 A CN114064849 A CN 114064849A
Authority
CN
China
Prior art keywords
database
document
module
keyword
search
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202111194738.2A
Other languages
Chinese (zh)
Inventor
戴井之
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN202111194738.2A priority Critical patent/CN114064849A/en
Publication of CN114064849A publication Critical patent/CN114064849A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3343Query execution using phonetics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/338Presentation of query results

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Acoustics & Sound (AREA)
  • Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a data retrieval method and a system, which belong to the field of databases, and comprise an external database and a search engine thereof, an internal database and a search engine thereof, a document comparison module, a data presentation module and a registration login module: a user logs in a registration login module, and the registration login module automatically logs in all external databases and all internal databases; inputting key words in a registration login module and determining search, namely automatically starting search engines of all external databases and internal databases to search by using the key words; the document comparison module extracts all documents searched from the external database and the internal database, compares the documents searched from each database, and sends the final search result to the data presentation module for presentation in a mode that the same document only adopts 1 document according to a preset preferential adoption sequence. The searching method and the searching system simultaneously use the keywords to search the documents in the internal and external databases, thereby reducing the workload of searching in different databases respectively.

Description

Data retrieval method and operation system thereof
Technical Field
The invention relates to the field of databases, in particular to a data retrieval method and a data retrieval system.
Background
In order to search documents comprehensively, people often need to search different database systems, such as the traditional Chinese knowledge network, the American FDA orange book, the Chinese CDE raw material registration platform and the like, and if the users log in the databases separately and input key words for searching, the searching is time-consuming and labor-consuming.
CN 112487014a discloses a SQL data processing method, apparatus and background server. The system comprises a background server, a client and a plurality of databases, wherein the background server is in communication connection with the client, and the background server logs in the plurality of databases at the same time. The SQL data processing method comprises the following steps: acquiring a plurality of pieces of SQL data submitted by the client; wherein each piece of SQL data is endowed with an identifier of the pointed database; according to the identification, identifying the database pointed by each SQL data; and respectively sending each SQL data to the corresponding database to obtain the corresponding query result. Therefore, the user does not need to start various clients, and the SQL data which needs to be sent to different databases can be sent to the corresponding databases only by starting the clients corresponding to the background server, so that the development and debugging cost of the SQL is reduced to a certain extent.
However, it is obvious that the method, the device and the background server only solve the problem of performing SQL data retrieval on a plurality of SQL databases at the same time, but do not solve the problem of simultaneously retrieving the same keyword by a plurality of databases of different types.
Disclosure of Invention
In order to solve the above problems, the present invention discloses a data retrieval method, which comprises at least one external database and its search engine, an internal database and its search engine, a document comparison module, a data presentation module, and a registration login module:
the registration login module is used for user registration and login;
a user logs in a registration login module, and the registration login module automatically logs in all external databases and all internal databases;
when a user is in a login state, inputting a keyword in a login registration module and determining search, wherein the login registration module automatically starts search engines of all external databases and internal databases to search by using the keyword;
the document comparison module extracts all documents searched out from the external database and the internal database, compares the documents searched out from each database, and sends the final search result to the data presentation module for presentation in a mode that the same document only adopts 1 document and according to a preset preferential adoption sequence.
The at least one external database is not controlled by an organization, and comprises a database which can be logged in only by authorization, namely a database which needs to be registered and can be logged in only by a user name and a password, such as a Chinese public network, a national intellectual property office patent retrieval and analysis database, and/or a database which can be directly logged in without authorization, namely a database which can be directly logged in without a user name and a password, such as a United states FDA orange book database, a Chinese drug administration drug substance registration platform and the like.
The internal database also includes a database that requires authorization to log in and/or a database that can log in without authorization.
The user registration module automatically memorizes a user name and a password, and automatically captures an authentication code and completes authentication, such as automatically identifying characters, symbols and numbers in an authentication code area on a display screen and automatically inputting the identified characters, symbols and numbers into an authentication code dialog box; and automatically identifying the verification code area number and the calculation symbol on the display screen, completing the calculation, and inputting the calculation result into a verification code dialog box.
The method for inputting the keywords in the registration login module comprises the following steps: text input, voice input, translation input (e.g., automatically translating chinese into english, and automatically translating english into japanese), etc.
The method for inputting the key words through voice in the registration login module after the user logs in comprises the following steps: 1. the user speaks a section of speech, and the registration login module extracts one or a plurality of key words in the speech; 2. the user checks the character display in the keyword input dialog box, if the keyword is correct, the confirmation is carried out, if the keyword is incorrect, the user can modify the characters in the dialog box or re-speak until the keyword is correct; 3. the registration login module keeps the mapping relation between the user voice and the keywords.
The method for inputting the key words through the translation in the registration login module after the user logs in comprises the following steps: 1. pre-storing database language donations; 2. recognizing the language of a keyword input through characters or voice; 3. if the language of the input keyword is not consistent with the language of the database, the input keyword is automatically translated into the language of the database document for searching.
The document comparison module compares documents searched by each database, the same document is simultaneously recorded in the external database and the internal database, and the document of the internal database is automatically extracted to be ignored.
In order to avoid repeated collection of invalid documents, a user can mark a document in the internal database as an invalid document, the document comparison module compares the extracted documents searched by the external database with the documents marked as invalid in the internal database one by one, and the invalid documents are removed from the final search result.
The user can reserve key words in a login state, and the method comprises the following steps: the method comprises the steps of inputting keywords in a keyword dialog box, automatically logging in a registration login module in a system for a certain period such as every month, automatically starting search engines of all external databases and internal databases by the registration login module to search documents in the period by the keywords, comparing the documents searched by the databases by the document comparison module, and sending a final search result to a data presentation module for presentation in a mode that the same document only adopts 1 piece according to a preset priority order.
The document comparison module compares whether the documents searched by the databases are the same or not in the following mode: comparing approximate characteristics such as file names and file sizes, and directly determining that the file is inconsistent if the file is inconsistent; if the documents are consistent, the rest of other projects are compared, and the documents are not determined to be consistent until the specific contents are compared, so that the calculation workload of document comparison is saved.
The document comparison module is used for sequencing the final search results according to the strength of the association degree of the documents and the search terms; the strength of the association between the document and the search word is determined by calculating the document feature images downloaded by the user in the past, such as by calculating the comprehensive scores of the media, the author and the voice used (such as English) recorded in the document, and the research direction of the user.
And finally, if the space number of the search result documents is larger than or smaller than a set threshold value, the data presentation module reminds the user to replace the keywords or further search in the search results.
And if the user finally selects to download a certain document in the external database, the data presentation module sends the document to the internal database, and the internal database automatically records and establishes keywords.
The rendering module may be a designated mailbox, a collection of data links, or the like. The data link set refers to a data set in which many hyperlinks (such as PDF format, video format, and 3D format file hyperlinks) are collected.
On the other hand, the invention also provides a system for operating the data retrieval method, which comprises at least one external database and a search engine thereof, an internal database and a search engine thereof, a document comparison module, a data presentation module and a registration login module.
The invention relates to a data retrieval method and an operation system thereof, which have the following beneficial effects: 1. meanwhile, the keywords are used for document retrieval in the internal and external databases, so that the workload of respective retrieval in different databases is reduced; 2. the keyword input mode simultaneously comprises characters, voice and translations, so that the search working efficiency is improved; 3. documents in a required external database are automatically moved to an internal database, so that the dependence on the external database is reduced; 4. invalid documents are removed, only 1 repeated document is selected, and the manual cleaning workload of the documents is reduced; 5. the final search results are sorted according to the strength of the association degree of the documents and the keywords, so that the workload of manually screening the documents is reduced; 6. the keyword translation input mode automatically matches the database languages, and the workload of manual translation is reduced.
Schematic diagram of
FIG. 1 is a schematic diagram of a data retrieval system according to the present invention, which includes an external database and its search engine, an internal database and its search engine, a document comparison module, a data presentation module, and a registration module.
Detailed Description
As shown in fig. 1, a database system disclosed in the present invention includes an external database and its search engine, an internal database and its search engine, a document comparison module, a data presentation module, and a registration module; the registration login module is used for user registration and login.
The external database is a database which is not controlled by an organization, and comprises a database which can be logged in only by authorization, namely a database which needs to be registered and can be logged in only by a user name and a password, such as a Chinese information network, a national intellectual property office patent retrieval and analysis database, and a database which can be directly logged in without authorization, namely a database which can be directly logged in without a user name and a password, such as an American FDA orange book database, a Chinese drug supervision and administration bulk drug registration platform, EMA website search, Baidu search and the like.
The searching method comprises the following steps:
1. the user logs in the registration login module, and the registration login module automatically logs in all external databases and all internal databases.
The user registration login module automatically memorizes a user name and a password, and automatically captures an authentication code and completes authentication if the authentication code is needed, such as automatically identifying characters, symbols and numbers in an authentication code area on a display screen and automatically inputting the identified characters, symbols and numbers into an authentication code dialog box; and automatically identifying the verification code area number and the calculation symbol on the display screen, completing the calculation, and inputting the calculation result into a verification code dialog box.
2. When a user is in a login state, a keyword is input and searched by a registration login module through characters and/or voice, and the registration login module automatically starts all search engines of an external database and an internal database to search by using the keyword.
2.1 the method for inputting the key words by voice in the registration login module after the user logs in comprises the following steps: firstly, a user speaks a section of speech, and a registration login module extracts one or a plurality of keywords in the speech of the section of speech; checking the character display in the key word input dialog box by the user, confirming if the key word is correct, and modifying the dialog box character or re-speaking until the key word is correct if the key word is incorrect; and the registration login module keeps the mapping relation between the user voice and the keywords.
2.2 the method for inputting the key words through translation in the registration login module after the user logs in comprises the following steps: storing database language donations in advance; identifying the language of the key word input by characters or voice; if the language of the input key word is not consistent with that of the database, the input key word is translated into the language of the database literature automatically and then searched.
3. The document comparison module extracts all documents searched from the external database and the internal database, compares the documents searched from each database, and sends the final search result to the data presentation module for presentation in a mode that the same document only adopts 1 document according to a preset preferential adoption sequence.
The document comparison module compares documents searched by each database, the same document is simultaneously recorded in an external database and an internal database, the internal database document is automatically extracted, and the external database document is ignored.
In order to avoid repeated collection of invalid documents, a user can mark a document in the internal database as an invalid document, the document comparison module compares the searched documents of the extracted external database with the document marked as invalid in the internal database one by one, and the invalid document is removed from the final search result.
The document comparison module compares whether documents searched by each database are the same or not by comparing general characteristics, such as document name and document size, and detailed comparison of specific contents is carried out, so that the calculation workload of document comparison is saved.
The priority order according to the preset priority order is as follows: the document comparison module is used for sequencing the final search results according to the strength of the association degree of the documents and the search terms; the strength of the association between the document and the search word is determined by calculating the document feature images downloaded by the user in the past, such as by calculating the comprehensive scores of the media, the author and the voice used (such as English) recorded in the document, and the research direction of the user.
4. The user can reserve key words in a login state, and the method comprises the following steps: the method comprises the steps of inputting keywords in a keyword dialog box, automatically logging in a registration login module in a system for a certain period such as every month, automatically starting search engines of all external databases and internal databases by the registration login module to search documents in the period by the keywords, comparing the documents searched by the databases by the document comparison module, and sending a final search result to a data presentation module for presentation in a mode that the same document only adopts 1 piece according to a preset priority order.
5. And finally, if the space number of the search result documents is larger than or smaller than a set threshold value, the data presentation module reminds the user to replace the keywords or further search in the search results.
6. And if the user finally selects to download a certain document in the external database, the data presentation module sends the document to the internal database, and the internal database automatically records and establishes keywords.
The documents described in this invention include written, speech, video and 3D works. The rendering module may be a designated mailbox, a collection of data links, or the like. The data link set refers to a data set in which many hyperlinks (such as PDF format, video format, and 3D format file hyperlinks) are collected.
The keywords described in the present invention are not narrowly defined document-defined keywords, such as the user-defined keywords in a thesis, but are broadly understood as the characteristics of the document, including the title, the user-defined keywords, the document theme, the content, and the like.
The above-mentioned embodiments only express one embodiment of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention.

Claims (10)

1. A data retrieval method comprises at least one external database and a search engine thereof, an internal database and a search engine thereof, a document comparison module, a data presentation module and a registration login module:
the registration login module is used for user registration and login;
a user logs in a registration login module, and the registration login module automatically logs in all external databases and all internal databases;
when a user is in a login state, inputting a keyword in a login registration module and determining search, wherein the login registration module automatically starts search engines of all external databases and internal databases to search by using the keyword;
the document comparison module extracts all documents searched out from the external database and the internal database, compares the documents searched out from each database, and sends the final search result to the data presentation module for presentation in a mode that the same document only adopts 1 document and according to a preset preferential adoption sequence.
2. A data retrieval method as claimed in claim 1, wherein said method of entering a keyword at a registration entry module comprises: text input, voice input, and translation input.
3. A data retrieval method according to claim 2, wherein said method of inputting a keyword by voice at a registration login module after the user logs in comprises: 1. the user speaks a section of speech, and the registration login module extracts one or a plurality of key words in the speech; 2. the user checks the character display in the keyword input dialog box, if the keyword is correct, the confirmation is carried out, if the keyword is incorrect, the user can modify the characters in the dialog box or re-speak until the keyword is correct; 3. the registration login module keeps the mapping relation between the user voice and the keywords.
4. A data retrieval method according to claim 2, wherein the method for inputting the keyword by translation after the user logs in the login module comprises: 1. pre-storing database language donations; 2. recognizing the language of a keyword input through characters or voice; 3. if the language of the input keyword is not consistent with the language of the database, the input keyword is automatically translated into the language of the database document for searching.
5. A data retrieval method as claimed in claim 1 or claim 2, wherein the document comparison module compares documents searched from each database, and the same document is included in both the external database and the internal database, automatically extracting the internal database document and disregarding the external database document.
6. A data retrieval method as claimed in claim 1 or 2, wherein the user can mark a document as invalid document in the internal database, and the document comparison module compares the extracted documents searched by the external database with the documents marked as invalid in the internal database one by one, and eliminates the invalid document in the final search result.
7. A data retrieval method as claimed in claim 1 or 2, wherein the user is in a login state and can subscribe to the keyword, the method comprising: the method comprises the steps of inputting keywords in a keyword dialog box, automatically logging in a registration login module in a system for a certain period such as every month, automatically starting search engines of all external databases and internal databases by the registration login module to search documents in the period by the keywords, comparing the documents searched by the databases by the document comparison module, and sending a final search result to a data presentation module for presentation in a mode that the same document only adopts 1 piece according to a preset priority order.
8. The data retrieval method according to claim 1 or 7, wherein the document comparison module is used for sorting the final search results according to the relevance between documents and retrieval words; the strength of the association between the document and the search word is determined by calculating the document feature portrait downloaded by the user in the past.
9. A data retrieval method as claimed in claim 1 or 2, wherein the user finally selects to download a document from the external database, the data presentation module sends the document to the internal database, and the internal database automatically records and creates the keyword.
10. The invention also provides a system for operating the retrieval method of the above claims 1 to 9, comprising at least one external database and its search engine, an internal database and its search engine, a document comparison module, a data presentation module, and a registration login module.
CN202111194738.2A 2021-10-13 2021-10-13 Data retrieval method and operation system thereof Withdrawn CN114064849A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111194738.2A CN114064849A (en) 2021-10-13 2021-10-13 Data retrieval method and operation system thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111194738.2A CN114064849A (en) 2021-10-13 2021-10-13 Data retrieval method and operation system thereof

Publications (1)

Publication Number Publication Date
CN114064849A true CN114064849A (en) 2022-02-18

Family

ID=80234354

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111194738.2A Withdrawn CN114064849A (en) 2021-10-13 2021-10-13 Data retrieval method and operation system thereof

Country Status (1)

Country Link
CN (1) CN114064849A (en)

Similar Documents

Publication Publication Date Title
CN109992645B (en) Data management system and method based on text data
US6353840B2 (en) User-defined search template for extracting information from documents
CN106649778B (en) Interaction method and device based on deep question answering
CN102053991B (en) Method and system for multi-language document retrieval
CN110929125B (en) Search recall method, device, equipment and storage medium thereof
US20090192996A1 (en) Method and apparatus for collecting entity aliases
WO2007143914A1 (en) Method, device and inputting system for creating word frequency database based on web information
CN110678860A (en) System and method for word-by-word text mining
US20140330866A1 (en) Systems and methods for parsing search queries
CN111639156B (en) Query method, device, equipment and storage medium based on hierarchical label
WO2020155749A1 (en) Method and apparatus for constructing personal knowledge graph, computer device, and storage medium
CN111581367A (en) Method and system for inputting questions
CN112052317A (en) Medical knowledge base intelligent retrieval system and method based on deep learning
CN110941702A (en) Retrieval method and device for laws and regulations and laws and readable storage medium
CN109948154B (en) Character acquisition and relationship recommendation system and method based on mailbox names
WO2020133186A1 (en) Document information extraction method, storage medium, and terminal
CN113220821A (en) Index establishing method and device for test question retrieval and electronic equipment
CN117171650A (en) Document data processing method, system and medium based on web crawler technology
CN110737677B (en) Data searching system and method
CN116542676A (en) Intelligent customer service system based on big data analysis and method thereof
CN114064849A (en) Data retrieval method and operation system thereof
CN110888894A (en) Patent search method, server and computer readable medium
CN109284364B (en) Interactive vocabulary updating method and device for voice microphone-connecting interaction
CN112988972A (en) Administrative penalty file evaluation and checking method and system based on data model
KR100753779B1 (en) Method for executing initial sound letter search of mixed form and system for executing the method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication

Application publication date: 20220218

WW01 Invention patent application withdrawn after publication