CN111291168A

CN111291168A - Book retrieval method and device and readable storage medium

Info

Publication number: CN111291168A
Application number: CN201811492676.1A
Authority: CN
Inventors: 景少玲; 李亚博; 谢海华
Original assignee: Pku Founder Information Industry Group Co ltd; Peking University Founder Group Co Ltd
Current assignee: Peking University
Priority date: 2018-12-07
Filing date: 2018-12-07
Publication date: 2020-06-16

Abstract

According to the book retrieval method, the book retrieval device and the readable storage medium, the retrieval text is obtained by performing text recognition on the retrieval voice input by the user; performing semantic analysis on the retrieval text to obtain a semantic word segmentation vector; processing the semantic word segmentation vector according to the trained problem template model to obtain a problem label corresponding to the retrieval text; and searching the search text according to a preset book map and the problem label to obtain and feed back a search result to the user, so that the accuracy of the search result is improved, and the real search requirement of the user is met.

Description

Book retrieval method and device and readable storage medium

Technical Field

The present invention relates to computer technologies, and in particular, to a book retrieval method and apparatus, and a readable storage medium.

Background

Book retrieval means that a computer automatically retrieves and returns book information required by readers from a book database according to information input by the readers.

Most of existing book retrieval is realized based on a keyword matching mode, for example, for the problem of user input: "please help me to inquire about some books on comment Cao Xue celery. "the existing book retrieval method uses" Cao Xue celery "or" review Cao Xue celery "as the retrieval key word, and carries out retrieval in the database. Accordingly, the results returned by the search will include the writing written by the Cao celery, and also the writing written by other people, wherein the search results of the writing written by the Cao celery obviously do not accord with the expectation of the user.

That is, in the prior art, book retrieval is realized only by keywords, which makes the accuracy of the retrieved result worse and inconsistent with the real retrieval requirement of the user.

Disclosure of Invention

The invention provides a book retrieval method, a book retrieval device and a readable storage medium, aiming at the problems that in the prior art, book retrieval is realized only by keywords, so that the accuracy of a retrieval result is poor and the retrieval result is inconsistent with the real retrieval requirement of a user.

In one aspect, the present invention provides a book retrieval method, including:

performing text recognition on the retrieval voice input by the user to obtain a retrieval text;

performing semantic analysis on the retrieval text to obtain a semantic word segmentation vector;

processing the semantic word segmentation vector according to the trained problem template model to obtain a problem label corresponding to the retrieval text;

and searching the search text according to a preset book map and the problem label to obtain and feed back a search result to a user.

In an optional implementation manner, the performing semantic analysis on the search text to obtain a semantic word segmentation vector includes:

performing word segmentation processing on the retrieval text according to a preset semantic dictionary, and converting the obtained segmented words after processing into semantic segmented word vectors; wherein each semantic participle is used for representing different semantics.

In an optional implementation manner, before performing text recognition on the search speech entered by the user and obtaining the search text, the method further includes:

collecting a plurality of marked texts to obtain a training set and a test set; the marked text comprises text participles and corresponding question labels;

performing semantic analysis on each labeled text to obtain each semantic word segmentation vector;

training the Bayes classifier by using each semantic word segmentation vector in the training set, and testing by using the trained Bayes classifier of each semantic word segmentation vector in the testing set;

and obtaining a trained Bayes classifier, wherein the trained Bayes classifier is a trained problem template model.

establishing a book map according to book information of each book in a book library; the book map comprises book information of each book under different information types and association relations among the different book information.

In one of the alternative embodiments, the first and second,

retrieving the retrieval text according to a preset book map and the question label to obtain and feed back a retrieval result to a user, wherein the retrieval result comprises the following steps:

searching the search text according to a preset book map and the problem label to obtain a search result; and the retrieval result comprises book information of the book corresponding to the question tag, and push information of other books related to the book is obtained.

In an optional implementation manner, after the feedback of the retrieval result to the user, the method further includes:

and adjusting each incidence relation in the book map according to the feedback of the user on the retrieval result.

In an optional implementation manner, the feeding back the search result to the user includes:

and sequencing according to the similarity of the book information in the retrieval result and the problem label of the retrieval text, and displaying the retrieval result to the user according to the sequencing result.

In another aspect, the present invention provides a book retrieval apparatus, comprising:

the voice recognition module is used for performing text recognition on the retrieval voice input by the user to obtain a retrieval text;

the text processing module is used for carrying out semantic analysis on the retrieval text to obtain a semantic word segmentation vector; processing the semantic word segmentation vector according to the trained problem template model to obtain a problem label corresponding to the retrieval text;

the retrieval module is used for retrieving the retrieval text according to a preset book map and the problem label to obtain a retrieval result;

and the display module is used for feeding back the retrieval result to the user.

In another aspect, the present invention provides a book retrieval apparatus, including: a speech collector, a display, a memory, a processor connected to the memory, and a computer program stored on the memory and executable on the processor,

the voice collector is used for collecting retrieval voice of a user;

the processor, when executing the computer program, performing the method of any of the preceding claims;

the display is used for displaying the retrieval result.

In a final aspect, the invention provides a readable storage medium comprising a program which, when run on a terminal, causes the terminal to perform the method of any of the preceding claims.

Drawings

With the foregoing drawings in mind, certain embodiments of the disclosure have been shown and described in more detail below. These drawings and written description are not intended to limit the scope of the disclosed concepts in any way, but rather to illustrate the concepts of the disclosure to those skilled in the art by reference to specific embodiments.

FIG. 1 is a schematic diagram of a network architecture on which the present invention is based;

fig. 2 is a schematic flow chart of a book retrieval method according to an embodiment of the present invention;

fig. 3 is a schematic flow chart of a book retrieval method according to a second embodiment of the present invention;

fig. 4 is a schematic flow chart of a book retrieval method according to a third embodiment of the present invention;

fig. 5 is a schematic structural diagram of a book retrieval apparatus according to a fourth embodiment of the present invention;

fig. 6 is a schematic diagram of a hardware structure of a book retrieval apparatus according to a fifth embodiment of the present invention.

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention.

It should be noted that the book retrieval method, the book retrieval device and the readable storage medium provided by the application can be applied to various scenes in which books need to be retrieved or queried, such as book retrieval management of a library or a bookstore, a book information retrieval engine and the like.

Fig. 1 is a schematic diagram of a network architecture based on the present invention, and as shown in fig. 1, the network architecture based on the present invention at least includes: a book retrieval device 1 and a data server 2. The book retrieval device can be a desktop computer, a tablet computer, a smart phone and other equipment which can be used for receiving voice information of a user; the data server 2 may be a server cluster storing book information provided by a book operator, a book information manager, and the like. The book searching device 1 can be connected with the data server 2 through wireless communication or wired communication to perform information interaction. In addition, plug-ins or programs which can be used for realizing the processing method of the access request are loaded or installed in the book retrieval device 1 and the data server 2 respectively, and the plug-ins or programs can be written by using languages such as C/C + +, Java, Shell or Python.

Fig. 2 is a schematic flow chart of a book retrieval method according to an embodiment of the present invention.

As shown in fig. 2, the book retrieval method includes:

step 101, performing text recognition on the retrieval voice input by the user to obtain a retrieval text.

And 102, performing semantic analysis on the retrieval text to obtain a semantic word segmentation vector.

And 103, processing the semantic word segmentation vector according to the trained problem template model to obtain a problem label corresponding to the retrieval text.

And step 104, retrieving the retrieval text according to a preset book map and the question label, and obtaining and feeding back a retrieval result to the user.

It should be noted that the main body of the book retrieval method provided by the present invention may be the book retrieval device 1 shown in fig. 1.

In the prior art, the automatic book retrieval means that the computer automatically retrieves and returns the book information required by the user from the book database according to the information input by the user. Most of traditional book retrieval is realized in a keyword matching mode, and returned information has the problems of poor accuracy, excessive related information and the like. Especially for the problem of existence of logical relationship, accurate retrieval cannot be realized. For example, for the problem: "please help me to inquire about some books on comment Cao Xue celery. ". Therefore, "Caochow" or "comment Caochow" can be used as a keyword for searching, the search result can include the writing written by Caochow, and also can include the writing written by other people, such a result does not accurately meet the real request of the reader, and the user still needs to spend time to perform secondary manual judgment on the results. Aiming at the problems, the invention utilizes the semantic analysis technology and combines the intelligent model and the book map based on the knowledge map technology to realize the retrieval of the book, thereby effectively improving the accuracy of the retrieval result and enabling the retrieval result to be matched with the requirements of the user.

In this embodiment, first, the book search device receives a search voice entered by the user, and performs text recognition on the search voice to obtain a corresponding search text. For example: "can you help me query a book about economic topics published in lude in the last 3 years? The manner in which the text of the speech is recognized may be implemented by various existing technologies, which is not limited in this embodiment. In addition, in this step, the book retrieval device may also directly receive the retrieval text input by the user, that is, the retrieval text may be obtained by the way of handwriting input or keyboard input by the user.

And then, the book retrieval device performs semantic analysis on the retrieval text to obtain a semantic word segmentation vector. Specifically, the book retrieval device performs word segmentation processing on the retrieval text according to a preset semantic dictionary, and converts the obtained segmented words after processing into semantic segmented word vectors; wherein each semantic participle is used for representing different semantics. For example, for a book on economic topics similar to the aforementioned "can you help me query for a book on economic topics published in lude 3 years ago? "the possible segmentation words are" near 3 years "," Lude "," economic material "and" book ". Generally speaking, a large number of words and their corresponding grammars and meanings can be included in a semantic dictionary, and by using the semantic dictionary, semantic participles with retrieval meaning in a retrieval text can be extracted, and the semantic participles are generally real words. The present invention is not limited to the embodiment, and the method specifically adopts the prior art to convert the obtained word into the semantic word vector and convert the word text into the word vector.

Then, the book retrieval device processes the semantic word segmentation vector according to the trained problem template model to obtain a problem label corresponding to the retrieval text. Specifically, the trained problem template model can be implemented based on a machine learning algorithm, and can be used for classifying each semantic word segmentation vector to obtain a problem label to which each semantic word segmentation vector belongs.

Taking the above-mentioned obtained participles as "near 3 years", "lude", "economic subject" and "book" as examples, after passing through the problem template model, the obtained problem labels are: year: 2016 years old; the authors: lude; subject word label: economy, Chinese economy, finance, economics.

And finally, the book retrieval device retrieves the retrieval text according to a preset book map and the problem label to obtain and feed back a retrieval result to the user. The book map is realized based on knowledge map technology, the knowledge map is named knowledge domain visualization or knowledge domain mapping map in the book intelligence field, and can display a series of different graphs of the relation between the knowledge development process and the structure, describe knowledge resources and carriers thereof by using visualization technology, and mine, analyze, construct, draw and display knowledge and the mutual relation between the knowledge resources and the carriers.

In this embodiment, the preset Book map may specifically include an International Standard Book Number (ISBN), a Book name, an author list, other books published by the author, a publication year, a publisher, a directory, a link, content, a score, a comment, a Book label, and the like of the Book, and a structural association relationship between each piece of information.

Therefore, in the embodiment, the problem labels obtained by semantic analysis and trained problem template model processing and the book atlas can be used for searching book information with higher matching degree with the search voice input by the user, and the accuracy of book search is effectively improved.

Optionally, in other optional embodiments, the feeding back the search result to the user includes: and sequencing according to the similarity of the book information in the retrieval result and the problem label of the retrieval text, and displaying the retrieval result to the user according to the sequencing result. Specifically, because the book atlas has a large information amount, the retrieval results can be sorted from high to low according to the similarity, and a plurality of pieces of information in the information are used as the retrieval results and fed back to the user.

Optionally, in other alternative embodiments, after the feedback of the search result to the user, the user may be further asked whether the user needs to view information such as related news or papers. If yes, searching related news; if not, the retrieval is stopped, so that better book retrieval information is further provided for the user.

The book retrieval method provided by the invention is characterized in that text recognition is carried out on retrieval voice input by a user to obtain a retrieval text; performing semantic analysis on the retrieval text to obtain a semantic word segmentation vector; processing the semantic word segmentation vector according to the trained problem template model to obtain a problem label corresponding to the retrieval text; and searching the search text according to a preset book map and the problem label to obtain and feed back a search result to the user, so that the accuracy of the search result is improved, and the real search requirement of the user is met.

On the basis of the first embodiment, in order to further improve the retrieval accuracy, a second embodiment of the present invention provides a book retrieval method, and fig. 3 is a schematic flow chart of the book retrieval method provided by the second embodiment of the present invention.

As shown in fig. 3, the book retrieval method includes:

step 201, collecting a plurality of marked texts to obtain a training set and a test set; wherein the labeled text comprises text participles and corresponding question labels.

Step 202, performing semantic analysis on each labeled text to obtain each semantic word segmentation vector.

And step 203, training the Bayes classifier by using each semantic word segmentation vector in the training set, and testing by using the trained Bayes classifier of each semantic word segmentation vector in the testing set.

And 204, obtaining a trained Bayes classifier, wherein the trained Bayes classifier is a trained problem template model.

And step 205, performing text recognition on the retrieval voice input by the user to obtain a retrieval text.

And step 206, performing semantic analysis on the retrieval text to obtain a semantic word segmentation vector.

And step 207, processing the semantic word segmentation vectors according to the trained problem template model to obtain problem labels corresponding to the retrieval texts.

And 208, retrieving the retrieval text according to a preset book map and the question label, and obtaining and feeding back a retrieval result to the user.

In the present embodiment, unlike the previous embodiments, the present embodiment further includes a process of creating a problem template model.

Firstly, a book retrieval device can acquire a plurality of marked texts and divide the marked texts into a training set and a test set; wherein the labeled text comprises text participles and corresponding question labels. Note that, the labeling of the labeled text is generally performed manually, and of course, the labeling can also be implemented by using an existing labeling algorithm, which is not limited in this embodiment.

Then, the book retrieval device may perform semantic analysis on each labeled text in the manner as described in the first embodiment to obtain each semantic word segmentation vector. And then, training a Bayesian classifier by using each semantic word segmentation vector in the training set, and testing by using each semantic word segmentation vector in the testing set through the trained Bayesian classifier, wherein the Bayesian classifier is an algorithm model capable of realizing classification based on a computational learning network model, and in the embodiment, the Bayesian classifier can be used as a problem template model in the application to classify the semantic word segmentation vectors. And finally, obtaining a trained Bayes classifier, wherein the trained Bayes classifier is a trained problem template model.

After the training of the problem template model is completed, similarly to the embodiment, the book retrieval device receives the retrieval voice input by the user, and performs text recognition on the retrieval voice to obtain a corresponding retrieval text. For example: "can you help me query a book about economic topics published in lude in the last 3 years? The manner in which the text of the speech is recognized may be implemented by various existing technologies, which is not limited in this embodiment. In addition, in this step, the book retrieval device may also directly receive the retrieval text input by the user, that is, the retrieval text may be obtained by the way of handwriting input or keyboard input by the user.

On the basis of the foregoing embodiments, a book retrieval method is provided in the third embodiment of the present invention, and fig. 4 is a flowchart illustrating the book retrieval method provided in the third embodiment of the present invention.

As shown in fig. 4, the book retrieval method includes:

step 301, performing text recognition on the retrieval voice input by the user to obtain a retrieval text.

Step 302, performing semantic analysis on the retrieval text to obtain a semantic word segmentation vector.

And 303, processing the semantic word segmentation vector according to the trained problem template model to obtain a problem label corresponding to the retrieval text.

Step 304, retrieving the retrieval text according to a preset book map and the question label to obtain a retrieval result; and the retrieval result comprises book information of the book corresponding to the question tag, and push information of other books related to the book is obtained.

And 305, adjusting each incidence relation in the book atlas according to the feedback of the retrieval result of the user.

Similar to the previous embodiment, first, the book retrieval device will receive the retrieval voice entered by the user, and perform text recognition on the retrieval voice to obtain the corresponding retrieval text. For example: "can you help me query a book about economic topics published in lude in the last 3 years? The manner in which the text of the speech is recognized may be implemented by various existing technologies, which is not limited in this embodiment. In addition, in this step, the book retrieval device may also directly receive the retrieval text input by the user, that is, the retrieval text may be obtained by the way of handwriting input or keyboard input by the user.

Different from the foregoing embodiment, in this embodiment, the book retrieval device may retrieve the retrieval text according to a preset book map and the question tag to obtain a retrieval result; and the retrieval result comprises book information of the book corresponding to the question tag, and push information of other books related to the book is obtained. For the book atlas, the book atlas is established according to book information of each book in the book library; the book map comprises book information of each book under different information types and association relations among the different book information. The book information includes, but is not limited to, information including author, year of publication, abstract, catalog, full text, book, etc.

Specifically, taking the foregoing as an example, the obtained problem labels are: year: 2016 years old; the authors: lude; subject word label: economy, Chinese economy, finance, economics. For the problem tags, the book retrieving apparatus first acquires book information of books that are consistent with the problem tags, and then, considering that there is a correlation between the book information, for example: investment, economy and finance are related to each other, and Chinese science and Chinese culture are related. Therefore, the book retrieval device also pushes the associated book information to the user by utilizing the association relationship among the books in the book map. That is, in addition to pushing economy-related book information to the user, financial-and investment-related book information may be pushed.

Finally, the book retrieval device can also adjust each incidence relation in the book atlas according to the feedback of the retrieval result from the user.

And (4) counting the satisfaction degree of each user on the feedback information of the retrieval result to determine whether the related book information needs to be pushed or determine which related book information is pushed in the next pushing.

Fig. 5 is a schematic structural diagram of a book retrieval device according to a fourth embodiment of the present invention, as shown in fig. 5, the book retrieval device includes:

the voice recognition module 10 is configured to perform text recognition on a retrieval voice input by a user to obtain a retrieval text;

the text processing module 20 is configured to perform semantic analysis on the search text to obtain a semantic word segmentation vector; processing the semantic word segmentation vector according to the trained problem template model to obtain a problem label corresponding to the retrieval text;

the retrieval module 30 is configured to retrieve the retrieval text according to a preset book map and the question tag to obtain a retrieval result;

and the display module 40 is used for feeding back the retrieval result to the user.

In an optional implementation manner, the text processing module 20 is specifically configured to:

In an optional implementation manner, the method further includes a training module, which is specifically configured to, before performing text recognition on the search speech entered by the user and obtaining the search text, further include:

In one optional implementation manner, the system further comprises a map establishing module, configured to perform text recognition on the retrieval voice entered by the user, and establish a book map according to book information of each book in the book library before obtaining the retrieval text; the book map comprises book information of each book under different information types and association relations among the different book information.

In an optional implementation manner, the retrieving module is specifically configured to:

In an optional implementation manner, the map building module is further configured to, after a retrieval result is fed back to the user, adjust each association relationship in the book map according to the retrieval result fed back by the user.

In one optional implementation manner, the display module is specifically configured to sort according to similarity between each book information in the search result and the question label of the search text, and display the search result to the user according to the sorting result.

The book retrieval device provided by the invention carries out text recognition on the retrieval voice input by the user to obtain the retrieval text; performing semantic analysis on the retrieval text to obtain a semantic word segmentation vector; processing the semantic word segmentation vector according to the trained problem template model to obtain a problem label corresponding to the retrieval text; and searching the search text according to a preset book map and the problem label to obtain and feed back a search result to the user, so that the accuracy of the search result is improved, and the real search requirement of the user is met.

Fig. 6 is a schematic diagram of a hardware structure of a book retrieval apparatus according to a fourth embodiment of the present invention. As shown in fig. 6, the book retrieval apparatus includes: the voice collector 43, the display 44, the memory 41, the processor 42 and a computer program stored on the memory 41 and capable of running on the processor 42, when the processor 42 runs the computer program, the method of any one of the first to third embodiments is executed, and the voice collector 43 is used for collecting the retrieval voice of the user; the display 44 is used for displaying the retrieval result.

The present invention also provides a readable storage medium comprising a program which, when run on a terminal, causes the terminal to perform the method of any of the first to third embodiments.

Those of ordinary skill in the art will understand that: all or a portion of the steps of implementing the above-described method embodiments may be performed by hardware associated with program instructions. The program may be stored in a computer-readable storage medium. When executed, the program performs steps comprising the method embodiments described above; and the aforementioned storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.

Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims

1. A book retrieval method is characterized by comprising the following steps:

2. The book retrieval method of claim 1, wherein the performing semantic analysis on the retrieved text to obtain semantic segmentation vectors comprises:

3. The book retrieval method of claim 1, wherein before performing text recognition on the retrieval voice entered by the user to obtain the retrieval text, the method further comprises:

4. The book retrieval method of claim 1, wherein before performing text recognition on the retrieval voice entered by the user to obtain the retrieval text, the method further comprises:

5. The book retrieval method of claim 4,

6. The book retrieval method of claim 5, wherein after the feedback of the retrieval result to the user, further comprising:

7. The book retrieval method of any one of claims 1-6, wherein the feeding back the retrieval result to the user comprises:

8. A book retrieval apparatus, comprising:

9. A book retrieval apparatus, comprising: a speech collector, a display, a memory, a processor connected to the memory, and a computer program stored on the memory and executable on the processor,

the voice collector is used for collecting retrieval voice of a user;

the processor, when executing the computer program, performs the method of any of claims 1-7;

the display is used for displaying the retrieval result.

10. A readable storage medium, characterized by comprising a program which, when run on a terminal, causes the terminal to perform the method of any one of claims 1-7.