CN112307073A - Information query method, device, equipment and storage medium - Google Patents

Information query method, device, equipment and storage medium Download PDF

Info

Publication number
CN112307073A
CN112307073A CN201910818419.0A CN201910818419A CN112307073A CN 112307073 A CN112307073 A CN 112307073A CN 201910818419 A CN201910818419 A CN 201910818419A CN 112307073 A CN112307073 A CN 112307073A
Authority
CN
China
Prior art keywords
user
corpus
word
retrieval result
word frequency
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910818419.0A
Other languages
Chinese (zh)
Inventor
不公告发明人
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing ByteDance Network Technology Co Ltd
Original Assignee
Beijing ByteDance Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing ByteDance Network Technology Co Ltd filed Critical Beijing ByteDance Network Technology Co Ltd
Priority to CN201910818419.0A priority Critical patent/CN112307073A/en
Publication of CN112307073A publication Critical patent/CN112307073A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/248Presentation of query results

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Document Processing Apparatus (AREA)

Abstract

The embodiment of the disclosure discloses a query method, a query device, equipment and a storage medium, wherein the method comprises the following steps: constructing a corpus based on pre-collected data, and counting the word frequency of each word in the corpus; acquiring a voice query instruction input by a user, and identifying user intention of the voice query instruction to obtain a character object corresponding to the user intention; searching in a corpus according to the pinyin and the tone of the character object to obtain at least one search result; reading the word frequency corresponding to each retrieval result, and sequencing at least one retrieval result according to the word frequency; displaying the at least one retrieval result according to the sorting result for selection by a user; and responding to the triggering operation of the user on a certain retrieval result, and navigating to the next level of page to perform information query. The embodiment of the invention realizes the purpose of inquiring the characters through voice, and simultaneously displays the characters with the same pronunciation to the user for selection according to the order of word frequency, thereby improving the inquiring efficiency.

Description

Information query method, device, equipment and storage medium
Technical Field
The embodiment of the disclosure relates to the technical field of computers, and in particular, to an information query method, an information query device, information query equipment and a storage medium.
Background
In daily life, some unfamiliar words or words forgotten how to write are easily encountered, and the words are usually queried by manual input through a dictionary. However, in chinese, many characters have a state of one character with multiple tones or multiple meanings, and when a user needs to ask how a certain multi-tone or multi-meaning character is written, there are many characters or words that can be searched by a dictionary, so that the target character required by the user cannot be quickly and accurately identified, and the identification efficiency is low.
BRIEF SUMMARY OF THE PRESENT DISCLOSURE
The embodiment of the disclosure provides an information query method, an information query device, information query equipment and a storage medium, so as to achieve the purpose of quickly and accurately identifying characters required by a user.
In a first aspect, an embodiment of the present disclosure provides an information query method, where the method includes:
constructing a corpus based on pre-collected data, and counting the word frequency of each word in the corpus;
acquiring a voice query instruction input by a user, and identifying user intention of the voice query instruction to obtain a character object corresponding to the user intention;
searching in the corpus according to the pinyin and the tone of the character object to obtain at least one search result, wherein the search result is a word with the same pronunciation as the character object;
reading the word frequency corresponding to each retrieval result, and sequencing the at least one retrieval result according to the word frequency;
displaying the at least one retrieval result according to the sorting result for selection by a user;
and responding to the triggering operation of the user on a certain retrieval result, and navigating to the next level of page to perform information query.
In a second aspect, an embodiment of the present disclosure further provides an information query apparatus, where the apparatus includes:
the building module is used for building a corpus based on pre-collected data and counting the word frequency of each word in the corpus;
the acquisition and identification module is used for acquiring a voice query instruction input by a user and identifying user intention of the voice query instruction to obtain a character object corresponding to the user intention;
the retrieval module is used for retrieving in the corpus according to the pinyin and the tone of the character object to obtain at least one retrieval result, wherein the retrieval result is a word with the same pronunciation as the character object;
the sequencing module is used for reading the word frequency corresponding to each retrieval result and sequencing the at least one retrieval result according to the word frequency;
the display module is used for displaying the at least one retrieval result according to the sorting result for the user to select;
and the response module is used for responding to the triggering operation of a user on a certain retrieval result, navigating to the next level of page to carry out information query.
In a third aspect, an embodiment of the present disclosure further provides an apparatus, including:
one or more processors;
a storage device for storing one or more programs,
when the one or more programs are executed by the one or more processors, the one or more processors implement the information query method according to any embodiment of the disclosure.
In a fourth aspect, the embodiments of the present disclosure further provide a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the information query method according to any embodiment of the present disclosure.
After the query voice of the user is obtained, the pinyin and the tone of the characters intended to be queried by the user are determined through voice recognition, all words with the same pronunciation are retrieved from the corpus according to the pinyin and the tone, and the words are sequentially displayed to the user for selection according to the word frequency. Therefore, the purpose of inquiring through voice is achieved, all words with the same pronunciation are displayed for the user to select, and the character recognition efficiency is improved.
Drawings
FIG. 1 is a flow chart of a method of querying information in an embodiment of the present disclosure;
fig. 2 is a schematic structural diagram of an information query device in an embodiment of the present disclosure;
fig. 3 is a schematic structural diagram of an apparatus in an embodiment of the disclosure.
Detailed Description
The present disclosure is described in further detail below with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the disclosure and are not limiting of the disclosure. It should be further noted that, for the convenience of description, only some of the structures relevant to the present disclosure are shown in the drawings, not all of them.
It should be noted that the terms "system" and "network" are often used interchangeably in this disclosure. Reference to "and/or" in embodiments of the present disclosure is intended to "include any and all combinations of one or more of the associated listed items. The terms "first", "second", and the like in the description and claims of the present disclosure and in the drawings are used for distinguishing between different objects and not for limiting a particular order.
It should also be noted that the following embodiments of the present disclosure may be implemented individually, or may be implemented in combination with each other, and the embodiments of the present disclosure are not limited specifically.
Referring to fig. 1, which shows a flow chart of an information query method provided by an embodiment of the present disclosure, the method disclosed by the embodiment of the present disclosure is mainly applicable to a case of querying information by voice, for example, querying a writing method of a certain chinese character by voice, the method can be executed by a corresponding information query device, the device can be implemented by software and/or hardware, and can be configured on a device having a voice recognition function and a display device, for example, on a mobile terminal.
As shown in fig. 1, the method specifically includes the following steps:
s101, a corpus is constructed based on data collected in advance, and the word frequency of each word in the corpus is counted.
When the corpus is established, data such as texts, compositions and web texts of a preset number of primary and secondary school teaching materials are collected, the data can be collected in an exemplary manual downloading mode or a crawler crawling mode, word segmentation processing is carried out on the collected data, stop words or nonsense words such as connecting words or language and atmosphere words included in the data are removed, and the corpus is obtained. When a corpus is constructed, the function of combined query of pinyin and tone is added to the corpus.
In the embodiment of the disclosure, after the corpus is established, word frequency statistics is performed on each word in the corpus, that is, the frequency of each word or word appearing in the corpus is determined, for example, the word frequency statistics may be performed through a TF-IDF (term frequency-inverse document frequency) algorithm, and in addition, in order to reduce the amount of calculation, the frequency of each word appearing in the corpus may be directly used as the word frequency of the word. And storing the word frequency statistical result in the established corpus in the form of a data list.
S102, a voice query instruction input by a user is obtained, user intention recognition is carried out on the voice query instruction, and a character object corresponding to the user intention is obtained.
In the embodiment of the invention, after the voice query instruction of the user is acquired, the voice query instruction of the user is identified to obtain the text information corresponding to the voice query instruction, and the obtained text information is matched with the pre-stored intention list to determine the intention of the user and the text object corresponding to the intention of the user. For example, the text information corresponding to the recognized voice query command input by the user is "how to write a word asking for a question? "matching with the intention list shows that the user intention is how the query term is written, and the character object corresponding to the intention is" strange ". Furthermore, after the character object corresponding to the user intention is obtained, the pinyin and the tone of the character object are identified and stored in the word slot together with the character object, and then the subsequent query can be directly carried out based on the pinyin and the tone in the word slot. In the embodiment of the invention, the numbers 1-4 can be used to respectively represent four tones (yin Ping, yang Ping, upward tone and de-voice) of the Chinese Pinyin.
S103, retrieving in the corpus according to the pinyin and the tone of the character object to obtain at least one retrieval result, wherein the retrieval result is a word with the same pronunciation as the character object.
In the embodiment of the present disclosure, after the corpus is established, the corpus may be stored in the local device or in the network server. If the word is stored in the local equipment, retrieval can be directly carried out according to the pinyin and the tone of the character object in the word slot to obtain at least one word with the same pronunciation as the character object; if the corpus is stored in the network server, the pinyin tone in the word slot can be sent to the network server, so that the network server can send the retrieval result to the equipment after retrieving. Illustratively, when the character object recognized by the device is a word of "singularity", the query is performed through the pinyin "qiyi" of the word and the corresponding tone "24" of the word, and all the words with the same pronunciation as the word of "singularity" are obtained. It should be noted that, since the tone is added during the query, the purpose of reducing the search result can be achieved, and the accuracy of the search can be improved.
And S104, reading the word frequency corresponding to each retrieval result, and sequencing the at least one retrieval result according to the word frequency.
Through the steps of S101-S103, all words with the same pronunciation as the text object can be retrieved, and in order to ensure that the user can accurately find the required word, the retrieval result is displayed on the display device of the device (for example, on the touch screen) at the same time, so that the user can select the required retrieval result.
Furthermore, in order to improve the efficiency and accuracy of obtaining the query information required by the user, after the retrieval result is obtained, the word frequency of the retrieval result can be read by matching any retrieval result with the data list for storing the word frequency statistical result. And then, the retrieval results can be sorted according to the word frequency of each retrieval result, and optionally, the retrieval results are sorted according to the sequence of the frequency from high to low.
And S105, displaying the at least one retrieval result according to the sorting result for the user to select.
Since the higher the frequency of a certain word, the greater the probability that it is the word that the user desires. Therefore, at least one retrieval result is displayed according to the sequencing result, and the purpose of quickly selecting the required words by the user can be achieved. Further, for the purpose of reminding the user, the search result ranked first may be highlighted, or all the top N ranked bits may be highlighted, where the value of N may be preset. Furthermore, if the retrieved retrieval results are numerous, in order to facilitate the user to quickly browse all the retrieval results, a scroll bar control is further arranged on the display interface of the device, so that the user can slide the scroll bar to browse the retrieval results.
And S106, responding to the trigger operation of the user on a certain retrieval result, and navigating to the next level of page to perform information query.
After determining the search result required by the user, the user may select the search result word through a trigger operation, where the trigger operation may be a single click or a double click, or may be other trigger operations, and is not specifically limited herein. And the equipment responds to the trigger operation of a user on a certain retrieval result, and navigates to the next level page to perform information query. For example, the search result is queried for detailed information, such as word interpretation, usage, etc.
After the query voice of the user is obtained, the pinyin and the tone of the characters intended to be queried by the user are determined through voice recognition, all characters or words with the same pronunciation are retrieved from the corpus according to the pinyin and the tone, and are displayed to the user for selection. Therefore, the purpose of inquiring through voice is achieved, and the retrieval result with high word frequency is preferentially displayed to the user so as to be convenient for the user to select, and therefore the character recognition efficiency is improved.
Fig. 2 is a schematic structural diagram of an information query apparatus in an embodiment of the present disclosure. As shown in fig. 2, the apparatus includes:
the building module 201 is configured to build a corpus based on pre-collected data, and count word frequency of each word in the corpus;
the acquisition and recognition module 202 is configured to acquire a voice query instruction input by a user, and perform user intention recognition on the voice query instruction to obtain a text object corresponding to the user intention;
the retrieval module 203 is configured to perform retrieval in the corpus according to the pinyin and the tone of the literal object to obtain at least one retrieval result, where the retrieval result is a word having the same pronunciation as the literal object;
the sorting module 204 is configured to read a word frequency corresponding to each search result, and sort the at least one search result according to the word frequency;
a display module 205, configured to display the at least one search result according to the sorting result, so as to be selected by a user;
and the response module 206 is configured to navigate to a next-level page for information query in response to a user triggering operation on a certain retrieval result.
After the query voice of the user is obtained, the pinyin and the tone of the characters intended to be queried by the user are determined through voice recognition, all characters or words with the same pronunciation are retrieved from the corpus according to the pinyin and the tone, and are displayed to the user for selection. Therefore, the purpose of inquiring through voice is achieved, all words with the same pronunciation are displayed for a user to select according to the sequence of word frequency from high to low, and the character recognition efficiency is improved.
On the basis of the above embodiment, the building module includes:
the building unit is used for performing word segmentation processing on the acquired data, removing stop words or nonsense words included in the data and obtaining a corpus;
and the counting unit is used for carrying out word frequency counting based on the TF-IDF algorithm and storing the word frequency counting result in a corpus in a data list form.
On the basis of the above embodiment, the apparatus further includes:
and the highlight processing module is used for highlighting the retrieval result ranked at the first place.
On the basis of the above embodiment, the acquiring and identifying module includes:
the voice recognition unit is used for recognizing a voice query instruction of a user to obtain character information corresponding to the voice query instruction;
and the intention matching unit is used for matching the character information with a pre-stored intention list so as to determine the user intention and the character object corresponding to the user intention.
The information inquiry device provided by the embodiment of the disclosure can execute the information inquiry method provided by any embodiment of the disclosure, and has corresponding functional modules and beneficial effects of the execution method.
Fig. 3 is a schematic structural diagram of an apparatus provided in an embodiment of the present disclosure, and as shown in fig. 3, a schematic structural diagram of an apparatus suitable for implementing an embodiment of the present disclosure is shown. The device shown in fig. 3 is only an example and should not bring any limitation to the function and use range of the embodiments of the present disclosure.
As shown in fig. 3, the apparatus 300 may include a processor (e.g., a central processing unit, a graphics processor, etc.) 301, which may perform various suitable actions and processes according to a program stored in a Read Only Memory (ROM)302 or a program loaded from a storage device 308 into a Random Access Memory (RAM)303, for example, implementing a query method provided by the embodiments of the present disclosure, wherein the method includes:
constructing a corpus based on pre-collected data, and counting the word frequency of each word in the corpus; acquiring a voice query instruction input by a user, and identifying user intention of the voice query instruction to obtain a character object corresponding to the user intention; searching in the corpus according to the pinyin and the tone of the character object to obtain at least one search result, wherein the search result is a word with the same pronunciation as the character object; reading the word frequency corresponding to each retrieval result, and sequencing the at least one retrieval result according to the word frequency; displaying the at least one retrieval result according to the sorting result for selection by a user; and responding to the triggering operation of the user on a certain retrieval result, and navigating to the next level of page to perform information query.
In the RAM 303, various programs and data necessary for the operation of the apparatus 300 are also stored. The processor 301, the ROM 302, and the RAM 303 are connected to each other via a bus 304. An input/output (I/O) interface 305 is also connected to bus 304.
Generally, the following devices may be connected to the I/O interface 305: input devices 306 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; an output device 307 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage devices 308 including, for example, magnetic tape, hard disk, etc.; and a communication device 309. The communication means 309 may allow the device 300 to communicate wirelessly or by wire with other devices to exchange data. While fig. 3 illustrates an apparatus 300 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided.
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication means 309, or installed from the storage means 308, or installed from the ROM 302. The computer program, when executed by the processor 301, performs the above-described functions defined in the methods of the embodiments of the present disclosure.
It should be noted that the computer readable medium in the present disclosure can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.
The computer readable medium may be embodied in the apparatus; or may be separate and not incorporated into the device.
The computer readable medium carries one or more programs, and when the one or more programs are executed by the apparatus, the server executes the query method provided by the embodiment, where the method includes: constructing a corpus based on pre-collected data, and counting the word frequency of each word in the corpus; acquiring a voice query instruction input by a user, and identifying user intention of the voice query instruction to obtain a character object corresponding to the user intention; searching in the corpus according to the pinyin and the tone of the character object to obtain at least one search result, wherein the search result is a word with the same pronunciation as the character object; reading the word frequency corresponding to each retrieval result, and sequencing the at least one retrieval result according to the word frequency; displaying the at least one retrieval result according to the sorting result for selection by a user; and responding to the triggering operation of the user on a certain retrieval result, and navigating to the next level of page to perform information query.
Computer program code for carrying out operations for aspects of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The modules described in the embodiments of the present disclosure may be implemented by software or hardware. The name of the module does not in some cases constitute a limitation of the module itself, and for example, the display module may be further described as a "module for displaying at least one search result".
In accordance with one or more embodiments of the present disclosure, the following is also disclosed:
a1, an information query method, comprising:
constructing a corpus based on pre-collected data, and counting the word frequency of each word in the corpus;
acquiring a voice query instruction input by a user, and identifying user intention of the voice query instruction to obtain a character object corresponding to the user intention;
searching in the corpus according to the pinyin and the tone of the character object to obtain at least one search result, wherein the search result is a word with the same pronunciation as the character object;
reading the word frequency corresponding to each retrieval result, and sequencing the at least one retrieval result according to the word frequency;
displaying the at least one retrieval result according to the sorting result for selection by a user;
and responding to the triggering operation of the user on a certain retrieval result, and navigating to the next level of page to perform information query.
A2, according to the method of A1, constructing a corpus based on pre-collected data, and counting word frequency of each word in the corpus, including:
performing word segmentation on the acquired data, and removing stop words or nonsense words included in the data to obtain a corpus;
and carrying out word frequency statistics based on the TF-IDF algorithm, and storing the word frequency statistical result in a corpus in a data list form.
A3, the method of A1, the method further comprising:
the search result ranked first is highlighted.
A4, according to the method of A1, performing user intention recognition on the user voice query instruction to obtain a character object corresponding to the user intention, including:
recognizing a voice query instruction of a user to obtain character information corresponding to the voice query instruction;
and matching the text information with a pre-stored intention list to determine the user intention and the text object corresponding to the user intention.
B1, an information inquiry apparatus, the apparatus comprising:
the building module is used for building a corpus based on pre-collected data and counting the word frequency of each word in the corpus;
the acquisition and identification module is used for acquiring a voice query instruction input by a user and identifying user intention of the voice query instruction to obtain a character object corresponding to the user intention;
the retrieval module is used for retrieving in the corpus according to the pinyin and the tone of the character object to obtain at least one retrieval result, wherein the retrieval result is a word with the same pronunciation as the character object;
the sequencing module is used for reading the word frequency corresponding to each retrieval result and sequencing the at least one retrieval result according to the word frequency;
the display module is used for displaying the at least one retrieval result according to the sorting result for the user to select;
and the response module is used for responding to the triggering operation of a user on a certain retrieval result, navigating to the next level of page to carry out information query.
B2, the apparatus of claim B1, the building blocks comprising:
the building unit is used for performing word segmentation processing on the acquired data, removing stop words or nonsense words included in the data and obtaining a corpus;
and the counting unit is used for carrying out word frequency counting based on the TF-IDF algorithm and storing the word frequency counting result in a corpus in a data list form.
B3, the apparatus of B1, the apparatus further comprising:
and the highlight processing module is used for highlighting the retrieval result ranked at the first place.
B4, the apparatus of B1, the acquiring identification module comprising:
the voice recognition unit is used for recognizing a voice query instruction of a user to obtain character information corresponding to the voice query instruction;
and the intention matching unit is used for matching the character information with a pre-stored intention list so as to determine the user intention and the character object corresponding to the user intention.
C. An apparatus, comprising:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the information query method of any one of A1-A4.
D. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out an information query method according to any one of claims a1-a 4.
The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the disclosure herein is not limited to the particular combination of features described above, but also encompasses other embodiments in which any combination of the features described above or their equivalents does not depart from the spirit of the disclosure. For example, the above features and (but not limited to) the features disclosed in this disclosure having similar functions are replaced with each other to form the technical solution.
Further, while operations are depicted in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order. Under certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while several specific implementation details are included in the above discussion, these should not be construed as limitations on the scope of the disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.
Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims.

Claims (10)

1. An information query method, comprising:
constructing a corpus based on pre-collected data, and counting the word frequency of each word in the corpus;
acquiring a voice query instruction input by a user, and identifying user intention of the voice query instruction to obtain a character object corresponding to the user intention;
searching in the corpus according to the pinyin and the tone of the character object to obtain at least one search result, wherein the search result is a word with the same pronunciation as the character object;
reading the word frequency corresponding to each retrieval result, and sequencing the at least one retrieval result according to the word frequency;
displaying the at least one retrieval result according to the sorting result for selection by a user;
and responding to the triggering operation of the user on a certain retrieval result, and navigating to the next level of page to perform information query.
2. The method of claim 1, wherein constructing a corpus based on pre-collected data and counting word frequencies of each word in the corpus comprises:
performing word segmentation on the acquired data, and removing stop words or nonsense words included in the data to obtain a corpus;
and carrying out word frequency statistics based on the TF-IDF algorithm, and storing the word frequency statistical result in a corpus in a data list form.
3. The method of claim 1, further comprising:
the search result ranked first is highlighted.
4. The method of claim 1, wherein performing user intention recognition on the user voice query instruction to obtain a text object corresponding to the user intention comprises:
recognizing a voice query instruction of a user to obtain character information corresponding to the voice query instruction;
and matching the text information with a pre-stored intention list to determine the user intention and the text object corresponding to the user intention.
5. An information query apparatus, comprising:
the building module is used for building a corpus based on pre-collected data and counting the word frequency of each word in the corpus;
the acquisition and identification module is used for acquiring a voice query instruction input by a user and identifying user intention of the voice query instruction to obtain a character object corresponding to the user intention;
the retrieval module is used for retrieving in the corpus according to the pinyin and the tone of the character object to obtain at least one retrieval result, wherein the retrieval result is a word with the same pronunciation as the character object;
the sequencing module is used for reading the word frequency corresponding to each retrieval result and sequencing the at least one retrieval result according to the word frequency;
the display module is used for displaying the at least one retrieval result according to the sorting result for the user to select;
and the response module is used for responding to the triggering operation of a user on a certain retrieval result, navigating to the next level of page to carry out information query.
6. The apparatus of claim 5, wherein the building module comprises:
the building unit is used for performing word segmentation processing on the acquired data, removing stop words or nonsense words included in the data and obtaining a corpus;
and the counting unit is used for carrying out word frequency counting based on the TF-IDF algorithm and storing the word frequency counting result in a corpus in a data list form.
7. The apparatus of claim 5, further comprising:
and the highlight processing module is used for highlighting the retrieval result ranked at the first place.
8. The apparatus of claim 5, wherein the acquisition identification module comprises:
the voice recognition unit is used for recognizing a voice query instruction of a user to obtain character information corresponding to the voice query instruction;
and the intention matching unit is used for matching the character information with a pre-stored intention list so as to determine the user intention and the character object corresponding to the user intention.
9. An apparatus, comprising:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the information query method of any one of claims 1-4.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the information query method according to any one of claims 1 to 4.
CN201910818419.0A 2019-08-30 2019-08-30 Information query method, device, equipment and storage medium Pending CN112307073A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910818419.0A CN112307073A (en) 2019-08-30 2019-08-30 Information query method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910818419.0A CN112307073A (en) 2019-08-30 2019-08-30 Information query method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN112307073A true CN112307073A (en) 2021-02-02

Family

ID=74485584

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910818419.0A Pending CN112307073A (en) 2019-08-30 2019-08-30 Information query method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112307073A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113903342A (en) * 2021-10-29 2022-01-07 镁佳(北京)科技有限公司 Voice recognition error correction method and device

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104424342A (en) * 2013-09-11 2015-03-18 携程计算机技术(上海)有限公司 Method for keyword matching, and device, server and system of method
CN107992587A (en) * 2017-12-08 2018-05-04 北京百度网讯科技有限公司 A kind of voice interactive method of browser, device, terminal and storage medium
CN108037837A (en) * 2017-11-07 2018-05-15 朗坤智慧科技股份有限公司 A kind of intelligent prompt method of search term
CN108509547A (en) * 2018-03-20 2018-09-07 中国长城科技集团股份有限公司 A kind of approaches to IM, information management system and electronic equipment
CN109040486A (en) * 2018-08-30 2018-12-18 中通天鸿(北京)通信科技股份有限公司 A kind of position system of call center
CN109101604A (en) * 2018-08-01 2018-12-28 深圳市元征科技股份有限公司 Vehicle brand knows method for distinguishing and vehicle brand identification device
CN109446217A (en) * 2018-09-17 2019-03-08 平安科技(深圳)有限公司 Data method, electronic device and computer readable storage medium
CN109817210A (en) * 2019-02-12 2019-05-28 百度在线网络技术(北京)有限公司 Voice writing method, device, terminal and storage medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104424342A (en) * 2013-09-11 2015-03-18 携程计算机技术(上海)有限公司 Method for keyword matching, and device, server and system of method
CN108037837A (en) * 2017-11-07 2018-05-15 朗坤智慧科技股份有限公司 A kind of intelligent prompt method of search term
CN107992587A (en) * 2017-12-08 2018-05-04 北京百度网讯科技有限公司 A kind of voice interactive method of browser, device, terminal and storage medium
CN108509547A (en) * 2018-03-20 2018-09-07 中国长城科技集团股份有限公司 A kind of approaches to IM, information management system and electronic equipment
CN109101604A (en) * 2018-08-01 2018-12-28 深圳市元征科技股份有限公司 Vehicle brand knows method for distinguishing and vehicle brand identification device
CN109040486A (en) * 2018-08-30 2018-12-18 中通天鸿(北京)通信科技股份有限公司 A kind of position system of call center
CN109446217A (en) * 2018-09-17 2019-03-08 平安科技(深圳)有限公司 Data method, electronic device and computer readable storage medium
CN109817210A (en) * 2019-02-12 2019-05-28 百度在线网络技术(北京)有限公司 Voice writing method, device, terminal and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113903342A (en) * 2021-10-29 2022-01-07 镁佳(北京)科技有限公司 Voice recognition error correction method and device
CN113903342B (en) * 2021-10-29 2022-09-13 镁佳(北京)科技有限公司 Voice recognition error correction method and device

Similar Documents

Publication Publication Date Title
CN110619076B (en) Search term recommendation method and device, computer and storage medium
EP2930627A1 (en) Interactive searching and recommending method and apparatus
CN110096655B (en) Search result sorting method, device, equipment and storage medium
CN111125555B (en) Enterprise information acquisition method and device
CN107133263B (en) POI recommendation method, device, equipment and computer readable storage medium
CN115455161A (en) Conversation processing method, conversation processing device, electronic equipment and storage medium
KR20150027885A (en) Operating Method for Electronic Handwriting and Electronic Device supporting the same
CN112052005A (en) Interface processing method, device, equipment and storage medium
CN111581228A (en) Search method and device for correcting search condition, storage medium and electronic equipment
JP2024507902A (en) Information retrieval methods, devices, electronic devices and storage media
CN110634050B (en) Method, device, electronic equipment and storage medium for identifying house source type
CN113094286B (en) Page test method and device, storage medium and electronic equipment
CN114995691B (en) Document processing method, device, equipment and medium
CN112307073A (en) Information query method, device, equipment and storage medium
CN107562747B (en) Information display method and system, electronic equipment and database
CN112380476A (en) Information display method and device and electronic equipment
CN110069604B (en) Text search method, text search device and computer-readable storage medium
CN109783745B (en) Method, device and computer equipment for personalized typesetting of pages
CN107832373B (en) Data searching and displaying method and system, storage medium and electronic equipment
CN114265777B (en) Application program testing method and device, electronic equipment and storage medium
CN113051400B (en) Labeling data determining method and device, readable medium and electronic equipment
KR101789234B1 (en) Data tagging apparatus and method thereof, and data search method using the same
CN114757146A (en) Text editing method and device, electronic equipment and storage medium
CN109857838B (en) Method and apparatus for generating information
KR102186595B1 (en) System and method for providing search service

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination