WO2024036616A1 - 一种基于终端的问答方法及装置 - Google Patents

一种基于终端的问答方法及装置 Download PDF

Info

Publication number
WO2024036616A1
WO2024036616A1 PCT/CN2022/113668 CN2022113668W WO2024036616A1 WO 2024036616 A1 WO2024036616 A1 WO 2024036616A1 CN 2022113668 W CN2022113668 W CN 2022113668W WO 2024036616 A1 WO2024036616 A1 WO 2024036616A1
Authority
WO
WIPO (PCT)
Prior art keywords
answer
query
database
question
text
Prior art date
Application number
PCT/CN2022/113668
Other languages
English (en)
French (fr)
Inventor
石瑞枫
张鑫宇
童新
黄宏运
王琪皓
张衡
曹朝
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to PCT/CN2022/113668 priority Critical patent/WO2024036616A1/zh
Publication of WO2024036616A1 publication Critical patent/WO2024036616A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/40Data acquisition and logging

Definitions

  • the present application relates to the field of information processing technology, and in particular to a terminal-based question and answer method and device.
  • the amount of information on the device side is exploding and it is private, and the demand for device-side search is becoming increasingly prominent.
  • the user has received a vaccination SMS, but has forgotten which SMS on which day it is, and hopes to get the required answer by directly querying it on the device; or the user needs to query all the delivery tracking numbers of a certain express delivery in the past week.
  • the current end-side search cannot filter and explicitly extract the order number list based on specific time.
  • Embodiments of the present application provide a terminal-based question and answer method and device, which realizes terminal-side question and answer by understanding the user's query intention and matching corresponding query results for the user based on the query intention.
  • this application provides a terminal-based question and answer method, which is applied to the terminal device.
  • the method includes obtaining the query text; determining the query intention based on the query text; and retrieving the query text correspondence from the first question and answer database based on the query intention.
  • the answer wherein, the first question and answer database is obtained based on the data stored in the terminal device.
  • the above method of determining the query intention based on the query text includes: using the query text as the input of the semantic model and outputting the semantic text; performing word segmentation processing on the semantic text to obtain several word segments; and using several word segments as the intent classification model. Input and output the category of query intention.
  • the category of query intention includes the first category of query intention and the second category of query intention.
  • the category of query intent can also be called query category.
  • the meaning of query category can be understood as: classifying according to the type of answer corresponding to the query text. For example, the type of answer corresponding to the query text is in the form of long and short answers. Then the query category corresponding to the query text can be classified into one category, and the answer type corresponding to the query text is a structured data type, then the query category corresponding to the query text can be classified into another category.
  • deep semantic technology is used to understand the user's query intention, and then achieve accurate retrieval based on the query intention.
  • the first question and answer database includes an unstructured database and a structured database, and the unstructured database includes multiple question and answer pairs; based on the query intention, the answer corresponding to the query text is retrieved from the question and answer database, including :
  • the query intent category is the first category of query intent
  • the query intent category is the second category of query intent
  • an unstructured database and a structured database are deployed in the terminal.
  • the unstructured database includes multiple pairs of question and answer pairs
  • the structured database includes multiple items of structured data.
  • the answer in the question and answer pair includes the long answer and the short answer
  • the content of the long answer includes the content of the short answer
  • the long answer includes more content than the short answer
  • the short answer includes The content is less than long answers, and the query results of the query text are displayed more concisely. Only the key information in the query results is displayed for users, so that users can get the information they want to query more quickly.
  • the query results are displayed in the format of long and short answers. The form is displayed to users, combining the advantages of both, giving users a better experience.
  • the unstructured data includes an inverted index structure
  • retrieving question and answer pairs corresponding to the query text from the unstructured database includes: determining the query language corresponding to the query text based on the semantic text and several word segments; Query language and inverted index structure to retrieve question and answer pairs corresponding to query text from unstructured databases.
  • the unstructured database includes an inverted index structure and a vector index structure; retrieving question and answer pairs corresponding to the query text from the unstructured database includes: using the inverted index structure and the vector index structure respectively to retrieve Search in an unstructured database to obtain a candidate answer set corresponding to the query text; the candidate answer set includes several first answers retrieved using the inverted index structure and several second answers retrieved using the vector index structure; from the candidate answer set Select the target answer corresponding to the query text.
  • the candidate answer set may include one first answer and one second answer, or may include one first answer and multiple second answers, or may include multiple third answers.
  • an index library from two aspects: inverted index and vector index, multi-channel recall of end-side Q&A is supported, making the retrieved answers more comprehensive.
  • the selection of the target answer is related to one or more of a preset strategy, an inverted retrieval score, a vector retrieval score, a question matching degree, a question answer matching degree, answer text quality, and answer timeliness. ;
  • the question matching degree represents the matching degree of the query text and the question in the question and answer pair;
  • the question answer matching degree represents the matching degree of the query text and the answer in the question and answer pair.
  • retrieving answers corresponding to query information from a structured database includes: identifying several slots corresponding to the query text; using the query text and several slots as inputs to the query language generation model, and outputting the query The query language corresponding to the text; based on the query language, the answer corresponding to the query text is retrieved from the structured database.
  • the query text is used as the input of the semantic model, and before outputting the semantic text, the step includes: determining that the query text has question-answering intent.
  • a lightweight database and a full database are deployed in the terminal device, where the full database includes text data extracted from data in multiple formats stored in the terminal device; the lightweight database includes the full database Text data that conforms to preset rules; the data volume of the lightweight database is less than or equal to one percent of the data volume of the full database; the first question and answer database includes multiple question and answer pairs generated based on the text data in the lightweight database, and/ Or structured data generated based on text data in a lightweight database.
  • the terminal device includes multiple, and the lightweight databases in the multiple terminal devices are synchronized, so that the lightweight database of each terminal device includes the contents of the lightweight databases of other terminal devices, adding The scope of Q&A search enables cross-end high-quality content search without going out of the way.
  • a second question and answer database is also deployed in the terminal device.
  • the second question and answer database includes multiple question and answer pairs generated based on the text data in the full database, and/or a structure generated based on the text data in the full database. data; the method also includes: if the answer corresponding to the query text is not retrieved from the first question and answer database of the terminal device, based on the query intention, retrieval of the answer corresponding to the query text from the second question and answer database of the terminal device.
  • the search is continued from the second question and answer database to achieve full depth retrieval and meet the user's question and answer needs.
  • multiple question and answer pairs in the second question and answer database may form an unstructured database, and the structured data in the second question and answer database may form a structured database.
  • the terminal-based question and answer method also includes: if the answer corresponding to the query text is not retrieved from the first question and answer database of the terminal device, then retrieving from the second question and answer database of multiple terminal devices based on the query intention Query the answer corresponding to the text.
  • the terminal-based question and answer method also includes updating the answers retrieved from the second question and answer database to the first question and answer database, further improving the content of the first question and answer database, and increasing the content of the first question and answer database. Hit rate.
  • this application provides a terminal interaction method, which is applied to the terminal device.
  • the method includes: receiving the query content input in the search box; responding to the search request, searching for the answer corresponding to the query content in the terminal device; targeting the query content , to display the query results, and the content displayed in the query results includes the searched answers.
  • the query results are displayed in the form of long and short answers
  • the answer corresponding to the query content is the second category answer
  • the query results are displayed in the form of a table.
  • the categories of answers can be divided according to the format of the answers. For example, when the answer is a long or short answer, it is divided into the first category, and when the answer is structured data, it is divided into the second category.
  • the answer includes a long answer and a short answer
  • the content of the long answer includes the content of the short answer
  • the content displayed by the query results also includes questions corresponding to the query content.
  • the content displayed by the query results also includes the data source of the answer, so that the user knows the source of the search answer.
  • the answer is structured data
  • the structured data includes several pieces of structured data searched in the terminal device for the query content.
  • the content displayed in the query results also includes information source controls corresponding to each of the several pieces of structured data; the terminal interaction method also includes: when it is detected that the information source control is triggered, display Details of the structured data corresponding to the information source control.
  • the content displayed in the query results also includes the question text corresponding to the query content, and the slot keywords in the question text are displayed in different colors.
  • the structured data is displayed in tabular form.
  • the table here can be understood as structured data distributed in the form of rows and columns. One row distributes a piece of structured data, and one column distributes the same type of data in the structured data.
  • the table also includes a table header, located in the table The first line of the table indicates the content and meaning of each column of the table. It should be explained that the table in the embodiment of the present application is not limited to having a border line. As long as the distribution of data conforms to the form of the table, it can be considered to be displayed in the form of a table (see Figure 12).
  • query results are displayed in the form of cards.
  • the search box is located on the negative screen of the terminal device, or is located at a search entrance in an application installed on the terminal device.
  • this application provides a terminal-based question and answer device, including: an acquisition module, a determination module and a retrieval module, wherein the acquisition module is used to acquire the query text; the determination module is used to determine the query intention based on the query text; The retrieval module is used to retrieve the answer corresponding to the query text from a first question and answer database based on the query intention; the first question and answer database is obtained based on the data stored in the terminal device.
  • the determination module is specifically used to: use the query text as the input of the semantic model and output the semantic text; perform word segmentation processing on the semantic text to obtain several word segments; use several word segments as the input of the intent classification model and output Categories of query intentions include first category query intentions and second category query intentions.
  • the first question and answer database includes an unstructured database and a structured database, and the unstructured database includes multiple question and answer pairs; the retrieval module is specifically used to: the query intent category is the first category of query intent, then from The question-answer pairs corresponding to the query text are retrieved from the unstructured database; if the query intent category is the second category of query intent, the answers corresponding to the query text are retrieved from the structured database.
  • retrieving the question-answer pairs corresponding to the query text from an unstructured database includes: determining the query language corresponding to the query text based on the semantic text and several word segments; based on the query language, retrieving the question-answer pairs corresponding to the query text from the unstructured database Retrieve the question and answer pairs corresponding to the query text.
  • the unstructured database includes an inverted index structure and a vector index structure; based on the query language, retrieving the question and answer pairs corresponding to the query text from the unstructured database includes: based on the query language , respectively using the inverted index structure and the vector index structure to retrieve from the unstructured database to obtain the candidate answer set corresponding to the query text; the candidate answer set includes several first answers and vectors obtained by using the inverted index retrieval structure Several second answers retrieved by the index structure; select the target answer corresponding to the query text from the set of candidate answers.
  • the selection of the target answer is related to one or more of the preset strategy, inverted search score, vector search score, question matching degree, question answer matching degree, answer text quality and answer timeliness.
  • the question matching degree represents the matching degree of the query text and the question in the question and answer pair
  • the question answer matching degree represents the matching degree of the query text and the answer in the question and answer pair.
  • retrieving the answer corresponding to the query information from the structured database includes: identifying several slots corresponding to the query text; using the query text and the several slots as input to the query language generation model, The query language corresponding to the query text is output; based on the query language, the answer corresponding to the query text is retrieved from the structured database.
  • the answer in the question-answer pair includes a long answer and a short answer
  • the content of the long answer includes the content of the short answer
  • the determining module is also used to determine that the query text has question-answering intent.
  • a lightweight database and a full database are deployed in the terminal device, where the full database includes text data extracted from data in multiple formats stored in the terminal device; the lightweight database includes the full database Text data that conforms to preset rules; the data volume of the lightweight database is less than or equal to one percent of the data volume of the full database; the first question and answer database includes multiple question and answer pairs generated based on the text data in the lightweight database, and/ Or structured data generated based on text data in a lightweight database.
  • the terminal device includes multiple terminal devices, and the lightweight databases in the multiple terminal devices are synchronized so that the lightweight database of each terminal device includes the contents of the lightweight databases of other terminal devices.
  • lightweight databases in multiple terminal devices are synchronized when the terminal device is in an idle state.
  • a second question and answer database is also deployed in the terminal device.
  • the second question and answer database includes multiple question and answer pairs generated based on the text data in the full database, and/or a structure generated based on the text data in the full database.
  • data; the retrieval module also includes: if the answer corresponding to the query text is not retrieved from the first question and answer database of the terminal device, based on the query intention, retrieve the answer corresponding to the query text from the second question and answer database of the terminal device.
  • the retrieval module is also configured to: if the answer corresponding to the query text is not retrieved from the first question and answer database of the terminal device, then retrieve from the second question and answer database of multiple terminal devices based on the query intention. Query the answer corresponding to the text.
  • the device further includes an update module, configured to update the answer retrieved from the second question and answer database into the first question and answer database.
  • the present application provides a terminal device, including a memory and a processor.
  • the memory stores executable code
  • the processor executes the executable code to implement the first aspect and/or the third aspect of the present application. Two methods are provided.
  • the present application provides a computer-readable storage medium on which a computer program is stored.
  • the computer program is executed in a computer, the computer is caused to execute the method provided in the first aspect and/or the second aspect of the present application. method.
  • the present application provides a computer program or computer program product.
  • the computer program or computer program product includes instructions. When the instructions are executed, the methods provided by the first and/or second aspect of the application are implemented. method.
  • Figure 1 is a schematic diagram of the first solution for question and answer query, but the terminal device cannot search for the corresponding answer;
  • Figure 2 is a schematic structural block diagram of a mobile phone provided by an embodiment of the present application.
  • Figure 3 is a flow chart of a terminal-based question and answer method provided by an embodiment of the present application.
  • Figure 4 is a schematic diagram of using inverted index and vector index to retrieve answers from an unstructured database
  • Figure 5 is a schematic diagram of retrieving answers from a structured database
  • Figure 6 is a schematic diagram of the implementation process of a terminal-based question and answer method
  • Figure 7 is a schematic diagram of the construction process of full database and lightweight database
  • Figure 8 is a schematic diagram of the construction process of a lightweight database
  • Figure 9 is a schematic diagram of cross-end data update synchronization and search when there are multiple terminal devices
  • Figure 10 is a schematic diagram of the generation process of the question and answer database
  • Figure 11 is a schematic diagram of a UI interface for short answers to daily information
  • Figure 12 is a schematic diagram of a UI interface for daily information aggregation
  • Figure 13 is a schematic diagram of the interface changes after the user clicks button 2;
  • Figure 14 is a schematic diagram of several forms of search entrance settings for question and answer
  • Figure 15 is a schematic structural diagram of a terminal-based question and answer device provided by an embodiment of the present application.
  • Figure 16 is a schematic structural diagram of a terminal device provided by an embodiment of the present application.
  • the first solution is to directly read the SMS text information into the memory with a simple storage method.
  • the user query (query) is string matched with the stored SMS text. If the user's search is particularly simple, this solution can quickly search for accurate answers. For example, if you search for "Zhang San” in text messages, you can display all the text messages whose recipient/sender is "Zhang San” or whose message text contains "Zhang San”.
  • the second solution is to first perform a matching query on the client side. If there is no matching result on the client side, the user parameter information obtained on the client side, such as time and location, together with the user query, will be sent to the cloud side, relying on the cloud side. Improve the query capabilities of search engines to increase the probability of getting answers.
  • this solution is that the user parameter information will be sent to the cloud side, resulting in the user's privacy not being protected.
  • this method can only answer simple daily questions, such as "Hangzhou Epidemic Prevention Telephone Number”, but cannot answer questions that are strongly related to client-side information, such as "What are the tracking numbers of Yunda Express this week?" because there are a lot of information on the client-side. Information such as galleries, memos, screenshots, and documents cannot be used, and true client-side queries cannot be realized.
  • embodiments of the present application provide a terminal-based question and answer method, which realizes terminal-side question and answer by understanding the user's query intention and matching corresponding query results for the user based on the query intention.
  • terminal-based question and answer method can be applied to mobile phones, tablet computers, wearable devices, vehicle-mounted terminals, augmented reality (AR)/(virtual reality, VR) devices, notebook computers, and super mobile personal computers.
  • Terminal devices such as ultra-mobile personal computer (UMPC) and personal digital assistant (PAD) are not limited to specific types of terminal devices in the embodiments of this application.
  • FIG. 2 shows a schematic structural block diagram of a mobile phone.
  • the mobile phone includes a radio frequency (RF) circuit 210, a memory 220, an input unit 230, a display unit 240, a sensor 250, an audio circuit 260, a wireless fidelity (WiFi) module 270, and a processor 280 As well as power supply 290 and other components.
  • RF radio frequency
  • the structure of the mobile phone shown in FIG. 2 does not limit the mobile phone, and may include more or fewer components than shown in the figure, or combine certain components, or arrange different components.
  • the RF circuit 210 can be used to receive and transmit information or signals during a call. In particular, after receiving downlink information from the base station, it is processed by the processor 280; in addition, the designed uplink data is sent to the base station.
  • RF circuits include, but are not limited to, antennas, at least one amplifier, transceivers, couplers, low noise amplifiers (LNA), duplexers, etc. Additionally, RF circuitry 210 may communicate with networks and other devices through wireless communications.
  • the above wireless communication can use any communication label or protocol, including but not limited to global system of mobile communication (GSM), general packet radio service (GPRS), code division multiple access (CDMA), wideband code division multiple access (WCDMA), long term evolution (LTE), email, short messaging service (SMS), etc.
  • GSM global system of mobile communication
  • GPRS general packet radio service
  • CDMA code division multiple access
  • WCDMA wideband code division multiple access
  • LTE long term evolution
  • email short messaging service
  • SMS short messaging service
  • the memory 220 can be used to store software programs and modules.
  • the processor 280 executes various functional applications and data processing of the mobile phone by running the software programs and modules stored in the memory 220 .
  • the memory 220 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function (such as a sound playback function, an image playback function, etc.), etc.; the storage data area may store a program according to Data created by the use of mobile phones (such as audio data, phone books, etc.), etc.
  • the memory 220 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage device.
  • the input unit 230 may be used to receive input numeric or character information, and generate key signal input related to user settings and function control of the mobile phone 200 .
  • the input unit 230 may include a touch panel 231 and other input devices 232 .
  • the touch panel 231 also known as a touch screen, can collect the user's touch operations on or near it (for example, the user uses a finger, stylus, or any other suitable object or accessory on or near the touch panel 231 operation), and drive the corresponding connection device according to the preset program.
  • the touch panel 231 may include two parts: a touch detection device and a touch controller.
  • the touch detection device detects the user's touch orientation, detects the signal brought by the touch operation, and transmits the signal to the touch controller; the touch controller receives the touch information from the touch detection device, converts it into contact point coordinates, and then is sent to the processor 280, and can receive commands sent by the processor 280 and execute them.
  • the touch panel 231 can be implemented using various types such as resistive, capacitive, infrared, and surface acoustic wave.
  • the input unit 230 may also include other input devices 232.
  • other input devices 232 may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control keys, switch keys, etc.), trackball, mouse, joystick, etc.
  • the display unit 240 may be used to display information input by the user or information provided to the user as well as various menus of the mobile phone.
  • the display unit 240 may include a display panel 241.
  • the display panel 241 may be configured in the form of a liquid crystal display (liquid crystal display, LCD), an organic light-emitting diode (OLED), etc.
  • touch control The panel 231 covers the display panel 241. When the touch panel 231 detects a touch operation on or near it, it is sent to the processor 280 to determine the type of the touch event. Then the processor 280 performs the operation on the display panel 241 according to the type of the touch event. Provide corresponding visual output.
  • the touch panel 231 and the display panel 241 are used as two independent components to implement the input and output functions of the mobile phone, in some embodiments, the touch panel 231 and the display panel 241 can be integrated. Realize the input and output functions of mobile phone.
  • the mobile phone 200 may also include at least one sensor 250, such as a light sensor, a motion sensor, and other sensors.
  • the light sensor may include an ambient light sensor and a proximity sensor.
  • the ambient light sensor may adjust the brightness of the display panel 241 according to the brightness of the ambient light.
  • the proximity sensor may close the display panel 241 and/or when the mobile phone is moved to the ear. or backlight.
  • the acceleration sensor can detect the magnitude of acceleration in various directions (usually three axes), detect the magnitude and direction of gravity at a stationary moment, and can be used to identify applications such as mobile phone posture (such as horizontal and vertical screen switching, related games and Magnetometer attitude calibration), vibration recognition related functions (such as pedometer, knock), etc.; as for the mobile phone, it can also be equipped with other sensors such as gyroscope, barometer, hygrometer, thermometer, infrared sensor, etc., which will not be described here.
  • the audio circuit 260, speaker 261, and microphone 262 can provide an audio interface between the user and the mobile phone.
  • the audio circuit 260 can transmit the electrical signal converted from the received audio data to the speaker 261, and the speaker 261 converts it into a sound signal for output; on the other hand, the microphone 262 converts the collected sound signal into an electrical signal, and the audio circuit 260 After receiving, it is converted into audio data, and then processed by the audio data output processor 280, and then sent to, for example, another mobile phone through the RF circuit 210, or the audio data is output to the memory 220 for further processing.
  • WiFi is a short-distance wireless transmission technology.
  • the mobile phone can help users send and receive emails, browse web pages, and access streaming media through the WiFi module 270. It provides users with wireless broadband Internet access.
  • FIG. 2 shows the WiFi module 270, it can be understood that it is not a necessary component of the mobile phone 200 and can be omitted as needed without changing the essence of the invention.
  • the processor 280 is the control center of the mobile phone. It uses various interfaces and lines to connect various parts of the entire mobile phone, and performs overall monitoring of the mobile phone by running or executing various functions of the mobile phone and processing data.
  • the processor 280 may include one or more processing units; preferably, the processor 280 may integrate an application processor and a modem processor, where the application processor mainly processes operating systems, user interfaces, application programs, etc. , the modem processor mainly handles wireless communications. It can be understood that the above modem processor may not be integrated into the processor 280 .
  • the mobile phone 200 also includes a power supply 290 (such as a battery) that supplies power to various components.
  • a power supply 290 (such as a battery) that supplies power to various components.
  • the power supply 290 can be logically connected to the processor 280 through a power management system, thereby realizing functions such as charging, discharging, and power consumption management through the power management system.
  • the mobile phone 200 may also include a camera.
  • the position of the camera on the mobile phone 200 can be front or rear, which is not limited in the embodiment of the present application.
  • the mobile phone 200 may include a single camera, dual cameras, or three cameras, which are not limited in the embodiments of the present application.
  • the mobile phone 200 may include three cameras, including one main camera, one wide-angle camera, and one telephoto camera.
  • the multiple cameras may be all front-facing, or all rear-facing, or some may be front-facing and the other may be rear-facing. This is not limited in the embodiments of the present application.
  • OCR optical character recognition
  • Question-answer generation refers to the fact that given a context (paragraph or sentence), the deep semantic model generates a fluent and contextual question-answer pair based on the context content, so that the generated question can be answered by the answer .
  • MRC Machine reading comprehension
  • NLP natural language processing
  • Natural language understanding is the study of using computers to simulate human language communication processes, so that computers can understand and use natural languages of human society, such as Chinese, English, etc., to realize natural language communication between humans and machines, instead of Part of human mental work includes searching for information, answering questions, excerpting documents, compiling information, and processing of all natural language information.
  • Query understanding can better understand the query words entered by the user, disassemble the intention of the user's search words, and quickly locate core words, attribute words, etc.
  • Qq matching Search and match the question (q) in the question and answer database according to the user query (Q), and recall the topK question and answer database q-a pairs with the highest similarity to the query. The corresponding answer is the answer candidate to be returned.
  • Recall Methods include text recall, vector recall, etc.
  • Inverted index also often called inverted index, placement archive or reverse archive, is an indexing method that is used to store the storage location of a word in a document or a group of documents under full-text search. mapping. It is the most commonly used data structure in document retrieval systems. This data structure can be called an inverted index structure.
  • Inverted search refers to retrieval based on the inverted index, generally suitable for fast full-text search.
  • Vector retrieval Calculate the similarity between the given question vector and the document (class) vector, and then sort the documents whose similarity exceeds a certain threshold (or according to the number of documents to be detected) in descending order of similarity and output them.
  • Slots and slot values represent clearly defined attributes of entities, such as departure location, destination, departure time, etc. in taxi hailing. Slot values are the values specifically filled in the slot, such as Hangzhou, Shanghai, 20220510 .
  • a terminal-based question and answer method provided by the embodiment of the present application is introduced in detail below.
  • Figure 3 is a flow chart of a terminal-based question and answer method according to an embodiment of the present application. This method can be implemented on the terminal device shown in Figure 2. As shown in Figure 3, the terminal-based question and answer method provided by the embodiment of the present application at least includes steps S301 to S303.
  • step S301 the query text is obtained.
  • Users can manually enter natural query sentences as query text in the question and answer input box of a terminal (such as a mobile phone). For example, the user clicks in the question and answer input box of the question and answer interface displayed on the touch screen of the mobile phone to bring up the virtual keyboard. Then the user enters the query text "Where should the third shot of the vaccine be given?" through the virtual keyboard in the question and answer input box, and clicks The "OK" button submits the query text. In response to this operation, the mobile phone obtains the query text and displays the query text on the display interface of the mobile phone.
  • users can also input query text via voice.
  • the user operates the voice input icon on the Q&A interface displayed on the touch screen of the mobile phone, and then inputs voice through the microphone.
  • the mobile phone obtains the voice input by the user, then recognizes the voice content, and then converts it into query text.
  • the specific solution for converting voice content into text content by a mobile phone can be incorporated into the existing technology, and will not be described in detail in the embodiments of this application.
  • step S302 the query intention is determined based on the query text.
  • the query intent includes a first category of query intent and a second category of query intent.
  • the first category of query intent can also be called daily information short answer query intent
  • the second category of query intent can also be called daily information aggregation.
  • the method before inputting the query text into the semantic model, the method further includes determining that the query text has a question and answer intention. If the query text has a question and answer intention, then inputting the query text into the semantic model and performing subsequent query intention identification. If the query text If the text does not have a question-and-answer intent, it will be skipped without further steps. That is to say, before determining the query intention, perform non-question and answer filtering on the query text. If the query text has a question and answer intention, continue with the subsequent steps, otherwise jump out.
  • step S303 based on the query intention, the answer corresponding to the query text is retrieved from the first question and answer database, and the first question and answer data is obtained based on the data stored in the terminal device.
  • the first question and answer database includes an unstructured database
  • the unstructured database summary includes a plurality of question and answer pairs
  • the plurality of question and answer pairs are generated based on data stored in the terminal device.
  • the question and answer pairs corresponding to the query text are retrieved from the unstructured database. For example, based on the query intent, the index is used to retrieve the question and answer pairs corresponding to the query text from unstructured data.
  • the index structure in unstructured data includes an inverted index.
  • the implementation plan of using the index to retrieve the question and answer pairs corresponding to the query text from the unstructured data is: based on the semantic text and several word segments obtained in step S302, Determine the query language corresponding to the query text.
  • the query language can be a domain specific language (DSL), and then use the DSL corresponding to the query text to retrieve the query text correspondence from unstructured data based on the inverted index.
  • DSL domain specific language
  • the inverted search engine can use Elasticsearch, which uses the structure of the inverted index and is suitable for fast full-text search.
  • Elasticsearch uses the structure of the inverted index and is suitable for fast full-text search.
  • ES When creating an inverted index, ES first splits the text of each document into separate words, called entries or tokens, and then creates a sorted list containing all unique entries. For each word, there is a List of documents containing it.
  • the index structure in unstructured data includes vector indexes.
  • the implementation plan of using the index to retrieve the question and answer pairs corresponding to the query text from the unstructured data is as follows: first, vectorize the query text input by the user to obtain the query The vector representation of the text is then used to retrieve several question and answer pairs corresponding to the query text from the unstructured data based on the vector index, and these several question and answer pairs are used as candidate answer sets.
  • the vector search engine can use FAISS.
  • the embedding of question/answer is obtained through the semantic model and poured into the similar vector retrieval library FAISS to build an index; in the online stage, the embedding of query is first obtained, the embedding of the nearest question/answer is searched in FAISS, and the corresponding The answer is returned as the recall result.
  • the index structure in unstructured data, inverted index and vector index uses the index to retrieve the question and answer pairs corresponding to the query text from the unstructured data.
  • the implementation plan is as follows: Figure 4 Indicates that the inverted index and vector index are respectively used to retrieve from the unstructured database to obtain the candidate answer set corresponding to the query text; the candidate answer set includes several first answers retrieved using the inverted index and several first answers retrieved using the vector index. Several second answers.
  • two index structures, inverted index and vector index are established in unstructured data.
  • inverted index and vector index are used respectively to synchronize in unstructured data. Perform inverted retrieval and vector retrieval to support multi-channel recall of end-side Q&A, making the retrieval answers more comprehensive.
  • the candidate answer set is the answers whose matching degree is greater than the threshold or the topk.
  • the unified calculateate the matching degree of the answer in the query text and answer pair and the matching degree of the question in the query text and answer pair, sort the retrieved answers according to the matching degree, and take the topk answers or the answers whose matching degree is greater than the threshold. These answers A set of candidate answers is formed.
  • the target answer corresponding to the query text is selected from the candidate answer set.
  • Considerations for selecting the target answer corresponding to the query text from the candidate answer set include one of the preset strategy, inverted search score, vector search score, question matching degree, question answer matching degree, answer text quality and answer timeliness, or Various related.
  • the question matching degree represents the matching degree of the query text and the question in the question and answer pair
  • the question answer matching degree represents the matching degree of the query text and the answer in the question and answer pair.
  • the timeliness of the answer can be understood as the time of the data source corresponding to the answer. The longer the time of the data source is, the worse the timeliness will be. The shorter the time of the data source is, the better the timeliness will be.
  • the quality of the answer text can be measured by the quality of the text information contained in the data source corresponding to the answer. The quality of the text information can be measured in many ways. For example, the more information the text information contains, the better the quality, or the semantic coherence of the text information. The better, the better the quality, etc.
  • the inverted retrieval score, vector retrieval score, question matching degree, question answer matching degree, answer text quality, answer timeliness, etc. of each answer in the candidate answer set are normalized, and the normalized scores are As the score of each answer, the answers are then sorted and intervened through preset strategies, such as improving the rights of high-quality libraries, reducing the rights of politics and pornography, etc.
  • the answers in the question and answer pairs in the unstructured database include long answers and short answers.
  • the long answers are more detailed answers to the questions, and the short answers are relatively simple answers to the questions.
  • the question for a certain answer pair in the unstructured database is "Community epidemic prevention telephone number?”
  • the long answer is "Community epidemic consultation telephone number of Bantian Street, Longgang District: 0755-8960****”
  • the short answer is "0755- 8960****”.
  • the first question and answer database includes a structured database, and there are multiple pieces of structured data in the structured database.
  • the query intent category is daily information aggregation
  • the answer corresponding to the query text is retrieved from the structured database.
  • the query text and slot dictionary are used to identify the slots in the query text, and all identified slots are recorded. For example, if the query text is "Pickup codes of all Yunda Express in the past week", then the slots in the query text identified according to the slot dictionary include "time”, "express type”, and "pickup code”. Input the identified slot and query text into the DSL generation model to generate a DSL that uses the identified slot as a filtering attribute. For example, if the query text is "pickup codes for all Yunda Express in the past week", it will be identified after slot identification. Slots such as "Time”, “Express Type", and "Pickup Code” are displayed.
  • the slot dictionary obtains a slot dictionary covering rich scenarios by mining structured data in the terminal device, for example, based on multiple pieces of structured data in a structured database.
  • the first question and answer database includes an unstructured database and a structured database
  • the unstructured database includes multiple question and answer pairs
  • the structured database includes multiple pieces of structured data.
  • a search is performed from the unstructured database to obtain a set of candidate answers.
  • a search is performed from the structured database to obtain a number of answers corresponding to the query text.
  • Structured data is filtered and sorted through preset strategies to obtain the final answer, and the answer is displayed to the user in the form of a card.
  • the terminal device After the user inputs the query text, the terminal device responds to the query text and first performs non-question and answer filtering on the query text. If the query text contains a question and answer intention, the subsequent process continues, otherwise it exits.
  • the basic NLU model and QU perform semantic analysis and understanding of the query text to obtain the semantic text and understanding word segmentation corresponding to the query text.
  • the semantic text and understanding word segmentation are input into the intent classification model (also called end-side question answering NLU) to determine the query text.
  • the query intention is "daily information short answer” or “daily information aggregation". If the query intention belongs to the "daily information short answer" intention, then it will enter the daily information short answer process.
  • This process is mainly based on question and answer pair retrieval, and can obtain the long and short answers to the questions. If the query intention belongs to the "daily information aggregation" intention, then it will enter the daily information aggregation process. This process is mainly based on numerical reasoning and can obtain multiple accurate structured answers.
  • Several retrieved answers form a candidate answer set.
  • the rule strategy which can also be called a preset strategy
  • artificial strategies will be used to intervene in the sorting of answers, such as high-quality library elevating, political , pornographic rights reduction, etc.
  • the answer card ranked first in the candidate answer set will be displayed as a question and answer card.
  • a lightweight database and a full database are deployed in the terminal device, where the full database includes text data extracted from data in multiple formats stored in the terminal device; the lightweight database includes the full database Text data that conforms to preset rules; the data volume of the lightweight database is less than or equal to one percent of the data volume of the full database; the first question and answer database includes multiple question and answer pairs generated based on the text data in the lightweight database, and/ Or structured data generated based on text data in a lightweight database.
  • Figure 7 shows the construction process of full database and lightweight database.
  • information extraction, document extraction, OCR and other technologies are used to extract text information from SMS, memos, documents, pictures and other information on the terminal side. , which constitutes the full database on this device.
  • SMS short message
  • memos documents
  • pictures and other information on the terminal side.
  • OCR optical character recognition
  • the Main library includes all filtered text data, or the Main library includes text data that meets certain conditions, for example, Text data whose text length is greater than the preset value will be discarded if the text length is too short to ensure the content validity of the text data.
  • a lightweight database is built on the basis of the Main library, which can also be called a Lite library.
  • the storage size of the Lite library is generally 1% of the Main library, and its data mainly comes from three parts, including from the Main library Filter out high-priority information source data through high-quality information filtering.
  • High-priority information sources can be user-defined, such as text messages, notepads, etc.; for online users who request query text with no results, the client-side Q&A system will process it offline during the offline process.
  • the Lite library synchronizes information updates, allowing a single device to access high-quality data in other devices.
  • the Lite database will prune data based on data storage time, size and priority, ensuring that it is one percent of the size of the Main library. For example, each piece of data in the Lite library is scored based on storage time, size and priority respectively, and the storage time score, size score and priority score of each piece of data are obtained. For example, the longer the storage time, the lower the score and the occupied storage. The larger the space, the lower the score, and the lower the priority, the lower the score; then the storage time score, size score and priority score of each piece of data are weighted to obtain a comprehensive score for each piece of data. For example, the weight of the storage time score is 30 %, the weight of the size score is 30%, and the weight of the priority score is 40%; those with a comprehensive score lower than the threshold will be cropped out.
  • the query is only performed from the question and answer database generated based on the Lite library (that is, the first question and answer database).
  • the question and answer database generated based on the Lite library only needs to be stored in the memory, which is extremely convenient.
  • the memory overhead of the terminal device is greatly reduced.
  • the retrieval time is also greatly reduced, thereby reducing the user request delay.
  • a full search is performed from the Q&A database built based on the Main database to improve the hit rate of the answer. This not only increases the Q&A response speed, but also increases the user's Q&A satisfaction rate. .
  • the terminal device includes multiple, and the lightweight databases in the multiple terminal devices are synchronized so that the lightweight database of each terminal device includes the contents of the lightweight databases of other terminal devices, adding The scope of question and answer search enables cross-device high-quality content search without going out of the end. At the same time, cross-device multi-source question and answer information can be obtained on the end side with a small performance overhead.
  • Multiple terminal devices are multiple terminal devices configured by the user, or multiple terminal devices under the same account. For example, multiple terminal devices used by the same user include smartphones, laptops, tablets, smart wearable devices, etc. used by the user. Multiple terminal devices.
  • the lightweight databases in multiple terminal devices are synchronized, making the synchronization process of the lightweight database invisible to the user and not affecting the user experience of the terminal device.
  • a second question and answer database is also deployed in the terminal device.
  • the second question and answer database includes multiple question and answer pairs generated based on the text data in the full database, and/or structured data generated based on the text data in the full database;
  • Embodiments of the present application provide
  • the terminal-based question and answer method also includes: if the answer corresponding to the query text is not retrieved from the first question and answer database of the terminal device, based on the query intention, retrieval of the answer corresponding to the query text from the second question and answer database of the terminal device. In other words, when the ideal question and answer result cannot be retrieved from the first question and answer database, the search is continued from the second question and answer database to achieve full and in-depth retrieval to meet the user's question and answer needs.
  • the answers retrieved from the second question and answer database are updated to the first question and answer database to further improve the content of the first question and answer database and increase the hit rate of the first question and answer database.
  • Figure 9 shows cross-end data update synchronization and search when there are multiple terminal devices.
  • the Lite libraries of different terminal devices such as the requesting device, search device 1, and search device 2 in Figure 9
  • the quick search requires the device to only query the device's Lite library, or only query the device's first question and answer database. In other words, it only searches in the device's first question and answer database to match answers to the query text.
  • the requesting device will simultaneously search the Main library of its own device and the Main library of the search device (such as Search Device 1 and Search Device 2 in Figure 9), and synchronize the search results to the requesting device.
  • This data synchronization process runs during the idle period of the device and is invisible to the user.
  • the end-side Q&A system can realize cross-end all-device full content search; when other devices are not online, through the pre-synchronized Lite library, cross-end all-device high-quality content search can be realized.
  • the implementation plan for generating the question and answer database is introduced.
  • Figure 10 shows the generation process of the question and answer database.
  • the Lite library is used as the data source of the question and answer database.
  • the Lite library includes multiple pieces of text data, and each piece of text data is separately used to generate unstructured question and answer pairs and extract structured data.
  • Unstructured question and answer pairs constitute an unstructured database
  • structured data constitute a structured database.
  • the generation of unstructured databases includes question and answer pair data generation, based on deep semantic technology, using end-side data to generate high-quality question and answer pairs.
  • the question and answer pair data generation mainly includes the question and answer generation stage, the question and answer determination stage, and the long/short answer extraction stage.
  • the implementation plan of question and answer generation is as follows: input a piece of text in the Lite library into the question and answer generation model to obtain several questions corresponding to the text. This text is combined with several questions output by the question and answer generation model to form several question and answer pairs.
  • the training phase includes: First, a crawler tool is used to collect a large number of high-quality Chinese question and answer pairs from the existing network as training corpus, and data preprocessing is performed.
  • the process includes: verifying whether the question and answer pairs are in Chinese, and converting traditional Chinese in the question and answer pairs into simplified Chinese. Chinese, replace consecutive non-characters and numbers with periods, process extra spaces and step numbers, filter junk words starting with strings in the junk word list, etc.
  • the loss value of the model is determined based on the label of the question and the text, and the question and answer generation model is trained by minimizing the loss value until the model converges, that is, the training of the question and answer generation model is completed.
  • the reasoning stage includes inputting the text in the Lite library into the question and answer generation module and outputting a number of questions. This text is combined with several questions output by the question and answer generation model to form a number of question and answer pairs.
  • the question and answer generation model can be the mT5 model.
  • This model has designed different task guides for different NLP downstream tasks.
  • the model input is a piece of Chinese text, namely answer, with the label
  • the question corresponding to the text segment is spliced with the task guide "question generation:" to form the training corpus, and Fine-Tune of the mT5 model is performed.
  • the model inference stage only the text that has been standardized and characterized can be input to generate a question that matches the text.
  • the implementation plan of question and answer determination is: after the question and answer generation stage, in order to reduce the noise data introduced by the question and answer generation model and enhance the quality of question and answer pairs in the library, it is necessary to use the decision model for further correlation judgment.
  • the input of the model is the question and answer pairs generated by the question and answer
  • the output of the model is the matching degree value of the answer to the question. This value can be used to filter out low-quality question and answer pairs that do not answer the question.
  • the decision model is a SHP model that constructs pre-training data based on symmetric hyperlink relationships, further-pretrains BERT, and uses a large amount of Chinese question and answer corpus to Fine-Tune the SHP model.
  • the adversarial training idea is used to perform random small perturbations on the word embedding vector (word_embedding) parameters, calculate the forward loss and reverse gradient after the perturbation, and accumulate the gradients of adversarial training on the basis of normal grad.
  • long/short answer extraction is: using MRC technology, that is, using a deep semantic model (long/short answer extraction model) to make a span prediction on the original paragraph, and extract a core sentence of a certain length from the original paragraph as Short answer results displayed by UI; original paragraph as long answer.
  • MRC technology that is, using a deep semantic model (long/short answer extraction model) to make a span prediction on the original paragraph, and extract a core sentence of a certain length from the original paragraph as Short answer results displayed by UI; original paragraph as long answer.
  • the backbone of the long/short answer extraction model is the ELECTRA model, which adds a RTD (Replaced Token Detection) process based on BERT's MLM (Masked Language Model) training idea.
  • the model consists of two parts. The first part is the generator, which replaces some words in the sentence. The second part is the discriminator. The discriminator is used to determine whether each word in a sentence has been replaced. The training process will predict all words, which is more efficient than BERT.
  • the overall loss function of the model is defined as the joint loss of the MLM task and the RTD task.
  • structured database For the generation of structured database, it includes using slot dictionary to identify the slots of each text data in the Lite database, and then filling the identified slots with slot values.
  • the slots and the corresponding slot values form a structure.
  • structured data multiple pieces of structured data constitute a structured database, and then an inverted index is built for each piece of structured data to facilitate subsequent retrieval of the structured database.
  • Embodiments of the present application also disclose a terminal interaction method, which includes: receiving query content input in a search box; responding to the search request, searching for answers corresponding to the query content in the terminal device; and displaying query results based on the query content.
  • the displayed content includes the searched answers.
  • the implementation solution of searching for answers corresponding to the query content in the terminal device can be implemented based on the terminal-based question and answer method provided above in the embodiment of the present application.
  • the specific solution please refer to the above description and will not be repeated here.
  • the answer includes a long answer and a short answer.
  • the long answer is a more detailed answer to the query question, and the short answer is a more concise answer to the query question.
  • the content displayed in the query results also includes questions corresponding to the query content.
  • the content displayed in the query results also includes the data source of the answer, so that the user can know the source of the search answer.
  • Figure 11 shows a UI interface for short answers to daily information.
  • tag 1 is the query entered by the user. After the end-side question and answer retrieval, a question and answer card will be displayed.
  • the question and answer card contains 4 contents: tag 2 is the question of the question and answer pair in the unstructured database; tag 3 is the short Answer; label 4 is the long answer; label 5 is the data source for the answer.
  • the structured data when the answer is structured data, includes several pieces of structured data searched in the terminal device for the query content.
  • the content displayed in the query results also includes information source controls corresponding to each of the several pieces of structured data; the terminal interaction method also includes: when it is detected that the information source control is triggered, display Details of the structured data corresponding to the information source control.
  • the content displayed in the query results also includes the question text corresponding to the query content, and the slot keywords in the question text are displayed in different colors.
  • the structured data is displayed in tabular form.
  • Figure 12 shows a UI interface for daily information aggregation.
  • label 1 is the query entered by the user.
  • a question and answer card will be displayed.
  • the question and answer card contains 3 contents: label 2 is the query entered by the user, and at the same time, the identified slot keywords are displayed. Display in other colors; label 3 is the activated slot; label 4 is the slot value corresponding to the activated slot, displayed in the form of a table.
  • the Q&A card only displays part of the slots and slot values. If the user wants to know more information, click the button to view the information source, and the corresponding details will be displayed dynamically. As shown in Figure 13, after the user clicks button 2, the more detailed content of serial number 2 is displayed to the user.
  • query results are displayed in the form of cards.
  • the search box is located on the negative screen of the terminal device, or is located at a search entrance in an application installed on the terminal device.
  • the search box can be located on the bottom screen of the terminal device and the search portal of the application installed on the terminal (such as the search portal of the built-in browser, the search portal of the smart voice assistant, etc.).
  • the embodiment of the present application also provides a terminal-based question and answer device 1500.
  • the terminal-based question and answer device 1500 includes a device for implementing the steps shown in Figure 3-10. The units or modules of each step in the terminal-based question and answer method shown.
  • Figure 15 is a schematic structural diagram of a terminal-based question and answer device provided by an embodiment of the present application. This device is applied to terminal equipment. As shown in Figure 15, the terminal-based question and answer device 1500 at least includes:
  • Determining module 1502 used to determine query intent based on the query text
  • the retrieval module 1503 is configured to retrieve the answer corresponding to the query text from a first question and answer database based on the query intention; the first question and answer database is obtained based on the data stored in the terminal device.
  • the determining module 1502 is specifically used to:
  • the plurality of word segments are used as the input of the intention classification model, and the category of the query intention is output.
  • the category of the query intention includes a first category of query intention and a second category of query intention.
  • the first question and answer database includes an unstructured database and a structured database, and the unstructured database includes multiple pairs of question and answer pairs;
  • the retrieval module 1503 is specifically used for:
  • query intent category is the first category of query intent, retrieve the question and answer pairs corresponding to the query text from the unstructured database;
  • the answer corresponding to the query text is retrieved from the structured database.
  • retrieving the question and answer pairs corresponding to the query text from an unstructured database includes:
  • the question and answer pairs corresponding to the query text are retrieved from the unstructured database.
  • the unstructured database includes an inverted index and a vector index
  • the method of retrieving the question and answer pairs corresponding to the query text from the unstructured database based on the query language includes:
  • the inverted index and the vector index are respectively used to retrieve from the unstructured database to obtain a candidate answer set corresponding to the query text;
  • the candidate answer set includes using the inverted index A number of first answers retrieved and a number of second answers retrieved using the vector index;
  • the selection of the target answer is related to one of the preset strategy, inverted search score, vector search score, question matching degree, question answer matching degree, answer text quality and answer timeliness. or multiple related;
  • the question matching degree represents the matching degree between the query text and the questions in the question and answer pair
  • the question answer matching degree represents the matching degree between the query text and the answers in the question and answer pair.
  • retrieving the answer corresponding to the query information from the structured database includes:
  • the answer corresponding to the query text is retrieved from the structured database.
  • the answer in the question-answer pair includes a long answer and a short answer
  • the content of the long answer includes the content of the short answer
  • the determining module 1502 is further configured to determine that the query text has a question-and-answer intention.
  • a lightweight database and a full database are deployed in the terminal device, wherein the full database includes text data extracted from data in multiple formats stored in the terminal device;
  • the lightweight database includes text data that conforms to preset rules in the full database;
  • the data volume of the lightweight database is less than or equal to one percent of the data volume of the full database;
  • the first question and answer database includes a plurality of question and answer pairs generated based on the text data in the lightweight database, and/or structured data generated based on the text data in the lightweight database.
  • the terminal device includes multiple lightweight databases, and the lightweight databases in multiple terminal devices are synchronized so that the lightweight database of each terminal device includes the lightweight databases of other terminal devices. Quantify the contents of the database.
  • a second question and answer database is also deployed in the terminal device.
  • the second question and answer database includes a plurality of question and answer pairs generated based on text data in the full database, and/or based on the full amount of Structured data generated from text data in the database;
  • the retrieval module 1503 also includes:
  • the answer corresponding to the query text is not retrieved from the first question and answer database of the terminal device, based on the query intention, the answer corresponding to the query text is retrieved from the second question and answer database of the terminal device.
  • the retrieval module 1503 is also used to:
  • the answer corresponding to the query text is not retrieved from the first question and answer database of the terminal device, the answer corresponding to the query text is retrieved from the second question and answer database of multiple terminal devices based on the query intention.
  • the device 1500 further includes an update module 1504, configured to update the answer retrieved from the second question and answer database to the first question and answer database.
  • the terminal-based question and answer device 1500 may correspond to performing the method described in the embodiment of the present application, and the above and other operations and/or functions of each module in the terminal-based question and answer device 1500 are respectively implemented.
  • the corresponding processes of each method in Figure 3-10 are not repeated here for the sake of brevity.
  • An embodiment of the present application also provides a terminal device, including at least one processor, a memory, and a communication interface.
  • the processor is configured to execute the method described in Figures 3-14.
  • Figure 16 is a schematic structural diagram of a terminal device provided by an embodiment of the present application.
  • the terminal device 1600 includes at least one processor 1601, a memory 1602 and a communication interface 1603.
  • the processor 1601, the memory 1602 and the communication interface 1603 are communicatively connected, and the communication connection can be implemented in a wired manner (such as a bus) or in a wireless manner.
  • the communication interface 1603 is used to receive data sent by other devices; the memory 1602 stores computer instructions, and the processor 1601 executes the computer instructions to perform the method in the foregoing method embodiment.
  • the processor 1601 can be a central processing unit CPU, and the processor 1601 can also be other general-purpose processors, digital signal processors (digital signal processors, DSPs), application specific integrated circuits (application specific integrated circuits). specific integrated circuit (ASIC), field programmable gate array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • a general-purpose processor can be a microprocessor or any conventional processor, etc.
  • the memory 1602 may include read-only memory and random access memory and provides instructions and data to the processor 1601 .
  • Memory 1602 may also include non-volatile random access memory.
  • the memory 1602 may be volatile memory or non-volatile memory, or may include both volatile and non-volatile memory.
  • non-volatile memory can be read-only memory (ROM), programmable ROM (PROM), erasable programmable read-only memory (erasable PROM, EPROM), electrically removable memory. Erase electrically programmable read-only memory (EPROM, EEPROM) or flash memory.
  • Volatile memory can be random access memory (RAM), which is used as an external cache.
  • RAM static random access memory
  • DRAM dynamic random access memory
  • SDRAM synchronous dynamic random access memory
  • Double data rate synchronous dynamic random access memory double data date SDRAM, DDR SDRAM
  • enhanced synchronous dynamic random access memory enhanced SDRAM, ESDRAM
  • synchronous link dynamic random access memory direct rambus RAM, DR RAM
  • terminal device 1600 can implement the method shown in Figures 3-14 in the embodiment of the present application. Please refer to the above for a detailed description of the implementation of the method. For the sake of brevity, the details will not be described again.
  • Embodiments of the present application provide a computer-readable storage medium on which a computer program is stored. When the computer instructions are executed by a processor, the above-mentioned method is implemented.
  • Embodiments of the present application provide a chip, which includes at least one processor and an interface.
  • the at least one processor determines program instructions or data through the interface; the at least one processor is used to execute the program instructions to Implement the methods mentioned above.
  • Embodiments of the present application provide a computer program or computer program product that includes instructions that, when executed, cause the computer to perform the above-mentioned method.
  • RAM random access memory
  • ROM read-only memory
  • electrically programmable ROM electrically erasable programmable ROM
  • registers hard disks, removable disks, CD-ROMs, or anywhere in the field of technology. any other known form of storage media.

Abstract

一种基于终端的问答方法及装置,应用于终端设备,该方法包括获取查询文本;基于查询文本,确定查询文本对应的查询类别;当查询类别为第一查询类别时,从第一非结构化数据库中检索查询文本对应的答案;当查询类别为第二查询类别时,从第一结构化数据库中检索查询文本对应的答案;其中,第一非结构化数据库包括多对问答对,第一结构化数据包括多项结构化数据,多对问答对和多项结构化数据均基于终端设备中存储的数据得到。识别用户的查询类别,针对不同的查询类别,执行相应的查询步骤,为用户匹配相应的查询结果,实现了端侧问答。

Description

一种基于终端的问答方法及装置 技术领域
本申请涉及信息处理技术领域,尤其涉及一种基于终端的问答方法及装置。
背景技术
当前端侧信息量爆炸且具有私密性,端侧搜索的需求日益突出。比如说,用户曾收到接种疫苗的短信,但忘记了具体是哪天的哪条短信,希望在端侧直接查询能获得所需答案;又或者用户需要查询最近一周某快递的全部快递单号,当前端侧搜索无法根据具体时间筛选并显式抽取出单号列表。
发明内容
本申请的实施例提供一种基于终端的问答方法及装置,通过理解用户的查询意图,基于查询意图为用户匹配相应的查询结果,实现了端侧问答。
第一方面,本申请提供了一种基于终端的问答方法,应用于终端设备,该方法包括获取查询文本;基于查询文本,确定查询意图;基于查询意图,从第一问答数据库中检索查询文本对应的答案;其中,第一问答数据库基于终端设备中存储的数据得到。
在一个可能的实现中,上述基于查询文本,确定查询意图,包括:将查询文本作为语义模型的输入,输出语义文本;将语义文本进行分词处理,得到若干分词;将若干分词作为意图分类模型的输入,输出查询意图的类别,查询意图的类别包括第一类别查询意图和第二类别查询意图。
可以理解的是,查询意图的类别也可称之为查询类别,查询类别的含义可以理解为:按照查询文本对应的答案的类型进行分类,例如查询文本对应的答案类型为长短答案形式的类型,则可以将查询文本对应的查询类别归为一个类别,查询文本对应的答案类型为结构化数据的类型,则可以将查询文本对应的查询类别归为另一类别。
在该可能的实现中,通过深度语义技术实现对用户的查询意图的理解,进而根据查询意图实现精准检索。
在另一个可能的实现中,第一问答数据库包括非结构化数据库和结构化数据库,非结构化数据库包括多对问答对;基于所述查询意图,从问答数据库中检索查询文本对应的答案,包括:当查询意图类别为第一类别查询意图时,从非结构化数据库中检索查询文本对应的问答对;当查询意图类别为第二类别查询意图时,从结构化数据库中检索查询文本对应的答案。
在该可能的实现中,终端中部署非结构化数据库和结构化数据库,非结构化数据库包括多对问答对,结构化数据库包括多项结构化数据,当查询意图识别为第一类别查询意图时,从非结构化数据库中为查询文本匹配问答对,当查询意图识别为第二类别查询意图时,则从结构化数据库中匹配相应的结构化数据,精准的回答用户的查询问题。也就是说,针对不同的查询意图设置了不同的答案检索流程,使检索到的答案更加准确。
可选的,问答对中的答案包括长答案和短答案,长答案的内容包括短答案的内容。
容易理解的是,长答案包括了比短答案更多的内容,更详细的展示了查询文本的查询结果,将查询结果的所有相关信息进行展示,使用户得到更多信息,而短答案包括的内容比长答案更少,更为简洁的展示了查询文本的查询结果,只为用户展示查询结果中的关键信息,使用户更为快速的得到想要查询的信息,将查询结果以长短答案的形式展示给用户,兼具两者的优势,给用户更好的体验。
在另一个可能的实现中,非结构化数据包括倒排索引结构,从非结构化数据库中检索查询文本对应的问答对,包括:基于语义文本和若干分词,确定查询文本对应的查询语言;基于查询语言和倒排索引结构,从非结构化数据库中检索查询文本对应的问答对。
在另一个可能的实现中,非结构化数据库中包括倒排索引结构和向量索引结构;从非结构化数据库中检索查询文本对应的问答对,包括:分别利用倒排索引结构和向量索引结构从非结构化数据库中检索,以得到查询文本对应的候选答案集;候选答案集包括利用倒排索引结构检索到的若干第一答案和利用向量索引结构检索到的若干第二答案;从候选答案集中选择出查询文本对应的目标答案。
若干可以理解为一个或多个的意思,也就是说,候选答案集中可以包括一个第一答案和一个第二答案,也可以包括一个第一答案和多个第二答案,也可以包括多个第一答案和一个第二答案,也可以包括多个第一答案和多个第二答案。
在该可能的实现中,通过从倒排索引和向量索引两方面进行索引库的构建,支撑端侧问答的多路召回,使得检索的答案更加全面。
在另一个可能的实现中,目标答案的选择与预设策略、倒排检索分数、向量检索分数、问题匹配度、问题答案匹配度、答案文本质量和答案时效性中的一种或多种相关;其中,问题匹配度表征查询文本与问答对中的问题的匹配程度;问题答案匹配度表征查询文本与问答对中的答案的匹配程度。目标答案的确定考虑了多方面的因素,进一步实现了端侧问答的高准确率。
在另一个可能的实现中,从结构化数据库中检索出查询信息对应的答案,包括:识别出查询文本对应的若干槽位;将查询文本和若干槽位作为查询语言生成模型的输入,输出查询文本对应的查询语言;基于查询语言,从结构化数据库中检索查询文本对应的答案。
在另一个可能的实现中,将查询文本作为语义模型的输入,输出语义文本之前包括:确定查询文本具有问答意图。
在另一个可能的实现中,终端设备中部署轻量数据库和全量数据库,其中,全量数据库包括对终端设备中存储的多种格式的数据中进行提取得到的文本数据;轻量数据库包括全量数据库中符合预设规则的文本数据;轻量数据库的数据量小于或等于全量数据库的数据量的百分之一;第一问答数据库包括基于轻量数据库中的文本数据生成的多条问答对,和/或基于轻量数据库中的文本数据生成的结构化数据。
在该可能的实现中,通过在全量数据库的基础上构建轻量数据库,在轻量数据库的基础上构建第一问答数据库,在线搜索时,仅需将第一问答数据库存入内存,极大降低了对终端设备的内存开销,由于第一问答数据库的数据量的减少,也大大减少了检索时间,进而减少用户请求时延。
在另一个可能的实现中,终端设备包括多个,多个终端设备中的轻量数据库进行同步, 以使每个终端设备的轻量数据库包括其他终端设备的轻量数据库中的内容,增加了问答搜索的范围,在不出端的情况下,实现跨端的优质内容搜索。
在另一个可能的实现中,当终端设备处于空闲状态时,多个终端设备中的轻量数据库进行同步,使得轻量数据库的同步过程用户无感知,不影响用户对终端设备的使用体验。
在另一个可能的实现中,终端设备中还部署第二问答数据库,第二问答数据库包括基于全量数据库中的文本数据生成的多条问答对,和/或基于全量数据库中的文本数据生成的结构化数据;该方法还包括:从终端设备的第一问答数据库中未检索到查询文本对应的答案,则基于查询意图,从终端设备的第二问答数据库中检索查询文本对应的答案。
在该可能的实现中,当从第一问答数据库中检索不到理想的问答结果时,从第二问答数据库中继续进行检索,实现全量深度检索,满足用户的问答需求。
在另一个示例中,第二问答数据库中的多条问答对可以形成非结构化数据库,第二问答数据库中的结构化数据可以形成结构化数据库。
在另一个可能的实现中,基于终端的问答方法还包括:从终端设备的第一问答数据库中未检索到查询文本对应的答案,则基于查询意图从多个终端设备的第二问答数据库中检索查询文本对应的答案。
在另一个可能的实现中,基于终端的问答方法还包括,将从第二问答数据库中检索到的答案更新至第一问答数据库中,进一步完善第一问答数据库的内容,增加第一问答数据库的命中率。
第二方面,本申请提供了一种终端的交互方法,应用于终端设备,该方法包括:接收搜索框输入的查询内容;响应搜索请求,在终端设备内搜索查询内容对应的答案;针对查询内容,进行查询结果显示,查询结果显示的内容包括搜索到的答案。
在一个可能的实现中,当查询内容对应的答案为第一类别答案时,以长短答案的形式显示查询结果,当查询内容对应的额答案为第二类别答案时,以表格的形式显示查询结果。答案的类别可以根据答案的格式进行划分,例如当答案为长短答案,将其划分为第一类别,当答案为结构化数据则划分为第二类别。
在一个可能的实现中,答案包括长答案和短答案,长答案的内容包括短答案的内容。
在另一个可能的实现中,查询结果显示的内容还包括查询内容对应的问题。
在另一个可能的实现中,查询结果显示的内容还包括答案的数据来源,以使用户得知搜索答案的来源。
在另一个可能的实现中,答案为结构化数据,结构化数据包括针对查询内容在终端设备中搜索到的若干条结构化数据。
在另一个可能的实现中,查询结果显示的内容还包括对应若干条结构化数据中各条结构化数据的信息源控件;终端的交互方法还包括:当检测到信息源控件被触发,则显示信息源控件对应的结构化数据的详细内容。
在另一个可能的实现中,查询结果显示的内容还包括查询内容对应的问题文本,问题文本中的槽位关键字以不同颜色进行显示。
在另一个可能的实现中,结构化数据以表格的形式进行显示。这里的表格可以理解为结构化数据按照行和列的形式进行分布,一行分布一条结构化数据,一列分布结构化数据中的同一类型的数据,在一个示例中,表格还包括表头,位于表格的第一行,指明表格每 一列的内容和含义。需要解释说明的是,本申请实施例中的表格并不局限于一定要具有边框线,只要数据的分布符合表格的形式即可认为是以表格的形式进行显示(可参见图12)。
在另一个可能的实现中,查询结果以卡片的形式进行显示。
在另一个可能的实现中,搜索框位于终端设备的负一屏,或位于终端设备上安装的应用中的搜索入口。
第三方面,本申请提供了一种基于终端的问答装置,包括:获取模块、确定模块和检索模块,其中,获取模块用于获取查询文本;确定模块用于基于所述查询文本确定查询意图;检索模块用于基于所述查询意图,从第一问答数据库中检索所述查询文本对应的答案;所述第一问答数据库基于所述终端设备中存储的数据得到。
在一个可能的实现中,所述确定模块具体用于:将查询文本作为语义模型的输入,输出语义文本;将语义文本进行分词处理,得到若干分词;将若干分词作为意图分类模型的输入,输出查询意图的类别,查询意图的类别包括第一类别查询意图和第二类别查询意图。
在另一个可能的实现中,第一问答数据库包括非结构化数据库和结构化数据库,非结构化数据库包括多对问答对;检索模块具体用于:查询意图类别为第一类别查询意图,则从非结构化数据库中检索查询文本对应的问答对;查询意图类别为第二类别查询意图,则从结构化数据库中检索查询文本对应的答案。
在另一个可能的实现中,从非结构化数据库中检索查询文本对应的所述问答对,包括:基于语义文本和若干分词,确定查询文本对应的查询语言;基于查询语言,从非结构化数据库中检索查询文本对应的问答对。
在另一个可能的实现中,非结构化数据库中包括倒排索引结构和向量索引结构;基于所述查询语言,从非结构化数据库中检索查询文本对应的所述问答对,包括:基于查询语言,分别利用倒排索引结构和向量索引结构从所述非结构化数据库中检索,以得到查询文本对应的候选答案集;候选答案集包括利用倒排索引检索结构到的若干第一答案和利用向量索引结构检索到的若干第二答案;从候选答案集中选择出查询文本对应的目标答案。
在另一个可能的实现中,目标答案的选择与所述预设策略、倒排检索分数、向量检索分数、问题匹配度、问题答案匹配度、答案文本质量和答案时效性中的一种或多种相关;其中,问题匹配度表征查询文本与问答对中的问题的匹配程度;问题答案匹配度表征查询文本与问答对中的答案的匹配程度。
在另一个可能的实现中,从结构化数据库中检索出查询信息对应的答案,包括:识别出查询文本对应的若干槽位;将查询文本和所述若干槽位作为查询语言生成模型的输入,输出所述查询文本对应的查询语言;基于查询语言,从结构化数据库中检索查询文本对应的答案。
在另一个可能的实现中,问答对中的答案包括长答案和短答案,长答案的内容包括短答案的内容。
在另一个可能的实现中,确定模块还用于:确定查询文本具有问答意图。
在另一个可能的实现中,终端设备中部署轻量数据库和全量数据库,其中,全量数据库包括对终端设备中存储的多种格式的数据中进行提取得到的文本数据;轻量数据库包括全量数据库中符合预设规则的文本数据;轻量数据库的数据量小于或等于全量数据库的数据量的百分之一;第一问答数据库包括基于轻量数据库中的文本数据生成的多条问答对, 和/或基于轻量数据库中的文本数据生成的结构化数据。
在另一个可能的实现中,终端设备包括多个,多个终端设备中的轻量数据库进行同步,以使每个终端设备的轻量数据库包括其他终端设备的轻量数据库中的内容。
在另一个可能的实现中,当终端设备处于空闲状态时,多个终端设备中的轻量数据库进行同步。
在另一个可能的实现中,终端设备中还部署第二问答数据库,第二问答数据库包括基于全量数据库中的文本数据生成的多条问答对,和/或基于全量数据库中的文本数据生成的结构化数据;检索模块还包括:从终端设备的第一问答数据库中未检索到查询文本对应的答案,则基于查询意图,从终端设备的第二问答数据库中检索查询文本对应的答案。
在另一个可能的实现中,检索模块还用于:从终端设备的第一问答数据库中未检索到查询文本对应的答案,则基于查询意图从多个所述终端设备的第二问答数据库中检索查询文本对应的答案。
在另一个可能的实现中,装置还包括更新模块,用于将从第二问答数据库中检索到的答案更新至第一问答数据库中。
第四方面,本申请提供了一种终端设备,包括存储器和处理器,所述存储器中存储有可执行代码,所述处理器执行所述可执行代码,实现本申请第一方面和/或第二方面提供的方法。
第五方面,本申请提供了一种计算机可读存储介质,其上存储有计算机程序,当所述计算机程序在计算机中执行时,令计算机执行本申请第一方面和/或第二方面提供的方法。
第六方面,本申请提供了一种计算机程序或计算机程序产品,所述计算机程序或计算机程序产品包括指令,当所述指令被执行时,实现本申请第一方面和/或第二方面提供的方法。
附图说明
图1为通过第一种方案进行问答查询,终端设备搜索不到对应答案的示意图;
图2为本申请实施例提供的一种手机的结构示意框图;
图3为本申请实施例提供的一种基于终端的问答方法的流程图;
图4为利用倒排索引和向量索引从非结构化数据库中检索答案的示意图;
图5为从结构化数据库检索答案的示意图;
图6为一种基于终端的问答方法的实现过程示意图;
图7为全量数据库和轻量数据库的构建过程示意图;
图8为轻量数据库的构建过程示意图;
图9当终端设备具有多个时,跨端数据更新同步和搜索的示意图;
图10为问答数据库的生成过程示意图;
图11一种针对日常信息简答的UI界面示意图;
图12一种针对日常信息聚合的UI界面示意图;
图13用户点击按钮2后的界面变化示意图;
图14问答的搜索入口设置的几种形式示意图;
图15为本申请实施例提供的一种基于终端的问答装置的结构示意图;
图16为本申请实施例提供的终端设备的结构示意图。
具体实施方式
下面通过附图和实施例,对本申请的技术方案做进一步的详细描述。
解决上述问题,有如下方案:
第一种方案为,直接把短信文本信息读入内存,存储方式简单,检索时将用户查询(query)与存储的短信文本进行字符串匹配。如果用户搜索内容特别简单,该方案可以快速搜索到准确的答案。例如在短信里搜索“张三”,可以展出所有收/发件人为“张三”、或是消息文本中包含“张三”的短信。
该方案的缺点在于搜索的效果严重依赖于用户输入的关键字,如果用户输入的内容比较复杂,那么检索的结果一般为空。为此用户常常需要反复修改输入的查询语句,用户体验比较差。字符串匹配的方法虽然简单,但是无法很好地理解用户的意图,例如,如图1所示,用户输入“疫苗第三针应该在哪里打?”,虽然收件箱中有相应的短信,但由于端侧没有一套很好的匹配查询机制,无法匹配到相应的结果,更无法直接展示出接种地点。
第二种方案为,先在端侧进行匹配查询,如果端侧匹配无结果,则将在端侧获取到的用户参数信息,如时间、地点,连同用户query一并发送到云侧,依靠云侧搜索引擎的查询能力,以提升出答案的概率。
该方案的缺点在于用户参数信息会被发送到云侧,导致用户的隐私无法得到保护。另外该方法只能回答日常简单问题,例如“杭州防疫电话”,对于与端侧信息强相关的问题就无法回答,例如“这周韵达快递的单号有哪些”,因为端侧中还有丰富的诸如图库、备忘录、截屏和文档等信息无法利用,无法实现真正意义的端侧查询。
为了解决上述方案存在的问题,本申请实施例提供一种基于终端的问答方法,通过理解用户的查询意图,基于查询意图为用户匹配相应的查询结果,实现端侧问答。
本申请实施例提供的基于终端的问答方法可以应用于手机、平板电脑、可穿戴设备、车载终端、增强现实(augmented reality,AR)/(virtual reality,VR)设备、笔记本电脑、超级移动个人计算机(ultra-mobile personal computer,UMPC)、个人数字助理(personal digital assistant,PAD)等终端设备,本申请实施例对终端设备的具体类型不作限定。
以终端设备为手机为例。图2示出了一种手机的结构示意框图。参考图2,手机包括,射频(radio frequency,RF)电路210、存储器220、输入单元230、显示单元240、传感器250、音频电路260、无线保真(wireless fidelity,WiFi)模块270、处理器280以及电源290等部件。
本领域技术人员可以理解,图2中示出的手机结构并不构成对手机的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。
下面是结合图2对手机的各个构成部件进行具体的介绍。
RF电路210可用于收发信息或通话过程中,信号的接收和发送,特别地,将基站的下行信息接收后,给处理器280处理;另外,将设计上行的数据发送给基站。通常,RF电路包括但不限于天线、至少一个放大器、收发信机、耦合器、低噪声放大器(low  noise amplifier,LNA)、双工器等。此外,RF电路210还可以通过无线通信与网络和其他设备通信。上述无线通信可以使用任一通信标注或协议,包括但不限于全球移动通信系统(global system of mobile communication,GSM)、通用分组无线服务(general packet radio service,GPRS)、码多分址(code division multiple access,CDMA)、宽带码分多址(wideband code division multiple access,WCDMA)、长期演进(long term evolution,LTE)、电子邮件、短消息服务(short messaging service,SMS)等。
存储器220可用于存储软件程序以及模块,处理器280通过运行存储在存储器220的软件程序以及模块,从而执行手机的各种功能应用以及数据处理。存储器220可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序(比如声音播放功能、图像播放功能等)等;存储数据区可存储根据手机的使用所创建的数据(比如音频数据、电话本等)等。此外,存储器220可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件或其他易失性固态存储器件。
输入单元230可用于接收输入的数字或字符信息,以及产生与手机200的用户设置以及功能控制有关的键信号输入。具体地,输入单元230可包括触控面板231以及其他输入设备232。触控面板231,也可称为触摸屏,可收集用户在其上或附近的触摸操作(比如用户使用手指、触笔等任何适合的物体或附件在触控面板231上或在触摸面板231附件的操作),并根据预先设定的程式驱动相应的连接装置。可选的,触控面板231可包括触控检测装置和触控控制器两个部分。其中,触摸检测装置检测用户的触摸方位,并检测触摸操作带来的信号,将信号传送给触摸控制器;触摸控制器从触摸检测装置上接收触摸信息,并将它转换成触点坐标,再发送给处理器280,并能接收处理器280发来的命令并加以执行。此外,可以采用电阻式、电容式、红外线以及表面声波等多种类型实现触控面板231。除了触控面板231,输入单元230还可以包括其他输入设备232。具体的,其他输入设备232可以包括但不限于物理键盘、功能键(比如音量控制键、开关按键等)、轨迹球、鼠标、操作杆等中的一种或多种。
显示单元240可用于显示由用户输入的信息或提供给用户的信息以及手机的各种菜单。显示单元240可包括显示面板241,可选的,可以采用液晶显示器(liquid crystal display,LCD)、有机发光二极管(organic light-emitting diode,OLED)等形式来配置显示面板241.进一步的,触控面板231了覆盖显示面板241,当触控面板231检测到在其上或附近的触摸操作后,传送给处理器280以确定触摸事件的类型,随后处理器280根据触摸事件的类型在显示面板241上提供相应的视觉输出。虽然在图2中,触控面板231与显示面板241是作为两个独立的部件来实现手机的输入和出功能,但是在某些实施例中,可以将触控面板231与显示面板241集成而实现手机的输入和输出功能。
手机200还可包括至少一种传感器250,比如光传感器、运动传感器以及其他传感器。具体的,光传感器可包括环境光传感器及接近传感器,其中,环境光传感器可根据环境光线的明暗来调节显示面板241的亮度,接近传感器可在手机移动到耳边时,关闭显示面板241和/或背光。作为运动传感器的一种,加速度传感器可检测各个方向 上(一般为三轴)加速度的大小,静止时刻检测出重力的大小及方向,可用于识别手机姿态的应用(比如横竖屏切换、相关游戏和磁力计姿态校准)、振动识别相关功能(比如计步器、敲击)等;至于手机还可配置陀螺仪、气压计、湿度计、温度计、红外线传感器等其他传感器,在此不再赘述。
音频电路260、扬声器261、传声器262可提供用户与手机之间的音频接口。音频电路260可将接收到的音频数据转换后的电信号,传输到扬声器261,由扬声器261转换为声音信号输出;另一方面,传声器262将收集的声音信号转换为电信号,由音频电路260接收后转换为音频数据,再将音频数据输出处理器280处理后,经RF电路210发送给比如另一部手机,或者将音频数据输出至存储器220以便进一步处理。
WiFi属于短距离无线传输技术,手机通过WiFi模块270可以帮助用户收发电子邮件、浏览网页和访问流式媒体等,它为用户提供了无线的宽带互联网访问。虽然图2示出了WiFi模块270,但是可以理解的是,其并不属于手机200的必须构成,完全可以根据需要在不改变发明的本质的范围内而省略。
处理器280是手机的控制中心,利用各种接口和线路连接整个手机的各个部分,通过运行或执行手机的各种功能和处理数据,从而对手机进行整体监控。可选的,处理器280可包括一个或多个处理单元;优选的,处理器280可集成应用处理器和调制解调处理器,其中,应用处理器主要处理操作系统、用户界面和应用程序等,调制解调处理器主要处理无线通信。可以理解的是,上述调制解调处理器也可以不集成到处理器280中。
手机200还包括给各个部件供电的电源290(比如电池),优选的,电源290可以通过电源管理系统与处理器280逻辑相连,从而通过电源管理系统实现管理充电、放电以及功耗管理等功能。
尽管未示出,手机200还可以包括摄像头。可选的,摄像头在手机200上的位置可以前置也可以后置,本申请实施例对此不作限定。
可选的,手机200可以包括单摄像头、双摄像头或三摄像头等,本申请实施例对此不作限定。
例如,手机200可以包括三个摄像头,其中,一个为主摄像头,一个为广角摄像头,一个为长焦摄像头。
可选的,当手机200包括多个摄像头时,这多个摄像头可以全部前置,或者全部后置,或者一部分前置,另一部分后置,本申请实施例对此不作限定。
为了更清楚的理解本申请实施例提供的基于终端的问答方法,下面对本申请实施例涉及到的技术要素进行简单说明。
光学字符识别(optical character recognition,OCR),是指对包含文本资料的图像文件进行分析识别处理,获取文字及版面信息的技术。
问答对生成(question-answer generation,QAG),指的是给定一段上下文(段落或语句),深度语义模型根据上下文内容生成一个流畅且切合上下文主题的问答对,使得生成的question可以被answer回答。
机器阅读理解(machine reading comprehension,MRC)是自然语言处理(natural language processing,NLP)领域的热门研究方向,利用机器对数据集中的文本内容 进行理解和分析,回答提出的问题,能够最大程度地评估机器理解语言的能力。
自然语言理解(natural language understanding,NLU)是研究用计算机模拟人的语言交际过程,使计算机能理解和运用人类社会的自然语言如汉语、英语等,实现人机之间的自然语言通信,以代替人的部分脑力劳动,包括查询资料、解答问题、摘录文献、汇编资料以及一切有关自然语言信息的加工处理。
查询理解(query understanding,QU)能够更好地理解用户输入的查询词,可以拆解用户搜索词的意图,快速定位核心词、属性词等。
Qq匹配:根据用户query(Q)对问答库中的question(q)进行检索匹配,召回与query相似度最高的topK的问答库q-a对,相对应的答案即为要返回的答案候选项,召回方法包括文本召回、向量召回等。
倒排索引:也常被称为反向索引、置入档案或反向档案,是一种索引方法,被用来存储在全文搜索下某个单词在一个文档或者一组文档中的存储位置的映射。它是文档检索系统中最常用的数据结构,这种数据结构可以称之为倒排索引结构。
倒排检索:指基于倒排索引进行检索,一般适用于快速的全文搜索。
向量检索:计算给定提问向量与文献(类)向量之间的相似度,然后使相似度超过某一阈值(或者根据预定要检出的文献数量)的文献按相似度大小降序排列输出。
槽位与槽位值:槽位代表实体已明确定义的属性,例如打车中的,出发地点、目的地、出发时间等,槽位值是具体填充到槽位的值,例如杭州、上海、20220510。
下面详细介绍本申请实施例提供的一种基于终端的问答方法。
图3为本申请实施例提供一种基于终端的问答方法的流程图。该方法可在图2所示的终端设备上执行实现,如图3所示,本申请实施例提供的基于终端的问答方法至少包括步骤S301至步骤S303。
在步骤S301中,获取查询文本。
用户可以在终端(例如手机)的问答输入框内手动输入自然查询语句作为查询文本。例如,用户在手机触摸屏上显示的问答界面的问答输入框中点击,调出虚拟键盘,然后用户通过虚拟键盘,在问答输入框中输入查询文本“疫苗第三针应该在哪里打?”,点击“确定”按钮提交查询文本。响应于该操作,手机获取到查询文本,并将查询文本显示在手机的显示界面上。
在另一个示例中,用户还可以通过语音输入查询文本。例如,用户操作手机触摸屏上显示的问答界面上语音输入图标,然后通过麦克风输入语音。手机响应于用户操作,手机获取到用户输入的语音,然后对语音内容进行识别,进而转换为查询文本。手机将语音内容转换为文本内容的具体方案可参加现有技术,本申请实施例不作赘述。
可以理解的是,上述描述的终端设备获取查询文本的方式仅为示例性说明,并不构成对本申请实施例的保护范围的限定。
在步骤S302中,基于查询文本,确定查询意图。
将查询文本输入语义模型,得到语义文本,然后将语义文本进行分词处理(也即QU处理),得到能表征查询意图的若干分词,该若干分词为语义文本的核心词和属性词等,以更好的理解用户的查询意图;再将处理后得到的若干分词输入意图分类模型,得到查询意图的类别。在一个示例中,查询意图包括第一类别查询意图和第二类别查 询意图,第一类别查询意图也可称之为日常信息简答查询意图,第二类别查询意图也可称之为日常信息聚合查询意图。
在另一个示例中,在将查询文本输入语义模型之前,还包括,确定查询文本具有问答意图,若查询文本具有问答意图,则将查询文本输入语义模型,已进行后续的查询意图识别,若查询文本不具有问答意图,则跳出无需进行后续步骤。也就是说,在确定查询意图之前,先对查询文本进行非问答过滤,如果查询文本具有问答意图,则继续后续步骤,否则跳出。
在步骤S303中,基于查询意图,从第一问答数据库中检索查询文本对应的答案,第一问答数据基于终端设备中存储的数据得到。
在一个示例中,第一问答数据库包括非结构化数据库,非结构化数据库汇总包括多对问答对,该多对问答对基于终端设备中存储的数据生成。
查询意图为日常信息简答时,则从非结构化数据库中检索查询文本对应的问答对。例如,基于查询意图,利用索引从非结构化数据中检索查询文本对应的问答对。
非结构化数据中的索引结构包括倒排索引,则基于查询意图,利用索引从非结构化数据中检索查询文本对应的问答对的实现方案为:基于步骤S302中得到的语义文本和若干分词,确定查询文本对应的查询语言,可选的,该查询语言可以为领域特定语言(domain specific language,DSL),然后利用查询文本对应的DSL,基于倒排索引从非结构化数据中检索查询文本对应的若干问答对,该若干问答对作为候选答案集。
可选的,倒排搜索引擎可使用Elasticsearch,其使用倒排索引的结构,适用于快速的全文搜索。创建倒排索引时,ES首先将每个文档的正文拆分成单独的词,称作词条或tokens,接着创建一个包含所有不重复词条的排序列表,对于其中每个词,都有一个包含它的文档列表。
非结构化数据中的索引结构包括向量索引,则基于查询意图,利用索引从非结构化数据中检索查询文本对应的问答对的实现方案为:首先对用户输入的查询文本进行向量化,得到查询文本的向量表示,然后利用查询文本的向量基于向量索引从非结构化数据中检索查询文本对应的若干问答对,该若干问答对作为候选答案集。
可选的,向量搜索引擎可使用FAISS。语义向量召回离线阶段,通过语义模型获取question/answer的embedding,灌入相似向量检索库FAISS中构建索引;在线阶段,首先获取query的embedding,在FAISS中查找最近的question/answer的embedding,将对应的answer作为召回结果返回。
在另一个示例中,非结构化数据中的索引结构倒排索引和向量索引,则基于查询意图,利用索引从非结构化数据中检索查询文本对应的问答对的实现方案为:如图4所示,分别利用倒排索引和向量索引从非结构化数据库中检索,以得到查询文本对应的候选答案集;候选答案集包括利用倒排索引检索到的若干第一答案和利用向量索引检索到的若干第二答案。也就是说,非结构化数据中建立了倒排索引和向量索引两种索引结构,当查询意图确定为日常信息简答时,则分别利用倒排索引和向量索引,在非结构化数据中同步进行倒排检索和向量检索,支撑端侧问答的多路召回,使得检索的答案更加全面。
在另一个示例中,候选答案集为匹配度大于阈值或前topk的答案,例如,当分别利用倒排索引和向量索引从非结构化数据库中检索到若干第一答案和第二答案后,统一计算 查询文本与答案对中答案的匹配度和查询文本与答案对中问题的匹配度,根据匹配度之后对检索到的答案进行排序,取前topk个答案或者匹配度大于阈值的答案,这些答案组成了候选答案集。
检索完成后,从候选答案集中选择出查询文本对应的目标答案。从候选答案集中选择出查询文本对应的目标答案的考虑因素包括预设策略、倒排检索分数、向量检索分数、问题匹配度、问题答案匹配度、答案文本质量和答案时效性中的一种或多种相关。其中,问题匹配度表征查询文本与问答对中的问题的匹配程度;问题答案匹配度表征查询文本与问答对中的答案的匹配程度。
答案时效性可以理解为答案对应的数据源的时间,数据源的时间距离当前越久时效性越差,数据源的时间距离当前越短时效性越好。答案文本质量可以用答案对应的数据源中包含的文本信息的质量来衡量,文本信息的质量可以通过多种方式衡量,例如文本信息包含的信息越多质量越好,或者文本信息的语义连贯性越好质量越好等。
例如,对候选答案集中的每个答案的倒排检索分数、向量检索分数、问题匹配度、问题答案匹配度、答案文本质量和答案时效性等进行归一化,并将归一化后的分数作为每个答案的分数,然后再对答案排序通过预设策略进行干预,例如高优库提权,政治、色情降权等。
可选的,非结构化数据库中的问答对中的答案包括长答案和短答案,长答案为针对问题的较为详细的答案,短答案为针对问题的较为简单的答案。例如,非结构化数据库中的某一答案对的问题为“社区防疫电话?”,长答案为“龙岗区坂田街道社区疫情咨询电话:0755-8960****”,短答案为“0755-8960****”。
在另一个示例中,第一问答数据库包括结构化数据库,结构化数据库中多条结构化数据,当查询意图类别为日常信息聚合时,则从结构化数据库中检索查询文本对应的答案。
具体实现,如图5所示,先利用查询文本和槽位词典,将查询文本中的槽位识别出来,并记录所有识别出来的槽位。例如,查询文本为“最近一周所有韵达快递的取件码”,则根据槽位词典识别出的查询文本中的槽位包括“时间”、“快递类型”、“取件码”。将识别出的槽位和查询文本输入到DSL生成模型,生成一个将识别槽位作为过滤属性的DSL、例如,查询文本为“最近一周所有韵达快递的取件码”,经过槽位识别会识别出“时间”、“快递类型”、“取件码”等槽位。经过DSL生成模型后,会生成“当前时间-时间<一周”、“快递类型==韵达”、“取件码!=None”等过滤条件的DSL。基于生成的DSL,利用倒排索引从结构化数据库中检索到查询文本对应的答案,该答案为从非结构化数据库中检索得到的查询文本对应的若干条结构化数据。
在一个示例中,槽位词典通过挖掘终端设备内的结构化数据获取涵盖丰富场景的槽位词典,例如,基于结构化数据库中的多条结构化数据得到。
在另一个示例中,第一问答数据库包括非结构化数据库和结构化数据库,非结构化数据库包括多对问答对,结构化数据库包括多条结构化数据。当查询意图为日常信息简答时,则从非结构化数据库中进行检索,以得到候选答案集,当查询意图为日常信息聚合时,则从结构化数据库中进行检索,得到查询文本对应的若干条结构化数据,经过预设策略对候选答案集进行筛选排序,得到最终的答案,并将答案以卡片的形式 向用户进行展示。
从结构化数据库中进行检索,和从非结构化数据库中进行检索的方案参见上文描述,为了简洁,这里不再赘述。
示例性的,参见图6,用户输入查询文本后,终端设备响应于该查询文本,首先对查询文本进行非问答过滤,如果查询文本包含问答意图,则继续后续流程,否则跳出。基础NLU模型和QU对查询文本进行语义分析和理解,得到查询文本对应的语义文本和理解分词,将语义文本和理解分词输入意图分类模型(也可称之为端侧问答NLU),确定查询文本的查询意图是属于“日常信息简答”或“日常信息聚合”。如果查询意图属于“日常信息简答”意图,那么就走入日常信息简答流程,该流程主要基于问答对检索,能够获取问题的长短答案。如果查询意图属于“日常信息聚合”意图,那么就走入日常信息聚合流程,该流程主要基于数值推理,能够获取到多个精准的结构化答案。
经过检索到的若干条答案,构成了候选答案集,候选答案集经过规则策略(也可称之为预设策略)后,会通过人为策略对答案排序进行干预,例如高优库提权、政治、色情降权等。最后对候选答案集中排序第一位的答案进行问答卡的展出。
候选答案集的生成以及排序的实现方案,参见上文描述,为了简洁,这里不再赘述。
在另一个可能的实现中,终端设备中部署轻量数据库和全量数据库,其中,全量数据库包括对终端设备中存储的多种格式的数据中进行提取得到的文本数据;轻量数据库包括全量数据库中符合预设规则的文本数据;轻量数据库的数据量小于或等于全量数据库的数据量的百分之一;第一问答数据库包括基于轻量数据库中的文本数据生成的多条问答对,和/或基于轻量数据库中的文本数据生成的结构化数据。
图7示出了全量数据库和轻量数据库的构建过程,如图7所示,利用信息提取、文档提取和OCR等技术,将端侧中的短信、备忘录、文档和图片等信息进行文本信息抽取,构成本设备上的全量数据库。例如,在手机上浏览网页的过程中,发现了网页上有一段重要的文本信息,则可以通过截屏将其保留下来,之后再构建全量数据库的时候可以该图片上的文本信息可以通过OCR将文本内容识别出来,进而转换为文本信息。
然后对抽取到的文本数据进行筛选、过滤。可以通过用户自定义实现部分/全部的数据来源允许被收集,敏感词、色情词、垃圾词和低信息量等过滤逻辑。最终将过滤后的所有数据或部分数据构建本设备的全量数据库,也可称之为Main库,例如,Main库包括过滤后的所有文本数据,或者Main库包括符合一定条件的文本数据,例如,文本长度大于预设值的文本数据,将文本长度过短的文本数据舍弃掉,保证文本数据的内容有效性。
如图8所示,在Main库的基础上构建轻量数据库,也可称之为Lite库,Lite库的存储大小一般为Main库的1%,其数据主要来自三部分,包括从Main库中通过优质信息过滤筛选出高优先级信息源数据,高优先级信息源可以用户定义,例如可以为短信、记事本等;针对在线用户请求无结果的查询文本,端侧问答系统会在离线过程中进行全量深度搜索,如果有返回答案则将问题和返回的答案保存在Lite库中,或第一 问答数据库中,以便下次用户请求时能返回正确答案;通过跨端数据同步对其他端侧设备的Lite库进行信息更新同步,使单一设备可以访问到其他设备中的高质量数据。
同时,为了保证Lite库的轻量性,Lite数据库会根据数据存储时间、大小和优先级进行数据裁剪,保证是Main库的百分之一。例如,分别基于存储时间、大小和优先级为Lite库中的每条数据进行打分,得到每条数据的存储时间打分、大小打分和优先级打分,例如,存储时间越长打分越低、占用存储空间越大打分越低、优先级越低打分越低;然后每条数据的存储时间打分、大小打分和优先级打分进行加权,得到每条数据的综合打分,例如,存储时间打分的权重为30%,大小打分的权重为30%,优先级打分的权重为40%;将综合打分低于阈值的裁剪掉。
当用户进行问答查询时,仅从基于Lite库提取生成的问答数据库(也即第一问答数据库)中进行查询检索,此时,只需要将基于Lite库生成的问答数据库存入内存即可,极大降低了对终端设备的内存开销,同时由于第一问答数据库的数据量的减少,也大大减少了检索时间,进而减少用户请求时延。当从第一问答数据库中检索不到答案是,则从基于Main数据库构建的问答数据库中进行全量检索,以提高答案的命中率,如此即增加了问答响应速度,又增加了用户问答的满足率。
在另一个可能的实现中,终端设备包括多个,多个终端设备中的轻量数据库进行同步,以使每个终端设备的轻量数据库包括其他终端设备的轻量数据库中的内容,增加了问答搜索的范围,在不出端的情况下,实现跨端的优质内容搜索,同时能够在端侧以较小性能开销获取跨端多源问答信息。多个终端设备为用户配置的多个终端设备,或者同一账号下的多个终端设备,例如,同一用户使用的多台终端设备包括用户使用的智能手机、笔记本电脑、平板电脑、智能穿戴设备等多个终端设备。
当终端设备处于空闲状态时,多个终端设备中的轻量数据库进行同步,使得轻量数据库的同步过程用户无感知,不影响用户对终端设备的使用体验。
终端设备中还部署第二问答数据库,第二问答数据库包括基于全量数据库中的文本数据生成的多条问答对,和/或基于全量数据库中的文本数据生成的结构化数据;本申请实施例提供的基于终端的问答方法还包括,从终端设备的第一问答数据库中未检索到查询文本对应的答案,则基于查询意图,从终端设备的第二问答数据库中检索查询文本对应的答案。换言之,当从第一问答数据库中检索不到理想的问答结果时,从第二问答数据库中继续进行检索,实现全量深度检索,满足用户的问答需求。
可选的,将从第二问答数据库中检索到的答案更新至第一问答数据库中,进一步完善第一问答数据库的内容,增加第一问答数据库的命中率。
图9示出了当终端设备具有多个时,跨端数据更新同步和搜索。如图9所示,离线数据同步时,不同终端设备(例如图9中的,请求设备、搜索设备1和搜索设备2)的Lite库进行数据同步更新,一般是在夜晚充电或设备建立且电量满足90%时。一般快速搜索为请求设备只查询本设备的Lite库,或者只查询本设备的第一问答数据库,也就是说,只在本设备的第一问答数据库中进行检索,为查询文本匹配答案。全量深度搜索时,请求设备会同时检索本设备Main库和搜索设备(例如图9中的搜索设备1和搜索设备2)的Main库,并将检索结果同步到请求设备。
该数据同步过程运行于设备空闲时段,对于用户无感知。同时当其他设备在线时,端 侧问答系统可实现跨端全设备全内容搜索;当其他设备不在线时,通过预先同步的Lite库,可实现跨端全设备优质内容搜索。
下面详细介绍问答数据库的生成方案。
以第一问答数据库的生成为例,介绍问答数据库的生成实现方案。
图10示出了问答数据库的生成过程。如图10所示,将Lite库作为问答数据库的数据源,Lite库中包括多条文本数据,将每条文本数据分别进行非结构化问答对的生成和结构化数据抽取。非结构化的问答对构成了非结构化数据库,结构化数据构成了结构化数据库。对于非结构化数据库的生成包括问答对数据生成,基于深度语义技术,利用端侧数据生成高质量问答对。问答对数据生成主要包括问答生成阶段、问答判定阶段、长/短答案抽取阶段。
其中,问答生成的实现方案为:将Lite库中的一段文本输入问答生成模型中,得到对应该文本的若干问题。该段文本分别与问答生成模型输出的若干问题,组成若干问答对。
训练阶段包括:首先使用爬虫工具收集了大量现网优质中文问答对作为训练语料,并进行了数据预处理,其流程包括:校验问答对是否为中文,将问答对中的繁体中文转换为简体中文、将连续的非字符与数字替换为句号、处理多余的空格及步骤编号、过滤以垃圾词表中字符串开头的垃圾词等,将预处理后的文本输入待训练的问答生成模型,输出对应问题,基于该问题与文本的标签确定模型的损失值,以损失值最小化我训练目标训练问答生成模型,直至模型收敛,即完成问答生成模型的训练。
推理阶段包括,将Lite库中的文本输入问答生成模块,输出若干问题,该段文本分别与问答生成模型输出的若干问题,组成若干问答对。
可选的,问答生成模型可以为mT5模型,该模型针对不同的NLP下游任务,设计了不同给的任务引导符,对于问答生成任务而言,模型输入为一段中文文本,即answer,标签为该段文本对应的question,与任务引导符“question generation:”拼接形成训练语料,进行mT5模型的Fine-Tune。模型推理阶段,只需输入经过规范化表征后的文本,即可生成与该文本相匹配的question。
问答判定的实现方案为:在问答生成阶段之后,为了减少问答生成模型引入的噪声数据,增强库中的问答对质量,需要使用判定模型进行进一步的相关度判断。该模型的输入是问答生成的问答对,模型的输出为该答案对于问题的匹配度数值,通过该数值可以过滤答非所问的低质问答对。
可选的,判定模型是基于对称超链关系构造预训练数据,对BERT进行further-pretrain的SHP模型,并使用大量中文问答语料对SHP模型进行Fine-Tune,首先利用对比学习方法寻找与正样本距离较近的N个负样本,在一个批次(Batch)中根据对比学习模型的相似度打分,对负样本进行采样,使得每个正样本可以有多个相似度不一的负样本与之配对,强化模型的分类性能。同时在训练过程中,使用对抗训练思想,对词嵌入向量(word_embedding)参数进行随机微小扰动,计算扰动后的前向loss和反向梯度,在正常的grad基础上,累加对抗训练的梯度。
长/短答案抽取的实现为:利用MRC技术,即使用深度语义模型(长/短答案抽取模型)对原始段落进行一个跨度(span)预测,从原始段落中抽取出一定长度的核心 句,作为UI展示的短答案结果;原始段落作为长答案。结合MRC过程,可以显著优化端侧问答系统的答案展示效果,通过长短答案的配合展示,用户可以快速获取到答案的关键信息。
可选的,长/短答案抽取模型的主干网络(backbone)为ELECTRA模型,该模型在BERT的MLM(Masked Language Model)训练思想的基础上,增加了RTD(Replaced Token Detection)过程。模型由两部分组成,第一部分是生成器(Generator),生成器将句子中的部分单词进行替换。第二部分是判别器(Discriminator),判别器用于判断一个句子中每一个单词是否被替换了,训练的过程会预测所有的单词,相比BERT更加高效。模型总体的损失函数定义为MLM任务和RTD任务的联合损失。Fine-Tune阶段,构造包含长/短答案的中文优质问答语料,采用序列标注的方式,输出原始语料中最优的长/短答案跨度,同时对输出的长/短答案进行策略优化,筛除答案中的噪声字段,提升整体的长短答案显示效果。
对生成的各个问答对分别构建向量索引和倒排索引,以便于后续对非结构化数据库的检索。
针对结构化数据库的生成,包括利用槽位词典对Lite数据库中的各条文本数据进行槽位识别,然后对识别出的槽位进行槽位值填充,槽位和对应的槽位值构成一条结构化数据,多条结构化数据构成了结构化数据库,然后对各条结构化数据构建倒排索引,以便于后续对结构化数据库的检索。
本申请实施例还公开了一种终端的交互方法,包括:接收搜索框输入的查询内容;响应搜索请求,在终端设备内搜索查询内容对应的答案;针对查询内容,进行查询结果显示,查询结果显示的内容包括搜索到的答案。
可选的,在终端设备内搜索查询内容对应的答案的实现方案,可以基于本申请实施例上文提供的一种基于终端的问答方法实现,具体方案参见上文描述,这里不再赘述。
在一个示例,答案包括长答案和短答案,长答案的为针对查询问题较为详细的回答,短答案为针对查询问题较为简练的回答。
可选的,查询结果显示的内容还包括查询内容对应的问题。
可选的,查询结果显示的内容还包括答案的数据来源,以使用户得知搜索答案的来源。
图11示出了一种针对日常信息简答的UI界面。如图11所示,其中标签1是用户输入的query,经过端侧问答检索后会展示一个问答卡,问答卡包含4个内容:标签2是非结构化数据库中问答对的question;标签3是短答案;标签4是长答案;标签5是答案的数据来源。
在一个示例中,当答案为结构化数据时,结构化数据包括针对查询内容在终端设备中搜索到的若干条结构化数据。
在另一个可能的实现中,查询结果显示的内容还包括对应若干条结构化数据中各条结构化数据的信息源控件;终端的交互方法还包括:当检测到信息源控件被触发,则显示信息源控件对应的结构化数据的详细内容。
在另一个可能的实现中,查询结果显示的内容还包括查询内容对应的问题文本,问题文本中的槽位关键字以不同颜色进行显示。
在另一个可能的实现中,结构化数据以表格的形式进行显示。
图12示出了针对日常信息聚合的UI界面。如图12所示,其中标签1是用户输入的 query,经过端侧问答检索后会展示一个问答卡,问答卡包含3个内容:标签2是用户输入的query,同时对识别的槽位关键字进行其他颜色的展示;标签3是激活的槽位;标签4是激活的槽位对应的槽位值,使用表格的形式进行展示。问答卡只展示了部分的槽位和槽位值,如果用户想了解更多信息,则点击查看信息源的按钮,会动态的展示相应的详细内容。如图13所示,用户点击按钮2后,则向用户展示序号2更详细的内容。
可选的,查询结果以卡片的形式进行显示。
在另一个可能的实现中,搜索框位于终端设备的负一屏,或位于终端设备上安装的应用中的搜索入口。例如,参见图14,搜索框可以位于终端设备的负一屏和终端上安装的应用的搜索入口(例如内置浏览器的搜索入口、智能语音助手的搜索入口等)。
与前述一种基于终端的问答方法的实施例基于相同的构思,本申请实施例中还提供了一种基于终端的问答装置1500,该基于终端的问答装置1500包括用以实现图3-10所示的基于终端的问答方法中的各个步骤的单元或模块。
图15为本申请实施例提供的一种基于终端的问答装置的结构示意图。该装置应用于终端设备,如图15所示,该一种基于终端的问答装置1500至少包括:
获取模块1501,用于获取查询文本;
确定模块1502,用于基于所述查询文本,确定查询意图;
检索模块1503,用于基于所述查询意图,从第一问答数据库中检索所述查询文本对应的答案;所述第一问答数据库基于所述终端设备中存储的数据得到。
在一个可能的实现中,所述确定模块1502具体用于:
将所述查询文本作为语义模型的输入,输出语义文本;
将所述语义文本进行分词处理,得到若干分词;
将所述若干分词作为意图分类模型的输入,输出所述查询意图的类别,所述查询意图的类别包括第一类别查询意图和第二类别查询意图。
在另一个可能的实现中,所述第一问答数据库包括非结构化数据库和结构化数据库,所述非结构化数据库包括多对问答对;
所述检索模块1503具体用于:
所述查询意图类别为所述第一类别查询意图,则从非结构化数据库中检索所述查询文本对应的所述问答对;
所述查询意图类别为所述第二类别查询意图,则从结构化数据库中检索所述查询文本对应的答案。
在另一个可能的实现中,所述从非结构化数据库中检索所述查询文本对应的所述问答对,包括:
基于所述语义文本和所述若干分词,确定所述查询文本对应的查询语言;
基于所述查询语言,从所述非结构化数据库中检索所述查询文本对应的所述问答对。
在另一个可能的实现中,所述非结构化数据库中包括倒排索引和向量索引;
所述基于所述查询语言,从所述非结构化数据库中检索所述查询文本对应的所述问答对,包括:
基于所述查询语言,分别利用所述倒排索引和向量索引从所述非结构化数据库中检索, 以得到所述查询文本对应的候选答案集;所述候选答案集包括利用所述倒排索引检索到的若干第一答案和利用所述向量索引检索到的若干第二答案;
从所述候选答案集中选择出所述查询文本对应的目标答案。
在另一个可能的实现中,所述目标答案的选择与所述预设策略、倒排检索分数、向量检索分数、问题匹配度、问题答案匹配度、答案文本质量和答案时效性中的一种或多种相关;
其中,所述问题匹配度表征所述查询文本与所述问答对中的问题的匹配程度;所述问题答案匹配度表征所述查询文本与所述问答对中的答案的匹配程度。
在另一个可能的实现中,所述从结构化数据库中检索出所述查询信息对应的答案,包括:
识别出所述查询文本对应的若干槽位;
将所述查询文本和所述若干槽位作为查询语言生成模型的输入,输出所述查询文本对应的查询语言;
基于所述查询语言,从所述结构化数据库中检索所述查询文本对应的答案。
在另一个可能的实现中,所述问答对中的答案包括长答案和短答案,所述长答案的内容包括所述短答案的内容。
在另一个可能的实现中,所述确定模块1502还用于:确定所述查询文本具有问答意图。
在另一个可能的实现中,所述终端设备中部署轻量数据库和全量数据库,其中,所述全量数据库包括对所述终端设备中存储的多种格式的数据中进行提取得到的文本数据;所述轻量数据库包括全量数据库中符合预设规则的文本数据;所述轻量数据库的数据量小于或等于所述全量数据库的数据量的百分之一;
所述第一问答数据库包括基于所述轻量数据库中的文本数据生成的多条问答对,和/或基于所述轻量数据库中的文本数据生成的结构化数据。
在另一个可能的实现中,所述终端设备包括多个,多个所述终端设备中的轻量数据库进行同步,以使每个所述终端设备的轻量数据库包括其他所述终端设备的轻量数据库中的内容。
在另一个可能的实现中,当所述终端设备处于空闲状态时,多个所述终端设备中的轻量数据库进行同步。
在另一个可能的实现中,所述终端设备中还部署第二问答数据库,所述第二问答数据库包括基于所述全量数据库中的文本数据生成的多条问答对,和/或基于所述全量数据库中的文本数据生成的结构化数据;
所述检索模块1503还包括:
从所述终端设备的第一问答数据库中未检索到所述查询文本对应的答案,则基于所述查询意图,从所述终端设备的第二问答数据库中检索所述查询文本对应的答案。
在另一个可能的实现中,所述检索模块1503还用于:
从所述终端设备的第一问答数据库中未检索到所述查询文本对应的答案,则基于所述查询意图从多个所述终端设备的第二问答数据库中检索所述查询文本对应的答案。
在另一个可能的实现中,所述装置1500还包括更新模块1504,用于将从所述第二问 答数据库中检索到的答案更新至所述第一问答数据库中。
根据本申请实施例的基于终端的问答装置1500可对应于执行本申请实施例中描述的方法,并且一种基于终端的问答装置1500中的各个模块的上述和其它操作和/或功能分别为了实现图3-10中的各个方法的相应流程,为了简洁,在此不再赘述。
本申请实施例还提供一种终端设备,包括至少一个处理器、存储器和通信接口,所述处理器用于执行图3-14所述的方法。
图16为本申请实施例提供的终端设备的结构示意图。
如图16所示,所述终端设备1600包括至少一个处理器1601、存储器1602和通信接口1603。其中,处理器1601、存储器1602和通信接口1603通信连接,可以通过有线(例如总线)的方式实现通信连接,也可以通过无线的方式实现通信连接。该通信接口1603用于接收其他设备发送的数据;存储器1602存储有计算机指令,处理器1601执行该计算机指令,执行前述方法实施例中的方法。
应理解,在本申请实施例中,该处理器1601可以是中央处理单元CPU,该处理器1601还可以是其他通用处理器、数字信号处理器(digital signal processor,DSP)、专用集成电路(application specific integrated circuit,ASIC)、现场可编程门阵列(field programmable gate array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者是任何常规的处理器等。
该存储器1602可以包括只读存储器和随机存取存储器,并向处理器1601提供指令和数据。存储器1602还可以包括非易失性随机存取存储器。
该存储器1602可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(read-only memory,ROM)、可编程只读存储器(programmable ROM,PROM)、可擦除可编程只读存储器(erasable PROM,EPROM)、电可擦除可编程只读存储器(electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(random access memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(static RAM,SRAM)、动态随机存取存储器(DRAM)、同步动态随机存取存储器(synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(double data date SDRAM,DDR SDRAM)、增强型同步动态随机存取存储器(enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(synchlink DRAM,SLDRAM)和直接内存总线随机存取存储器(direct rambus RAM,DR RAM)。
应理解,根据本申请实施例的终端设备1600可以执行实现本申请实施例中图3-14所示方法,该方法实现的详细描述参见上文,为了简洁,在此不再赘述。
本申请的实施例提供了一种计算机可读存储介质,其上存储有计算机程序,当所述计算机指令在被处理器执行时,使得上文提及的方法被实现。
本申请的实施例提供了一种芯片,该芯片包括至少一个处理器和接口,所述至少一个处理器通过所述接口确定程序指令或者数据;该至少一个处理器用于执行所述程序指令,以实现上文提及的方法。
本申请的实施例提供了一种计算机程序或计算机程序产品,该计算机程序或计算机程 序产品包括指令,当该指令执行时,令计算机执行上文提及的方法。
本领域普通技术人员应该还可以进一步意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、计算机软件或者二者的结合来实现,为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各示例的组成及步骤。这些功能究竟以硬件还是软件方式来执轨道,取决于技术方案的特定应用和设计约束条件。本领域普通技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
结合本文中所公开的实施例描述的方法或算法的步骤可以用硬件、处理器执轨道的软件模块,或者二者的结合来实施。软件模块可以置于随机存储器(RAM)、内存、只读存储器(ROM)、电可编程ROM、电可擦除可编程ROM、寄存器、硬盘、可移动磁盘、CD-ROM、或技术领域内所公知的任意其它形式的存储介质中。
以上所述的具体实施方式,对本申请的目的、技术方案和有益效果进行了进一步详细说明,所应理解的是,以上所述仅为本申请的具体实施方式而已,并不用于限定本申请的保护范围,凡在本申请的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。

Claims (20)

  1. 一种基于终端的问答方法,其特征在于,应用于终端设备,所述方法包括:
    获取查询文本;
    当所述查询文本对应的查询类别为第一查询类别时,从第一数据库中检索所述查询文本对应的答案;
    当所述查询文本对应的查询类别为第二查询类别时,从第二数据库中检索所述查询文本对应的答案;
    其中,所述第一数据库存储多对问答对,所述第二数据库存储多项结构化数据,所述第一数据库和第二数据库存储于所述终端设备。
  2. 根据权利要求1所述的方法,其特征在于,所述当所述查询类别为第一查询类别时,从第一数据库中检索所述查询文本对应的答案,包括:
    从所述第一数据库中检索,得到所述查询文本对应的候选答案集;所述候选答案集包括若干第一答案和若干第二答案,所述若干第一答案基于倒排索引结构检索得到,所述若干第二答案基于向量索引结构检索得到;
    从所述候选答案集中选择出所述查询文本对应的答案。
  3. 根据权利要求2所述的方法,其特征在于,所述查询文本对应的答案的选择与预设策略、倒排检索分数、向量检索分数、问题匹配度、问题答案匹配度、答案文本质量和答案时效性中的一种或多种相关;
    其中,所述问题匹配度表征所述查询文本与所述问答对中的问题的匹配程度;所述问题答案匹配度表征所述查询文本与所述问答对中的答案的匹配程度。
  4. 根据权利要求1-3任一项所述的方法,其特征在于,当所述查询文本对应的查询类别为第二查询类别时,所述查询文本对应的答案包括长答案和短答案,所述长答案的内容包括所述短答案的内容。
  5. 根据权利要求1-4任一项所述的方法,其特征在于,所述当所述查询类别为第二查询类别时,从第二数据库中检索所述查询文本对应的答案,包括:
    基于槽位词典识别所述查询文本对应的若干槽位;
    基于所述若干槽位,从所述第二数据库中检索所述查询文本对应的答案;所述槽位词典基于所述第二数据库中的多项结构化数据得到。
  6. 根据权利要求1-5任一项所述的方法,其特征在于,所述终端设备中部署轻量数据库和全量数据库,其中,所述全量数据库包括对所述终端设备中存储的多种格式的数据中进行提取得到的文本数据;所述轻量数据库包括全量数据库中符合预设规则的文本数据;所述轻量数据库的数据量小于或等于所述全量数据库的数据量的百分之一;
    所述第一数据库中的多条问答对和所述第二数据库中的多项结构化数据均基于所述轻量数据库中的文本数据得到。
  7. 根据权利要求6所述的方法,其特征在于,还包括:
    将所述终端设备中的轻量数据库与其他终端设备中的轻量数据库进行同步,以使所述终端设备中的轻量数据库包括所述其他终端设备的轻量数据库中的内容,所述终端设备与所述其他终端设备均为同一用户配置。
  8. 根据权利要求6或7所述的方法,其特征在于,所述终端设备中还部署第三数据库和第四数据库,所述第三数据库中的多对问答对和所述第四数据库中的多项结构化数据均基于所述全量数据库中的文本数据得到;
    所述方法还包括:
    当所述查询类别为第一查询类别时,从第一数据库中未检索到所述查询文本对应的答案时,从所述第三数据库中检索所述查询文本对应的答案;
    当所述查询类别为第二查询类别时,从第二数据库中未检索到所述查询文本对应的答案时,从所述第四数据库中检索所述查询文本对应的答案。
  9. 根据权利要求8所述的方法,其特征在于,所述方法还包括,将从所述第三数据库中检索到的答案更新至所述第一数据库;和/或,将从所述第四数据库中检索到的答案更新至所述第二数据库。
  10. 一种终端的交互方法,其特征在于,应用于终端设备,所述方法包括:
    接收搜索框输入的查询内容;
    响应搜索请求,在所述终端设备内搜索所述查询内容对应的答案;
    当所述答案为第一类别答案时,以长短答案的形式显示所述第一类别答案的内容;
    当所述答案为第二类别答案时,以表格的形式显示所述第二类别答案的内容。
  11. 根据权利要求10所述的方法,其特征在于,所述第一类别答案的内容,包括所述查询内容对应的长短答案,所述长短答案包括长答案和短答案,所述长答案的内容包括所述短答案的内容。
  12. 根据权利要求11所述的方法,其特征在于,所述第一类别答案的内容,还包括:所述长短答案的数据来源。
  13. 根据权利要求10-12任一项所述的方法,其特征在于,所述第二类别答案的内容,包括所述查询文本对应的若干项结构化数据。
  14. 根据权利要求13所述的方法,其特征在于,所述第二类别答案的内容还包括:对应若干项结构化数据中各项结构化数据的信息源控件;
    所述方法还包括:
    当检测到所述信息源控件被触发,则显示所述信息源控件对应的结构化数据的详细内容。
  15. 根据权利要求13或14所述的方法,其特征在于,所述第二类别答案的内容,还包括:所述查询内容对应的问题文本,所述问题文本中的槽位关键字以不同颜色进行显示。
  16. 根据权利要求10-15任一项所述的方法,其特征在于,所述第一类别答案的内容和/或所述第二类别答案的内容以卡片的形式进行显示。
  17. 根据权利要求10-16任一项所述的方法,其特征在于,所述搜索框位于所述终端设备的负一屏,或位于所述终端设备上安装的应用中的搜索入口。
  18. 一种终端设备,包括存储器和处理器,其特征在于,所述存储器中存储有可执行代码,所述处理器执行所述可执行代码,实现如权利要求1-9任一项,和/或权利要求10-17任一项所述的方法。
  19. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,当所述计算机程序在计算机中执行时,令计算机执行如权利要求1-9任一项,和/或权利要求10-17任一项所述的方法。
  20. 一种计算机程序或计算机程序产品,其特征在于,所述计算机程序或计算机程序产品包括指令,当所述指令被执行时,实现如权利要求1-9任一项,和/或权利要求10-17任一项所述的方法。
PCT/CN2022/113668 2022-08-19 2022-08-19 一种基于终端的问答方法及装置 WO2024036616A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/113668 WO2024036616A1 (zh) 2022-08-19 2022-08-19 一种基于终端的问答方法及装置

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/113668 WO2024036616A1 (zh) 2022-08-19 2022-08-19 一种基于终端的问答方法及装置

Publications (1)

Publication Number Publication Date
WO2024036616A1 true WO2024036616A1 (zh) 2024-02-22

Family

ID=89940425

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/113668 WO2024036616A1 (zh) 2022-08-19 2022-08-19 一种基于终端的问答方法及装置

Country Status (1)

Country Link
WO (1) WO2024036616A1 (zh)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101093505A (zh) * 2006-06-23 2007-12-26 佳能株式会社 文档检索系统、文档检索装置、文档检索方法、程序和存储介质
US20150019541A1 (en) * 2013-07-08 2015-01-15 Information Extraction Systems, Inc. Apparatus, System and Method for a Semantic Editor and Search Engine
CN106156135A (zh) * 2015-04-10 2016-11-23 华为技术有限公司 查询数据的方法及装置
CN107301213A (zh) * 2017-06-09 2017-10-27 腾讯科技(深圳)有限公司 智能问答方法及装置
CN108920497A (zh) * 2018-05-23 2018-11-30 北京奇艺世纪科技有限公司 一种人机交互方法及装置
CN112765424A (zh) * 2021-01-29 2021-05-07 北京字节跳动网络技术有限公司 数据查询方法、装置、设备及计算机可读介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101093505A (zh) * 2006-06-23 2007-12-26 佳能株式会社 文档检索系统、文档检索装置、文档检索方法、程序和存储介质
US20150019541A1 (en) * 2013-07-08 2015-01-15 Information Extraction Systems, Inc. Apparatus, System and Method for a Semantic Editor and Search Engine
CN106156135A (zh) * 2015-04-10 2016-11-23 华为技术有限公司 查询数据的方法及装置
CN107301213A (zh) * 2017-06-09 2017-10-27 腾讯科技(深圳)有限公司 智能问答方法及装置
CN108920497A (zh) * 2018-05-23 2018-11-30 北京奇艺世纪科技有限公司 一种人机交互方法及装置
CN112765424A (zh) * 2021-01-29 2021-05-07 北京字节跳动网络技术有限公司 数据查询方法、装置、设备及计算机可读介质

Similar Documents

Publication Publication Date Title
US20180349472A1 (en) Methods and systems for providing query suggestions
US10162865B2 (en) Generating image tags
WO2018222776A1 (en) Methods and systems for customizing suggestions using user-specific information
US9916396B2 (en) Methods and systems for content-based search
CN111247778A (zh) 使用web智能的对话式/多回合的问题理解
US20170109435A1 (en) Apparatus and method for searching for information
US20160055134A1 (en) Method and apparatus for providing summarized content to users
US20140324426A1 (en) Reminder setting method and apparatus
TW201322014A (zh) 以圈選方式進行檢索之輸入方法及其系統
US10241994B2 (en) Electronic device and method for providing content on electronic device
CN106096010B (zh) 自带搜索引擎功能的输入控制方法与装置
CN109918555B (zh) 用于提供搜索建议的方法、装置、设备和介质
CN111465918A (zh) 在预览界面中显示业务信息的方法及电子设备
US20230169134A1 (en) Annotation and retrieval of personal bookmarks
CN109543014B (zh) 人机对话方法、装置、终端及服务器
EP3762876A1 (en) Intelligent knowledge-learning and question-answering
CN114817755A (zh) 一种用户互动内容管理方法、装置和存储介质
KR20130103205A (ko) 휴대 전자기기 및 이의 제어 방법
CN108595107B (zh) 一种界面内容处理方法及移动终端
CN113822038A (zh) 一种摘要生成方法和相关装置
US11580303B2 (en) Method and device for keyword extraction and storage medium
WO2021098175A1 (zh) 录制语音包功能的引导方法、装置、设备和计算机存储介质
CN110929137B (zh) 文章推荐方法、装置、设备及存储介质
EP3105858B1 (en) Electronic device and method for extracting and using sematic entity in text message of electronic device
CN111555960A (zh) 信息生成的方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22955396

Country of ref document: EP

Kind code of ref document: A1