WO2020093720A1 - 基于语音识别的信息查询方法和装置 - Google Patents

基于语音识别的信息查询方法和装置 Download PDF

Info

Publication number
WO2020093720A1
WO2020093720A1 PCT/CN2019/095013 CN2019095013W WO2020093720A1 WO 2020093720 A1 WO2020093720 A1 WO 2020093720A1 CN 2019095013 W CN2019095013 W CN 2019095013W WO 2020093720 A1 WO2020093720 A1 WO 2020093720A1
Authority
WO
WIPO (PCT)
Prior art keywords
file
medical insurance
label
information query
document
Prior art date
Application number
PCT/CN2019/095013
Other languages
English (en)
French (fr)
Inventor
车惯红
Original Assignee
平安医疗健康管理股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安医疗健康管理股份有限公司 filed Critical 平安医疗健康管理股份有限公司
Publication of WO2020093720A1 publication Critical patent/WO2020093720A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Definitions

  • the present application relates to the medical technology field, and in particular to an information query method and device based on voice recognition.
  • Medical insurance is a social insurance system established to compensate workers for economic losses caused by disease risks.
  • users who purchase medical insurance or medical insurance information management users can query medical insurance policies and purchased medical insurance information on medical insurance management platforms based on Internet technology. information.
  • the embodiments of the present application provide an information query method and device based on voice recognition to solve the problem that users who have difficulty in using the search function of the medical insurance management platform cannot obtain medical insurance information.
  • a method for querying information based on speech recognition including:
  • the content of the first medical insurance file contains the keyword, and the medical insurance file storage system is used to store Medical insurance documents;
  • an information query device based on voice recognition including:
  • a request receiving module configured to receive an information query request sent by a voice collection terminal, where the information query request includes voice data collected by the voice collection terminal;
  • a voice recognition module used to perform voice recognition on the voice data to obtain an information query sentence
  • the file query module is used to search the first medical insurance file in the medical insurance file storage system using the information query statement as a keyword, and the content of the first medical insurance file contains the keyword, the medical insurance
  • the file storage system is used to store medical insurance files;
  • a file sending module is used to send the first medical insurance file to the voice collection terminal, so that the voice collection terminal displays the first medical insurance file.
  • another information query device based on voice recognition including a processor, a memory, and a communication interface, where the processor, memory, and communication interface are connected to each other, wherein the communication interface is used to receive or send data,
  • the memory is used to store application code for the voice query-based information query device to execute the above method, and the processor is configured to execute the above method of the first aspect.
  • a computer non-volatile readable storage medium stores a computer program, the computer program includes program instructions, and the program instructions are executed by a processor. When the processor executes the method of the first aspect.
  • the function of searching for medical insurance files based on the user's voice is realized by means of voice recognition.
  • the user only needs to obtain the medical insurance information that he wants to query through voice, which makes the operation of the medical insurance platform Users with difficulties can also obtain medical insurance information, which improves the user experience.
  • FIG. 1 is a schematic structural diagram of a medical insurance information query system based on voice recognition provided by an embodiment of the present application
  • FIG. 2 is a schematic flowchart of an information query method based on voice recognition provided by an embodiment of the present application
  • FIG. 3 is a schematic diagram of a medical insurance document displayed by a voice collection terminal provided by an embodiment of the present application
  • FIG. 4 is a schematic flowchart of another method for determining abnormality of drug reimbursement information provided by an embodiment of the present application.
  • FIG. 5 is a schematic diagram of a medical insurance document displayed by a voice collection terminal provided by an embodiment of the present application
  • FIG. 6 is a schematic structural diagram of an information query device based on voice recognition provided by an embodiment of the present application.
  • FIG. 7 is a schematic structural diagram of another information query device based on voice recognition provided by an embodiment of the present application.
  • FIG. 1 is a schematic structural diagram of a voice recognition-based medical insurance information query system provided by an embodiment of the present application.
  • the system includes a medical insurance information query server 101 and one or more voice collection terminals 102.
  • the voice collection terminal 102 is used to collect the user's voice data and send the collected voice data to the medical insurance information query server 101 for voice recognition.
  • the voice collection terminal 102 may be a computer, a tablet computer, an intelligent terminal device, and so on.
  • the voice collection terminal 102 may be a self-service query machine provided by a medical insurance institution (such as a social security bureau, an insurance company, etc.) for users to query medical insurance information.
  • the medical insurance information query server 101 is used to receive the query request sent by the voice collection terminal 102, and query the medical insurance information corresponding to the query request according to the query request and send it to the voice collection terminal.
  • the solution of the embodiment of the present application can be implemented. Next, the solution of the embodiment of the present application is introduced.
  • FIG. 2 is a schematic flowchart of an information query method based on voice recognition provided by an embodiment of the application.
  • the method can be implemented on the medical insurance information query server shown in FIG. 1 as shown in the figure. include:
  • the information query request may also carry the terminal identification of the voice collection terminal, which is used to uniquely identify the voice collection terminal in the medical insurance information query system, and the terminal identification of the voice collection terminal may be the voice collection The Internet protocol (IP) address of the terminal, the multimedia access control (MAC) address, or the identification assigned by the medical insurance information query server to the voice collection terminal, and so on.
  • IP Internet protocol
  • MAC multimedia access control
  • the voice collection terminal may issue an information query request when no other voice data is collected within a period of time after the voice data is collected, the period of time is greater than or equal to the first duration threshold, and the first duration threshold may be set to 30s , 1 minute and so on.
  • S202 Perform speech recognition on the speech data collected by the speech collection terminal to obtain an information query sentence.
  • a method based on a statistical model, or a method based on a vocal tract model and a voice indication, or a method based on a standard template matching, or a neural network-based method may be used for speech recognition
  • the voice data collected by the collection terminal is subjected to voice recognition to obtain information query sentences.
  • the voice data collected by the voice collection terminal can be preprocessed to obtain multiple voice segments corresponding to the voice data.
  • the voice data collected by the voice collection terminal can be sampled with a preset sampling period, and the continuous voice data can be converted into a discrete voice signal S (n).
  • the sampling period can be determined according to the Nyquist sampling theorem
  • is the pre-emphasis coefficient and ⁇ is greater than 0.9 and less than 0.9 1.
  • a window function can be used to frame discrete speech information to obtain multiple speech segments, where the window function can be any window function in a rectangular window, a Hamming window, or a Hanning window.
  • endpoint detection can be performed by means of endpoint detection based on energy, endpoint detection based on information entropy, or endpoint detection based on band variance.
  • feature extraction can be performed on each voice segment to obtain an observation sequence of M rows * N columns, where M is the dimension of the acoustic feature and N is the number of voice segments.
  • linear prediction cepstral coefficients (Mel-scale frequency cepstral coefficients, MFCC) feature extraction can be performed on each speech segment separately to convert each speech segment into M Dimension feature vector, the M dimension feature vectors of the multiple speech segments constitute an observation sequence of M rows * N columns.
  • the observation sequence is sent to a pre-trained state network based on Hidden Markov Model (HMM), and the matching degree with the observation sequence in the state network is greater than the pre-prediction
  • HMM Hidden Markov Model
  • the state network includes an acoustic model, a speech model, a dictionary model and a decoder that are pre-trained with a large amount of speech data.
  • the decoder After the observation sequence is sent to the state network, the decoder combines the acoustic model, the language model and the dictionary model to find The path with the highest probability is found, and the path is determined as the target path that best matches the observation sequence.
  • S203 Use the information query statement as a keyword to search for the first medical insurance file in the medical insurance file storage system, where the content of the first medical insurance file contains the keyword, and the medical insurance file storage system is used to store the medical insurance file.
  • the medical insurance file storage system is a system that stores multiple medical insurance files.
  • the medical insurance file storage system can be a local storage area of the medical insurance information query server; the medical insurance file storage system can also be queried by multiple medical insurance information
  • the medical insurance documents stored in the medical insurance document storage system may be medical insurance policy documents, personal medical insurance documents, treatment documents for insured persons, medical insurance coverage drug documents, medical insurance designated pharmacy management agreement documents, and medical insurance drug limited payment basis documents , Bed medical insurance payment standard documents, medical insurance diagnosis and treatment project agreement service hospital documents, chronic disease management policy documents, medical insurance fund payment method documents and other documents related to medical insurance information.
  • the information query statement is "hypertension" and the medical insurance files stored in the medical insurance file storage system are medical insurance file 1 to medical insurance file 100
  • the keyword "hypertension" determines medical insurance documents 3 to 10 as the first medical insurance documents.
  • the file operation plug-in may be used to sequentially close each opened medical insurance file.
  • the second way is to extract the medical insurance file of the medical insurance file storage system to obtain the file label of each medical insurance file, and then save the file label of each medical insurance file and the correspondence between the file label and the medical insurance file to
  • the file label of each medical insurance file is the content in the medical insurance file.
  • proper nouns such as chronic diseases, payment methods, etc.
  • disease types such as high blood pressure, diabetes, etc.
  • drug names can be extracted from various medical insurance documents (Such as Conlight, etc.) or some nouns or words related to the name of the drug as the file label of each medical insurance document.
  • the medical insurance files of the medical insurance file storage system are medical insurance file 1 to medical insurance file 100
  • the file labels extracted from medical insurance file 1 to insurance file 100 are file label 1 to file label 100
  • Table 1 the corresponding relationship between the file label and the insurance file is shown in Table 1.
  • the target file label that is the same as the information query statement or contains the information query statement is not found, it may be Search for the first medical insurance file in the medical insurance file storage system according to the first method described above.
  • the information query statement can be used as the File label, the information query statement and the corresponding relationship between the information query statement and the first medical insurance file are saved in the file label data table.
  • the name of the first medical insurance file containing the information query statement can be directly determined according to the correspondence between the information query statement and the medical insurance file, Furthermore, according to the name of the first medical insurance file, the first medical insurance file is acquired in the medical insurance file storage system, and the medical insurance file is not opened in turn for keyword search, thereby improving the efficiency of finding the medical insurance file.
  • the file label data table is shown in Table 1.
  • the information query statement is different from the file label 1 to file 100.
  • the first medical insurance document found by the first method above is the medical insurance document 95, then query the information After the correspondence between the sentence and the information query sentence and the medical insurance file 95 is saved in the file label data table, the file label data table may be as shown in Table 2.
  • Medical insurance file name 1 File label 1 Medical Insurance Document 1, Medical Insurance Document 3, Medical Insurance Document 7, ... 2 File label 2 Medical insurance document 5, medical insurance document 8, medical insurance document 10, ... ... ... ... 100 File label 100 Medical insurance document 4, medical insurance document 9, medical insurance document 25, ... 101 Information query statement Medical Insurance Document 95
  • the information query statement may be saved and the information query statement is counted at Search for the first statistical number of the first medical insurance file in the medical insurance file storage system as a keyword within the first period of time, and when the first statistical number of times is greater than the preset first number of times, use the information query statement as the first File label of the medical insurance file, and save the information query statement and the corresponding relationship between the information query statement and the first medical insurance file in the file label data table.
  • the first time period may be a length of time such as a week or a month, and the first number may be 20 times, 30 times, and so on.
  • the information query statement is "drug payment basis", the preset time period is one month, and the threshold is 50 times. If the information query statement "drug payment basis” is used as a keyword in the medical insurance file storage system within a month The number of times to search for the first medical insurance document exceeds 50 times, then the file name of the "medical payment basis” and the medical insurance document containing the search query containing the "pharmaceutical payment basis” search query shall be saved in the file label data table. in.
  • each file label in the file label data table can also be used to determine the second count of the first medical insurance file in the second time period, if the first file label is used in the second time period.
  • the second time period can be 1 month, 2 months, etc .; the second time can be 3 times, 5 times, etc.
  • the second time period is 3 months, and the second number is 1 time.
  • the file label data table is shown in Table 1, where the file label 1 is determined as the second statistical number of the first medical insurance file within 3 months If it is 0, delete the data of the file label 2 row in Table 2.
  • S204 Send the first medical insurance file to the voice collection terminal, so that the voice collection terminal displays the first medical insurance file.
  • the first medical insurance file may be sent to the voice collection terminal according to the terminal identification of the voice collection terminal in the information query request.
  • the voice collection terminal may display the name of the first medical insurance file in the form of icons, lists, etc., or may display the content of the first medical insurance file after opening the first medical insurance file.
  • the first medical insurance document displayed by the voice collection terminal may be as shown in A or B in FIG. 3.
  • the information query sentence is obtained by recognizing the voice data collected by the voice collection terminal, and the information query sentence is used as a keyword in the medical insurance file storage system to find a medical insurance file containing the keyword and sent to the voice
  • the collection terminal enables the voice collection terminal to display the medical insurance file, and realizes the function of searching and displaying the medical insurance file according to the voice.
  • the user only needs to obtain the medical insurance information to be queried through the voice, so that it will not be used Users who search the medical insurance platform can also obtain medical insurance information, which improves the user experience.
  • FIG. 4 is a schematic flowchart of another method for determining abnormality of drug reimbursement information provided by an embodiment of the present application. As shown in the figure, the method includes:
  • S302 Perform speech recognition on the speech data collected by the speech collection terminal to obtain an information query sentence.
  • S303 Use the information query statement as a keyword to search for the first medical insurance file in the medical insurance file storage system, where the content of the first medical insurance file contains the keyword, and the medical insurance file storage system is used to store the medical insurance file.
  • steps S301 to S303 reference may be made to the description of the above steps S201 to S203, which will not be repeated here.
  • S304 Determine the file label of the first medical insurance file according to the file label data table.
  • the file labels in the file label data table include file label 1 to file label 8, the medical insurance file stored in the medical insurance file system is medical insurance file 10, and the file label data table is shown in 3.
  • the file labels of the first medical insurance document are file label 1, file label 5, and file label 8.
  • the third medical insurance document may be determined according to the file label of the first medical insurance document, and the medical insurance document whose file label is the file label of the first medical insurance document may be determined as the third medical insurance document.
  • the first medical insurance document is the medical insurance document 1 in Table 3 above, and the medical insurance document 1 has the document label of document label 1, document label 5, and document label 8, it is determined that the document label is the medical insurance document of document label 1.
  • medical insurance document 3 medical insurance document 8, medical label for document label 5, medical insurance document for medical label 2, medical insurance document 5, medical insurance document for document label 8 for medical insurance document 5, medical insurance
  • the document 9 further determines that the medical insurance document 3, the medical insurance document 8, the medical insurance document 2, the medical insurance document 5 and the medical insurance document 9 are the third medical insurance documents.
  • S306 Send the first medical insurance file and the third medical insurance file to the voice collection terminal, so that the voice collection terminal displays the third medical insurance file in association with the first medical insurance file.
  • multiple third medical insurance documents may be sent to the voice collection terminal; or the fourth medical insurance document may be determined in the third medical insurance document, and the The four medical insurance documents are sent to the voice collection terminal.
  • the fourth medical insurance document is the third medical insurance document with the largest number of file tags in the third medical insurance document and the file label of the first medical insurance document.
  • the same document label of the medical insurance document 3 and the medical insurance document 1 is the document label 1
  • the same document label of the medical insurance document 8 and the medical insurance document 1 is the document label 1.
  • the medical insurance document 2 and the medical insurance document 1 have the same document label as the document label 5
  • the medical insurance document 5 and the medical insurance document 1 have the same document label as the document label 5 and the document label 8
  • the medical insurance document 9 and The same document label of the medical insurance document 1 is the document label 8
  • the medical insurance document 5 is determined as the fourth medical insurance document.
  • the voice collection terminal may display the name of the first medical insurance file while displaying the third medical insurance file with the icon or list The name. If the voice collection terminal displays the content of the first medical insurance file after opening the first medical insurance file, the voice collection terminal may display the name of the third medical insurance file in the form of a floating ball, a bullet frame, etc. without displaying the first medical insurance The content area of the file.
  • the first medical insurance document is a chronic disease management policy document
  • the third medical insurance document is a medical insurance drug limited payment basis document
  • the first medical insurance document displayed on the voice collection terminal is associated with the third medical insurance document. As shown in A or B in Figure 5.
  • the first medical insurance document containing the information query sentence is searched for in the medical insurance file storage system by identifying the information query sentence based on the voice data collected by the voice collection terminal, and according to the first medical insurance document
  • the file label of the file determines the third medical insurance file with the same file label as the first medical insurance file, and sends the first medical insurance file and the third medical insurance file to the voice collection terminal so that the voice collection terminal can display the first medical insurance file and
  • the third medical insurance document realizes the associated search and display of medical insurance documents based on voice, and the user can obtain the medical insurance information and the medical insurance information associated with it that he wants to query only through voice, thereby improving the user experience.
  • FIG. 6 is a schematic structural diagram of an information query device based on voice recognition provided by an embodiment of the present application.
  • the device may be the medical insurance information query server shown in FIG. 1 or a part of the medical insurance information query server
  • the device 40 includes:
  • the request receiving module 401 is configured to receive an information query request sent by a voice collection terminal, where the information query request includes voice data collected by the voice collection terminal;
  • the voice recognition module 402 is used for performing voice recognition on the voice data to obtain an information query sentence;
  • the file query module 403 is used to search the first medical insurance file in the medical insurance file storage system using the information query statement as a keyword, and the content of the first medical insurance file includes the keyword, the medical Insurance file storage system is used to store medical insurance files;
  • the file sending module 404 is configured to send the first medical insurance file to the voice collection terminal, so that the voice collection terminal displays the first medical insurance file.
  • the file query module 403 is specifically used to:
  • the label data table is used to store the file label of each medical insurance file obtained by performing label extraction on each medical insurance file in the medical insurance file storage system;
  • the medical insurance file corresponding to the target file label is determined as the first medical insurance file according to the correspondence between the file label and the medical insurance file.
  • the file query module 403 is also used to:
  • the medical insurance document containing the keyword is determined as the first medical insurance document.
  • the file query module 403 is also used to:
  • the keyword is used as the file label of the first medical insurance file and saved in the file label data table.
  • the device 40 further includes:
  • a file label determination module 405, configured to determine the file label of the first medical insurance file according to the file label data table
  • the associated file search module 406 is configured to search for a third medical insurance file in the medical insurance file storage system according to the correspondence between the file label and the medical insurance file, and at least one file label in the file label of the third medical insurance file The same as the document label of the first medical insurance document;
  • the file sending module 404 is further configured to send the third medical insurance file to the voice collection terminal, so that the voice collection terminal displays the third medical insurance file in association with the third medical insurance file Medical insurance documents.
  • the speech recognition module 402 is specifically used to:
  • the information query sentence is obtained by performing speech recognition on the speech data through a method based on a statistical model, or a method based on a vocal tract model and speech knowledge, or a method based on a standard template matching, or a neural network-based method.
  • the speech recognition module 402 is specifically used to:
  • the observation sequence is sent to a state network based on a hidden Markov model obtained in advance, and a target path whose matching degree with the observation sequence is greater than a preset threshold is found in the state network, and the target path
  • the corresponding text content is determined to be the information query sentence.
  • the file query module 403 is also used to:
  • the file query module 403 is also used to:
  • the first file label in the file label data table is used as a keyword to search for the second count of the first medical insurance file, the first file label is the first 1.
  • an information query device based on voice recognition obtains an information query sentence by recognizing voice data collected by a voice collection terminal, and uses the information query sentence as a keyword to search for content in the medical insurance file storage system that contains the keyword
  • the medical insurance file is sent to the voice collection terminal, so that the voice collection terminal can display the medical insurance file, and the function of searching and displaying the medical insurance file according to the voice is realized.
  • the user only needs to use voice to obtain the medical Insurance information enables users who do not search the medical insurance platform to obtain medical insurance information, which improves the user experience.
  • FIG. 7 is a schematic structural diagram of another information query device based on voice recognition provided by an embodiment of the present application.
  • the device 50 includes a processor 501, a memory 502, and a communication interface 503.
  • the processor 501 is connected to the memory 502 and the communication interface 503.
  • the processor 501 may be connected to the memory 502 and the communication interface 503 through a bus.
  • the processor 501 is configured to support the voice recognition-based information query device to perform the corresponding functions in the voice recognition-based information query method described in FIGS. 2-5.
  • the processor 501 may be a central processing unit (CPU), a network processor (NP), a hardware chip, or any combination thereof.
  • the above-mentioned hardware chip may be an application specific integrated circuit (application specific integrated circuit, ASIC), a programmable logic device (programmable logic device, PLD), or a combination thereof.
  • the PLD may be a complex programmable logic device (complex programmable logic device (CPLD), a field programmable logic gate array (field-programmable gate array, FPGA), a general array logic (generic array logic, GAL), or any combination thereof.
  • the memory 502 is used to store program codes and the like.
  • the memory 502 may include volatile memory (volatile memory, VM), such as random access memory (random access memory, RAM); the memory 502 may also include non-volatile memory (non-volatile memory, NVM), such as read-only Memory (read-only memory, ROM), flash memory (flash memory), hard disk (hard disk drive) or solid-state drive (SSD); memory 502 may also include a combination of the aforementioned types of memory.
  • the memory 502 is used to store medical insurance files, file label data tables, and the like.
  • the communication interface 503 is used to send or receive data.
  • the processor 501 may call the program code to perform the following operations:
  • the content of the first medical insurance file contains the keyword, and the medical insurance file storage system is used to store Medical insurance documents;
  • the first medical insurance file is sent to the voice collection terminal through the communication interface 503, so that the voice collection terminal displays the first medical insurance file.
  • each operation may also correspond to the corresponding description of the method embodiment shown in FIGS. 2 to 5; the processor 501 may also cooperate with the communication interface 503 to perform other operations in the above method embodiment.
  • An embodiment of the present application further provides a computer non-volatile readable storage medium, the computer non-volatile readable storage medium stores a computer program, the computer program includes program instructions, and the program instructions are executed by a computer
  • the computer may be a part of the information query device based on voice recognition mentioned above.
  • the processor 501 described above.
  • the non-volatile readable storage medium may be a magnetic disk, an optical disk, a read-only memory (Read-Only Memory, ROM) or a random storage memory (Random Access Memory, RAM), etc.

Abstract

本申请提供确定基于语音识别的信息查询方法和装置,其中,所述方法包括:接收语音采集终端发送的信息查询请求,所述信息查询请求包括所述语音采集终端采集的语音数据;对所述语音数据进行语音识别得到信息查询语句;将所述信息查询语句作为关键字,在医疗保险文件存储系统中查找第一医疗保险文件,所述第一医疗保险文件的内容中包含所述关键字,所述医疗保险文件存储系统用于存储医疗保险文件;将所述第一医疗保险文件发送给所述语音采集终端,以使所述语音采集终端显示所述第一医疗保险文件。该方案可以帮助不会利用对医疗保险平台的进行搜索的用户获取医疗保险信息,提高用户体验。

Description

基于语音识别的信息查询方法和装置
本申请要求于2018年11月7日提交中国专利局、申请号为2018113232950、申请名称为“基于语音识别的信息查询方法和装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及医疗技术领域,尤其涉及基于语音识别的信息查询方法和装置。
背景技术
医疗保险是为了补偿劳动者因疾病风险造成的经济损失而建立的一项社会保险制度。随着互联网技术的发展,购买医疗保险的用户或医疗保险信息管理用户可以通过在基于互联网技术的医疗保险管理平台上查询想要了解的医疗保险政策、购买的医疗保险信息等与医疗保险相关的信息。
但是,对于一些文化水平较低的、年龄较高的用户来说,由于这些用户不会打字或者不会使用医疗保险管理平台的搜索功能,所以他们无法利用医疗保险管理平台进行医疗保险信息的查询。
申请内容
本申请实施例提供基于语音识别的信息查询方法和装置,解决使用医疗保险管理平台的搜索功能存在困难的用户无法获取医疗保险信息的问题。
第一方面,提供一种基于语音识别的信息查询方法,包括:
接收语音采集终端发送的信息查询请求,所述信息查询请求包括所述语音采集终端采集的语音数据;
对所述语音数据进行语音识别得到信息查询语句;
将所述信息查询语句作为关键字,在医疗保险文件存储系统中查找第一医疗保险文件,所述第一医疗保险文件的内容中包含所述关键字,所述医疗保险文件存储系统用于存储医疗保险文件;
将所述第一医疗保险文件发送给所述语音采集终端,以使所述语音采集终端显示所述第一医疗保险文件。
第二方面,提供一种基于语音识别的信息查询装置,包括:
请求接收模块,用于接收语音采集终端发送的信息查询请求,所述信息查询请求包括所述语音采集终端采集的语音数据;
语音识别模块,用于对所述语音数据进行语音识别得到信息查询语句;
文件查询模块,用于将所述信息查询语句作为关键字,在医疗保险文件存储系统中查找第一医疗保险文件,所述第一医疗保险文件的内容中包含所述关键字,所述医疗保险文件存储系统用于存储医疗保险文件;
文件发送模块,用于将所述第一医疗保险文件发送给所述语音采集终端,以使所述语音采集终端显示所述第一医疗保险文件。
第三方面,提供另一种基于语音识别的信息查询装置,包括处理器、存储器以及通信接口,所述处理器、存储器和通信接口相互连接,其中,所述通信接口用于接收或发送数据,所述存储器用于存储基于语音识别的信息查询装置执行上述方法的应用程序代码,所述处理器被配置用于执行上述第一方面的方法。
第四方面,提供一种计算机非易失性可读存储介质,所述计算机非易失性可读存储介质存储有计算机程序,所述计算机程序包括程序指令,所述程序指令当被处理器执行时使所述处理器执行上述第一方面的方法。
本申请实施例中,通过语音识别的方式实现了根据用户的语音进行医疗保险文件进行搜索的功能,用户只需要通过语音即可获取到想要查询的医疗保险信息,使得对医疗保险平台的操作存在困难的用户也能获取医疗保险信息,提高了用户体验。
附图说明
图1是本申请实施例提供的一种基于语音识别的医疗保险信息查询系统的结构示意图;
图2是本申请实施例提供的一种基于语音识别的信息查询方法的流程示意图;
图3是本申请实施例提供的语音采集终端显示医疗保险文件的示意图;
图4是本申请实施例提供的另一种确定药品报销信息异常的方法的流程示意图;
图5是本申请实施例提供的语音采集终端显示医疗保险文件的示意图;
图6是本申请实施例提供的一种基于语音识别的信息查询装置的组成结构示意图;
图7是本申请实施例提供的另一种基于语音识别的信息查询装置的组成结构示意图。
具体实施方式
下面将结合图1至图7,对本申请实施例提供的基于语音识别的信息查询方法和装置进行说明。
参见图1,图1是本申请实施例提供的一种基于语音识别的医疗保险信息查询系统的结构示意图,如图所示,该系统包括医疗保险信息查询服务器101和一个或多个语音采集终端102。语音采集终端102用于采集用户的语音数据,并将采集到的语音数据发送给医疗保险信息查询服务器101进行语音识别。语音采集终端102可以为电脑、平板电脑、智能终端设备,等等。语音采集终端102可以为医疗保险机构(如社保局、保险公司等)提供的供用户查询医疗保险信息的自助查询机。医疗保险信息查询服务器101用于接收语音采集终端102发送的查询请求,根据该查询请求查询与该查询请求对应的医疗保险信息发送给语音采集终端。
基于图1所示基于语音识别的医疗保险信息查询系统,可以实现本申请实施例的方案,接下来介绍本申请实施例的方案。
参见图2,图2是申请实施例提供的一种基于语音识别的信息查询方法的流程示意图,该方法可以实现在上述图1所示的医疗保险信息查询服务器上,如图所示,该方法包括:
S201,接收语音采集终端发送的信息查询请求,信息查询请求包括语音采集终端采集的语音数据。
可选地,信息查询请求还可以携带该语音采集终端的终端标识,该终端标识用于在上述医疗保险信息查询系统中唯一地标识该语音采集终端,语音采集终端的终端标识可以为 该语音采集终端的互联网协议(Internet protocol,IP)地址、多媒体接入控制(media access control,MAC)地址或者医疗保险信息查询服务器为该语音采集终端分配的标识,等等。
语音采集终端可以在采集到该语音数据之后的一段时间内未采集到其他的语音数据时,发出信息查询请求,该一段时间的时长大于或等于第一时长阈值,第一时长阈值可以设置为30s,1分钟等时间长度。
S202,对语音采集终端采集的语音数据进行语音识别得到信息查询语句。
具体地,可以通过基于统计模型的方法,或者,基于声道模型和语音指示的方法,或者,基于标准模板匹配的方法,或者,基于神经网络的方法中的其中一种语音识别的方法对语音采集终端采集的语音数据进行语音识别,得到信息查询语句。
以下以语音识别的方法为基于统计模型的方法对语音采集终端采集的语音数据进行语音识别得到信息查询语句的具体过程。
1,首先,可以对语音采集终端采集的语音数据进行预处理,得到语音数据对应的多个语音小段。
具体地,可以以预设的采样周期对语音采集终端采集的语音数据进行采样,将连续的语音数据变换为离散化的语音信号S(n),采样周期可以为根据奈奎斯特采样定理确定的周期;然后通过传递函数为H(Z)=1-αZ -1的数字滤波器对离散后的语音信号进行滤波,增加语音信号的高频分辨率,α为预加重系数,α大于0.9小于1;最后,可以利用窗函数对离散的语音信息进行分帧处理得到多个语音小段,其中,窗函数可以为矩形窗、汉明窗或汉宁窗中的任意一种窗函数。
可选地,还可以通过端点检测剔除语音小段中的噪声和干扰。其中,可以通过基于能量的端点检测、基于信息熵的端点检测或基于频带方差的端点检测等方式进行端点检测。
2,在得到语音数据对应的多个语音小段后,可以分别对各个语音小段进行特征提取,得到M行*N列的观测序列,其中,M为声学特征的维度,N为语音小段的数量。
具体地,可以分别对各个语音小段进行线性预测倒谱系数(linear prediction cepstral coefficients)特征提取或者梅尔频率倒谱系数(Mel-scale frequency cepstral coefficients,MFCC)特征提取,将各个语音小段转化为M维特征向量,该多个语音小段的M维特征向量组成了M行*N列的观测序列。
3,在得到观测序列后,将观测序列送入预先训练得到的基于隐马尔可夫模型(Hidden Markov Model,HMM)的状态网络中,在该状态网络中查找与该观测序列的匹配度大于预设阈值的目标路径,将该目标路径对应的文本内容确定为信息查询语句。
其中,该状态网络包括预先利用大量的语音数据训练得到的声学模型、语音模型、词典模型以及解码器,将该观测序列送入状态网络中后,解码器结合声学模型、语言模型以及词典模型找出概率最大的路径,将该路径确定为与观测序列最匹配的目标路径。
S203,将信息查询语句作为关键字,在医疗保险文件存储系统中查找第一医疗保险文件,其中,第一医疗保险文件的内容中包含关键字,医疗保险文件存储系统用于存储医疗保险文件。
这里,医疗保险文件存储系统为存储有多个医疗保险文件的系统,医疗保险文件存储系统可以为医疗保险信息查询服务器本地的存储区域;医疗保险文件存储系统也可以为由 多个医疗保险信息查询服务器组成的分布式存储系统,即多个医疗文件分布式存储在多个医疗保险信息查询服务器中。医疗保险文件存储系统中存储的医疗保险文件可以为医疗保险政策文件、个人医疗保险文件、参保人员待遇文件、医疗保险范围药品文件、医疗保险定点药店管理协议文件、医疗保险药品限定支付依据文件、床位医疗保险支付标准文件、医疗保险诊疗项目约定服务医院文件、慢特病管理政策文件、医保基金支付方式文件等与医疗保险信息相关的文件。
本申请实施例中,可以有两种在医疗保险文件存储系统中查找第一医疗保险文件的方式:
第一种方式,可以通过文件操作插件依次打开医疗保险文件存储系统中的医疗保险文件,并通过文件操作插件将该信息查询语句作为查询的关键字,在已经打开的医疗保险文件中查找该信息查询语句,如果在第二医疗保险文件中查询到该信息查询语句,则确定第二医疗保险文件中包含该信息查询语句,则将该第二医疗保险文件确定为第一医疗保险文件。
例如,信息查询语句为“高血压”,医疗保险文件存储系统存储中的医疗保险文件为医疗保险文件1~医疗保险文件100,则可以通过文件操作插件依次打开医疗保险文件1~医疗保险文件100,然后通过文件操作插件将“高血压”作为查询的关键字,分别在医疗保险文件1~医疗保险文件100查找“高血压”这一关键字,假设在医疗保险文件3~10中均查找到“高血压”这一个关键字,则将医疗保险文件3~10确定为第一医疗保险文件。
可选地,在通过文件操作插件对医疗保险文件存储系统中的每个医疗保险文件均执行完查找该信息查询语句之后,还可以通过该文件操作插件依次关闭已经打开的各个医疗保险文件。
第二种方式,可以对医疗保险文件存储系统的医疗保险文件进行标签提取,得到各个医疗保险文件的文件标签,然后将各个医疗保险文件的文件标签以及文件标签与医疗保险文件的对应关系保存到文件标签数据表中,其中,各个医疗保险文件的文件标签为该医疗保险文件中的内容。在查找包含信息查询语句的第一医疗保险文件时,可以遍历该文件标签数据表,直到找到与该信息查询语句相同或包含该信息查询语句的文件标签,将该与该信息查询语句相同或包含该信息查询语句的文件标签确定为目标文件标签,然后根据该文件标签数据表中存储的文件标签与医疗保险文件的对应关系将与该目标文件标签对应的医疗保险文件确定为第一医疗保险文件。
具体地,可以从各个医疗保险文件中提取专有名词(如慢特病、支付方式等)、疾病病种(如高血压、糖尿病等)或与疾病病种相关的一些名词或词语、药品名称(如康莱特等)或与药品名称相关的一些名词或词语等作为各个医疗保险文件的文件标签。
举例来进行说明,假设医疗保险文件存储系统的医疗保险文件为医疗保险文件1~医疗保险文件100,分别对医疗保险文件1~保险文件100提取得到的文件标签为文件标签1~文件标签100,其中,文件标签与保险文件的对应关系如表1所示。
标签编号 文件标签 医疗保险文件名称
1 文件标签1 医疗保险文件1,医疗保险文件3,医疗保险文件8,…
2 文件标签2 医疗保险文件5,医疗保险文件8,医疗保险文件10,…
100 文件标签100 医疗保险文件4,医疗保险文件8,医疗保险文件25,…
表1
假设信息查询语句为文件标签2,则遍历表1,当遍历至文件标签2时,文件标签2与信息查询语句相同,则确定文件标签2为目标标签,则将与文件标签2对应的医疗保险文件5,医疗保险文件8以及医疗保险文件10确定为第一医疗保险文件。
可选地,如果在查找包含信息查询语句的第一保险文件时,在遍历完该文件标签数据表之后,未查找到与该信息查询语句相同或包含该信息查询语句的目标文件标签时,可以按照上述第一种方式在医疗保险文件存储系统中查找第一医疗保险文件。
可选地,在通过上述第二种方式未查找到第一医疗保险文件,但是通过上述第一种方式查找到第一医疗保险文件时,可以将该信息查询语句作为该第一医疗保险文件的文件标签,该将信息查询语句以及该信息查询语句与第一医疗保险文件的对应关系保存到文件标签数据表中。通过这种方式,在后续将该信息查询语句作为关键字查询医疗保险文件时,可以根据该信息查询语句与医疗保险文件的对应关系直接确定包含该信息查询语句的第一医疗保险文件的名称,进而根据该第一医疗保险文件的名称在医疗保险文件存储系统中获取该第一医疗保险文件,而不用再依次打开医疗保险文件进行关键字搜索,提高查找医疗保险文件的效率。
例如,文件标签数据表如表1所示,信息查询语句与文件标签1~文件100均不相同,通过上述第一种方式查找到的第一医疗保险文件为医疗保险文件95,则将信息查询语句以及信息查询语句与医疗保险文件95的对应关系保存到文件标签数据表之后,文件标签数据表可以如表2所示。
标签编号 文件标签 医疗保险文件名称
1 文件标签1 医疗保险文件1,医疗保险文件3,医疗保险文件7,…
2 文件标签2 医疗保险文件5,医疗保险文件8,医疗保险文件10,…
100 文件标签100 医疗保险文件4,医疗保险文件9,医疗保险文件25,…
101 信息查询语句 医疗保险文件95
表2
可选地,在通过上述第二种方式未查找到第一医疗保险文件,但是通过上述第一种方式查找到第一医疗保险文件时,可以保存该信息查询语句,并统计该信息查询语句在第一时间段内作为关键字在医疗保险文件存储系统中查找第一医疗保险文件的第一统计次数,当第一统计次数大于预设第一次数时,将该信息查询语句作为该第一医疗保险文件的文件标签,并将该信息查询语句以及该信息查询语句与第一医疗保险文件的对应关系保存到文件标签数据表中。其中,第一时间段可以为一周、一个月等时间长度,第一次数可以为20次,30次,等等。通过这种方式,可以将经常被用作关键字的信息查询语句保存到文件标签数据表中,一方面,可以尽量减少文件标签的增加,另一方,将常用的关键字保存到文件标签数据表中也可以起到提高查找医疗保险文件的效率的作用。
例如,信息查询语句为“药品支付依据”,预设时间段为一个月,次数阈值为50次,如 果在一个月内“药品支付依据”这一信息查询语句作为关键字在医疗保险文件存储系统中查找第一医疗保险文件的次数超过了50次,则将“药品支付依据”以及查找得到的包含“药品支付依据”这一信息查询语句的医疗保险文件的文件名称对应保存到文件标签数据表中。
可选地,还可以统计文件标签数据表中的各个文件标签在第二时间段内被用于确定第一医疗保险文件的第二统计次数,如果第一文件标签在第二时间段内被用于确定第一医疗保险文件的第二统计次数小于第二次数,则在文件标签数据表中删除第一文件标签以及第一文件标签与医疗保险文件的对应关系。第二时间段可以为1个月,2个月等时间长度;第二次数可以3次,5次等次数。通过在文件标签数据表中删除文件标签的方式,可以使文件标签数据表中的文件标签的数量维持在一个比较平衡的水平,一方面,可以起到节省存储空间的作用,另一方面,文件标签的减少可以提高遍历文件标签的速度,进而提高查找第一医疗保险文件的速度。
例如,第二时间段为3个月,第二次数为1次,文件标签数据表如表1所示,其中,文件标签1在3个月内被确定第一医疗保险文件的第二统计次数为0,则删除表2中文件标签2这一行的数据。
S204,将第一医疗保险文件发送给语音采集终端,以使语音采集终端显示第一医疗保险文件。
具体地,可以根据信息查询请求中的语音采集终端的终端标识,将第一医疗保险文件发送给语音采集终端。语音采集终端可以通过图标、列表等形式显示该第一医疗保险文件的名称,也可以将第一医疗保险文件打开后,显示该第一医疗保险文件的内容。
例如,第一医疗保险文件为慢特病管理政策文件,则语音采集终端显示的第一医疗保险文件可以如图3中的A或B所示。
本申请实施例中,通过对语音采集终端采集到的语音数据进行识别得到信息查询语句,将信息查询语句作为关键字在医疗保险文件存储系统中查找内容包含该关键字的医疗保险文件发送给语音采集终端,使得语音采集终端能够显示该医疗保险文件,实现了根据语音进行医疗保险文件的搜索以及显示的功能,用户只需要通过语音即可获取到想要查询的医疗保险信息,使得不会利用对医疗保险平台的进行搜索的用户也能获取医疗保险信息,提高了用户体验。
在一些可能的场景中,在利用文件标签数据表保存各个医疗保险文件的文件标签以及文件标签与医疗保险文件的对应关系的情况下,除了将用户要查询的医疗保险信息对应的医疗保险文件发送给语音采集终端进行显示外,还可以将与该医疗保险信息相关联的医疗保险信息对应的医疗保险文件发送给语音采集终端进行显示。参见图4,图4是本申请实施例提供的另一种确定药品报销信息异常的方法的流程示意图,如图所示,所述方法包括:
S301,接收语音采集终端发送的信息查询请求,信息查询请求包括语音采集终端采集的语音数据。
S302,对语音采集终端采集的语音数据进行语音识别得到信息查询语句。
S303,将信息查询语句作为关键字,在医疗保险文件存储系统中查找第一医疗保险文件,其中,第一医疗保险文件的内容中包含关键字,医疗保险文件存储系统用于存储医疗 保险文件。
这里,步骤S301~S303的具体实现方式可参考上述步骤S201~S203的描述,此处不再赘述。
S304,根据文件标签数据表确定第一医疗保险文件的文件标签。
例如,文件标签数据表中的文件标签有文件标签1~文件标签8,医疗保险文件系统中存储的医疗保险文件为医疗保险文件10,文件标签数据表如3所示。
标签编号 文件标签 医疗保险文件名称
1 文件标签1 医疗保险文件1,医疗保险文件3,医疗保险文件8
2 文件标签2 医疗保险文件5,医疗保险文件8,医疗保险文件10
3 文件标签3 医疗保险文件3,医疗保险文件4,医疗保险文件5
4 文件标签4 医疗保险文件4,医疗保险文件8,医疗保险文件9
5 文件标签5 医疗保险文件1,医疗保险文件2,医疗保险文件5
6 文件标签6 医疗保险文件3,医疗保险文件6,医疗保险文件9
7 文件标签7 医疗保险文件2,医疗保险文件4,医疗保险文件7
8 文件标签8 医疗保险文件1,医疗保险文件5,医疗保险文件9
表3
假设第一医疗保险文件为医疗保险文件1,则根据表3可确定第一医疗保险文件的文件标签为文件标签1,文件标签5,文件标签8。
S305,根据文件标签与医疗保险文件的对应关系在医疗保险文件存储系统中查找第三医疗保险文件,第三医疗保险文件的文件标签中至少有一个文件标签与第一医疗保险文件的文件标签相同。
这里,可以根据第一医疗保险文件的文件标签确定第三医疗保险文件,将文件标签为第一医疗保险文件的文件标签的医疗保险文件确定为第三医疗保险文件。
例如,第一医疗保险文件为上述表3中的医疗保险文件1,医疗保险文件1的文件标签为文件标签1,文件标签5,文件标签8,则确定文件标签为文件标签1的医疗保险文件为医疗保险文件3,医疗保险文件8,文件标签为文件标签5的医疗保险文件为医疗保险文件2,医疗保险文件5,文件标签为文件标签8的医疗保险文件为医疗保险文件5,医疗保险文件9,进而确定医疗保险文件3,医疗保险文件8,医疗保险文件2,医疗保险文件5以及医疗保险文件9为第三医疗保险文件。
S306,将第一医疗保险文件和第三医疗保险文件发送给语音采集终端,以使语音采集终端在显示第一医疗保险文件的同时,关联显示第三医疗保险文件。
可选地,在第三医疗保险文件有多个的情况下,可以将多个第三医疗保险文件发送给语音采集终端;也可以在第三医疗保险文件中确定第四医疗保险文件,将第四医疗保险文件发送给语音采集终端,第四医疗保险文件为第三医疗保险文件中文件标签与第一医疗保险文件的文件标签相同的数量最多的第三医疗保险文件。
例如,在上述步骤S306确定的第三医疗保险文件中医疗保险文件3与医疗保险文件1的相同的文件标签为文件标签1,医疗保险文件8与医疗保险文件1的相同的文件标签为文件标签1,医疗保险文件2与医疗保险文件1的相同的文件标签为文件标签5,医疗保险 文件5与医疗保险文件1的相同的文件标签为文件标签5和文件标签8,,医疗保险文件9与医疗保险文件1的相同的文件标签为文件标签8,则将医疗保险文件5确定为第四医疗保险文件。
这里,如果语音采集终端以图标、列表等形式显示第一医疗保险文件的名称,则语音采集终端可以在显示第一医疗保险文件的名称的同时,以图标或者列表的同时显示第三医疗保险文件的名称。如果语音采集终端将第一医疗保险文件打开后显示第一医疗保险文件的内容,则语音采集终端可以以悬浮球、弹框等形式将第三医疗保险文件的名称显示在未显示第一医疗保险文件的内容的区域上。
例如,第一医疗保险文件为慢特病管理政策文件,第三医疗保险文件为医疗保险药品限定支付依据文件,则语音采集终端显示的第一医疗保险文件的同时关联显示第三医疗保险文件可以如图5中的A或B所示。
本申请实施例中,通过根据对语音采集终端采集到的语音数据进行识别得到的信息查询语句在医疗保险文件存储系统中查找包含信息查询语句的第一医疗保险文件,并根据第一医疗保险文件的文件标签确定文件标签与第一医疗保险文件相同的第三医疗保险文件,将第一医疗保险文件和第三医疗保险文件发送给语音采集终端,使得语音采集终端能够显示第一医疗保险文件和第三医疗保险文件,实现了根据语音进行医疗保险文件的关联搜索以及显示,用户仅通过语音即可获取到想要查询的医疗保险信息和与其关联的医疗保险信息,提高了用户体验。
上面介绍了申请实施例的方法,下面介绍申请实施例的装置。
参见图6,图6是本申请实施例提供的一种基于语音识别的信息查询装置的组成结构示意图,该装置可以为图1所示的医疗保险信息查询服务器或该医疗保险信息查询服务器的一部分,该装置40包括:
请求接收模块401,用于接收语音采集终端发送的信息查询请求,所述信息查询请求包括所述语音采集终端采集的语音数据;
语音识别模块402,用于对所述语音数据进行语音识别得到信息查询语句;
文件查询模块403,用于将所述信息查询语句作为关键字,在医疗保险文件存储系统中查找第一医疗保险文件,所述第一医疗保险文件的内容中包含所述关键字,所述医疗保险文件存储系统用于存储医疗保险文件;
文件发送模块404,用于将所述第一医疗保险文件发送给所述语音采集终端,以使所述语音采集终端显示所述第一医疗保险文件。
在一种可能的设计中,所述文件查询模块403具体用于:
将所述信息查询语句作为关键字,遍历文件标签数据表,在所述文件标签数据表中查找目标文件标签,所述目标文件标签与所述关键字相同或包含所述关键字,所述文件标签数据表用于存储对所述医疗保险文件存储系统中的各个医疗保险文件进行标签提取得到的所述各个医疗保险文件的文件标签;
在查找到所述目标文件标签的情况下,根据文件标签与医疗保险文件的对应关系将与所述目标文件标签对应的医疗保险文件确定为第一医疗保险文件。
在一种可能的设计中,所述文件查询模块403还用于:
在未查找到所述目标文件标签的情况下,通过文件操作插件依次打开所述医疗保险文件存储系统中的医疗保险文件;
通过所述文件操作插件将所述关键字作为查询关键字在所述医疗保险文件中查找所述关键字;
将包含所述关键字的医疗保险文件确定为第一医疗保险文件。
在一种可能的设计中,所述文件查询模块403还用于:
将所述关键字作为所述第一医疗保险文件的文件标签,保存至文件标签数据表中。
在一种可能的设计中,所述装置40还包括:
文件标签确定模块405,用于根据所述文件标签数据表确定所述第一医疗保险文件的文件标签;
关联文件查找模块406,用于根据文件标签与医疗保险文件的对应关系在所述医疗保险文件存储系统中查找第三医疗保险文件,所述第三医疗保险文件的文件标签中至少有一个文件标签与所述第一医疗保险文件的文件标签相同;
所述文件发送模块404还用于将所述第三医疗保险文件发送给所述语音采集终端,以使所述语音采集终端在显示所述第一医疗保险文件的同时,关联显示所述第三医疗保险文件。
在一种可能的设计中,所述语音识别模块402具体用于:
通过基于统计模型的方法,或者,基于声道模型和语音知识的方法,或者,基于标准模板匹配的方法,或者,基于神经网络的方法对所述语音数据进行语音识别得到信息查询语句。
在一种可能的设计中,所述语音识别模块402具体用于:
对所述语音数据进行预处理,得到所语音数据对应的多个语音小段;
对所述多个语音小段中的各个语音小段进行声学特征提取,得到M行*N列的观测序列,其中,M为声学特征的维度,N为所述语音小段的数量;
将所述观测序列送入预先训练得到的基于隐马尔可夫模型的状态网络中,在所述状态网络中查找与所述观测序列的匹配度大于预设阈值的目标路径,将所述目标路径对应的文本内容确定为所述信息查询语句。
在一种可能的设计中,所述文件查询模块403,还用于:
统计在第一时间段内,所述信息查询语句被作为关键字在所述医疗保险文件存储系统中查找所述第一医疗保险文件的第一统计次数;
在所述第一统计次数大于预设的第一次数的情况下,将所述信息查询语句作为所述第一医疗保险文件的文件标签,并将所述信息查询语句,以及所述信息查询语句与所述第一医疗保险文件的对应关系保存在所述文件标签数据表中。
在一种可能的设计中,所述文件查询模块403,还用于:
统计在第二时间段内,所述文件标签数据表中的第一文件标签被用于作为关键字查找所述第一医疗保险文件的第二统计次数,所述第一文件标签为所述第一医疗保险文件对应的文件标签;
在所述第二统计次数小于预设的第二次数阈值的情况下,在所述文件标签数据表中删除所述第一文件标签,以及所述第一文件标签与所述第一医疗保险文件的对应关系。
需要说明的是,图6对应的实施例中未提及的内容可参见方法实施例的描述,这里不再赘述。
本申请实施例中,基于语音识别的信息查询装置通过对语音采集终端采集到的语音数据进行识别得到信息查询语句,将信息查询语句作为关键字在医疗保险文件存储系统中查找内容包含该关键字的医疗保险文件发送给语音采集终端,使得语音采集终端能够显示该医疗保险文件,实现了根据语音进行医疗保险文件的搜索以及显示的功能,用户只需要通过语音即可获取到想要查询的医疗保险信息,使得不会利用对医疗保险平台的进行搜索的用户也能获取医疗保险信息,提高了用户体验。
参见图7,图7是本申请实施例提供的另一种基于语音识别的信息查询装置的组成结构示意图,该装置50包括处理器501、存储器502以及通信接口503。处理器501连接到存储器502和通信接口503,例如处理器501可以通过总线连接到存储器502和通信接口503。
处理器501被配置为支持所述基于语音识别的信息查询装置执行图2-图5所述的基于语音识别的信息查询方法中相应的功能。该处理器501可以是中央处理器(central processing unit,CPU),网络处理器(network processor,NP),硬件芯片或者其任意组合。上述硬件芯片可以是专用集成电路(application specific integrated circuit,ASIC),可编程逻辑器件(programmable logic device,PLD)或其组合。上述PLD可以是复杂可编程逻辑器件(complex programmable logic device,CPLD),现场可编程逻辑门阵列(field-programmable gate array,FPGA),通用阵列逻辑(generic array logic,GAL)或其任意组合。
存储器502存储器用于存储程序代码等。存储器502可以包括易失性存储器(volatile memory,VM),例如随机存取存储器(random access memory,RAM);存储器502也可以包括非易失性存储器(non-volatile memory,NVM),例如只读存储器(read-only memory,ROM),快闪存储器(flash memory),硬盘(hard disk drive,HDD)或固态硬盘(solid-state drive,SSD);存储器502还可以包括上述种类的存储器的组合。本申请实施例中,存储器502用于存储医疗保险文件、文件标签数据表等。
所述通信接口503用于发送或接收数据。
处理器501可以调用所述程序代码以执行以下操作:
通过通信接口503接收语音采集终端发送的信息查询请求,所述信息查询请求包括所述语音采集终端采集的语音数据;
对所述语音数据进行语音识别得到信息查询语句;
将所述信息查询语句作为关键字,在医疗保险文件存储系统中查找第一医疗保险文件,所述第一医疗保险文件的内容中包含所述关键字,所述医疗保险文件存储系统用于存储医疗保险文件;
通过通信接口503将所述第一医疗保险文件发送给所述语音采集终端,以使所述语音采集终端显示所述第一医疗保险文件。
需要说明的是,各个操作的实现还可以对应参照图2-图5所示的方法实施例的相应描 述;所述处理器501还可以与通信接口503配合执行上述方法实施例中的其他操作。
本申请实施例还提供一种计算机非易失性可读存储介质,所述计算机非易失性可读存储介质存储有计算机程序,所述计算机程序包括程序指令,所述程序指令当被计算机执行时使所述计算机执行如前述实施例所述的方法,所述计算机可以为上述提到的基于语音识别的信息查询装置的一部分。例如为上述的处理器501。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,所述的程序可存储于一计算机可读取非易失性可读存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,所述的非易失性可读存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)或随机存储记忆体(Random Access Memory,RAM)等。
以上所揭露的仅为本申请较佳实施例而已,当然不能以此来限定本申请之权利范围,因此依本申请权利要求所作的等同变化,仍属本申请所涵盖的范围。

Claims (20)

  1. 一种基于语音识别的信息查询方法,其特征在于,包括:
    接收语音采集终端发送的信息查询请求,所述信息查询请求包括所述语音采集终端采集的语音数据;
    对所述语音数据进行语音识别得到信息查询语句;
    将所述信息查询语句作为关键字,在医疗保险文件存储系统中查找第一医疗保险文件,所述第一医疗保险文件的内容中包含所述关键字,所述医疗保险文件存储系统用于存储医疗保险文件;
    将所述第一医疗保险文件发送给所述语音采集终端,以使所述语音采集终端显示所述第一医疗保险文件。
  2. 根据权利要求1所述的方法,其特征在于,所述将所述信息查询语句作为关键字,在医疗保险文件存储系统中查找第一医疗保险文件,包括:
    将所述信息查询语句作为关键字,遍历文件标签数据表,在所述文件标签数据表中查找目标文件标签,所述目标文件标签与所述关键字相同或包含所述关键字,所述文件标签数据表用于存储对所述医疗保险文件存储系统中的各个医疗保险文件进行标签提取得到的所述各个医疗保险文件的文件标签;
    在查找到所述目标文件标签的情况下,根据文件标签与医疗保险文件的对应关系将与所述目标文件标签对应的医疗保险文件确定为第一医疗保险文件。
  3. 根据权利要求1或2所述的方法,其特征在于,所述方法还包括:
    在未查找到所述目标文件标签的情况下,通过文件操作插件依次打开所述医疗保险文件存储系统中的医疗保险文件;
    通过所述文件操作插件将所述关键字作为查询关键字在所述医疗保险文件中查找所述关键字;
    将包含所述关键字的医疗保险文件确定为第一医疗保险文件。
  4. 根据权利要求3所述的方法,其特征在于,所述将包含所述关键字的医疗保险文件确定为第一医疗保险文件之后还包括:
    将所述关键字作为所述第一医疗保险文件的文件标签,保存至所述文件标签数据表中。
  5. 根据权利要求2-4任一项所述的方法,其特征在于,所述将所述信息查询语句作为关键字,在医疗保险文件存储系统中查找第一医疗保险文件之后,还包括:
    根据所述文件标签数据表确定所述第一医疗保险文件的文件标签;
    根据文件标签与医疗保险文件的对应关系在所述医疗保险文件存储系统中查找第三医疗保险文件,所述第三医疗保险文件的文件标签中至少有一个文件标签与所述第一医疗保险文件的文件标签相同;
    将所述第三医疗保险文件发送给所述语音采集终端,以使所述语音采集终端在显示所述第一医疗保险文件的同时,关联显示所述第三医疗保险文件。
  6. 根据权利要求1-5任一项所述的方法,其特征在于,所述对所述语音数据进行语音识别得到信息查询语句包括:
    通过基于统计模型的方法,或者,基于声道模型和语音知识的方法,或者,基于标准模板匹配的方法,或者,基于神经网络的方法对所述语音数据进行语音识别得到信息查询语句。
  7. 根据权利要求6所述的方法,其特征在于,所述通过基于统计模型的方法对所述语音数据进行语音识别得到信息查询语句,包括:
    对所述语音数据进行预处理,得到所述语音数据对应的多个语音小段;
    对所述多个语音小段中的各个语音小段进行声学特征提取,得到M行*N列的观测序列,其中,M为声学特征的维度,N为所述语音小段的数量;
    将所述观测序列送入预先训练得到的基于隐马尔可夫模型的状态网络中,在所述状态网络中查找与所述观测序列的匹配度大于预设阈值的目标路径,将所述目标路径对应的文本内容确定为所述信息查询语句。
  8. 根据权利要求1-7任一项所述的方法,其特征在于,所述方法还包括:
    统计在第一时间段内,所述信息查询语句被作为关键字在所述医疗保险文件存储系统中查找所述第一医疗保险文件的第一统计次数;
    在所述第一统计次数大于预设的第一次数的情况下,将所述信息查询语句作为所述第一医疗保险文件的文件标签,并将所述信息查询语句,以及所述信息查询语句与所述第一医疗保险文件的对应关系保存在所述文件标签数据表中。
  9. 根据权利要求1-8任一项所述的方法,其特征在于,所述方法还包括:
    统计在第二时间段内,所述文件标签数据表中的第一文件标签被用于作为关键字查找所述第一医疗保险文件的第二统计次数,所述第一文件标签为所述第一医疗保险文件对应的文件标签;
    在所述第二统计次数小于预设的第二次数阈值的情况下,在所述文件标签数据表中删除所述第一文件标签,以及所述第一文件标签与所述第一医疗保险文件的对应关系。
  10. 一种基于语音识别的信息查询装置,其特征在于,包括:
    请求接收模块,用于接收语音采集终端发送的信息查询请求,所述信息查询请求包括所述语音采集终端采集的语音数据;
    语音识别模块,用于对所述语音数据进行语音识别得到信息查询语句;
    文件查询模块,用于将所述信息查询语句作为关键字,在医疗保险文件存储系统中查找第一医疗保险文件,所述第一医疗保险文件的内容中包含所述关键字,所述医疗保险文件存储系统用于存储医疗保险文件;
    文件发送模块,用于将所述第一医疗保险文件发送给所述语音采集终端,以使所述语音采集终端显示所述第一医疗保险文件。
  11. 根据权利要求10所述的装置,其特征在于,所述文件查询模块,具体用于:
    将所述信息查询语句作为关键字,遍历文件标签数据表,在所述文件标签数据表中查找目标文件标签,所述目标文件标签与所述关键字相同或包含所述关键字,所述文件标签数据表用于存储对所述医疗保险文件存储系统中的各个医疗保险文件进行标签提取得到的所述各个医疗保险文件的文件标签;
    在查找到所述目标文件标签的情况下,根据文件标签与医疗保险文件的对应关系将与 所述目标文件标签对应的医疗保险文件确定为第一医疗保险文件。
  12. 根据权利要求10或11所述的装置,其特征在于,所述文件查询模块,还用于:
    在未查找到所述目标文件标签的情况下,通过文件操作插件依次打开所述医疗保险文件存储系统中的医疗保险文件;
    通过所述文件操作插件将所述关键字作为查询关键字在所述医疗保险文件中查找所述关键字;
    将包含所述关键字的医疗保险文件确定为第一医疗保险文件。
  13. 根据权利要求12所述的装置,其特征在于,所述文件查询模块,还用于:
    将所述关键字作为所述第一医疗保险文件的文件标签,保存至所述文件标签数据表中。
  14. 根据权利要求10-13任一项所述的装置,其特征在于,所述装置还包括:
    文件标签确定模块,用于根据所述文件标签数据表确定所述第一医疗保险文件的文件标签;
    关联文件查找模块,用于根据文件标签与医疗保险文件的对应关系在所述医疗保险文件存储系统中查找第三医疗保险文件,所述第三医疗保险文件的文件标签中至少有一个文件标签与所述第一医疗保险文件的文件标签相同;
    所述文件发送模块,还用于将所述第三医疗保险文件发送给所述语音采集终端,以使所述语音采集终端在显示所述第一医疗保险文件的同时,关联显示所述第三医疗保险文件。
  15. 根据权利要求10-14任一项所述的装置,其特征在于,所述语音识别模块具体用于:
    通过基于统计模型的方法,或者,基于声道模型和语音知识的方法,或者,基于标准模板匹配的方法,或者,基于神经网络的方法对所述语音数据进行语音识别得到信息查询语句。
  16. 根据权利要求15所述的装置,其特征在于,所述语音识别模块,具体用于:
    对所述语音数据进行预处理,得到所述语音数据对应的多个语音小段;
    对所述多个语音小段中的各个语音小段进行声学特征提取,得到M行*N列的观测序列,其中,M为声学特征的维度,N为所述语音小段的数量;
    将所述观测序列送入预先训练得到的基于隐马尔可夫模型的状态网络中,在所述状态网络中查找与所述观测序列的匹配度大于预设阈值的目标路径,将所述目标路径对应的文本内容确定为所述信息查询语句。
  17. 根据权利要求10-16任一项所述的装置,其特征在于,所述文件查询模块,还用于:
    统计在第一时间段内,所述信息查询语句被作为关键字在所述医疗保险文件存储系统中查找所述第一医疗保险文件的第一统计次数;
    在所述第一统计次数大于预设的第一次数的情况下,将所述信息查询语句作为所述第一医疗保险文件的文件标签,并将所述信息查询语句,以及所述信息查询语句与所述第一医疗保险文件的对应关系保存在所述文件标签数据表中。
  18. 根据权利要求10-17任一项所述的装置,其特征在于,所述文件查询模块,还用于:
    统计在第二时间段内,所述文件标签数据表中的第一文件标签被用于作为关键字查找所述第一医疗保险文件的第二统计次数,所述第一文件标签为所述第一医疗保险文件对应的文件标签;
    在所述第二统计次数小于预设的第二次数阈值的情况下,在所述文件标签数据表中删除所述第一文件标签,以及所述第一文件标签与所述第一医疗保险文件的对应关系。
  19. 一种基于语音识别的信息查询装置,包括处理器、存储器以及通信接口,所述处理器、存储器和通信接口相互连接,其中,所述通信接口用于传输数据,所述存储器用于存储程序代码,所述处理器用于调用所述程序代码,执行如权利要求1-9任一项所述的方法。
  20. 一种计算机非易失性可读存储介质,其特征在于,所述计算机非易失性可读存储介质存储有计算机程序,所述计算机程序包括程序指令,所述程序指令当被处理器执行时使所述处理器执行如权利要求1-9任一项所述的方法。
PCT/CN2019/095013 2018-11-07 2019-07-08 基于语音识别的信息查询方法和装置 WO2020093720A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811323295.0 2018-11-07
CN201811323295.0A CN109299227B (zh) 2018-11-07 2018-11-07 基于语音识别的信息查询方法和装置

Publications (1)

Publication Number Publication Date
WO2020093720A1 true WO2020093720A1 (zh) 2020-05-14

Family

ID=65145149

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/095013 WO2020093720A1 (zh) 2018-11-07 2019-07-08 基于语音识别的信息查询方法和装置

Country Status (2)

Country Link
CN (1) CN109299227B (zh)
WO (1) WO2020093720A1 (zh)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109299227B (zh) * 2018-11-07 2023-06-02 平安医疗健康管理股份有限公司 基于语音识别的信息查询方法和装置
CN110059224B (zh) * 2019-03-11 2020-08-07 深圳市橙子数字科技有限公司 投影仪设备的视频检索方法、装置、设备及存储介质
CN110597952A (zh) * 2019-08-20 2019-12-20 深圳壹账通智能科技有限公司 信息处理方法、服务器及计算机存储介质
CN111046154A (zh) * 2019-11-20 2020-04-21 泰康保险集团股份有限公司 信息检索方法、装置、介质及电子设备
CN113360459A (zh) * 2021-07-08 2021-09-07 国网能源研究院有限公司 文件半自动标注与存储的方法、系统及装置

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW200928806A (en) * 2007-12-25 2009-07-01 Quanta Storage Inc Mothod for producing and searching voice tag of file
CN101996195A (zh) * 2009-08-28 2011-03-30 中国移动通信集团公司 音频文件中语音信息的搜索方法、装置及设备
CN108038114A (zh) * 2017-10-17 2018-05-15 广东欧珀移动通信有限公司 一种路径查询方法、终端、计算机可读存储介质
CN108428446A (zh) * 2018-03-06 2018-08-21 北京百度网讯科技有限公司 语音识别方法和装置
CN108717516A (zh) * 2018-05-18 2018-10-30 云易天成(北京)安全科技开发有限公司 文件标签方法、终端及介质
CN109299227A (zh) * 2018-11-07 2019-02-01 平安医疗健康管理股份有限公司 基于语音识别的信息查询方法和装置

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101145149A (zh) * 2006-09-11 2008-03-19 千兆科技(深圳)有限公司 基于下载引擎的二进制文件相关搜索方法及系统
CN103778919B (zh) * 2014-01-21 2016-08-17 南京邮电大学 基于压缩感知和稀疏表示的语音编码方法
CN103956166A (zh) * 2014-05-27 2014-07-30 华东理工大学 一种基于语音关键词识别的多媒体课件检索系统
CN106021531A (zh) * 2016-05-25 2016-10-12 北京云知声信息技术有限公司 通过语音实现图书查询的方法、系统及装置
CN107798032B (zh) * 2017-02-17 2020-05-19 平安科技(深圳)有限公司 自助语音会话中的应答消息处理方法和装置
CN107679060B (zh) * 2017-07-25 2019-02-05 平安科技(深圳)有限公司 电子保单的状态查询方法、装置、用户终端及存储介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW200928806A (en) * 2007-12-25 2009-07-01 Quanta Storage Inc Mothod for producing and searching voice tag of file
CN101996195A (zh) * 2009-08-28 2011-03-30 中国移动通信集团公司 音频文件中语音信息的搜索方法、装置及设备
CN108038114A (zh) * 2017-10-17 2018-05-15 广东欧珀移动通信有限公司 一种路径查询方法、终端、计算机可读存储介质
CN108428446A (zh) * 2018-03-06 2018-08-21 北京百度网讯科技有限公司 语音识别方法和装置
CN108717516A (zh) * 2018-05-18 2018-10-30 云易天成(北京)安全科技开发有限公司 文件标签方法、终端及介质
CN109299227A (zh) * 2018-11-07 2019-02-01 平安医疗健康管理股份有限公司 基于语音识别的信息查询方法和装置

Also Published As

Publication number Publication date
CN109299227A (zh) 2019-02-01
CN109299227B (zh) 2023-06-02

Similar Documents

Publication Publication Date Title
WO2020093720A1 (zh) 基于语音识别的信息查询方法和装置
US10169325B2 (en) Segmenting and interpreting a document, and relocating document fragments to corresponding sections
US11455301B1 (en) Method and system for identifying entities
WO2020140373A1 (zh) 一种意图识别方法、识别设备及计算机可读存储介质
WO2019085064A1 (zh) 医疗理赔拒付方法、装置、终端设备及存储介质
WO2021184729A1 (zh) 一种药品分类方法、装置、存储介质和智能设备
US8577882B2 (en) Method and system for searching multilingual documents
US20120290561A1 (en) Information processing apparatus, information processing method, program, and information processing system
CN111984851B (zh) 医学资料搜索方法、装置、电子装置及存储介质
CN110096573B (zh) 一种文本解析方法及装置
US11222031B1 (en) Determining terminologies for entities based on word embeddings
CN114817386A (zh) 一种结构化医疗数据生成方法及装置
US9772991B2 (en) Text extraction
US20210183526A1 (en) Unsupervised taxonomy extraction from medical clinical trials
WO2022160454A1 (zh) 医疗文献的检索方法、装置、电子设备及存储介质
CN109947903B (zh) 一种成语查询方法及装置
WO2021208444A1 (zh) 电子病例自动生成方法、装置、设备及存储介质
WO2019080428A1 (zh) 目标文档获取方法及应用服务器
CN114358001A (zh) 诊断结果的标准化方法及其相关装置、设备和存储介质
WO2022222942A1 (zh) 问答记录生成方法、装置、电子设备及存储介质
WO2021174923A1 (zh) 概念词序列生成方法、装置、计算机设备及存储介质
US11461668B1 (en) Recognizing entities based on word embeddings
CN115858776A (zh) 一种变体文本分类识别方法、系统、存储介质和电子设备
CN115358817A (zh) 基于社交数据的智能产品推荐方法、装置、设备及介质
CN114360678A (zh) 信息处理方法、装置、设备和存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19881899

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19881899

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 19881899

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 221221)

122 Ep: pct application non-entry in european phase

Ref document number: 19881899

Country of ref document: EP

Kind code of ref document: A1