WO2018214663A1 - Procédé et appareil de traitement de données vocales, et dispositif électronique - Google Patents

Procédé et appareil de traitement de données vocales, et dispositif électronique Download PDF

Info

Publication number
WO2018214663A1
WO2018214663A1 PCT/CN2018/082702 CN2018082702W WO2018214663A1 WO 2018214663 A1 WO2018214663 A1 WO 2018214663A1 CN 2018082702 W CN2018082702 W CN 2018082702W WO 2018214663 A1 WO2018214663 A1 WO 2018214663A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
voice
text
segment
text data
Prior art date
Application number
PCT/CN2018/082702
Other languages
English (en)
Chinese (zh)
Inventor
李明修
银磊
卜海亮
Original Assignee
北京搜狗科技发展有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京搜狗科技发展有限公司 filed Critical 北京搜狗科技发展有限公司
Publication of WO2018214663A1 publication Critical patent/WO2018214663A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/02Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction

Definitions

  • the present invention relates to the technical field, and in particular, to a voice-based data processing method, apparatus, and electronic device.
  • Speech recognition usually converts speech into text.
  • Traditional speech recognition recording tools can only convert speech data into corresponding text, but cannot distinguish between speakers. Therefore, in the case of multi-person speech, recording cannot be performed efficiently by speech recognition.
  • Embodiments of the present invention provide a voice-based data processing method to completely record a consultation process.
  • the embodiment of the present invention further provides a voice-based data processing device, an electronic device, and a readable storage medium, to ensure implementation and application of the foregoing method.
  • the embodiment of the present invention discloses a voice-based data processing method, including: obtaining an inquiry process data, where the consultation process data is determined according to voice data collected during the consultation process; The process data is identified, and the corresponding first text data and the second text data are acquired, wherein the first text data belongs to a target user, and the second text data belongs to other users than the target user; The first text data and the second text data obtain the consultation information.
  • the consultation process data is voice data; the identifying according to the consultation process data, acquiring corresponding first text data and second text data, including: according to the voiceprint feature, from the voice Separating the first voice data and the second voice data from the data; respectively performing voice recognition on the first voice data and the second voice data, and acquiring corresponding first text data and second text data.
  • the separating, according to the voiceprint feature, the first voice data and the second voice data from the voice data including: dividing the voice data into multiple voice segments; The speech segment determines the first speech data and the second speech data.
  • the determining, according to the voiceprint feature, the first voice data and the second voice data by using the voice segment comprising: respectively matching each voice segment by using a reference voiceprint feature, wherein the reference voiceprint feature a voiceprint feature of the target user; acquiring a voice segment corresponding to the reference voiceprint feature to obtain a corresponding first voice data; acquiring a voice segment that does not match the reference voiceprint feature, and obtaining a corresponding second voice data .
  • the determining, according to the voiceprint feature, the first voice data and the second voice data by using the voice segment including: identifying voiceprint features of each voice segment; and counting the number of voice segments corresponding to each voiceprint feature Determining a voiceprint feature having the largest number of voice segments, generating first voice data using the voice segment corresponding to the voiceprint feature, and generating second voice data using the voice segment not belonging to the first voice data.
  • the performing the voice recognition on the first voice data and the second voice data separately, and acquiring the corresponding first text data and the second text data including: respectively, respectively, each voice segment in the first voice data
  • speech recognition generating first text data by using the recognized text segment
  • performing speech recognition on each speech segment in the second speech data and generating second text data by using the recognized text segment
  • the first text data and the second text data are used to obtain the consultation information, including: according to the time sequence of each of the text segments in the first text data and the text segments in the second text data respectively corresponding to the voice segments,
  • the text segments are sorted to get the consultation information.
  • the inquiry process data is a text recognition result obtained by the voice data identification; the identifying according to the consultation process data, acquiring corresponding first text data and second text data, including: The text recognition result performs feature recognition, and the first text data and the second text data are separated according to the language feature.
  • performing feature recognition on the text recognition result, and separating the first text data and the second text data according to the language feature including: dividing the text recognition result to obtain a corresponding text segment; using a preset model Identifying the text segment, determining a language feature of the text segment, the language feature including a target user language feature and a non-target user language feature; generating the first text data by using the text segment having the target user language feature, and Generating second text data using text segments having non-target user language features.
  • the embodiment of the invention further discloses a voice-based data processing device, comprising: a data acquisition module, configured to acquire the data of the consultation process, wherein the data of the consultation process is determined according to the voice data collected during the consultation process; the text recognition module And for identifying, according to the consultation process data, acquiring corresponding first text data and second text data, wherein the first text data belongs to a target user, and the second text data belongs to the target Other users than the user; the information determining module is configured to obtain the consultation information according to the first text data and the second text data.
  • the query process data is voice data
  • the text recognition module includes: a separation module, configured to separate the first voice data and the second voice data from the voice data according to the voiceprint feature;
  • the voice recognition module is configured to separately perform voice recognition on the first voice data and the second voice data, and acquire corresponding first text data and second text data.
  • the separating module is configured to divide the voice data into multiple voice segments; and according to the voiceprint feature, the voice segment is used to determine the first voice data and the second voice data.
  • the separating module is configured to respectively match each of the voice segments by using a reference voiceprint feature, wherein the reference voiceprint feature is a voiceprint feature of the target user; and acquiring a reference voiceprint feature.
  • the audio segment obtains corresponding first voice data; and acquires an audio segment that does not match the reference voiceprint feature to obtain corresponding second voice data.
  • the separating module is configured to identify voiceprint features of each voice segment; separately count voice segments having the same voiceprint feature and their numbers, and generate second voice data by using the largest number of voice segments, where The largest voiceprint feature is the voiceprint feature of the target user; the second voice data is generated using the remaining voice segments.
  • the voice recognition module is configured to perform voice recognition on each voice segment in the first voice data, and generate first text data by using the recognized text segment; and voices in the second voice data.
  • the segment respectively performs speech recognition, and generates second text data by using the recognized text segment;
  • the information determining module is configured to respectively correspond to each text segment in the first text data and each text segment in the second text data The chronological order of the speech segments, sorting each text segment to obtain the consultation information.
  • the inquiry process data is a text recognition result obtained by the voice data identification; the text recognition module is configured to perform feature recognition on the text recognition result, and separate the first text data and the second according to the language feature. text data.
  • the text recognition module includes: a segment dividing module, configured to divide the text recognition result to obtain a corresponding text segment; and a segment identification module, configured to identify the text segment by using a preset model Determining a language feature of the text segment, the language feature comprising a first language feature and a second language feature; a text generation module, configured to generate the first text data by using the text segment having the first language feature, and adopting The text segment having the second language feature generates second text data.
  • Embodiments of the present invention also disclose a readable storage medium, when instructions in the storage medium are executed by a processor of an electronic device, enabling the electronic device to perform execution based on one or more of the embodiments of the present invention.
  • Voice data processing method when instructions in the storage medium are executed by a processor of an electronic device, enabling the electronic device to perform execution based on one or more of the embodiments of the present invention.
  • an electronic device includes a memory, and one or more programs, wherein one or more programs are stored in the memory and configured to execute the one or more by one or more processors
  • the program includes instructions for: obtaining the consultation process data, the diagnosis process data is determined according to the voice data collected during the consultation process; identifying according to the consultation process data, and acquiring the corresponding first text data And second text data, wherein the first text data belongs to a target user, and the second text data belongs to other users than the target user; according to the first text data and the second text data, Get the consultation information.
  • the consultation process data is voice data; the identifying according to the consultation process data, acquiring corresponding first text data and second text data, including: according to the voiceprint feature, from the voice Separating the first voice data and the second voice data from the data; respectively performing voice recognition on the first voice data and the second voice data, and acquiring corresponding first text data and second text data.
  • the separating, according to the voiceprint feature, the first voice data and the second voice data from the voice data including: dividing the voice data into multiple voice segments; The speech segment determines the first speech data and the second speech data.
  • the determining, according to the voiceprint feature, the first voice data and the second voice data by using the voice segment comprising: respectively matching each voice segment by using a reference voiceprint feature, wherein the reference voiceprint feature a voiceprint feature of the target user; acquiring a voice segment corresponding to the reference voiceprint feature to obtain a corresponding first voice data; acquiring a voice segment that does not match the reference voiceprint feature, and obtaining a corresponding second voice data .
  • the determining, according to the voiceprint feature, the first voice data and the second voice data by using the voice segment including: identifying voiceprint features of each voice segment; and counting the number of voice segments corresponding to each voiceprint feature Determining a voiceprint feature having the largest number of voice segments, generating first voice data using the voice segment corresponding to the voiceprint feature, and generating second voice data using the voice segment not belonging to the first voice data.
  • the performing the voice recognition on the first voice data and the second voice data separately, and acquiring the corresponding first text data and the second text data including: respectively, respectively, each voice segment in the first voice data
  • speech recognition generating first text data by using the recognized text segment
  • performing speech recognition on each speech segment in the second speech data and generating second text data by using the recognized text segment
  • the first text data and the second text data are used to obtain the consultation information, including: according to the time sequence of each of the text segments in the first text data and the text segments in the second text data respectively corresponding to the voice segments,
  • the text segments are sorted to get the consultation information.
  • the inquiry process data is a text recognition result obtained by the voice data identification; the identifying according to the consultation process data, acquiring corresponding first text data and second text data, including: The text recognition result performs feature recognition, and the first text data and the second text data are separated according to the language feature.
  • performing feature recognition on the text recognition result, and separating the first text data and the second text data according to the language feature including: dividing the text recognition result to obtain a corresponding text segment; using a preset model Identifying the text segment, determining a language feature of the text segment, the language feature including a target user language feature and a non-target user language feature; generating the first text data by using the text segment having the target user language feature, and Generating second text data using text segments having non-target user language features.
  • the first text data and the second text data may be identified according to different users by collecting voice data of the consultation process determined by the voice during the consultation process, wherein the first text data is Having a target user, the second text data belongs to other users than the target user, that is, can automatically distinguish the doctor and the patient's sentence during the consultation, and then according to the first text data and the second text data. Get the consultation information, be able to completely record the consultation process, automatically sort out the medical records, etc., and save the finishing time of the consultation records.
  • FIG. 1 is a flow chart showing the steps of an embodiment of a voice-based data processing method of the present invention
  • FIG. 2 is a flow chart showing the steps of another embodiment of the voice-based data processing method of the present invention.
  • FIG. 3 is a flow chart showing the steps of another embodiment of a voice-based data processing method of the present invention.
  • FIG. 4 is a structural block diagram of an embodiment of a voice-based data processing apparatus of the present invention.
  • FIG. 5 is a structural block diagram of another embodiment of a voice-based data processing apparatus of the present invention.
  • FIG. 6 is a structural block diagram of an electronic device for voice-based data processing according to an exemplary embodiment of the present invention.
  • FIG. 7 is a schematic structural diagram of an electronic device according to a voice-based data processing according to another exemplary embodiment of the present invention.
  • FIG. 1 there is shown a flow chart of the steps of an embodiment of a speech-based data processing method of the present invention, which may include the following steps:
  • step 102 the data of the consultation process is obtained, and the data of the consultation process is determined according to the voice data collected during the consultation process.
  • the consultation process can be collected by various electronic devices, and the data of the consultation process can be obtained based on the collected voice data, that is, the data of the consultation process can be collected voice data, or can be collected based on The speech data is converted to a text recognition result.
  • the data of the consultation process can be collected voice data, or can be collected based on The speech data is converted to a text recognition result.
  • embodiments of the present invention can be identified using data collected by various consultation processes.
  • Step 104 Perform identification according to the consultation process data, and obtain corresponding first text data and second text data, wherein the first text data belongs to a target user, and the second text data belongs to the target Other users than the user.
  • the data of the consultation process can be identified, and different identification methods are adopted according to different data types.
  • the voice data can be processed by voiceprint features, voice recognition, etc.
  • the text data can be identified by text features, thereby obtaining a basis for the user.
  • the first text data and the second text data are distinguished.
  • the consultation process may have at least two users communicating and interacting, one user is a doctor, and other users are patients, family members, and the like. For example, according to the doctor's one-day clinic, it will include a doctor and multiple patients, and may also have one or more family members.
  • the doctor can be the target user
  • the first text data is the doctor's corresponding consultation text data
  • the text data of at least one other user is used as the second text data, that is, the patient and the family corresponding to the consultation. text data.
  • Step 106 Obtain consultation information according to the first text data and the second text data.
  • the first text data and the second text data may be composed of a plurality of text segments, so that the consultation information may be obtained based on the time of the text segment and the corresponding user.
  • Patient B I am not comfortable with XXX.
  • the patient information can also be obtained in combination with the outpatient records of the hospital, thereby distinguishing different patients and the like in the consultation information.
  • the first text data and the second text data may be identified from different data in the consultation process data for the consultation process data determined by collecting the voice during the consultation process, wherein the first text The data belongs to a target user, and the second text data belongs to other users than the target user, that is, the statement that can automatically distinguish the doctor and the patient during the consultation, and then according to the first text data and the second text.
  • the data get the consultation information, can completely record the consultation process, automatically sort out the medical records and other content, and save the finishing time of the consultation records.
  • the consultation process data includes a text recognition result obtained by the voice data and/or the voice data identification.
  • the identification methods of different types of consultation process data are different. Therefore, the embodiments of the present invention respectively discuss the processing process of different types of consultation process data.
  • the data of the consultation process is voice data.
  • the method may include the following steps:
  • Step 202 Obtain a consultation process data, where the consultation process data is voice data collected during the consultation process.
  • voice data can be collected through the electronic device through various electronic devices, for example, recording audio data through a recording pen, a mobile phone, a computer, etc., and obtaining voice data collected during the consultation process, the voice data.
  • the voice data that can be collected for one outpatient clinic, or the voice data collected by a doctor in multiple clinics, is not limited in this embodiment of the present invention. Therefore, the voice data includes voice data of a doctor, and voice data of at least one patient, and may also include voice data of at least one patient's family.
  • the step 104 is performed according to the data of the consultation process, and the corresponding first text data and the second text data are obtained, which may include the following steps 204-206.
  • Step 204 Separate the first voice data and the second voice data from the voice data according to the voiceprint feature.
  • Voiceprint refers to the spectrum of sound waves carrying speech information displayed by electroacoustic instruments. Voiceprints are characterized by specificity and stability. After adulthood, human voiceprints can remain relatively stable for a long time, so different people can be identified through voiceprints. Therefore, for the voice data, the voiceprint feature can be identified, and the voice segment corresponding to different users (voiceprint features) in the voice data is determined, thereby obtaining the first voice data of the target user and the second voice data of the other user.
  • the separating the first voice data and the second voice data from the voice data according to the voiceprint feature comprising: dividing the voice data into a plurality of voice segments; and using the voice according to the voiceprint feature The segment determines the first voice data and the second voice data.
  • the voice data can be divided into a plurality of voice segments.
  • the voice division rule for example, the pause interval of the sound segment is divided; or according to the voiceprint feature, that is, the voiceprint feature corresponding to each sound is determined, thereby dividing the voice segment according to different voiceprint features. Therefore, one voice data can divide a plurality of voice segments, and each voice segment has a sequence of front and back, and different voice segments can have the same or different voiceprint features. Therefore, based on the voiceprint feature, whether each voice segment belongs to the first voice data or the second voice data is determined, and the voiceprint feature of each voice segment can be determined, and then multiple voices having the voiceprint feature of the target user are determined.
  • the segments constitute the first voice data, and the other remaining voice segments constitute the second voice data.
  • the doctor before collecting the voice data in the consultation process, the doctor (target user) may first collect a piece of voice as the reference data, so as to identify the voiceprint feature of the doctor from the reference data, that is, the reference voiceprint. feature.
  • a voice recognition model may also be set. After the voice data is input into the voice recognition model, the voice segment conforming to the reference voiceprint data may be separated from the voice segment of other voiceprint features, thereby obtaining each voice segment of the target user. And other user's voice clips.
  • the medical record information is usually only included in one doctor, and there may be more than one patient, so that a corresponding large number of medical samples can be obtained for a specific doctor in the above manner.
  • the voiceprint feature of the target user may be collected in advance as a reference voiceprint feature, thereby dividing the voice data. That is, the determining, according to the voiceprint feature, the first voice data and the second voice data by using the voice segment, comprising: respectively matching each voice segment by using a reference voiceprint feature, wherein the reference voiceprint feature is a target user a voiceprint feature; acquiring a voice segment corresponding to the reference voiceprint feature to obtain a corresponding first voice data; and acquiring a voice segment that does not conform to the reference voiceprint feature, to obtain corresponding second voice data.
  • the voice data may be collected in advance to extract the voiceprint feature, and the voiceprint feature of the target user is used as the reference voiceprint feature, so that for the voice data having the target user, the reference voiceprint feature may be used for each
  • the voice segments are respectively matched to determine whether the voiceprint features in the voice segments are consistent with the reference voiceprint features. If they are consistent, the voice segments are considered to match the reference voiceprint features, and the voice segments are added to the first voice data (ie, the target) User-specific voice data).
  • the voice segment does not match the reference voiceprint feature
  • the voice segment does not match the reference voiceprint feature
  • the voice segment is added to the second voice data (ie, the voice data corresponding to the non-target user). That is, the first voice data and the second voice data are each composed of corresponding voice segments, wherein each voice segment also has a sequential relationship, thereby facilitating subsequent accurate determination of the consultation information.
  • the division of the voice data may also be performed by the number of voice segments corresponding to the same voiceprint feature in the voice data. That is, the determining, according to the voiceprint feature, the first voice data and the second voice data by using the voice segment, comprising: identifying voiceprint features of each voice segment; and counting the number of voice segments corresponding to each voiceprint feature; The voiceprint feature having the largest number of voice segments generates the first voice data by using the voice segment corresponding to the voiceprint feature, wherein the largest number of voiceprint features are voiceprint features of the target user; and the voice data features that are not belonging to the first voice data are used.
  • the speech segment generates second speech data.
  • the data of the consultation process may be the record data of a doctor's multiple outpatient clinics. Therefore, in this process, doctors often occupy more time to communicate with different patients and their families, that is, voice data.
  • the Chinese doctor (target user) has the largest number of voices, so the target user and other users can be distinguished according to the number of corresponding voice segments of different users, and the first voice data and the second voice data are obtained.
  • the voiceprint features in the voice segment can be identified, the voiceprint features included in each voice segment are determined, and then the number of voice segments corresponding to each voiceprint feature is separately counted, and the voiceprint having the largest number of voice segments is determined.
  • the voiceprint feature is determined as a voiceprint feature of the target user, and the other voiceprint features are voiceprint features of other users, so that the voice segment having the voiceprint feature of the target user sequentially constitutes the first audio data, and the other
  • the speech segments i.e., the speech segments that do not belong to the first speech data
  • a voice segment may include voiceprint features of a plurality of users. For the case of identifying multiple voiceprint features from a voice segment: when different voiceprint features appear at different times, if the voiceprint feature is a voiceprint feature of other users, the voice segment may be added to the In the two voice data; if the voiceprint feature includes the voiceprint feature of the target user and the voiceprint feature of other users, the voice segment may be further divided into sub-segments and added to the corresponding voice data.
  • the voice segment may be added to the second voice data.
  • the voiceprint feature includes the voiceprint feature of the target user and the voiceprint feature of other users, it may be divided according to requirements, for example, the voice segment is classified into the voice segment of the target user to obtain the first voice data, or the voice segment is returned.
  • the second voice data is obtained for the voice segments of other users, or added separately in the voice data of the two users.
  • Step 206 Perform voice recognition on the first voice data and the second voice data respectively, and acquire corresponding first text data and second text data.
  • the two voice data can be separately identified, thereby obtaining first text data of the target user and second text data of other users.
  • performing voice recognition on the first voice data and the second voice data separately, and acquiring corresponding first text data and second text data including: performing voice segments in the first voice data Performing speech recognition separately, generating first text data by using the recognized text segment; performing speech recognition on each speech segment in the second speech data, and generating second text data by using the recognized text segment.
  • the text data corresponding to the voice segment can be obtained by the identification of each voice segment by the first voice data, so that the first text data is formed according to the sequence of the voice segments, and the second text data can also be obtained in a corresponding manner.
  • Step 208 Obtain consultation information according to the first text data and the second text data.
  • each text segment in the first text data and each text segment in the second text data may be sorted according to a corresponding order, such as a time sequence, thereby obtaining Corresponding consultation information
  • the consultation information can record the doctor's question in a consultation and the corresponding patient (family)'s answer, as well as the doctor's diagnosis, medical advice and other information.
  • Step 210 Perform analysis on the consultation information to obtain a corresponding analysis result, and the analysis result is related to disease diagnosis.
  • the embodiment of the present invention can also analyze the consultation information according to the requirements, and obtain the corresponding analysis result. Since the consultation is related to the diagnosis of the disease, the analysis result is also related to the diagnosis of the disease. Determined based on analytical needs.
  • the data of the consultation process is a text recognition result obtained by the voice data identification, and may specifically include The following steps:
  • Step 302 Acquire a text recognition result obtained by the voice data identification.
  • the voice data is collected during the consultation process, and the collected voice data is converted into the text recognition result by voice recognition, and the text recognition result can be directly obtained.
  • the step 104 is performed according to the data of the consultation process, and the corresponding first text data and the second text data are obtained, which may include the following step 304.
  • Step 304 Perform feature recognition on the text recognition result, and separate the first text data and the second text data according to the language feature.
  • the embodiment of the present invention recognizes the words of different users from the text recognition result and organizes the consultation information. Among them, during the consultation process, the doctor usually asks the symptoms, and the user will reply to the symptoms, and the doctor will diagnose the disease, the required examination, the required medicine, etc., so that the characteristics can be identified from the text recognition result based on these characteristics. The doctor and patient statements are separated, and the first text data and the second text data are separated.
  • the embodiment of the present invention can collect the text of the doctor's consultation and the text of the patient's consultation in advance, and collect the information of the examination for each analysis, thereby counting the language characteristics of the doctor (ie, the target user), and the patient and the patient.
  • a predetermined model can be established by determining the language features of different users by means of machine learning, probability statistics, and the like.
  • the embodiment of the present invention can obtain a large number of separated medical texts as training data, and the separated medical texts identify the medical information of the target user and other users, such as the textual information obtained historically based on the identification.
  • the doctor content data (first text data of the target user) and the patient content data (second text data of other users) may be separately trained to obtain a doctor content model and a patient content model, and of course, the two models may be synthesized.
  • a preset model based on which the doctor's statement and the patient's statement are recognized.
  • the doctor's content is generally a question with a symptomatic vocabulary, such as how you feel, what symptoms, what is uncomfortable, etc.; and the patient's content is generally symptomatic.
  • Question of epidemic disease for example, is it a cold, is it XX disease, etc.
  • the contents of the doctor are usually statements with symptoms and medicines, for example, you have a cold, you can eat XX medicine and so on. Therefore, both the doctor's sentence content and the patient's sentence content have relatively significant language features, so the doctor content model and the patient content model can be trained according to the separated medical case information.
  • Performing feature recognition on the text recognition result, and separating the first text data and the second text data according to the language feature comprising: dividing the text recognition result to obtain a corresponding text segment; and using the preset model to the text Identifying a segment, determining a language feature of the text segment, the language feature comprising a first language feature and a second language feature; generating a first text data using the text segment having the first language feature, and employing the second
  • the text segment of the linguistic feature generates second text data.
  • the text recognition result may be first divided, and the text recognition result may be divided into sentences according to Chinese sentence features, and multiple text segments may be divided according to other methods.
  • each text segment is sequentially input into a preset model, and the text segment is identified by the preset model, so that the language features of each text segment can be identified.
  • the preset model can also be set to divide the user segment into the user based on the recognized language feature. Where the character of the target user is used as the first language feature and the language feature of the other user is used as the second language feature, the preset model may be used to determine that the text segment has the first language feature or the second language feature. The text segment having the first language feature may then be generated into the first text data in accordance with the division order of the text segments, and the second text data may be generated using the text segment having the second language feature.
  • Step 306 Obtain consultation information according to the first text data and the second text data.
  • Step 308 analyzing the consultation information to obtain a corresponding analysis result, and the analysis result is related to the disease diagnosis.
  • each text segment in the first text data and each text segment in the second text data may be sorted according to a corresponding order, thereby obtaining corresponding consultation information.
  • the consultation information can record the doctor's question in a consultation and the answer of the corresponding patient (family), as well as the doctor's diagnosis, medical advice and other information.
  • the embodiment of the present invention can also analyze the consultation information according to the requirements, and obtain the corresponding analysis result. Since the consultation is related to the diagnosis of the disease, the analysis result is also related to the diagnosis of the disease. Determined based on analytical needs.
  • the communication process with the patient can be recorded by means of recording, and then the doctor and patient statements can be separated, differentiated and arranged, and provided to the doctor in the form of dialogue. As a medical case, it can effectively reduce the time spent by doctors on medical records.
  • FIG. 4 a structural block diagram of an embodiment of a voice-based data processing apparatus of the present invention is shown, which may specifically include the following modules:
  • the data acquisition module 402 is configured to obtain the data of the consultation process, and the data of the consultation process is determined according to the voice data collected during the consultation process.
  • the text identification module 404 is configured to perform identification according to the consultation process data, and obtain corresponding first text data and second text data, wherein the first text data belongs to a target user, and the second text data belongs to Other users than the target user.
  • the information determining module 406 is configured to obtain the consultation information according to the first text data and the second text data.
  • the consultation process may have at least two users communicating and interacting, one user is a doctor, and other users are patients, family members, and the like.
  • the doctor can be the target user
  • the first text data is the doctor's corresponding consultation text data
  • the text data of at least one other user is used as the second text data, that is, the patient and the family corresponding to the consultation.
  • text data Since the consultation is usually a question and answer process, the first text data and the second text data may be composed of a plurality of text segments, so that the consultation information may be obtained based on the time of the text segment and the corresponding user.
  • the patient information can also be obtained in combination with the outpatient records of the hospital, thereby distinguishing different patients and the like in the consultation information.
  • the first text data and the second text data may be identified according to different users from the consultation process data, wherein the first text data Having a target user, the second text data belongs to other users than the target user, that is, can automatically distinguish the doctor and the patient's sentence during the consultation, and then according to the first text data and the second text data.
  • Get the consultation information be able to completely record the consultation process, automatically sort out the medical records, etc., and save the finishing time of the consultation records.
  • FIG. 5 a structural block diagram of an embodiment of a voice-based data processing apparatus of the present invention is shown, which may specifically include the following modules:
  • the consultation process data includes text recognition results obtained by voice data and/or voice data recognition.
  • the query process data is voice data;
  • the text recognition module 404 can include:
  • the separating module 40402 is configured to separate the first voice data and the second voice data from the voice data according to the voiceprint feature.
  • the voice recognition module 40404 is configured to separately perform voice recognition on the first voice data and the second voice data, and acquire corresponding first text data and second text data.
  • the separating module 40402 is configured to divide the voice data into a plurality of voice segments; and according to the voiceprint feature, the voice segment is used to determine the first voice data and the second voice data.
  • the separating module 40402 is configured to respectively match each of the voice segments by using a reference voiceprint feature, wherein the reference voiceprint feature is a voiceprint feature of the target user; and acquiring a reference voiceprint feature.
  • the voice segment obtains corresponding first voice data; and acquires a voice segment that does not match the reference voiceprint feature, to obtain corresponding second voice data.
  • the doctor before collecting the voice data in the consultation process, the doctor (target user) may first collect a piece of voice as the reference data, so as to identify the voiceprint feature of the doctor from the reference data, that is, the reference voiceprint. feature.
  • a voice recognition model may also be set. After the voice data is input into the voice recognition model, the voice segment conforming to the reference voiceprint data may be separated from the voice segment of other voiceprint features, thereby obtaining each voice segment of the target user. And other user's voice clips.
  • the medical record information is usually only included in one doctor, and there may be more than one patient, so that a corresponding large number of medical samples can be obtained for a specific doctor in the above manner.
  • the separating module 40402 is configured to identify voiceprint features of each voice segment; separately count the number of voice segments corresponding to each voiceprint feature, and determine the voiceprint feature having the largest number of voice segments, using the sound
  • the voice segment corresponding to the pattern feature generates first voice data, wherein the largest number of voiceprint features are voiceprint features of the target user; and the second voice data is generated using voice segments that do not belong to the first voice data.
  • the data through the consultation process may be the record data of a doctor's multiple outpatient clinics. Therefore, in this process, doctors often occupy more time to communicate with different patients and their families, that is, voice.
  • the doctor (target user) has the largest number of voices in the data, so the target user and other users can be distinguished according to the number of corresponding voice segments of different users, and the first voice data and the second voice data are obtained.
  • a voice segment may include voiceprint features of a plurality of users.
  • the separation module 40402 may perform the following processing for identifying a plurality of voiceprint features from a voice segment: when different voiceprint features occur at different times, and if the voiceprint features are voiceprint features of other users, The voice segment may be added to the second voice data; if the voiceprint feature includes the voiceprint feature of the target user and other user's voiceprint features, the voice segment may be subdivided into sub-segments and added to the corresponding voice data. in.
  • the voice segment may be added to the second voice data.
  • the voiceprint feature includes the voiceprint feature of the target user and the voiceprint feature of other users, it may be divided according to requirements, for example, the voice segment is classified into the voice segment of the target user to obtain the first voice data, or the voice segment is returned.
  • the second voice data is obtained for the voice segments of other users, or added separately in the voice data of the two users.
  • the voice recognition module 40404 is configured to perform voice recognition on each voice segment in the first voice data, and generate first text data by using the recognized text segment; and voices in the second voice data.
  • the segments respectively perform speech recognition, and the second text data is generated by using the recognized text segments.
  • the information determining module 406 is configured to sort each text segment according to the time sequence of each text segment in the first text data and each text segment in the second text data, and obtain an inquiry. information.
  • the inquiry process data is a text recognition result obtained by the voice data identification; the text recognition module 404 is configured to perform feature recognition on the text recognition result, and separate the first text data and the second according to the language feature. text data.
  • the text recognition module 404 includes:
  • the segment dividing module 40406 is configured to divide the text recognition result to obtain a corresponding text segment.
  • the segment identification module 40408 is configured to identify the text segment by using a preset model, and determine a language feature that the text segment has, the language feature including a first language feature and a second language feature.
  • the embodiment of the present invention can obtain a large number of separated medical texts as training data, and the separated medical texts identify the medical information of the target user and other users, such as the textual information obtained historically based on the identification.
  • the doctor content data (first text data of the target user) and the patient content data (second text data of other users) may be separately trained to obtain a doctor content model and a patient content model, and of course, the two models may be synthesized.
  • a preset model based on which the doctor's statement and the patient's statement are recognized. For example, in the case information obtained from the consultation, the doctor's content is generally a question with a symptomatic vocabulary, such as how you feel, what symptoms, what is uncomfortable, etc.; and the patient's content is generally symptomatic.
  • the text generating module 40410 is configured to generate the first text data by using the text segment having the first language feature, and generate the second text data by using the text segment having the second language feature.
  • the device further includes: an analysis module 408, configured to analyze the consultation information, and obtain a corresponding analysis result, where the analysis result is related to a disease diagnosis.
  • an analysis module 408 configured to analyze the consultation information, and obtain a corresponding analysis result, where the analysis result is related to a disease diagnosis.
  • each text segment in the first text data and each text segment in the second text data may be sorted according to a corresponding order, thereby obtaining corresponding consultation information.
  • the consultation information can record the doctor's question in a consultation and the answer of the corresponding patient (family), as well as the doctor's diagnosis, medical advice and other information.
  • the embodiment of the present invention can also analyze the consultation information according to the requirements, and obtain the corresponding analysis result. Since the consultation is related to the diagnosis of the disease, the analysis result is also related to the diagnosis of the disease. Determined based on analytical needs.
  • the communication process with the patient can be recorded by means of recording, and then the doctor and patient statements can be separated, differentiated and arranged, and provided to the doctor in the form of dialogue. As a medical case, it can effectively reduce the time spent by doctors on medical records.
  • the description is relatively simple, and the relevant parts can be referred to the description of the method embodiment.
  • FIG. 6 is a structural block diagram of an electronic device 600 for voice-based data processing, according to an exemplary embodiment.
  • the electronic device 600 can be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, a fitness device, a personal digital assistant, etc., or a server device such as a server.
  • the electronic device 600 can include one or more of the following components: a processing component 602, a memory 604, a power component 606, a multimedia component 608, an audio component 610, an input/output (I/O) interface 612, and a sensor component 614. And communication component 616.
  • Processing component 602 typically controls the overall operation of electronic device 600, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations.
  • Processing component 602 can include one or more processors 620 to execute instructions to perform all or part of the steps described above.
  • processing component 602 can include one or more modules to facilitate interaction between component 602 and other components.
  • processing component 602 can include a multimedia module to facilitate interaction between multimedia component 608 and processing component 602.
  • Memory 604 is configured to store various types of data to support operation at device 600. Examples of such data include instructions for any application or method operating on electronic device 600, contact data, phone book data, messages, pictures, videos, and the like.
  • the memory 604 can be implemented by any type of volatile or non-volatile storage device or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read only memory (EEPROM), erasable Programmable Read Only Memory (EPROM), Programmable Read Only Memory (PROM), Read Only Memory (ROM), Magnetic Memory, Flash Memory, Disk or Optical Disk.
  • SRAM static random access memory
  • EEPROM electrically erasable programmable read only memory
  • EPROM erasable Programmable Read Only Memory
  • PROM Programmable Read Only Memory
  • ROM Read Only Memory
  • Magnetic Memory Flash Memory
  • Disk Disk or Optical Disk.
  • Power component 604 provides power to various components of electronic device 600.
  • Power component 604 can include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for electronic device 600.
  • the multimedia component 608 includes a screen between the electronic device 600 and a user that provides an output interface.
  • the screen can include a liquid crystal display (LCD) and a touch panel (TP). If the screen includes a touch panel, the screen can be implemented as a touch screen to receive input signals from the user.
  • the touch panel includes one or more touch sensors to sense touches, slides, and gestures on the touch panel. The touch sensor may sense not only the boundary of the touch or sliding action, but also the duration and pressure associated with the touch or slide operation.
  • the multimedia component 608 includes a front camera and/or a rear camera. When the electronic device 600 is in an operation mode, such as a shooting mode or a video mode, the front camera and/or the rear camera can receive external multimedia data. Each front and rear camera can be a fixed optical lens system or have focal length and optical zoom capabilities.
  • the audio component 610 is configured to output and/or input an audio signal.
  • the audio component 610 includes a microphone (MIC) that is configured to receive an external audio signal when the electronic device 600 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode.
  • the received audio signal may be further stored in memory 604 or transmitted via communication component 616.
  • audio component 610 also includes a speaker for outputting an audio signal.
  • the I/O interface 612 provides an interface between the processing component 402 and the peripheral interface module, which may be a keyboard, a click wheel, a button, or the like. These buttons may include, but are not limited to, a home button, a volume button, a launch button, and a lock button.
  • Sensor assembly 614 includes one or more sensors for providing electronic device 600 with a status assessment of various aspects.
  • sensor component 614 can detect an open/closed state of device 600, a relative positioning of components, such as the display and keypad of electronic device 600, and sensor component 614 can also detect a component of electronic device 600 or electronic device 600. The position changes, the presence or absence of contact of the user with the electronic device 600, the orientation or acceleration/deceleration of the electronic device 600, and the temperature change of the electronic device 600.
  • Sensor assembly 614 can include a proximity sensor configured to detect the presence of nearby objects without any physical contact.
  • Sensor assembly 614 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications.
  • the sensor component 614 can also include an acceleration sensor, a gyro sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
  • Communication component 616 is configured to facilitate wired or wireless communication between electronic device 600 and other devices.
  • the electronic device 400 can access a wireless network based on a communication standard such as WiFi, 2G or 3G, or a combination thereof.
  • the communication component 614 receives broadcast signals or broadcast associated information from an external broadcast management system via a broadcast channel.
  • the communication component 614 also includes a near field communication (NFC) module to facilitate short range communication.
  • NFC near field communication
  • the NFC module can be implemented based on radio frequency identification (RFID) technology, infrared data association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.
  • RFID radio frequency identification
  • IrDA infrared data association
  • UWB ultra-wideband
  • Bluetooth Bluetooth
  • electronic device 600 may be implemented by one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), A gated array (FPGA), controller, microcontroller, microprocessor, or other electronic component implementation for performing the above methods.
  • ASICs application specific integrated circuits
  • DSPs digital signal processors
  • DSPDs digital signal processing devices
  • PLDs programmable logic devices
  • FPGA gated array
  • controller microcontroller, microprocessor, or other electronic component implementation for performing the above methods.
  • non-transitory computer readable storage medium comprising instructions, such as a memory 604 comprising instructions executable by processor 620 of electronic device 400 to perform the above method.
  • the non-transitory computer readable storage medium may be a ROM, a random access memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, and an optical data storage device.
  • a non-transitory computer readable storage medium when instructions in the storage medium are executed by a processor of an electronic device, enabling the electronic device to perform a voice-based data processing method, the method comprising: obtaining a consultation Process data, the consultation process data is determined according to the voice data collected during the consultation process; the identification is performed according to the consultation process data, and the corresponding first text data and second text data are acquired, wherein the first text The data belongs to a target user, and the second text data belongs to other users than the target user; according to the first text data and the second text data, the consultation information is obtained.
  • the consultation process data includes a text recognition result obtained by the voice data and/or the voice data identification.
  • the consultation process data is voice data; the identifying according to the consultation process data, acquiring corresponding first text data and second text data, including: according to the voiceprint feature, from the voice Separating the first voice data and the second voice data from the data; respectively performing voice recognition on the first voice data and the second voice data, and acquiring corresponding first text data and second text data.
  • the separating, according to the voiceprint feature, the first voice data and the second voice data from the voice data including: dividing the voice data into multiple voice segments; according to the voiceprint feature, adopting The speech segment determines the first speech data and the second speech data.
  • the determining, according to the voiceprint feature, the first voice data and the second voice data by using the voice segment comprising: respectively matching each voice segment by using a reference voiceprint feature, wherein the reference voiceprint feature a voiceprint feature of the target user; acquiring a voice segment corresponding to the reference voiceprint feature to obtain a corresponding first voice data; acquiring a voice segment that does not match the reference voiceprint feature, and obtaining a corresponding second voice data .
  • determining, according to the voiceprint feature, the first voice data and the second voice data by using the voice segment including: identifying voiceprint features of each voice segment; and counting the number of voice segments corresponding to each voiceprint feature Determining the voiceprint feature having the largest number of voice segments, generating the first voice data using the voice segment corresponding to the voiceprint feature, and generating the second voice data using the voice segment not belonging to the first voice data.
  • performing voice recognition on the first voice data and the second voice data to obtain corresponding first text data and second text data including: performing voice separately on each voice segment in the first voice data Identifying, generating the first text data by using the recognized text segment; separately performing speech recognition on each of the second speech data segments, and generating the second text data by using the recognized text segment.
  • the consultation process data is a text recognition result obtained by the voice data identification; the identifying according to the consultation process data, acquiring corresponding first text data and second text data, including: The text recognition result performs feature recognition, and the first text data and the second text data are separated according to the language feature.
  • performing feature recognition on the text recognition result, and separating the first text data and the second text data according to the language feature including: dividing the text recognition result to obtain a corresponding text segment; using a preset model Identifying the text segment, determining a language feature of the text segment, the language feature comprising a first language feature and a second language feature; generating a first text data using the text segment having the first language feature, and The second text data is generated using the text segment having the second language feature.
  • the method further includes: analyzing the consultation information, and obtaining a corresponding analysis result, where the analysis result is related to the disease diagnosis.
  • FIG. 7 is a schematic structural diagram of an electronic device 700 for voice-based data processing according to another exemplary embodiment of the present invention.
  • the electronic device 700 can be a server that can vary considerably depending on configuration or performance, and can include one or more central processing units (CPUs) 722 (eg, one or more processors) And memory 732, one or more storage media 730 storing application 742 or data 744 (eg, one or one storage device in Shanghai).
  • the memory 732 and the storage medium 730 may be short-term storage or persistent storage.
  • the program stored on storage medium 730 may include one or more modules (not shown), each of which may include a series of instruction operations in the server.
  • central processor 722 can be configured to communicate with storage medium 730, executing a series of instruction operations in storage medium 730 on the server.
  • the server may also include one or more power sources 726, one or more wired or wireless network interfaces 750, one or more input and output interfaces 758, one or more keyboards 756, and/or one or more operating systems 741, For example, Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, etc.
  • the server is configured to execute, by one or more central processors 722, one or more programs including instructions for: obtaining consultation process data, the consultation process data being based on the consultation Determining the voice data collected in the process; performing identification according to the consultation process data, and acquiring corresponding first text data and second text data, wherein the first text data belongs to a target user, and the second text data It belongs to other users than the target user; according to the first text data and the second text data, the consultation information is obtained.
  • the consultation process data includes a text recognition result obtained by the voice data and/or the voice data identification.
  • the consultation process data is voice data; the identifying according to the consultation process data, acquiring corresponding first text data and second text data, including: according to the voiceprint feature, from the voice Separating the first voice data and the second voice data from the data; respectively performing voice recognition on the first voice data and the second voice data, and acquiring corresponding first text data and second text data.
  • the separating, according to the voiceprint feature, the first voice data and the second voice data from the voice data including: dividing the voice data into multiple voice segments; The speech segment determines the first speech data and the second speech data.
  • the determining, according to the voiceprint feature, the first voice data and the second voice data by using the voice segment comprising: respectively matching each voice segment by using a reference voiceprint feature, wherein the reference voiceprint feature a voiceprint feature of the target user; acquiring a voice segment corresponding to the reference voiceprint feature to obtain a corresponding first voice data; acquiring a voice segment that does not match the reference voiceprint feature, and obtaining a corresponding second voice data .
  • the determining, according to the voiceprint feature, the first voice data and the second voice data by using the voice segment including: identifying voiceprint features of each voice segment; and counting the number of voice segments corresponding to each voiceprint feature Determining the voiceprint feature having the largest number of voice segments, generating the first voice data using the voice segment corresponding to the voiceprint feature, and generating the second voice data using the voice segment not belonging to the first voice data.
  • performing voice recognition on the first voice data and the second voice data to obtain corresponding first text data and second text data including: performing voice separately on each voice segment in the first voice data Identifying, generating the first text data by using the recognized text segment; separately performing speech recognition on each of the second speech data segments, and generating the second text data by using the recognized text segment.
  • the inquiry process data is a text recognition result obtained by the voice data identification; the identifying according to the consultation process data, acquiring corresponding first text data and second text data, including: The text recognition result performs feature recognition, and the first text data and the second text data are separated according to the language feature.
  • performing feature recognition on the text recognition result, and separating the first text data and the second text data according to the language feature including: dividing the text recognition result to obtain a corresponding text segment; using a preset model Identifying the text segment, determining a language feature of the text segment, the language feature comprising a first language feature and a second language feature; generating a first text data using the text segment having the first language feature, and The second text data is generated using the text segment having the second language feature.
  • the one or more programs includes instructions for performing the following operations: analyzing the medical consultation information to obtain a corresponding analysis result, the analysis result Related to the diagnosis of the disease.
  • embodiments of the embodiments of the invention may be provided as a method, apparatus, or computer program product.
  • embodiments of the invention may be in the form of an entirely hardware embodiment, an entirely software embodiment, or a combination of software and hardware.
  • embodiments of the invention may take the form of a computer program product embodied on one or more computer usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) including computer usable program code.
  • Embodiments of the invention are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or FIG.
  • These computer program instructions can be provided to a processor of a general purpose computer, special purpose computer, embedded processor or other programmable data processing terminal device to produce a machine such that instructions are executed by a processor of a computer or other programmable data processing terminal device
  • Means are provided for implementing the functions specified in one or more of the flow or in one or more blocks of the flow chart.
  • the computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing terminal device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device.
  • the instruction device implements the functions specified in one or more blocks of the flowchart or in a flow or block of the flowchart.

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Machine Translation (AREA)

Abstract

L'invention concerne un procédé et un appareil de traitement de données vocales, et un dispositif électronique, utilisés pour enregistrer entièrement un processus d'interrogation de diagnostic. Le procédé consiste : à acquérir des données de processus d'interrogation de diagnostic, les données de processus d'interrogation de diagnostic étant déterminées selon des données vocales collectées dans un processus d'interrogation de diagnostic (102) ; à réaliser une reconnaissance selon les données de processus d'interrogation de diagnostic pour acquérir des premières données de texte et des secondes données de texte correspondantes, les premières données de texte appartenant à un utilisateur cible, et les secondes données de texte appartenant à un autre utilisateur autre que l'utilisateur cible (104) ; et à obtenir des informations d'interrogation de diagnostic en fonction des premières données de texte et des secondes données de texte (106). Au moyen du procédé, de l'appareil et du dispositif électronique, les déclarations d'un médecin et d'un patient dans un processus d'interrogation de diagnostic peuvent être distinguées automatiquement, le processus d'interrogation de diagnostic est entièrement enregistré et est automatiquement organisé pour obtenir un contenu tel qu'un dossier médical, ce qui permet d'économiser du temps pour organiser des dossiers d'interrogation de diagnostic.
PCT/CN2018/082702 2017-05-26 2018-04-11 Procédé et appareil de traitement de données vocales, et dispositif électronique WO2018214663A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710384412.3A CN108962253A (zh) 2017-05-26 2017-05-26 一种基于语音的数据处理方法、装置和电子设备
CN201710384412.3 2017-05-26

Publications (1)

Publication Number Publication Date
WO2018214663A1 true WO2018214663A1 (fr) 2018-11-29

Family

ID=64395285

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/082702 WO2018214663A1 (fr) 2017-05-26 2018-04-11 Procédé et appareil de traitement de données vocales, et dispositif électronique

Country Status (2)

Country Link
CN (1) CN108962253A (fr)
WO (1) WO2018214663A1 (fr)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111582708A (zh) * 2020-04-30 2020-08-25 北京声智科技有限公司 医疗信息的检测方法、系统、电子设备及计算机可读存储介质
CN112118415B (zh) * 2020-09-18 2023-02-10 瑞然(天津)科技有限公司 远程诊疗方法、装置和患者侧终端、医生侧终端
CN114520062B (zh) * 2022-04-20 2022-07-22 杭州马兰头医学科技有限公司 一种基于ai和信创的医疗云通信系统

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104268279A (zh) * 2014-10-16 2015-01-07 魔方天空科技(北京)有限公司 语料数据的查询方法和装置
CN104427292A (zh) * 2013-08-22 2015-03-18 中兴通讯股份有限公司 会议纪要的提取方法及装置
CN105469790A (zh) * 2014-08-29 2016-04-06 上海联影医疗科技有限公司 会诊信息处理方法及装置
CN106328124A (zh) * 2016-08-24 2017-01-11 安徽咪鼠科技有限公司 一种基于用户行为特征的语音识别方法
CN106326640A (zh) * 2016-08-12 2017-01-11 上海交通大学医学院附属瑞金医院卢湾分院 一种医疗语音控制系统及其控制方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104427292A (zh) * 2013-08-22 2015-03-18 中兴通讯股份有限公司 会议纪要的提取方法及装置
CN105469790A (zh) * 2014-08-29 2016-04-06 上海联影医疗科技有限公司 会诊信息处理方法及装置
CN104268279A (zh) * 2014-10-16 2015-01-07 魔方天空科技(北京)有限公司 语料数据的查询方法和装置
CN106326640A (zh) * 2016-08-12 2017-01-11 上海交通大学医学院附属瑞金医院卢湾分院 一种医疗语音控制系统及其控制方法
CN106328124A (zh) * 2016-08-24 2017-01-11 安徽咪鼠科技有限公司 一种基于用户行为特征的语音识别方法

Also Published As

Publication number Publication date
CN108962253A (zh) 2018-12-07

Similar Documents

Publication Publication Date Title
CN108899037B (zh) 动物声纹特征提取方法、装置及电子设备
US20200075011A1 (en) Sign Language Information Processing Method and Apparatus, Electronic Device and Readable Storage Medium
US10270736B2 (en) Account adding method, terminal, server, and computer storage medium
WO2017028416A1 (fr) Procédé d'apprentissage de classificateur, procédé de reconnaissance de type, et appareil
WO2018120447A1 (fr) Procédé, dispositif et équipement de traitement d'informations de dossier médical
JP7166294B2 (ja) オーディオ処理方法、装置及び記憶媒体
WO2018214663A1 (fr) Procédé et appareil de traitement de données vocales, et dispositif électronique
CN109558599B (zh) 一种转换方法、装置和电子设备
CN106202150A (zh) 信息显示方法及装置
CN107168958A (zh) 一种翻译方法及装置
CN109585001A (zh) 一种数据分析方法、装置、电子设备和存储介质
CN108628819A (zh) 处理方法和装置、用于处理的装置
CN105447109A (zh) 关键字词搜索方法及装置
CN105550643A (zh) 医学术语识别方法及装置
WO2021208531A1 (fr) Procédé et appareil de traitement de la parole, et dispositif électronique
CN109002184A (zh) 一种输入法候选词的联想方法和装置
CN112836058A (zh) 医疗知识图谱建立方法及装置、医疗知识图谱查询方法及装置
CN111898382A (zh) 一种命名实体识别方法、装置和用于命名实体识别的装置
CN108665889A (zh) 语音信号端点检测方法、装置、设备及存储介质
JP2022510660A (ja) データ処理方法及びその装置、電子機器、並びに記憶媒体
CN116166843A (zh) 基于细粒度感知的文本视频跨模态检索方法和装置
WO2018018912A1 (fr) Procédé et appareil de recherche, et dispositif électronique
CN112133295B (zh) 语音识别方法、装置及存储介质
CN110634570A (zh) 一种诊断仿真方法及相关装置
CN109785941B (zh) 一种医生的推荐方法和装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18806292

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18806292

Country of ref document: EP

Kind code of ref document: A1