CN111737992B - Three-way text information processing method, computer equipment and storage medium - Google Patents

Three-way text information processing method, computer equipment and storage medium Download PDF

Info

Publication number
CN111737992B
CN111737992B CN202010638463.6A CN202010638463A CN111737992B CN 111737992 B CN111737992 B CN 111737992B CN 202010638463 A CN202010638463 A CN 202010638463A CN 111737992 B CN111737992 B CN 111737992B
Authority
CN
China
Prior art keywords
text information
professional
terminal
sequence
way
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010638463.6A
Other languages
Chinese (zh)
Other versions
CN111737992A (en
Inventor
周赞和
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Heyu Health Technology Co ltd
Original Assignee
Heyu Health Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Heyu Health Technology Co ltd filed Critical Heyu Health Technology Co ltd
Priority to CN202010638463.6A priority Critical patent/CN111737992B/en
Publication of CN111737992A publication Critical patent/CN111737992A/en
Application granted granted Critical
Publication of CN111737992B publication Critical patent/CN111737992B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Machine Translation (AREA)

Abstract

The application relates to the technical field of text processing, and discloses a three-way text information processing method, computer equipment and a storage medium, wherein common words and professional words in text information are respectively marked; dividing the text information into a first sequence consisting of a plurality of common sentence segments and a plurality of professional sentence segments; calling a preset alternative sentence segment group database; replacing part of professional sentence segments in the first sequence with short sentences of the corresponding replacement sentence segment group respectively to obtain a plurality of non-repeated second sequences which are composed of common vocabularies and a small number of professional vocabularies; replacing all professional sentence segments in the first sequence with short sentences of the corresponding replacement sentence segment group respectively to obtain a plurality of non-repeated third sequences which are composed of common vocabularies; finally, the text information is processed into three text information forms facing audience groups of three different professional levels, so that each audience group can understand the text information, and meanwhile reading fluency and efficiency are improved.

Description

Three-way text information processing method, computer equipment and storage medium
Technical Field
The present application relates to the field of text processing technologies, and in particular, to a three-way text information processing method, a computer device, and a storage medium.
Background
With the rapid development of computer technology and network technology, text information processing technology is realized. Some domain-specific textual information, because of its strong profession, may only be understood by practitioners in the domain, and even by "semi-professionals" who have some expertise in the domain, may only be understood in part, and may be completely unintelligible by people without any expertise in the domain. The current text information processing technology is generally based on simple professional word replacement, the replacement result is not beneficial to reading, resource waste is caused to the requirement of semi-professional personnel, and the efficiency is low.
Disclosure of Invention
The application provides a three-way text information processing method, which comprises the following steps:
s1, respectively labeling the common words and the professional words in the text information;
s2, dividing the text information into a first sequence consisting of a plurality of ordinary sentence segments and a plurality of professional sentence segments, wherein the ordinary sentence segments do not comprise professional vocabularies, the professional sentence segments comprise professional vocabularies with the number more than or equal to one and less than or equal to three, and the professional sentence segments at least comprise five words;
s3, calling a preset replacement sentence segment group database, wherein the replacement sentence segment group database records a plurality of replacement sentence segment groups, each replacement sentence segment group consists of a professional sentence segment and a plurality of short sentences, and the meaning of each of the professional sentence segments and the short sentences in the same replacement word group is the same;
s4, replacing part of professional sentence segments in the first sequence with short sentences of corresponding replacement sentence segment groups respectively, thereby obtaining a plurality of non-repeated second sequences which are composed of common vocabularies and a small number of professional vocabularies;
s5, calculating a second similarity between the first sequence and the second sequence according to a preset similarity calculation method, so as to obtain a plurality of second similarity values respectively corresponding to the plurality of second sequences;
s6, obtaining the largest second similarity value in the plurality of second similarity values, and taking the second sequence corresponding to the largest second similarity value as the second-direction text information form of the three-direction text information;
s7, replacing all professional sentence segments in the first sequence with short sentences of the corresponding replacement sentence segment group respectively, thereby obtaining a plurality of third sequences which are not repeated and are composed of common vocabularies;
s8, calculating the similarity between the first sequence and the third sequence according to a preset similarity calculation method, so as to obtain a plurality of similarity values respectively corresponding to the plurality of non-repeated third sequences which are all composed of common words;
s9, obtaining the maximum similarity value in the similarity values, and taking the third sequence corresponding to the maximum similarity value as the third direction text information form of the three direction text information;
s10, recording the text information in the step S1 as a first form of the three-way text information.
Further, the three-way text information processing method further comprises the following steps:
s11, receiving a request for inquiring text information sent by a terminal;
s12, judging whether the terminal is one of a first audience group terminal, a second audience group terminal and a third audience group terminal;
s13, if the terminal is a first audience group terminal, sending the three-way text information in a first-way text information form to the first audience group terminal;
s14, if the terminal is a second audience group terminal, sending the three-way text information in a second-way text information form to the second audience group terminal;
and S15, if the terminal is a third audience group terminal, sending the three-way text information in the form of the third-way text information to the third audience group terminal.
Further, the step S4 of replacing some of the professional sentence segments in the first sequence with short sentences corresponding to the replacing sentence segment group, so as to obtain a plurality of second sequences that are not repeated and each of which is composed of a normal vocabulary and a small amount of professional vocabulary, includes:
s41, performing ascending sequencing on all text information in a preset text information base to obtain a ascending sequencing list of professional vocabularies;
s42, obtaining the ranking of the professional vocabularies in the first sequence in the professional vocabulary ascending list;
s43, performing secondary ascending sequencing according to the ranking of the professional vocabularies in the first sequence in the professional vocabulary ascending list to obtain a secondary ascending list of the professional vocabularies in the first sequence;
s44, recording the professional vocabularies with the ranking larger than the preset value in the secondary ascending list as vocabularies to be replaced, and respectively replacing part of the professional sentence segments in the first sequence with short sentences of the corresponding replacement sentence segment group, thereby obtaining a plurality of non-repeated second sequences which are composed of common vocabularies and a small number of professional vocabularies; wherein each part of professional sentence segment comprises at least one vocabulary to be replaced.
Further, the step S8 of calculating the similarity between the first sequence and the third sequence according to a preset similarity calculation method includes:
s81, mapping the first sequence and the third sequence into a first piece vector and a second piece vector respectively according to a preset vector mapping method;
s82, according to the formula:
Figure 840065DEST_PATH_IMAGE002
calculating the similarity Z between the first vector and the second vector; p is a first vector, Pi is the ith vector value of the first vector, T is a second vector, Ti is the ith vector value of the second vector, and the first vector and the second vector both have n vectors.
Further, step S11 of receiving the request for inquiring text information sent by the terminal further includes:
s111, judging whether requests for inquiring text information sent by two terminals are received within preset time;
s112, if requests for inquiring text information sent by two terminals are received, judging whether the two terminals are respectively two of a first audience group terminal, a second audience group terminal and a third audience group terminal;
s113, if the two terminals are respectively a first audience group terminal, a second audience group terminal and a third audience group terminal, sending three-way text information in corresponding forms to the two terminals at the same time, and requiring the two terminals to return reading opinions;
s114, receiving the reading opinions returned by the two terminals, and judging whether the reading opinions returned by the two terminals are both satisfied;
and S115, if only one of the reading opinions returned by the two terminals is satisfied, constructing a communication channel between the two terminals so that the two terminals can mutually ask the text information.
Further, after the step S12 of determining whether the terminal is one of the first audience member terminal, the second audience member terminal, and the third audience member terminal, the method includes:
s121, if the terminal cannot be judged to be one of the first audience group terminal, the second audience group terminal and the third audience group terminal, copying the three-way text information into three parts;
s122, respectively converting the three-way text information into a first-way text information form, a second-way text information form and a third-way text information form;
s123, simultaneously sending the three-way text information in the form of first-direction text information, the three-way text information in the form of second-direction text information and the three-way text information in the form of third-direction text information to the terminal.
The present application further provides a computer device comprising a memory storing a computer program and a processor implementing the steps of the method according to any one of the above when the processor executes the computer program.
The present application also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, carries out the steps of the method of any of the above.
According to the three-way text information processing method, the computer equipment and the storage medium, common words and special words in the text information are respectively marked; dividing the text information into a first sequence consisting of a plurality of ordinary sentence segments and a plurality of professional sentence segments, wherein the ordinary sentence segments do not comprise professional vocabularies, the professional sentence segments comprise professional vocabularies with the number more than or equal to one and less than or equal to three, and the professional sentence segments at least comprise five words; calling a preset replacement sentence segment group database, wherein the replacement sentence segment group database records a plurality of replacement sentence segment groups, each replacement sentence segment group consists of a professional sentence segment and a plurality of short sentences, and the meaning of each of the professional sentence segment and the short sentences in the same replacement word group is the same; replacing part of professional sentence segments in the first sequence with short sentences of the corresponding replacement sentence segment group respectively to obtain a plurality of non-repeated second sequences which are composed of common vocabularies and a small number of professional vocabularies; calculating a second similarity between the first sequence and the second sequence according to a preset similarity calculation method, so as to obtain a plurality of second similarity values respectively corresponding to the plurality of second sequences; obtaining a maximum second similarity value in the plurality of second similarity values, and taking a second sequence corresponding to the maximum second similarity value as a second-direction text information form of the three-way text information; replacing all professional sentence segments in the first sequence with short sentences of the corresponding replacement sentence segment group respectively to obtain a plurality of non-repeated third sequences which are composed of common vocabularies; calculating the similarity between the first sequence and the third sequence according to a preset similarity calculation method, so as to obtain a plurality of similarity values respectively corresponding to the plurality of non-repeated third sequences which are all formed by common words; acquiring a maximum similarity value in the similarity values, and taking a third sequence corresponding to the maximum similarity value as a third-way text information form of the three-way text information; and recording the text information as a first form of the three-way text information. Finally, the text information is processed into three text information forms facing audience groups of three different professional levels, so that each audience group can understand the text information, and meanwhile reading fluency and efficiency are improved.
Drawings
Fig. 1 is a schematic flowchart of a three-way text information processing method according to an embodiment of the present application;
fig. 2 is a block diagram illustrating a structure of a computer device according to an embodiment of the present application.
The implementation, functional features and advantages of the objectives of the present application will be further explained with reference to the accompanying drawings.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
The method and the device improve the utilization efficiency of the professional text information in a specific field (such as education, medical treatment, finance and the like) through the design of the special three-way text information. The three-way text information has three forms of first-way text information, second-way text information and third-way text information, wherein the first-way text information form is specific to a first audience group (professionals in the specific field), the second-way text information form is specific to a second audience group (semi-professionals in the specific field), and the third-way text information form is specific to a third audience group (laypersons in the specific field). So that not only the first audience group can utilize the text information, but also other groups, such as the second audience group and the third audience group, can utilize the text information. Thereby improving the use efficiency of the text information as the information data.
Referring to fig. 1, an embodiment of the present application provides a three-way text information processing method, including the following steps:
s1, respectively labeling the common words and the professional words in the text information;
s2, dividing the text information into a first sequence consisting of a plurality of ordinary sentence segments and a plurality of professional sentence segments, wherein the ordinary sentence segments do not comprise professional vocabularies, the professional sentence segments comprise professional vocabularies with the number more than or equal to one and less than or equal to three, and the professional sentence segments at least comprise five words;
s3, calling a preset replacement sentence segment group database, wherein the replacement sentence segment group database records a plurality of replacement sentence segment groups, each replacement sentence segment group consists of a professional sentence segment and a plurality of short sentences, and the meaning of each of the professional sentence segments and the short sentences in the same replacement word group is the same;
s4, replacing part of professional sentence segments in the first sequence with short sentences of corresponding replacement sentence segment groups respectively, thereby obtaining a plurality of non-repeated second sequences which are composed of common vocabularies and a small number of professional vocabularies;
s5, calculating a second similarity between the first sequence and the second sequence according to a preset similarity calculation method, so as to obtain a plurality of second similarity values respectively corresponding to the plurality of second sequences;
s6, obtaining the largest second similarity value in the plurality of second similarity values, and taking the second sequence corresponding to the largest second similarity value as the second-direction text information form of the three-direction text information;
s7, replacing all professional sentence segments in the first sequence with short sentences of the corresponding replacement sentence segment group respectively, thereby obtaining a plurality of third sequences which are not repeated and are composed of common vocabularies;
s8, calculating the similarity between the first sequence and the third sequence according to a preset similarity calculation method, so as to obtain a plurality of similarity values respectively corresponding to the plurality of non-repeated third sequences which are all composed of common words;
s9, obtaining the maximum similarity value in the similarity values, and taking the third sequence corresponding to the maximum similarity value as the third direction text information form of the three direction text information;
s10, recording the text information in the step S1 as a first form of the three-way text information.
As described above, the text information is divided into a first sequence consisting of a plurality of ordinary sentence segments and a plurality of professional sentence segments, and then the professional sentence segments are respectively replaced with short sentences; rather than simply replacing the professional vocabulary with its defining words and phrases, which is a feature of the present application. If only the professional vocabulary is simply replaced by the definition words and sentences, the reading is not facilitated; and this application adopts professional sentence section, the quantity of professional vocabulary that professional sentence section included is more than or equal to one and is less than or equal to three, and the mode of professional sentence section at least includes five words, make reading unobstructed to improve at first, secondly, because the possibility of the vocabulary collocation of professional vocabulary is very limited, consequently, the quantity of professional sentence section that adopts this application to obtain can not be many to make the establishment of replacement sentence section group database possible (this is because, if the possibility of the vocabulary collocation of professional vocabulary is very high, then the replacement sentence section group will be exponential growth, will lead to the establishment of replacement sentence section group database to be impossible to realize). Therefore, only under the special environment of the application, the form of professional sentence segments can be adopted to generate the second direction text information form and the third direction text information form. Through the steps, the text information is processed, so that professional text information is converted into a form that each audience group can correspondingly understand and read the efficiency, and the reading fluency and efficiency are improved. The technical scheme of the application can be applied to text information processing in a specific field, for example, the text information can be processed, the text information can also be understood as translation of the text information, and the processing result is in a form that the audience groups with three different professional degrees can read and understand the text information and the reading efficiency, for example, for college students in ancient Chinese profession, the text information is processed into the first-direction text form, for middle school students, the text information is processed into the second-direction text information form, and for primary school students, the text information is processed into the third-direction text information form. For another example, in the medical field, electronic medical records written by doctors can be understood by doctors, but for non-doctors, patients can often understand a part of the electronic medical records, and ordinary people who do not have related diseases in the medical records can not understand the part of the electronic medical records, the three people correspond to three different audience groups respectively, and the medical and health information of the electronic medical records can be processed by using the method of the application to obtain three-way medical and health information, so that the reading fluency and efficiency of the electronic medical records by the different audience groups can be ensured.
In one embodiment, the three-way text information processing method further includes:
s11, receiving a request for inquiring text information sent by a terminal;
s12, judging whether the terminal is one of a first audience group terminal, a second audience group terminal and a third audience group terminal;
s13, if the terminal is a first audience group terminal, sending the three-way text information in a first-way text information form to the first audience group terminal;
s14, if the terminal is a second audience group terminal, sending the three-way text information in a second-way text information form to the second audience group terminal;
and S15, if the terminal is a third audience group terminal, sending the three-way text information in the form of the third-way text information to the third audience group terminal.
As described above, the method includes receiving a request for inquiring text information sent by a terminal, determining whether the terminal is one of a first audience group terminal, a second audience group terminal and a third audience group terminal, and if a user of the terminal requesting for inquiring text information is not a professional in the field, determining whether the terminal is one of the first audience group terminal, the second audience group terminal and the third audience group terminal by determining the type of the terminal, and sending a corresponding text information form to the past to ensure the fluency of information text reading. The terminal may be determined in any form, for example, by verifying an electronic certificate of the terminal. If the terminal is a first audience group terminal, the three-way text information in the form of first-way text information is sent to the first audience group terminal, if the terminal is a second audience group terminal, the three-way text information in the form of second-way text information is sent to the second audience group terminal, and if the terminal is the first audience group terminal, the three-way text information in the form of third-way text information is sent to the third audience group terminal.
In one embodiment, the step S4 of replacing some of the professional sentence segments in the first sequence with short sentences corresponding to the replacing sentence segment group, so as to obtain a plurality of non-repeating second sequences each composed of a normal vocabulary and a small amount of professional vocabulary, includes:
s41, performing ascending sequencing on all text information in a preset text information base to obtain a ascending sequencing list of professional vocabularies;
s42, obtaining the ranking of the professional vocabularies in the first sequence in the professional vocabulary ascending list;
s43, performing secondary ascending sequencing according to the ranking of the professional vocabularies in the first sequence in the professional vocabulary ascending list to obtain a secondary ascending list of the professional vocabularies in the first sequence;
s44, recording the professional vocabularies with the ranking larger than the preset value in the secondary ascending list as vocabularies to be replaced, and respectively replacing part of the professional sentence segments in the first sequence with short sentences of the corresponding replacement sentence segment group, thereby obtaining a plurality of non-repeated second sequences which are composed of common vocabularies and a small number of professional vocabularies; wherein each part of professional sentence segment comprises at least one vocabulary to be replaced.
As mentioned above, the second group of recipients has its particularity that they are familiar enough with some of the specialized vocabularies. Therefore, when part of the professional sentence periods are selected, the common professional vocabularies are invisible, and only the rarely used professional vocabularies are selected, so that the calculation amount is reduced, and the second audience group can still be guaranteed to be capable of efficiently reading the text information.
In one embodiment, the step S8 of calculating the similarity between the first sequence and the third sequence according to a preset similarity calculation method includes:
s81, mapping the first sequence and the third sequence into a first piece vector and a second piece vector respectively according to a preset vector mapping method;
s82, according to the formula:
Figure 823064DEST_PATH_IMAGE004
calculating the similarity Z between the first vector and the second vector; p is a first vector, Pi is the ith vector value of the first vector, T is a second vector, Ti is the ith vector value of the second vector, and the first vector and the second vector both have n vectors.
As described above, the vector mapping method may be implemented in any feasible manner, such as by using a word vector library to implement vector mapping, or by using a hierarchical convolutional neural network (e.g., modeling each sentence by using a convolutional neural network, and then performing convolution and pooling operations again in units of sentences to obtain a piece vector), or by using a hierarchical cyclic neural network (e.g., modeling each sentence by using a cyclic neural network, and then modeling a sequence in units of sentences by using a cyclic neural network to obtain a piece vector). And calculating the similarity Z between the first vector and the second vector by adopting the formula which simultaneously considers the numerical difference between the vectors and the angle difference between the vectors.
In one embodiment, the step S11 of receiving the request for inquiring text information sent by the terminal further includes:
s111, judging whether requests for inquiring text information sent by two terminals are received within preset time;
s112, if requests for inquiring text information sent by two terminals are received, judging whether the two terminals are respectively two of a first audience group terminal, a second audience group terminal and a third audience group terminal;
s113, if the two terminals are respectively a first audience group terminal, a second audience group terminal and a third audience group terminal, sending three-way text information in corresponding forms to the two terminals at the same time, and requiring the two terminals to return reading opinions;
s114, receiving the reading opinions returned by the two terminals, and judging whether the reading opinions returned by the two terminals are both satisfied;
and S115, if only one of the reading opinions returned by the two terminals is satisfied, constructing a communication channel between the two terminals so that the two terminals can mutually ask the text information.
As mentioned above, through the steps, the text information is sent together in two directions, and the text information is fully understood. Not all people can fully understand the text information through the terminals, so the method and the device are specially designed, namely three-way text information in corresponding forms is sent to the two terminals at the same time, and the two terminals are required to return reading opinions; receiving the reading opinions returned by the two terminals, and judging whether the reading opinions returned by the two terminals are both satisfied; and if only one of the reading opinions returned by the two terminals is satisfied, constructing a communication channel between the two terminals so that the two terminals can carry out mutual question interpretation on the text information, so that one party of the text information can be asked for a question. Since the two terminals are respectively the first audience group terminal, the second audience group terminal and the third audience group terminal, at least one terminal is the second audience group terminal or the first audience group terminal, and questions can be answered accurately.
In one embodiment, after the step S12 of determining whether the terminal is one of a first audience member terminal, a second audience member terminal and a third audience member terminal, the method includes:
s121, if the terminal cannot be judged to be one of the first audience group terminal, the second audience group terminal and the third audience group terminal, copying the three-way text information into three parts;
s122, respectively converting the three-way text information into a first-way text information form, a second-way text information form and a third-way text information form;
s123, simultaneously sending the three-way text information in the form of first-direction text information, the three-way text information in the form of second-direction text information and the three-way text information in the form of third-direction text information to the terminal.
As described above, the simultaneous transmission of the three-way text information in the form of the first directional text information, the three-way text information in the form of the second directional text information, and the three-way text information in the form of the third directional text information to the terminal is achieved through the above steps. When the class identification of the terminal fails, three pieces of three-way text information are respectively converted into a first-way text information form, a second-way text information form and a third-way text information form; the method is characterized in that the three-way text information in the form of the first-way text information, the three-way text information in the form of the second-way text information and the three-way text information in the form of the third-way text information are simultaneously sent to the terminal, so that the user of the terminal can still be ensured to accurately read the text information on the premise of losing certain computing resources and network resources.
In a specific embodiment, the three-way text information processing method is applied to the field of medical information, and more specifically, the application further provides a three-way health care information sending method, which includes: acquiring written medical and health information sent by a first terminal, wherein the medical and health information comprises electronic medical records and public health electronic health file documents aiming at original residents; generating three-way medical health information according to the medical health information, wherein the three-way medical health information has three forms of first-way medical health information, second-way medical health information and third-way medical health information, the first-way medical health information form is specific to a professional medical staff group, the second-way medical health information form is specific to an original resident, and the third-way medical health information form is specific to other people except the professional medical staff group and the original resident; receiving a request for inquiring medical and health information sent by a second terminal; judging whether the second terminal is one of a professional medical worker terminal, an original resident terminal and other terminals except a professional medical worker group and an original resident; if the second terminal is a professional medical worker terminal, converting the three-way medical and health information in the medical and health information into a first-way medical and health information form, and sending the three-way medical and health information in the first-way medical and health information form to the second terminal; if the second terminal is an original resident terminal, converting the three-way medical and health information into a second-way medical and health information form, and sending the three-way medical and health information in the second-way medical and health information form to the second terminal; and if the second terminal is a terminal of other people except a professional medical worker group and the original residents, converting the three-way medical and health information into a third-way medical and health information form, and sending the three-way medical and health information in the third-way medical and health information form to the second terminal.
The step of generating three-way medical health information according to the medical health information, wherein the three-way medical health information has three forms of first-way medical health information, second-way medical health information and third-way medical health information comprises the following steps: respectively labeling common vocabularies and medical professional vocabularies in the medical and health information; dividing the medical and health information into a first sequence consisting of a plurality of common sentence segments and a plurality of medical professional sentence segments, wherein the common sentence segments do not comprise medical professional vocabularies, the medical professional sentence segments comprise medical professional vocabularies with the number more than or equal to one and less than or equal to three, and the medical professional sentence segments at least comprise five words; calling a preset replacement sentence segment group database, wherein the replacement sentence segment group database records a plurality of replacement sentence segment groups, each replacement sentence segment group consists of a medical professional sentence segment and a plurality of short sentences, and the medical professional sentence segment and each short sentence in the same replacement word group have the same ideographical meaning; replacing part of medical professional sentence segments in the first sequence with short sentences of the corresponding replacement sentence segment group respectively to obtain a plurality of non-repeated second sequences which are composed of common vocabularies and a small amount of medical professional vocabularies; calculating a second similarity between the first sequence and the second sequence according to a preset similarity calculation method, so as to obtain a plurality of second similarity values respectively corresponding to the plurality of second sequences; acquiring a maximum second similarity value in the plurality of second similarity values, and taking a second sequence corresponding to the maximum second similarity value as a second-direction medical health information form of the three-direction medical health information; replacing all medical professional sentence segments in the first sequence with short sentences of the corresponding replacement sentence segment group respectively to obtain a plurality of non-repeated third sequences which are composed of common vocabularies; calculating the similarity between the first sequence and the third sequence according to a preset similarity calculation method, so as to obtain a plurality of similarity values respectively corresponding to the plurality of non-repeated third sequences which are all formed by common words; acquiring a maximum similarity value in the similarity values, and taking a third sequence corresponding to the maximum similarity value as a third-way medical health information form of the three-way medical health information; recording the medical and health information as a first-way medical and health information form of three-way medical and health information.
According to the three-way text information processing method, common words and professional words in the text information are respectively marked; dividing the text information into a first sequence consisting of a plurality of ordinary sentence segments and a plurality of professional sentence segments, wherein the ordinary sentence segments do not comprise professional vocabularies, the professional sentence segments comprise professional vocabularies with the number more than or equal to one and less than or equal to three, and the professional sentence segments at least comprise five words; calling a preset replacement sentence segment group database, wherein the replacement sentence segment group database records a plurality of replacement sentence segment groups, each replacement sentence segment group consists of a professional sentence segment and a plurality of short sentences, and the meaning of each of the professional sentence segment and the short sentences in the same replacement word group is the same; replacing part of professional sentence segments in the first sequence with short sentences of the corresponding replacement sentence segment group respectively to obtain a plurality of non-repeated second sequences which are composed of common vocabularies and a small number of professional vocabularies; calculating a second similarity between the first sequence and the second sequence according to a preset similarity calculation method, so as to obtain a plurality of second similarity values respectively corresponding to the plurality of second sequences; obtaining a maximum second similarity value in the plurality of second similarity values, and taking a second sequence corresponding to the maximum second similarity value as a second-direction text information form of the three-way text information; replacing all professional sentence segments in the first sequence with short sentences of the corresponding replacement sentence segment group respectively to obtain a plurality of non-repeated third sequences which are composed of common vocabularies; calculating the similarity between the first sequence and the third sequence according to a preset similarity calculation method, so as to obtain a plurality of similarity values respectively corresponding to the plurality of non-repeated third sequences which are all formed by common words; acquiring a maximum similarity value in the similarity values, and taking a third sequence corresponding to the maximum similarity value as a third-way text information form of the three-way text information; and recording the text information as a first form of the three-way text information. Finally, the text information is processed into three text information forms facing audience groups of three different professional levels, so that each audience group can understand the text information, and meanwhile reading fluency and efficiency are improved.
Referring to fig. 2, an embodiment of the present invention further provides a computer device, where the computer device may be a server, and an internal structure of the computer device may be as shown in the figure. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the computer designed processor is used to provide computational and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The memory provides an environment for the operation of the operating system and the computer program in the non-volatile storage medium. The database of the computer device is used for storing data used by the three-way text information processing method. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a three-way text information processing method.
The processor executes the three-way text information processing method, wherein the steps of the method are respectively in one-to-one correspondence with the steps of executing the three-way text information processing method of the foregoing embodiment, and are not described herein again.
It will be understood by those skilled in the art that the structures shown in the drawings are only block diagrams of some of the structures associated with the embodiments of the present application and do not constitute a limitation on the computer apparatus to which the embodiments of the present application may be applied.
The computer equipment marks common words and professional words in the text information respectively; dividing the text information into a first sequence consisting of a plurality of ordinary sentence segments and a plurality of professional sentence segments, wherein the ordinary sentence segments do not comprise professional vocabularies, the professional sentence segments comprise professional vocabularies with the number more than or equal to one and less than or equal to three, and the professional sentence segments at least comprise five words; calling a preset replacement sentence segment group database, wherein the replacement sentence segment group database records a plurality of replacement sentence segment groups, each replacement sentence segment group consists of a professional sentence segment and a plurality of short sentences, and the meaning of each of the professional sentence segment and the short sentences in the same replacement word group is the same; replacing part of professional sentence segments in the first sequence with short sentences of the corresponding replacement sentence segment group respectively to obtain a plurality of non-repeated second sequences which are composed of common vocabularies and a small number of professional vocabularies; calculating a second similarity between the first sequence and the second sequence according to a preset similarity calculation method, so as to obtain a plurality of second similarity values respectively corresponding to the plurality of second sequences; obtaining a maximum second similarity value in the plurality of second similarity values, and taking a second sequence corresponding to the maximum second similarity value as a second-direction text information form of the three-way text information; replacing all professional sentence segments in the first sequence with short sentences of the corresponding replacement sentence segment group respectively to obtain a plurality of non-repeated third sequences which are composed of common vocabularies; calculating the similarity between the first sequence and the third sequence according to a preset similarity calculation method, so as to obtain a plurality of similarity values respectively corresponding to the plurality of non-repeated third sequences which are all formed by common words; acquiring a maximum similarity value in the similarity values, and taking a third sequence corresponding to the maximum similarity value as a third-way text information form of the three-way text information; and recording the text information as a first form of the three-way text information. Finally, the text information is processed into three text information forms facing audience groups of three different professional levels, so that each audience group can understand the text information, and meanwhile reading fluency and efficiency are improved.
An embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored thereon, and when the computer program is executed by a processor, the method for processing three-way text information is implemented, where steps included in the method are respectively in one-to-one correspondence with steps of executing the three-way text information processing method of the foregoing embodiment, and are not described herein again.
The computer-readable storage medium marks common words and professional words in the text information respectively; dividing the text information into a first sequence consisting of a plurality of ordinary sentence segments and a plurality of professional sentence segments, wherein the ordinary sentence segments do not comprise professional vocabularies, the professional sentence segments comprise professional vocabularies with the number more than or equal to one and less than or equal to three, and the professional sentence segments at least comprise five words; calling a preset replacement sentence segment group database, wherein the replacement sentence segment group database records a plurality of replacement sentence segment groups, each replacement sentence segment group consists of a professional sentence segment and a plurality of short sentences, and the meaning of each of the professional sentence segment and the short sentences in the same replacement word group is the same; replacing part of professional sentence segments in the first sequence with short sentences of the corresponding replacement sentence segment group respectively to obtain a plurality of non-repeated second sequences which are composed of common vocabularies and a small number of professional vocabularies; calculating a second similarity between the first sequence and the second sequence according to a preset similarity calculation method, so as to obtain a plurality of second similarity values respectively corresponding to the plurality of second sequences; obtaining a maximum second similarity value in the plurality of second similarity values, and taking a second sequence corresponding to the maximum second similarity value as a second-direction text information form of the three-way text information; replacing all professional sentence segments in the first sequence with short sentences of the corresponding replacement sentence segment group respectively to obtain a plurality of non-repeated third sequences which are composed of common vocabularies; calculating the similarity between the first sequence and the third sequence according to a preset similarity calculation method, so as to obtain a plurality of similarity values respectively corresponding to the plurality of non-repeated third sequences which are all formed by common words; acquiring a maximum similarity value in the similarity values, and taking a third sequence corresponding to the maximum similarity value as a third-way text information form of the three-way text information; and recording the text information as a first form of the three-way text information. Finally, the text information is processed into three text information forms facing audience groups of three different professional levels, so that each audience group can understand the text information, and meanwhile reading fluency and efficiency are improved.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium provided herein and used in the examples may include non-volatile and/or volatile memory. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), double data rate SDRAM (SSRDRAM), Enhanced SDRAM (ESDRAM), synchronous Link (Synchlink) DRAM (SLDRAM), Rambus (Rambus) direct RAM (RDRAM), direct bused dynamic RAM (DRDRAM), and bused dynamic RAM (RDRAM).
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, article, or method that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, article, or method. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, article, or method that includes the element.
The above description is only a preferred embodiment of the present application, and not intended to limit the scope of the present application, and all modifications of equivalent structures and equivalent processes, which are made by the contents of the specification and the drawings of the present application, or which are directly or indirectly applied to other related technical fields, are also included in the scope of the present application.

Claims (8)

1. A method for processing three-way text information, the method comprising:
s1, respectively labeling the common words and the professional words in the text information;
s2, dividing the text information into a first sequence consisting of a plurality of ordinary sentence segments and a plurality of professional sentence segments, wherein the ordinary sentence segments do not comprise professional vocabularies, the professional sentence segments comprise professional vocabularies with the number more than or equal to one and less than or equal to three, and the professional sentence segments at least comprise five words;
s3, calling a preset replacement sentence segment group database, wherein the replacement sentence segment group database records a plurality of replacement sentence segment groups, each replacement sentence segment group consists of a professional sentence segment and a plurality of short sentences, and the meaning of each of the professional sentence segments and the short sentences in the same replacement word group is the same;
s4, replacing part of professional sentence segments in the first sequence with short sentences of corresponding replacement sentence segment groups respectively, thereby obtaining a plurality of non-repeated second sequences which are composed of common vocabularies and a small number of professional vocabularies;
s5, calculating a second similarity between the first sequence and the second sequence according to a preset similarity calculation method, so as to obtain a plurality of second similarity values respectively corresponding to the plurality of second sequences;
s6, obtaining the largest second similarity value in the plurality of second similarity values, and taking the second sequence corresponding to the largest second similarity value as the second-direction text information form of the three-direction text information;
s7, replacing all professional sentence segments in the first sequence with short sentences of the corresponding replacement sentence segment group respectively, thereby obtaining a plurality of third sequences which are not repeated and are composed of common vocabularies;
s8, calculating the similarity between the first sequence and the third sequence according to a preset similarity calculation method, so as to obtain a plurality of similarity values respectively corresponding to the plurality of non-repeated third sequences which are all composed of common words;
s9, obtaining the maximum similarity value in the similarity values, and taking the third sequence corresponding to the maximum similarity value as the third direction text information form of the three direction text information;
s10, recording the text information in the step S1 as a first form of the three-way text information.
2. The three-way text information processing method of claim 1, further comprising:
s11, receiving a request for inquiring text information sent by a terminal;
s12, judging whether the terminal is one of a first audience group terminal, a second audience group terminal and a third audience group terminal;
s13, if the terminal is a first audience group terminal, sending the three-way text information in a first-way text information form to the first audience group terminal;
s14, if the terminal is a second audience group terminal, sending the three-way text information in a second-way text information form to the second audience group terminal;
and S15, if the terminal is a third audience group terminal, sending the three-way text information in the form of the third-way text information to the third audience group terminal.
3. The three-way text information processing method according to claim 1, wherein said step S4 of replacing some of the professional sentence segments in said first sequence with short sentences corresponding to the set of replacement sentence segments, respectively, to obtain a plurality of second sequences that are not repeated and each consist of a normal vocabulary and a small number of professional vocabularies, comprises:
s41, performing ascending sequencing on all text information in a preset text information base to obtain a ascending sequencing list of professional vocabularies;
s42, obtaining the ranking of the professional vocabularies in the first sequence in the professional vocabulary ascending list;
s43, performing secondary ascending sequencing according to the ranking of the professional vocabularies in the first sequence in the professional vocabulary ascending list to obtain a secondary ascending list of the professional vocabularies in the first sequence;
s44, recording the professional vocabularies with the ranking larger than the preset value in the secondary ascending list as vocabularies to be replaced, and respectively replacing part of the professional sentence segments in the first sequence with short sentences of the corresponding replacement sentence segment group, thereby obtaining a plurality of non-repeated second sequences which are composed of common vocabularies and a small number of professional vocabularies; wherein each part of professional sentence segment comprises at least one vocabulary to be replaced.
4. The three-way text information processing method according to claim 1, wherein said step S8 of calculating the similarity between the first sequence and the third sequence according to a preset similarity calculation method includes:
s81, mapping the first sequence and the third sequence into a first piece vector and a second piece vector respectively according to a preset vector mapping method;
s82, according to the formula:
Figure 27020DEST_PATH_IMAGE002
calculating the similarity Z between the first vector and the second vector; p is a first vector, Pi is the ith vector value of the first vector, T is a second vector, Ti is the ith vector value of the second vector, and the first vector and the second vector both have n vectors.
5. The three-way text information processing method of claim 2, wherein the step S11 of receiving the request for inquiring text information sent by the terminal further comprises:
s111, judging whether requests for inquiring text information sent by two terminals are received within preset time;
s112, if requests for inquiring text information sent by two terminals are received, judging whether the two terminals are respectively two of a first audience group terminal, a second audience group terminal and a third audience group terminal;
s113, if the two terminals are respectively a first audience group terminal, a second audience group terminal and a third audience group terminal, sending three-way text information in corresponding forms to the two terminals at the same time, and requiring the two terminals to return reading opinions;
s114, receiving the reading opinions returned by the two terminals, and judging whether the reading opinions returned by the two terminals are both satisfied;
and S115, if only one of the reading opinions returned by the two terminals is satisfied, constructing a communication channel between the two terminals so that the two terminals can mutually ask the text information.
6. The method of processing three-way text information according to claim 2, wherein said step S12 of determining whether said terminal is one of a first audience member terminal, a second audience member terminal and a third audience member terminal, comprises:
s121, if the terminal cannot be judged to be one of the first audience group terminal, the second audience group terminal and the third audience group terminal, copying the three-way text information into three parts;
s122, respectively converting the three-way text information into a first-way text information form, a second-way text information form and a third-way text information form;
s123, simultaneously sending the three-way text information in the form of first-direction text information, the three-way text information in the form of second-direction text information and the three-way text information in the form of third-direction text information to the terminal.
7. A computer device comprising a memory and a processor, the memory storing a computer program, wherein the processor implements the steps of the method of any one of claims 1 to 6 when executing the computer program.
8. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 6.
CN202010638463.6A 2020-07-06 2020-07-06 Three-way text information processing method, computer equipment and storage medium Active CN111737992B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010638463.6A CN111737992B (en) 2020-07-06 2020-07-06 Three-way text information processing method, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010638463.6A CN111737992B (en) 2020-07-06 2020-07-06 Three-way text information processing method, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN111737992A CN111737992A (en) 2020-10-02
CN111737992B true CN111737992B (en) 2020-12-22

Family

ID=72653298

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010638463.6A Active CN111737992B (en) 2020-07-06 2020-07-06 Three-way text information processing method, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111737992B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104794169A (en) * 2015-03-30 2015-07-22 明博教育科技有限公司 Subject term extraction method and system based on sequence labeling model
CN108073392A (en) * 2017-12-29 2018-05-25 上海宽全智能科技有限公司 Intelligence programming method, equipment and storage medium based on natural language
CN109871546A (en) * 2017-12-01 2019-06-11 四川路源企业管理咨询有限公司 A kind of patent document translation system
CN110503945A (en) * 2019-09-06 2019-11-26 北京金山数字娱乐科技有限公司 A kind of training method and device of speech processes model

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103744919B (en) * 2013-12-26 2017-02-08 广州供电局有限公司 Power grid knowledge base self learning method and system
CN106951703B (en) * 2017-03-15 2020-01-10 长沙富格伦信息科技有限公司 System and method for generating electronic medical record
CN107578818B (en) * 2017-08-25 2021-09-10 广州宝荣科技应用有限公司 Auxiliary evolution method and device based on deep learning
CN110517791A (en) * 2019-08-26 2019-11-29 北京好医生云医院管理技术有限公司 A kind of working method based on big data intelligence interrogation system
CN111241834A (en) * 2020-01-20 2020-06-05 和宇健康科技股份有限公司 Medical care quality evaluation obtaining method, device, medium and terminal equipment

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104794169A (en) * 2015-03-30 2015-07-22 明博教育科技有限公司 Subject term extraction method and system based on sequence labeling model
CN109871546A (en) * 2017-12-01 2019-06-11 四川路源企业管理咨询有限公司 A kind of patent document translation system
CN108073392A (en) * 2017-12-29 2018-05-25 上海宽全智能科技有限公司 Intelligence programming method, equipment and storage medium based on natural language
CN110503945A (en) * 2019-09-06 2019-11-26 北京金山数字娱乐科技有限公司 A kind of training method and device of speech processes model

Also Published As

Publication number Publication date
CN111737992A (en) 2020-10-02

Similar Documents

Publication Publication Date Title
CN111128394B (en) Medical text semantic recognition method and device, electronic equipment and readable storage medium
CN110929515B (en) Reading understanding method and system based on cooperative attention and adaptive adjustment
WO2019029723A1 (en) Mathematical processing method, apparatus and device for text problem, and storage medium
CN110727779A (en) Question-answering method and system based on multi-model fusion
CN111651992A (en) Named entity labeling method and device, computer equipment and storage medium
CN112287089B (en) Classification model training and automatic question-answering method and device for automatic question-answering system
CN113157863B (en) Question-answer data processing method, device, computer equipment and storage medium
CN112016274B (en) Medical text structuring method, device, computer equipment and storage medium
CN112016295A (en) Symptom data processing method and device, computer equipment and storage medium
WO2021218023A1 (en) Emotion determining method and apparatus for multiple rounds of questions and answers, computer device, and storage medium
CN110322959B (en) Deep medical problem routing method and system based on knowledge
CN110929714A (en) Information extraction method of intensive text pictures based on deep learning
CN113468887A (en) Student information relation extraction method and system based on boundary and segment classification
CN110929532B (en) Data processing method, device, equipment and storage medium
CN109460541A (en) Lexical relation mask method, device, computer equipment and storage medium
CN116842036A (en) Data query method, device, computer equipment and storage medium
CN111737992B (en) Three-way text information processing method, computer equipment and storage medium
CN113505786A (en) Test question photographing and judging method and device and electronic equipment
CN112990290A (en) Sample data generation method, device, equipment and storage medium
CN116467412A (en) Knowledge graph-based question and answer method, system and storage medium
CN110147791A (en) Character recognition method, device, equipment and storage medium
CN111680148B (en) Method and device for intelligently responding to question of user
CN114048753A (en) Method, device, equipment and medium for training word sense recognition model and judging word sense
CN114691716A (en) SQL statement conversion method, device, equipment and computer readable storage medium
CN110909541A (en) Instruction generation method, system, device and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant