CN110517747A - Pathological data processing method, device and electronic equipment - Google Patents

Pathological data processing method, device and electronic equipment Download PDF

Info

Publication number
CN110517747A
CN110517747A CN201910822446.5A CN201910822446A CN110517747A CN 110517747 A CN110517747 A CN 110517747A CN 201910822446 A CN201910822446 A CN 201910822446A CN 110517747 A CN110517747 A CN 110517747A
Authority
CN
China
Prior art keywords
pathology
corpus data
vector
pathological
processed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910822446.5A
Other languages
Chinese (zh)
Other versions
CN110517747B (en
Inventor
马素芬
魏博
李力行
凌少平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Trino Invensys (beijing) Gene Technology Co Ltd
Original Assignee
Trino Invensys (beijing) Gene Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Trino Invensys (beijing) Gene Technology Co Ltd filed Critical Trino Invensys (beijing) Gene Technology Co Ltd
Priority to CN201910822446.5A priority Critical patent/CN110517747B/en
Publication of CN110517747A publication Critical patent/CN110517747A/en
Application granted granted Critical
Publication of CN110517747B publication Critical patent/CN110517747B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2237Vectors, bitmaps or matrices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • G06F16/313Selection or weighting of terms for indexing
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data
    • G16H10/60ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H15/00ICT specially adapted for medical reports, e.g. generation or transmission thereof

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Primary Health Care (AREA)
  • Public Health (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

This application provides a kind of pathological data processing method, device and electronic equipments, wherein the pathological data processing method includes: to obtain more parts of pathological replacements to be processed;More parts of pathological replacements to be processed are subjected to structuring processing, obtain a plurality of pathology corpus data;A plurality of pathology corpus data is subjected to vector conversion, obtains the pathology matrix of more parts of pathological replacements to be processed;Pathology matrix is subjected to dimension-reduction treatment, determines that the corresponding pathology characterization matrix of more parts of pathological replacements to be processed, each row vector in pathology characterization matrix indicate the characterization vector of a pathological replacement to be processed.

Description

Pathological data processing method, device and electronic equipment
Technical field
This application involves technical field of data processing, in particular to a kind of pathological data processing method, device and electricity Sub- equipment.
Background technique
Doctor is generally required during being diagnosed to patient with reference to the clinical path of related disease and actually examining for patient Disconnected result, it is also possible to reference to the detailed rules for the implementation of disease guide, the progress adjustment medical decision making of evidence-based medicine EBM, to be supplied to patient The therapeutic scheme of newest treatment means and optimization.But it is at present in the form of text data mostly pathology related data and is in Existing, the information of text is relatively scattered, relatively small to the directive function of doctor, to the expression of patient and portrays text data Effect is also poor.
Summary of the invention
In view of this, the be designed to provide a kind of pathological data processing method, device and electronics of the embodiment of the present application are set It is standby.It can be realized the digitized effect of text report, indicate effect with improve pathological data.
In a first aspect, the embodiment of the present application provides a kind of pathological data processing method, comprising:
Obtain more parts of pathological replacements to be processed;
The more parts of pathological replacements to be processed are pre-processed, a plurality of pathology corpus data is obtained;
The a plurality of pathology corpus data is subjected to vector conversion, obtains the pathology square of the more parts of pathological replacements to be processed Battle array;
The pathology matrix is subjected to dimension-reduction treatment, determines the corresponding pathology characterization of described more parts pathological replacements to be processed Matrix, each row vector in the pathology characterization matrix indicate the characterization vector of a pathological replacement to be processed.
Pathological data processing method provided by the embodiments of the present application, using by pathological replacement structuring to be processed, then again The data of structuring are subjected to vector conversion, the digitized presentation of pathological replacement to be processed can be used, yet further pathology Matrix carry out dimension-reduction treatment, the matrix of low dimensional, digitized presentation can be more convenient related personnel recognize it is each wait locate The similitude for managing pathological replacement, so that pathology characterization matrix gives related personnel's pathology indicative function.
With reference to first aspect, the embodiment of the present application provides the first possible embodiment of first aspect, in which: institute It states and a plurality of pathology corpus data is subjected to vector conversion, obtain the step of the pathology matrix of the more parts of pathological replacements to be processed Suddenly, comprising:
It, will be in each pathology corpus data for each pathology corpus data in a plurality of pathology corpus data Each word using bag of words carry out vector conversion, obtain the term vector of each word of the pathology corpus data;
All term vectors are weighted summation, to obtain the pathology vector in this pathology corpus data;
The corresponding multiple pathology vectors of a plurality of pathology corpus data form pathology matrixes.
The possible embodiment of with reference to first aspect the first, the embodiment of the present application provide second of first aspect Possible embodiment, in which: it is described that all term vectors are weighted summation, to obtain the disease in this pathology corpus data Before managing vector, the method also includes: the corresponding power of each term vector is determined according to the more parts of pathological replacements to be processed Weight;
It is described that all term vectors are weighted summation, to obtain the step of the pathology vector in this pathology corpus data Suddenly, comprising: all term vectors are weighted summation using the corresponding weight of the term vector, to obtain this pathology corpus data Pathology vector.
The possible embodiment of second with reference to first aspect, the embodiment of the present application provide the third of first aspect Possible embodiment, wherein described that the corresponding weight of each term vector is determined according to the more parts of pathological replacements to be processed The step of, comprising:
According in the total number of files amount of the more parts of pathological replacements to be processed, the more parts of pathological replacements to be processed include mesh The reverse frequency of target is calculated in the quantity of documents of the corresponding target word of mark term vector;
Target word frequency is calculated in every portion pathological replacement frequency of occurrence to be processed according to the target word;
The weight of the target term vector is calculated according to the reverse frequency of the target and target word frequency.
Pathological data processing method provided by the embodiments of the present application, is also based on the data of more parts of pathological replacements to be processed Weight when all term vectors that situation calculates every case are weighted and averaged can enable the value of determine word It is enough preferably to represent the word in the importance of matrix.
With reference to first aspect, the embodiment of the present application provides the 4th kind of possible embodiment of first aspect, wherein
It is described that a plurality of pathology corpus data is subjected to vector conversion, obtain the disease of the more parts of pathological replacements to be processed The step of managing matrix, comprising:
For each pathology corpus data in a plurality of pathology corpus data, each pathology corpus data is made Vector conversion is carried out with bag of words, obtains the sentence vector of the pathology corpus data;
The corresponding multiple vectors of a plurality of pathology corpus data form pathology matrix.
Pathological data processing method provided by the embodiments of the present application, it is true for each sentence corresponding in pathology corpus data A vector is made, the pathology matrix eventually formed, which also can be used, can be two-dimensional matrix, reduce answering for pathological replacement to be processed Miscellaneous degree.
With reference to first aspect, the embodiment of the present application provides the 5th kind of possible embodiment of first aspect, wherein institute It states and a plurality of pathology corpus data is subjected to vector conversion, obtain the step of the pathology matrix of the more parts of pathological replacements to be processed Suddenly, comprising:
It, will be in each pathology corpus data for each pathology corpus data in a plurality of pathology corpus data Each word using bag of words carry out vector conversion, obtain the term vector of each word of the pathology corpus data;It should All term vectors in pathology corpus data calculate average value, obtain the pathology vector in this pathology corpus data;It is described The first pathology matrix of a plurality of corresponding multiple pathology vectors compositions of pathology corpus data;
It, will be in each pathology corpus data for each pathology corpus data in a plurality of pathology corpus data Each word using bag of words carry out vector conversion, obtain the term vector of each word of the pathology corpus data;According to The more parts of pathological replacements to be processed determine the corresponding weight of each term vector;By all words in this pathology corpus data Vector is weighted summation, obtains the pathology vector in this pathology corpus data;The a plurality of pathology corpus data is corresponding Multiple pathology vectors form the second pathology matrix;
For each pathology corpus data in a plurality of pathology corpus data, each pathology corpus data is made Vector conversion is carried out with bag of words, obtains the sentence vector of the pathology corpus data;The a plurality of pathology corpus data is corresponding Multiple vectors form third pathology matrix;
Summation is weighted to the first pathology matrix, the second pathology matrix and the third pathology matrix, is obtained To pathology matrix.
Further, pathological data processing method provided by the embodiments of the present application can also be distinguished according to a variety of calculations The first pathology matrix, the second pathology matrix and the third pathology matrix determined, then the first pathology of result matrix, the second pathology square Battle array and third pathology matrix determine pathology matrix, can make pathology matrix combination multiclass situation, can make the pathology square determined Battle array can preferably indicate the information of each part pathological replacement to be processed.
With reference to first aspect or in any one possible embodiment of first aspect, the embodiment of the present application provides 6th kind of possible embodiment of one side, wherein further include:
Calculate the cosine value of each row vector of the pathology characterization matrix;
The similarity between each part pathological replacement to be processed is determined according to the cosine value.
Pathological data processing method provided by the embodiments of the present application, can also can be obtained by cosine value described in each part to The similarity between pathological replacement is handled, so that related personnel be facilitated to recognize the information of each part pathological replacement to be processed.
With reference to first aspect or in any one possible embodiment of first aspect, the embodiment of the present application provides 7th kind of possible embodiment of one side, wherein further include:
Each row vector of the pathology characterization matrix is mapped on coordinate system identical with the dimension of the row vector, is obtained To pathology distribution results;
Output shows the pathology distribution results.
Pathological data processing method provided by the embodiments of the present application can also export display pathology distribution results, to make Related personnel more intuitively recognizes the distribution situation of each part pathological replacement to be processed.
Second aspect, the embodiment of the present application also provide a kind of pathological data processing unit, comprising:
Module is obtained, for obtaining more parts of pathological replacements to be processed;
Processing module obtains a plurality of pathology corpus data for pre-processing the more parts of pathological replacements to be processed;
Conversion module obtains the more parts of diseases to be processed for a plurality of pathology corpus data to be carried out vector conversion Manage the pathology matrix of report;
Dimensionality reduction module determines the more parts of pathological replacements to be processed for the pathology matrix to be carried out dimension-reduction treatment Corresponding pathology characterization matrix, each row vector in the pathology characterization matrix indicate a pathological replacement to be processed Characterize vector.
The third aspect, the embodiment of the present application also provide a kind of electronic equipment, comprising: processor, memory, the memory The executable machine readable instructions of the processor are stored with, when electronic equipment operation, the machine readable instructions are described Processor executes above-mentioned in a first aspect, or the step of method in any possible embodiment of first aspect when executing.
Fourth aspect, the embodiment of the present application also provide a kind of computer readable storage medium, the computer-readable storage medium Computer program is stored in matter, which executes above-mentioned in a first aspect, or first aspect when being run by processor The step of method in any possible embodiment.
To enable the above objects, features, and advantages of the application to be clearer and more comprehensible, special embodiment below, and appended by cooperation Attached drawing is described in detail below.
Detailed description of the invention
Technical solution in ord to more clearly illustrate embodiments of the present application, below will be to needed in the embodiment attached Figure is briefly described, it should be understood that the following drawings illustrates only some embodiments of the application, therefore is not construed as pair The restriction of range for those of ordinary skill in the art without creative efforts, can also be according to this A little attached drawings obtain other relevant attached drawings.
Fig. 1 is the block diagram of electronic equipment provided by the embodiments of the present application.
Fig. 2 is the flow chart of pathological data processing method provided by the embodiments of the present application.
Fig. 3 is the detail flowchart of the step 203 of pathological data processing method provided by the embodiments of the present application.
Fig. 4 is the detailed stream of the determination process of the numerical value in pathological data processing method provided by the embodiments of the present application Cheng Tu.
Fig. 5 is the flow chart for the pathological data processing method that another embodiment of the application provides.
Fig. 6 is the functional block diagram of pathological data processing unit provided by the embodiments of the present application.
Specific embodiment
Below in conjunction with attached drawing in the embodiment of the present application, technical solutions in the embodiments of the present application is described.
It should also be noted that similar label and letter indicate similar terms in following attached drawing, therefore, once a certain Xiang Yi It is defined in a attached drawing, does not then need that it is further defined and explained in subsequent attached drawing.Meanwhile the application's In description, term " first ", " second " etc. are only used for distinguishing description, are not understood to indicate or imply relative importance.
Embodiment one
For convenient for understanding the present embodiment, first to executing pathological data processing side disclosed in the embodiment of the present application The electronic equipment of method describes in detail.
As shown in Figure 1, being the block diagram of electronic equipment.Electronic equipment 100 may include memory 111, storage control Device 112 processed, processor 113, Peripheral Interface 114, input-output unit 115, display unit 116.Those of ordinary skill in the art It is appreciated that structure shown in FIG. 1 is only to illustrate, the structure of electronic equipment 100 is not caused to limit.For example, electronics is set Standby 100 may also include than shown in Fig. 1 more perhaps less component or with the configuration different from shown in Fig. 1.
Above-mentioned memory 111, storage control 112, processor 113, Peripheral Interface 114, input-output unit 115 and Each element of display unit 116 is directly or indirectly electrically connected between each other, to realize the transmission or interaction of data.For example, this A little elements can be realized by one or more communication bus or signal wire be electrically connected between each other.Above-mentioned processor 113 is used The executable module stored in execution memory.
Wherein, memory 111 may be, but not limited to, random access memory (Random Access Memory, letter Claim RAM), read-only memory (Read Only Memory, abbreviation ROM), programmable read only memory (Programmable Read-Only Memory, abbreviation PROM), erasable read-only memory (Erasable Programmable Read-Only Memory, abbreviation EPROM), electricallyerasable ROM (EEROM) (Electric Erasable Programmable Read-Only Memory, abbreviation EEPROM) etc..Wherein, memory 111 is for storing program, and the processor 113 is executed instruction receiving Afterwards, described program, method performed by the electronic equipment 100 that the process that the embodiment of the present application any embodiment discloses defines are executed It can be applied in processor 113, or realized by processor 113.
Above-mentioned processor 113 may be a kind of IC chip, the processing capacity with signal.Above-mentioned processor 113 can be general processor, including central processing unit (Central Processing Unit, abbreviation CPU), network processes Device (Network Processor, abbreviation NP) etc.;It can also be digital signal processor (digital signal Processor, abbreviation DSP), specific integrated circuit (Application Specific Integrated Circuit, referred to as ASIC), field programmable gate array (FPGA) either other programmable logic device, discrete gate or transistor logic, Discrete hardware components.It may be implemented or execute disclosed each method, step and the logic diagram in the embodiment of the present application.It is general Processor can be microprocessor or the processor is also possible to any conventional processor etc..
Various input/output devices are couple processor 113 and memory 111 by above-mentioned Peripheral Interface 114.One In a little embodiments, Peripheral Interface 114, processor 113 and storage control 112 can be realized in one single chip.At other In some examples, they can be realized by independent chip respectively.
Above-mentioned input-output unit 115 is for being supplied to user input data.The input-output unit 115 can be with It is, but is not limited to, mouse and keyboard etc..
Above-mentioned display unit provides an interactive interface (such as user's operation circle between electronic equipment 100 and user Face) or for display image data give user reference.In the present embodiment, the display unit can be liquid crystal display or touching Control display.It can be the touching of the capacitance type touch control screen or resistance-type of support single-point and multi-point touch operation if touch control display Control screen etc..Single-point and multi-point touch operation is supported to refer to that touch control display can sense on the touch control display one or more The touch control operation generated simultaneously at a position, and the touch control operation that this is sensed transfers to processor to be calculated and handled.
Electronic equipment 100 in the present embodiment can be used for executing each in each method provided by the embodiments of the present application Step.Below by the realization process of several embodiments detailed description pathological data processing method.
Embodiment two
Referring to Fig. 2, being the flow chart of pathological data processing method provided by the embodiments of the present application.It below will be to Fig. 2 institute The detailed process shown is described in detail.
Step 201, more parts of pathological replacements to be processed are obtained.
More parts of above-mentioned pathological replacements to be processed can be the pathological replacement of multidigit patient.More parts of pathological replacements to be processed Source can be the electronic equipment being present in for executing the pathological data processing method in the present embodiment, is also possible to and is somebody's turn to do In the server of electronic equipment communication connection.
Step 202, the more parts of pathological replacements to be processed are pre-processed, obtains a plurality of pathology corpus data.
Wherein, above-mentioned a plurality of pathology corpus data can form patient history corpus.Above-mentioned a plurality of pathology corpus Data are also possible to the pathological data of a plurality of structuring.
In one embodiment, above-mentioned pretreatment may include that more parts of pathological replacements to be processed are cleaned and divided Word processing.Wherein, more parts of pathological replacements to be processed are subjected to cleaning and current patient case history corpus can be obtained in word segmentation processing.Show Example property, data cleansing may include: space, newline and spcial character etc. in removal pathological replacement to be processed.It is exemplary Ground, word segmentation processing can carry out word segmentation processing to pathological replacement to be processed by custom dictionaries.
In another embodiment, above-mentioned pretreatment may include carrying out at structuring to more parts of pathological replacements to be processed Reason.Wherein, more parts of pathological replacements to be processed are subjected to the pathological data that structuring handles available a plurality of structuring.It is exemplary Ground, structuring processing may include: that pathological replacement to be processed is first carried out word cutting processing, to obtain pathology phrase;To the pathology Each pathology word in phrase is identified, determines mark words;The pathology phrase is grouped according to the mark words, is obtained To at least one set of information phrase;Key message extraction is carried out at least one set information phrase, and with the key message word of extraction It is combined into the pathological data of structuring.
Step 203, a plurality of pathology corpus data is subjected to vector conversion, obtains the more parts of pathological replacements to be processed Pathology matrix.
In one embodiment, as shown in figure 3, step 203 may include step 2031-2033.
Step 2031, for each pathology corpus data in a plurality of pathology corpus data, by each pathology Each of corpus data word carries out vector conversion using bag of words, obtains the word of each word of the pathology corpus data Vector.
Illustratively, above-mentioned bag of words can be Word2Vec model.Wherein, Word2Vec model is to simplify Neural network.Word2Vec model may include CBOW (Continuous Bag-of-Words) model, Skip-Gram model. Wherein, CBOW model is opposite is more suitable for toy data base, and Skip-Gram model also can be realized very well in large-scale corpus Term vector convert.Therefore, suitable bag of words can be selected based on the size of the amount of translation of vocabulary.It is optional at one In embodiment, Skip-Gram model can be used.Further, it on the basis of Skip-Gram model, can also use Hierarchical Softmax optimization.
Step 2032, all term vectors are weighted summation, to obtain the pathology vector in this pathology corpus data.
It illustratively, include n word in a pathology corpus data, each word of conversion by bag of words can indicate The vector for being q for length, n term vector can respectively indicate are as follows: V1, V2, V3 ..., Vn.Wherein, a pathology corpus data Indicate that vector can indicate: (a1, a2, a3 ..., an).
Then the calculation formula of the expression number of n word can indicate are as follows: a1=1/q (v11+v21+v31+..+vn1);…;an =1/q (v1n+v2n+v3n+..+vqn).
In another embodiment, the corresponding weight of each term vector can first be calculated, further according to weight meter Calculation obtains the expression number of each word.
It can also include: that each term vector is determined according to the more parts of pathological replacements to be processed before step 2032 Corresponding weight.
As shown in figure 4, determining that the corresponding weight of each term vector can wrap according to the more parts of pathological replacements to be processed Include following steps.
Step 301, according to the total number of files amount of the more parts of pathological replacements to be processed, the more parts of pathological replacements to be processed In include the corresponding target word of target term vector quantity of documents the reverse frequency of target is calculated.
Illustratively, calculating the reverse frequency of target required for the corresponding weight of target word can indicate are as follows: IDFi,j
Illustratively, the reverse frequency IDF of targeti,jCalculation formula may be expressed as:
Wherein, | D | indicate the current patient case history corpus obtained according to more parts of pathological replacements to be processed, | { j:tj∈dj} | it indicates to include word t in more parts of pathological replacements to be processedjReporting quantities, tjIndicate that in currently processed target word be current place J-th of word in the pathology corpus data of reason.Wherein, IDFi,jTarget word can be characterized in current patient case history corpus data library In reverse document-frequency, that is, target word general importance measurement.
Step 302, target word frequency is calculated in every portion pathological replacement frequency of occurrence to be processed according to the target word.
Illustratively, calculating target word frequency required for the corresponding weight of target word can indicate are as follows: TFi,j
Illustratively, target word frequency TFi,jCalculation formula can indicate are as follows:
Wherein, ni,jIndicate currently processed target word in currently processed pathology corpus data (namely in current patient I-th pathology corpus data in case history corpus) in occur number;nk,jIndicate currently processed target word in current patient The number occurred in kth pathology corpus data in case history corpus.Wherein, TFi,jTarget word can be characterized in current patient Word frequency in case history corpus refers to the frequency that target word occurs in a case.
Step 303, the power of the target term vector is calculated according to the reverse frequency of the target and target word frequency Weight.
Illustratively, the weight of target term vector can indicate are as follows: TFIDFi,j
Illustratively, the weight TFIDF of target term vectori,jCalculation formula can indicate are as follows:
TFIDFi,j=TFi,j×IDFi,j
Illustratively, by the expression vector of target word multiplied by the above-mentioned TFIDF being calculatedi,j, obtain the table of the target word Show vector.
Illustratively, n term vector of i-th pathology corpus data can respectively indicate are as follows: V1, V2, V3 ..., Vn.
The weight of the corresponding n term vector of i-th pathology corpus data may be expressed as: TFIDFi1、TFIDFi2、TFIDFin
Illustratively, the pathology vector of i-th pathology corpus data is expressed as:
V1*TFIDFi1,V2*TFIDFi2,V3*TFIDFi3,...,Vn*TFIDFin
It is by the calculation of the expression number of above-mentioned calculating target word, each term vector is corresponding using the term vector Weight be weighted summation, to obtain the pathology vector of this pathology corpus data.
Step 2033, the corresponding multiple pathology vectors of a plurality of pathology corpus data form pathology matrix.
Multiple vectors can form a matrix.Each pathology vector can be a row vector of pathology matrix.
Illustratively, average value is calculated as target word by each element in the target term vector that will be calculated The mode for indicating number, can be respectively indicated are as follows: D with the pathology matrix determinedm×n
Illustratively, it by the way that the corresponding weight of each term vector will first be calculated, is obtained often further according to weight calculation The mode of the expression number of one word, can be respectively indicated are as follows: E with the pathology matrix determinedm×n
In another embodiment, step 203 may include: for each in a plurality of pathology corpus data Each pathology corpus data is carried out vector conversion using bag of words, obtains the pathology corpus data by pathology corpus data Sentence vector;The corresponding multiple vectors of a plurality of pathology corpus data form pathology matrix.
Illustratively, Doc2Vec model can be used to handle pathology corpus data, to obtain a vector.
Each sentence vector can be used as a row vector of pathology matrix.
What the embodiment based on the corresponding multiple vectors composition pathology matrix of a plurality of pathology corpus data was determined It can indicate are as follows: Fm×n
Wherein, above-mentioned Doc2Vec model can be understood as the expansion of Word2Vec method.The training of Doc2Vec model Paragraph id is increased newly in the process, i.e., each sentence has a unique id in training corpus.Paragraph id and Common word is the same, is mapped to a vector, i.e. paragraph vector before this.Paragraph vector and word Though the dimension of vector is the same, two different vector spaces are come from.In calculating later, paragraph Vector and word vector is cumulative or connects, the input as output layer Softmax.In a sentence or text In the training process of shelves, Paragraph id is remained unchanged, and is shared the same paragraph vector, is equivalent to every time pre- When surveying the probability of word, the semanteme of entire sentence is all utilized.
Consider that the possible focal point of calculation of different calculating pathology matrixes can be different.
In one embodiment, the application step 203 can also include being calculated using above-mentioned numerous embodiments Dm×n、Em×n、Fm×nIt is weighted summation, obtains pathology matrix.
Illustratively, step 203 may include:
It, will be in each pathology corpus data for each pathology corpus data in a plurality of pathology corpus data Each word using bag of words carry out vector conversion, obtain the term vector of each word of the pathology corpus data;It should All term vectors in pathology corpus data calculate average value, obtain the pathology vector in this pathology corpus data;It is described The first pathology matrix of a plurality of corresponding multiple pathology vectors compositions of pathology corpus data;
Wherein, the process that the first pathology matrix is calculated in present embodiment can refer to above-mentioned calculating matrix Dm×nRetouch It states, details are not described herein.
It, will be in each pathology corpus data for each pathology corpus data in a plurality of pathology corpus data Each word using bag of words carry out vector conversion, obtain the term vector of each word of the pathology corpus data;According to The more parts of pathological replacements to be processed determine the corresponding weight of each term vector;By all words in this pathology corpus data Vector is weighted summation, obtains the pathology vector in this pathology corpus data;The a plurality of pathology corpus data is corresponding Multiple pathology vectors form the second pathology matrix;
Wherein, the process that the second pathology matrix is calculated in present embodiment can refer to above-mentioned calculating matrix Em×nRetouch It states, details are not described herein.
For each pathology corpus data in a plurality of pathology corpus data, each pathology corpus data is made Vector conversion is carried out with bag of words, obtains the sentence vector of the pathology corpus data;The a plurality of pathology corpus data is corresponding Multiple vectors form third pathology matrix;
Wherein, the process for third pathology matrix being calculated in present embodiment can refer to above-mentioned calculating matrix Fm×nRetouch It states, details are not described herein.
Summation is weighted to the first pathology matrix, the second pathology matrix and the third pathology matrix, is obtained To pathology matrix.
Illustratively, the first pathology matrix in present embodiment, the second pathology matrix and third pathology matrix make respectively Use Am×n、Bm×n、Cm×nIt indicates.
In the present embodiment, pathology matrix is represented by Dm×n
Illustratively, pathology matrix Dm×nCalculation formula can indicate are as follows:
Dm×n=a*Am×n+b*Bm×n+c*Cm×n
Wherein, a indicates the weight of the first pathology matrix, and b indicates the weight of the second pathology matrix, and c indicates third pathology square The weight of battle array.
Illustratively, the weight of the first above-mentioned pathology matrix, the weight of the second pathology matrix and third pathology matrix Weight can choose positive integer.
Step 204, the pathology matrix is subjected to dimension-reduction treatment, determines that the more parts of pathological replacements to be processed are corresponding Pathology characterization matrix, each row vector in the pathology characterization matrix indicate the characterization of a pathological replacement to be processed to Amount.
It is alternatively possible to use PCA (principal components analysis, Chinese claim: principal component analysis) drop Dimension mode carries out dimension-reduction treatment to pathology matrix.
Optionally, can also use t-SNE (t-distributed stochastic neighbor embedding, in Text claim: t- be distributed random neighborhood insertion) dimensionality reduction mode to pathology matrix carry out dimension-reduction treatment.
Optionally, can by first use PCA dimensionality reduction in a manner of to pathology matrix carry out dimension-reduction treatment, obtain preliminary dimensionality reduction square Battle array, then on the basis of PCA dimensionality reduction, reuse t-SNE dimensionality reduction mode and dimension-reduction treatment is carried out to preliminary dimensionality reduction matrix.By dual Dimension-reduction treatment mode can reduce computation complexity and lower system resource requirement.
Illustratively, pathology matrix can be reduced into the matrix of m*2 or m*3 dimension.Wherein, each pathology to be processed Report that corresponding vector is a bivector or three-dimensional vector.
In above-mentioned steps in the embodiment described above the example, using by pathological replacement structuring to be processed, then it will tie again The data of structure carry out vector conversion, the digitized presentation of pathological replacement to be processed can be used, yet further pathology matrix Dimension-reduction treatment is carried out, the matrix of low dimensional can be more convenient related personnel and recognize the similar of each pathological replacement to be processed Property, so that pathology characterization matrix gives related personnel's pathology indicative function.
In other embodiments, as shown in figure 5, on the basis of shown in Fig. 2, pathological data processing method can also be wrapped It includes:
Step 205, the cosine value of each row vector of the pathology characterization matrix is calculated;
Step 206, the similarity between each part pathological replacement to be processed is determined according to the cosine value.
Illustratively, the cosine value of each row vector indicates are as follows:
Wherein, xiIndicate i-th of element value of a wherein row vector, yiIndicate i-th of element value of another row vector, p table Show the columns of pathology characterization matrix, that is, pathology characterization matrix row vector dimension.
Illustratively, the cosine value of two row vectors closer to one, get over by the angle for indicating that the line of the expression of two row vectors is formed It is small, also mean that two parts of pathological replacements to be processed representated by two row vectors are more similar.
In other embodiments, as shown in figure 5, on the basis of shown in Fig. 2, pathological data processing method can also be wrapped It includes:
Step 207, each row vector of the pathology characterization matrix is mapped into seat identical with the dimension of the row vector Mark is fastened, and pathology distribution results are obtained.
It illustratively, can be by each row vector one or three if the dimension of the row vector of pathology characterization matrix is three-dimensional It is marked in dimension coordinate system.
It illustratively, can be by each row vector one or two if the dimension of the row vector of pathology characterization matrix is two dimension It is marked in dimension coordinate system.
Step 208, output shows the pathology distribution results.
Optionally, if each row vector of pathology characterization matrix is three-dimensional vector, pathology distribution can be shown with three-dimensional figure As a result.If each row vector of pathology characterization matrix is bivector, pathology distribution results can be shown with X-Y scheme.
By the display to pathology distribution results, so that related personnel be made more intuitively to recognize each part disease to be processed Manage the distribution situation of report.
Embodiment three
Conceived based on same application, additionally provides pathology number corresponding with pathological data processing method in the embodiment of the present application According to processing unit, at the principle and the above-mentioned pathological data of the embodiment of the present application solved the problems, such as due to the device in the embodiment of the present application Reason method is similar, therefore the implementation of device may refer to the implementation of method, and overlaps will not be repeated.
Referring to Fig. 6, being the functional block diagram of pathological data processing unit provided by the embodiments of the present application.This implementation The modules in pathological data processing unit in example are used to execute each step in above method embodiment.Pathological data Processing unit includes: to obtain module 401, processing module 402, conversion module 403 and dimensionality reduction module 404;Wherein,
Module 401 is obtained, for obtaining more parts of pathological replacements to be processed;
Processing module 402 obtains a plurality of pathology corpus number for pre-processing the more parts of pathological replacements to be processed According to;
Conversion module 403, for will a plurality of pathology corpus data progress vector conversion, obtain described more parts it is to be processed The pathology matrix of pathological replacement;
Dimensionality reduction module 404 determines described more parts pathology reports to be processed for the pathology matrix to be carried out dimension-reduction treatment Corresponding pathology characterization matrix is accused, each row vector in the pathology characterization matrix indicates a pathological replacement to be processed Characterization vector.
In a kind of possible embodiment, the conversion module 403 in the present embodiment, comprising:
First obtains unit, each pathology corpus data for being directed in a plurality of pathology corpus data, will be every Each of one pathology corpus data word carries out vector conversion using bag of words, obtains each of the pathology corpus data The term vector of a word;
Second obtains unit, for all term vectors to be weighted summation, to obtain in this pathology corpus data Pathology vector;
Comprising modules are used for the corresponding multiple pathology vectors of a plurality of pathology corpus data and form pathology matrixes.
In a kind of possible embodiment, the conversion module 403 in the present embodiment, further includes:
Weight calculation unit, for determining the corresponding power of each term vector according to the more parts of pathological replacements to be processed Weight;
Second obtains unit, is also used to all term vectors being weighted summation using the corresponding weight of the term vector, with Obtain the pathology vector of this pathology corpus data;
The corresponding multiple pathology vectors of a plurality of pathology corpus data form pathology matrixes.
In a kind of possible embodiment, weight calculation unit is also used to:
According in the total number of files amount of the more parts of pathological replacements to be processed, the more parts of pathological replacements to be processed include mesh The reverse frequency of target is calculated in the quantity of documents of the corresponding target word of mark term vector;
Target word frequency is calculated in every portion pathological replacement frequency of occurrence to be processed according to the target word;
The weight of the target term vector is calculated according to the reverse frequency of the target and target word frequency.
In a kind of possible embodiment, the conversion module 403 in the present embodiment is also used to:
For each pathology corpus data in a plurality of pathology corpus data, each pathology corpus data is made Vector conversion is carried out with bag of words, obtains the sentence vector of the pathology corpus data;
The corresponding multiple vectors of a plurality of pathology corpus data form pathology matrix.
In a kind of possible embodiment, the conversion module 403 in the present embodiment is also used to:
It, will be in each pathology corpus data for each pathology corpus data in a plurality of pathology corpus data Each word using bag of words carry out vector conversion, obtain the term vector of each word of the pathology corpus data;It should All term vectors in pathology corpus data calculate average value, obtain the pathology vector in this pathology corpus data;It is described The first pathology matrix of a plurality of corresponding multiple pathology vectors compositions of pathology corpus data;
It, will be in each pathology corpus data for each pathology corpus data in a plurality of pathology corpus data Each word using bag of words carry out vector conversion, obtain the term vector of each word of the pathology corpus data;According to The more parts of pathological replacements to be processed determine the corresponding weight of each term vector;By all words in this pathology corpus data Vector is weighted summation, obtains the pathology vector in this pathology corpus data;The a plurality of pathology corpus data is corresponding Multiple pathology vectors form the second pathology matrix;
For each pathology corpus data in a plurality of pathology corpus data, each pathology corpus data is made Vector conversion is carried out with bag of words, obtains the sentence vector of the pathology corpus data;The a plurality of pathology corpus data is corresponding Multiple vectors form third pathology matrix;
Summation is weighted to the first pathology matrix, the second pathology matrix and the third pathology matrix, is obtained To pathology matrix.
In a kind of possible embodiment, the pathological data processing unit in the present embodiment, further includes: similarity calculation mould Block 405, is used for:
Calculate the cosine value of each row vector of the pathology characterization matrix;
The similarity between each part pathological replacement to be processed is determined according to the cosine value.
In a kind of possible embodiment, the pathological data processing unit in the present embodiment, further includes: display module 406, It is also used to:
Each row vector of the pathology characterization matrix is mapped on coordinate system identical with the dimension of the row vector, is obtained To pathology distribution results;
Output shows the pathology distribution results.
In addition, the embodiment of the present application also provides a kind of computer readable storage medium, on the computer readable storage medium It is stored with computer program, pathological data described in above method embodiment is executed when which is run by processor The step of processing method.
The computer program product of pathological data processing method provided by the embodiment of the present application, including store program generation The computer readable storage medium of code, the instruction that said program code includes can be used for executing described in above method embodiment The step of pathological data processing method, for details, reference can be made to above method embodiments, and details are not described herein.
In several embodiments provided herein, it should be understood that disclosed device and method can also pass through Other modes are realized.The apparatus embodiments described above are merely exemplary, for example, flow chart and block diagram in attached drawing Show the device of multiple embodiments according to the application, the architectural framework in the cards of method and computer program product, Function and operation.In this regard, each box in flowchart or block diagram can represent the one of a module, section or code Part, a part of the module, section or code, which includes that one or more is for implementing the specified logical function, to be held Row instruction.It should also be noted that function marked in the box can also be to be different from some implementations as replacement The sequence marked in attached drawing occurs.For example, two continuous boxes can actually be basically executed in parallel, they are sometimes It can execute in the opposite order, this depends on the function involved.It is also noted that every in block diagram and or flow chart The combination of box in a box and block diagram and or flow chart can use the dedicated base for executing defined function or movement It realizes, or can realize using a combination of dedicated hardware and computer instructions in the system of hardware.
In addition, each functional module in each embodiment of the application can integrate one independent portion of formation together Point, it is also possible to modules individualism, an independent part can also be integrated to form with two or more modules.
It, can be with if the function is realized and when sold or used as an independent product in the form of software function module It is stored in a computer readable storage medium.Based on this understanding, the technical solution of the application is substantially in other words The part of the part that contributes to existing technology or the technical solution can be embodied in the form of software products, the meter Calculation machine software product is stored in a storage medium, including some instructions are used so that a computer equipment (can be a People's computer, server or network equipment etc.) execute each embodiment the method for the application all or part of the steps. And storage medium above-mentioned includes: that USB flash disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), arbitrary access are deposited The various media that can store program code such as reservoir (RAM, Random Access Memory), magnetic or disk.It needs Illustrate, herein, relational terms such as first and second and the like be used merely to by an entity or operation with Another entity or operation distinguish, and without necessarily requiring or implying between these entities or operation, there are any this realities The relationship or sequence on border.Moreover, the terms "include", "comprise" or its any other variant are intended to the packet of nonexcludability Contain, so that the process, method, article or equipment for including a series of elements not only includes those elements, but also including Other elements that are not explicitly listed, or further include for elements inherent to such a process, method, article, or device. In the absence of more restrictions, the element limited by sentence " including ... ", it is not excluded that in the mistake including the element There is also other identical elements in journey, method, article or equipment.
The foregoing is merely preferred embodiment of the present application, are not intended to limit this application, for the skill of this field For art personnel, various changes and changes are possible in this application.Within the spirit and principles of this application, made any to repair Change, equivalent replacement, improvement etc., should be included within the scope of protection of this application.It should also be noted that similar label and letter exist Similar terms are indicated in following attached drawing, therefore, once being defined in a certain Xiang Yi attached drawing, are then not required in subsequent attached drawing It is further defined and explained.
The above, the only specific embodiment of the application, but the protection scope of the application is not limited thereto, it is any Those familiar with the art within the technical scope of the present application, can easily think of the change or the replacement, and should all contain Lid is within the scope of protection of this application.Therefore, the protection scope of the application should be subject to the protection scope in claims.

Claims (10)

1. a kind of pathological data processing method characterized by comprising
Obtain more parts of pathological replacements to be processed;
The more parts of pathological replacements to be processed are pre-processed, a plurality of pathology corpus data is obtained;
The a plurality of pathology corpus data is subjected to vector conversion, obtains the pathology matrix of the more parts of pathological replacements to be processed;
The pathology matrix is subjected to dimension-reduction treatment, determines the corresponding pathology characterization square of described more parts pathological replacements to be processed Gust, each row vector in the pathology characterization matrix indicates the characterization vector of a pathological replacement to be processed.
2. the method according to claim 1, wherein described carry out vector turn for a plurality of pathology corpus data The step of changing, obtaining the pathology matrix of the more parts of pathological replacements to be processed, comprising:
It, will be every in each pathology corpus data for each pathology corpus data in a plurality of pathology corpus data One word carries out vector conversion using bag of words, obtains the term vector of each word of the pathology corpus data;
All term vectors are weighted summation, to obtain the pathology vector in this pathology corpus data;
The corresponding multiple pathology vectors of a plurality of pathology corpus data form pathology matrixes.
3. according to the method described in claim 2, it is characterized in that, described be weighted summation for all term vectors, to obtain Before pathology vector in this pathology corpus data, the method also includes: it is true according to the more parts of pathological replacements to be processed The fixed corresponding weight of each term vector;
It is described that all term vectors are weighted summation, the step of to obtain the pathology vector in this pathology corpus data, packet It includes: all term vectors being weighted summation using the corresponding weight of the term vector, to obtain the disease of this pathology corpus data Manage vector.
4. according to the method described in claim 3, it is characterized in that, described determine often according to described more parts pathological replacements to be processed The step of one term vector corresponding weight, comprising:
According in the total number of files amount of the more parts of pathological replacements to be processed, the more parts of pathological replacements to be processed include target word The reverse frequency of target is calculated in the quantity of documents of the corresponding target word of vector;
Target word frequency is calculated in every portion pathological replacement frequency of occurrence to be processed according to the target word;
The weight of the target term vector is calculated according to the reverse frequency of the target and target word frequency.
5. the method according to claim 1, wherein described carry out vector turn for a plurality of pathology corpus data The step of changing, obtaining the pathology matrix of the more parts of pathological replacements to be processed, comprising:
For each pathology corpus data in a plurality of pathology corpus data, each pathology corpus data is used into word Bag model carries out vector conversion, obtains the sentence vector of the pathology corpus data;
The corresponding multiple vectors of a plurality of pathology corpus data form pathology matrix.
6. the method according to claim 1, wherein described carry out vector turn for a plurality of pathology corpus data The step of changing, obtaining the pathology matrix of the more parts of pathological replacements to be processed, comprising:
It, will be every in each pathology corpus data for each pathology corpus data in a plurality of pathology corpus data One word carries out vector conversion using bag of words, obtains the term vector of each word of the pathology corpus data;By this disease All term vectors managed in corpus data calculate average value, obtain the pathology vector in this pathology corpus data;It is described a plurality of The corresponding multiple pathology vectors of pathology corpus data form the first pathology matrix;
It, will be every in each pathology corpus data for each pathology corpus data in a plurality of pathology corpus data One word carries out vector conversion using bag of words, obtains the term vector of each word of the pathology corpus data;According to described More parts of pathological replacements to be processed determine the corresponding weight of each term vector;By all term vectors in this pathology corpus data It is weighted summation, obtains the pathology vector in this pathology corpus data;The a plurality of pathology corpus data is corresponding multiple Pathology vector forms the second pathology matrix;
For each pathology corpus data in a plurality of pathology corpus data, each pathology corpus data is used into word Bag model carries out vector conversion, obtains the sentence vector of the pathology corpus data;The a plurality of pathology corpus data is corresponding multiple Sentence vector forms third pathology matrix;
Summation is weighted to the first pathology matrix, the second pathology matrix and the third pathology matrix, obtains disease Manage matrix.
7. method described in -6 any one according to claim 1, which is characterized in that further include:
Calculate the cosine value of each row vector of the pathology characterization matrix;
The similarity between each part pathological replacement to be processed is determined according to the cosine value.
8. method described in -6 any one according to claim 1, which is characterized in that further include:
Each row vector of the pathology characterization matrix is mapped on coordinate system identical with the dimension of the row vector, disease is obtained Manage distribution results;
Output shows the pathology distribution results.
9. a kind of pathological data processing unit characterized by comprising
Module is obtained, for obtaining more parts of pathological replacements to be processed;
Processing module obtains a plurality of pathology corpus data for pre-processing the more parts of pathological replacements to be processed;
Conversion module obtains described more parts pathology reports to be processed for a plurality of pathology corpus data to be carried out vector conversion The pathology matrix of announcement;
Dimensionality reduction module determines that described more parts pathological replacements to be processed are corresponding for the pathology matrix to be carried out dimension-reduction treatment Pathology characterization matrix, each row vector in the pathology characterization matrix indicates the characterization of a pathological replacement to be processed Vector.
10. a kind of electronic equipment characterized by comprising processor, memory, the memory are stored with the processor Executable machine readable instructions, when electronic equipment operation, the machine readable instructions execute when being executed by the processor The step of method as described in any of the claims 1 to 8.
CN201910822446.5A 2019-08-30 2019-08-30 Pathological data processing method and device and electronic equipment Active CN110517747B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910822446.5A CN110517747B (en) 2019-08-30 2019-08-30 Pathological data processing method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910822446.5A CN110517747B (en) 2019-08-30 2019-08-30 Pathological data processing method and device and electronic equipment

Publications (2)

Publication Number Publication Date
CN110517747A true CN110517747A (en) 2019-11-29
CN110517747B CN110517747B (en) 2022-06-03

Family

ID=68630232

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910822446.5A Active CN110517747B (en) 2019-08-30 2019-08-30 Pathological data processing method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN110517747B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111681749A (en) * 2020-06-22 2020-09-18 韦志永 Pathology department standardized work management and diagnosis consultation system and method
CN112185572A (en) * 2020-09-25 2021-01-05 志诺维思(北京)基因科技有限公司 Tumor specific disease database construction system, method, electronic device and medium
CN113626460A (en) * 2021-07-12 2021-11-09 武汉千屏影像技术有限责任公司 Data interaction method and device for different pathological systems and storage medium

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102682132A (en) * 2012-05-18 2012-09-19 合一网络技术(北京)有限公司 Method and system for searching information based on word frequency, play amount and creation time
CN106446526A (en) * 2016-08-31 2017-02-22 北京千安哲信息技术有限公司 Electronic medical record entity relation extraction method and apparatus
CN107169259A (en) * 2016-12-12 2017-09-15 为朔生物医学有限公司 Personalized medicine based on collaborative filtering and suggestion determines support system
CN107656952A (en) * 2016-12-30 2018-02-02 青岛中科慧康科技有限公司 The modeling method of parallel intelligent case recommended models
CN107767946A (en) * 2017-09-26 2018-03-06 浙江工业大学 Breast cancer diagnosis system based on PCA and PSO KELM models
CN108763487A (en) * 2018-05-30 2018-11-06 华南理工大学 A kind of word representation method of fusion part of speech and sentence information based on Mean Shift
CN108804423A (en) * 2018-05-30 2018-11-13 平安医疗健康管理股份有限公司 Medical Text character extraction and automatic matching method and system
CN109086265A (en) * 2018-06-29 2018-12-25 厦门快商通信息技术有限公司 A kind of semanteme training method, multi-semantic meaning word disambiguation method in short text
CN109409416A (en) * 2018-09-29 2019-03-01 上海联影智能医疗科技有限公司 Feature vector dimension reduction method and medical image recognition method, apparatus and storage medium
CN109740652A (en) * 2018-12-24 2019-05-10 深圳大学 A kind of pathological image classification method and computer equipment
US20190197105A1 (en) * 2017-12-21 2019-06-27 International Business Machines Corporation Unsupervised neural based hybrid model for sentiment analysis of web/mobile application using public data sources

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102682132A (en) * 2012-05-18 2012-09-19 合一网络技术(北京)有限公司 Method and system for searching information based on word frequency, play amount and creation time
CN106446526A (en) * 2016-08-31 2017-02-22 北京千安哲信息技术有限公司 Electronic medical record entity relation extraction method and apparatus
CN107169259A (en) * 2016-12-12 2017-09-15 为朔生物医学有限公司 Personalized medicine based on collaborative filtering and suggestion determines support system
CN107656952A (en) * 2016-12-30 2018-02-02 青岛中科慧康科技有限公司 The modeling method of parallel intelligent case recommended models
CN107767946A (en) * 2017-09-26 2018-03-06 浙江工业大学 Breast cancer diagnosis system based on PCA and PSO KELM models
US20190197105A1 (en) * 2017-12-21 2019-06-27 International Business Machines Corporation Unsupervised neural based hybrid model for sentiment analysis of web/mobile application using public data sources
CN108763487A (en) * 2018-05-30 2018-11-06 华南理工大学 A kind of word representation method of fusion part of speech and sentence information based on Mean Shift
CN108804423A (en) * 2018-05-30 2018-11-13 平安医疗健康管理股份有限公司 Medical Text character extraction and automatic matching method and system
CN109086265A (en) * 2018-06-29 2018-12-25 厦门快商通信息技术有限公司 A kind of semanteme training method, multi-semantic meaning word disambiguation method in short text
CN109409416A (en) * 2018-09-29 2019-03-01 上海联影智能医疗科技有限公司 Feature vector dimension reduction method and medical image recognition method, apparatus and storage medium
CN109740652A (en) * 2018-12-24 2019-05-10 深圳大学 A kind of pathological image classification method and computer equipment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
浦东旭: "基于病历文本语义分析的智能肝病辅助诊疗系统研究", 《中国优秀博硕士学位论文全文数据库(硕士) 医药卫生科技辑》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111681749A (en) * 2020-06-22 2020-09-18 韦志永 Pathology department standardized work management and diagnosis consultation system and method
CN112185572A (en) * 2020-09-25 2021-01-05 志诺维思(北京)基因科技有限公司 Tumor specific disease database construction system, method, electronic device and medium
CN112185572B (en) * 2020-09-25 2024-03-01 志诺维思(北京)基因科技有限公司 Tumor specific disease database construction system, method, electronic equipment and medium
CN113626460A (en) * 2021-07-12 2021-11-09 武汉千屏影像技术有限责任公司 Data interaction method and device for different pathological systems and storage medium
CN113626460B (en) * 2021-07-12 2023-11-03 武汉千屏影像技术有限责任公司 Data interaction method, device and storage medium for different pathology systems

Also Published As

Publication number Publication date
CN110517747B (en) 2022-06-03

Similar Documents

Publication Publication Date Title
US9911211B1 (en) Lens-based user-interface for visualizations of graphs
CN104662491B (en) Automatic gesture for sensing system is recognized
CN110517747A (en) Pathological data processing method, device and electronic equipment
CN109214002A (en) A kind of transcription comparison method, device and its computer storage medium
Shen et al. Discovering the potential opportunities of scientific advancement and technological innovation: A case study of smart health monitoring technology
Adams et al. Thematic signatures for cleansing and enriching place-related linked data
Govindarajan et al. Intelligent collaborative patent mining using excessive topic generation
CN113449187A (en) Product recommendation method, device and equipment based on double portraits and storage medium
JP6247775B2 (en) Time series prediction apparatus and time series prediction method
CN107391545A (en) A kind of method classified to user, input method and device
CN109376270A (en) A kind of data retrieval method and device
CN112306835A (en) User data monitoring and analyzing method, device, equipment and medium
CN108829804A (en) Based on the high dimensional data similarity join querying method and device apart from partition tree
CN108460455A (en) Model treatment method and device
CN110033382A (en) A kind of processing method of insurance business, device and equipment
CN106991084B (en) Document evaluation method and device
CN110457707A (en) Extracting method, device, electronic equipment and the readable storage medium storing program for executing of notional word keyword
CN106953937A (en) A kind of uniform resource position mark URL conversion method and device
CN109543959A (en) Examine chain generation method, device, computer equipment and storage medium
CN109144980A (en) Metadata management method, device and electronic equipment
CN110457430A (en) A kind of Traceability detection method of text, device and equipment
CN107085498A (en) The method and apparatus for inputting numerical value
Agranovsky et al. A multi-resolution interpolation scheme for pathline based Lagrangian flow representations
CN108985908A (en) Real estate information sharing method, device, computer readable storage medium
CN108846067A (en) The high dimensional data similarity join querying method and device divided based on mapping space

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant