CN109033413A - A kind of requirement documents neural network based and service document matches method - Google Patents

A kind of requirement documents neural network based and service document matches method Download PDF

Info

Publication number
CN109033413A
CN109033413A CN201810883232.4A CN201810883232A CN109033413A CN 109033413 A CN109033413 A CN 109033413A CN 201810883232 A CN201810883232 A CN 201810883232A CN 109033413 A CN109033413 A CN 109033413A
Authority
CN
China
Prior art keywords
documents
similarity
service
requirement documents
document
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810883232.4A
Other languages
Chinese (zh)
Other versions
CN109033413B (en
Inventor
邹祥文
吴悦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Federation Of Scientific And Technological Enterprises
University of Shanghai for Science and Technology
Original Assignee
Shanghai Federation Of Scientific And Technological Enterprises
University of Shanghai for Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Federation Of Scientific And Technological Enterprises, University of Shanghai for Science and Technology filed Critical Shanghai Federation Of Scientific And Technological Enterprises
Publication of CN109033413A publication Critical patent/CN109033413A/en
Application granted granted Critical
Publication of CN109033413B publication Critical patent/CN109033413B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention relates to a kind of requirement documents neural network based and service document matches method.The present invention utilizes requirement documents and service file structure, by being extracted to document, vector is transformed a document to using paragraph insertion, article is split by shot and long term Memory Neural Networks, similarity is calculated using convolutional neural networks on segmentation text, calculates weighted average after acquiring the similarity of all segmentation documents;Finally obtain the similarity of requirement documents and service documents.

Description

A kind of requirement documents neural network based and service document matches method
Technical field
The present invention relates to Computer Natural Language Processing field, mainly in the matching of requirement documents and service documents, specifically It is related to a kind of requirement documents neural network based and service document matches method.
Background technique
As internet rapidly develops and popularizes, modern enterprise production method becomes cooperating with each other based on technology.In order to Co-operating enterprise is found, party in request writes the requirement documents for meeting enterprise demand, and technical side writes Technological Capability Corresponding service documents accelerate discovery collaborative enterprise, reduce enterprise's time and human cost by connecting internet.
Enterprise demand document includes enterprise's problem to be solved and solves the problems, such as to need index to be achieved, enterprises service when this Document then includes to summarize the method for solving the problem technology, the experience for solving similar item, accept the technology that this project has Deposit, related patents obtained, the quasi- research method taken, the technical indicator mainly realized and project schedule plan.How It is quickly that enterprise Finding Cooperative partner becomes when next hot and difficult issue by requirement documents and service documents.
Currently used document matches method is by converting the text to document vector space model (Vector Space Model, VSM), in inverse document frequency (Term Frequency-Inverse Document Frequency Model, TF- IDF the similarity for) calculating two documents on the basis of model by distance function, apart from smaller more similar.Due to demand text Shelves may comprising need cooperative enterprise simultaneously satisfaction several demands, and service documents may enumerate enterprise at present can be most The technological service that big degree provides, service documents need to be only just in the case where most or whole in meet demand document True matching, there is also deficiencies in this respect for current matching process.
Summary of the invention
In order to overcome the shortcomings of that current matching process on requirement documents and service document matches, improves requirement documents and service The accuracy rate of document matches, the invention proposes a kind of requirement documents neural network based and service document matches method, benefits With the particularity of requirement documents and the content of service documents, document content is extracted, is matched in more fine granularity, finally comprehensive Matching result out.
In order to achieve the above objectives, the present invention adopts the following technical solutions:
Step 1: as document to be matched, requirement documents include that enterprise needs to solve for one requirement documents of input and a service documents Certainly the problem of and index to be achieved is needed when solving the problems, such as this, service documents then include to summarize the side for solving the problem technology Method, the experience for solving similar item, accept technological reserve, related patents obtained that this project has, it is quasi- take grind Study carefully method, the technical indicator mainly realized and project schedule plan;
Step 2: judging that input document is requirement documents or service documents according to document content;
Step 2.1: it is then demand that indexing section to be achieved is needed when including enterprise's problem to be solved and solving the problems, such as this Document extracts enterprise's problem to be solved and needs indexing section to be achieved when solving the problems, such as this;
Step 2.2: having including summarizing the method for solving the problem technology, the experience for solving similar item, accepting this project Technological reserve, related patents obtained, the quasi- research method taken, the technical indicator mainly realized and project schedule plan Part is then service documents, extracts the method for solving the problem technology of summarizing, the experience for solving similar item, accepts this project Technological reserve, related patents obtained, the quasi- research method taken, the technical indicator and project process mainly realized having Plan part;
Step 2.3: the similarity of final requirement documents and service documents will extract part and all clothes to all requirement documents Business document extracts part and calculates similarity, and the general introduction of the problem to be solved and service documents of requirement documents is taken to solve to be somebody's turn to do below For the method for problem technology;
Step 3: the method that the general introduction of problem to be solved part and service documents to requirement documents solves the problem technology Sentence in part carries out paragraph insertion (Paragraph Embedding, PE) processing, obtains sentence vector;
Step 4: document cut-point is judged by shot and long term memory network (Long Short-Term Memory, LSTM);
Step 4.1: the sentence vector of acquisition is inputted into trained shot and long term memory network (Long Short-Term Memory, LSTM) in, judge whether previous sentence is a cut-point by shot and long term memory network output result;
Step 4.2: according to cut-point by a partial segmentation at the different several sections of texts of looking like, part the problem of to requirement documents It is exactly demand one by one, the solution part of service documents is exactly method one by one.
Step 5: being inputted according to processing result type structure similarity model;
Step 5.1: if it is requirement documents, then by all sentences of a demand by obtaining sentence vector after PE model treatment A matrix is constituted, while all sentence vectors an of method being taken to constitute another matrix;
Step 5.2: if it is service documents, then by all sentences of a method by obtaining sentence vector after PE model treatment A matrix is constituted, while all sentence vectors an of demand being taken to constitute another matrix;
Step 6: passing through trained convolutional neural networks (Convolutional Neural for two matrixes as input Networks, CNNs) similarity is calculated, method that each demand is intersected and each calculates similarity, takes to each demand similar Spend end value of the maximum value as this demand;
Step 7: final similarity is obtained to similarity value weighted average;
Step 7.1: asking weighted average final as the problem to be solved of requirement documents after obtaining each demand end value Similarity value;
Step 7.2: above-mentioned steps solve the problem technology with the general introduction of the problem to be solved of requirement documents and service documents Method for, requirement documents include problem to be solved and need indexing section to be achieved when solving the problems, such as this, according to The above method acquires requirement documents again and solves the problems, such as to need indexing section similarity to be achieved when this, seeks two parts weighted average As requirement documents and the final similarity of service documents;
Step 8: final similarity compares with preset threshold, is greater than threshold value then two document matches, is less than threshold value then two texts Shelves mismatch.
Wherein, cut-point described in step 4 refers to that the previous sentence of document and the latter sentence meaning be not identical, then previous sentence is one A cut-point.Shot and long term memory network historical information more new formula are as follows:
Ct=0 (when ht-1→1)
Wherein CtThe historical information of duration short-term memory network t moment, ht-1It is the output of Last status.
When updating historical information, if the output that the previous time obtains is cut-point, by CtBe updated to 0, be not cut-point then It does not handle.
The present invention compared with prior art, has following obvious prominent substantive distinguishing features and significant technological progress: logical It crosses text segmenting method to be split requirement documents and service documents, obtains specific demand and service, finally based on specific Demand and service calculate matching degree, solve requirement documents and service document matches when need most or all meet Problem.The independent structuring one-dimensional addition of the indication information of appearance is originally inputted matrix, is solved in requirement documents and service documents Influence of the indication information to matching result.Cross-matched has been carried out again after acquiring each segmentation Documents Similarity, has taken best match As a result, solving because user is accustomed to the different influences to matching result.
Detailed description of the invention
Fig. 1 is flow chart of the present invention.
Fig. 2 is similarity calculation convolutional network figure of the present invention.
Fig. 3 is convolution operation figure in similarity calculation of the present invention.
Fig. 4 is similarity layer figure in similarity calculation of the present invention.
Fig. 5 is cross-matched figure of the present invention.
Specific embodiment
Embodiment 1
Below with reference to the attached drawing in the present invention, to technical solution of the present invention carry out it is clear, be fully described by.
The invention proposes a kind of requirement documents and service document matches invention, step is embodied in specific flow chart as shown in Figure 1 It is rapid as follows:
Step 1: as document to be matched, requirement documents include that enterprise needs to solve for one requirement documents of input and a service documents Certainly the problem of and index to be achieved is needed when solving the problems, such as this, service documents then include to summarize the side for solving the problem technology Method, the experience for solving similar item, accept technological reserve, related patents obtained that this project has, it is quasi- take grind Study carefully method, the technical indicator mainly realized and project schedule plan;
Step 2: judging that input document is requirement documents or service documents according to document content;
Step 2.1: it is then demand that indexing section to be achieved is needed when including enterprise's problem to be solved and solving the problems, such as this Document extracts enterprise's problem to be solved and needs indexing section to be achieved when solving the problems, such as this;
Step 2.2: having including summarizing the method for solving the problem technology, the experience for solving similar item, accepting this project Technological reserve, related patents obtained, the quasi- research method taken, the technical indicator mainly realized and project schedule plan Part is then service documents, extracts the method for solving the problem technology of summarizing, the experience for solving similar item, accepts this project Technological reserve, related patents obtained, the quasi- research method taken, the technical indicator and project process mainly realized having Plan part;
Step 2.3: the similarity of final requirement documents and service documents will extract part and all clothes to all requirement documents Business document extracts part and calculates similarity, and the general introduction of the problem to be solved and service documents of requirement documents is taken to solve to be somebody's turn to do below For the method for problem technology;
Step 3: the method that the general introduction of problem to be solved part and service documents to requirement documents solves the problem technology Sentence in part carries out paragraph insertion (Paragraph Embedding, PE) processing, obtains sentence vector;
In word insertion (Word Embedding, WE) model, each word can be mapped to only one in document matrix W Column, the index of column is exactly position of the word in vocabulary, and then term vector cascades up, and it is next in sentence to predict Word.Give a word sequence w1, w2, w3..., wT, the target of word incorporation model is exactly to maximize average log probability, is calculated Shown in formula such as formula (I):
Wherein Probability p is the probability of correctly predicted next word.
Prediction task is completed by multi-categorizer, such as softmax classifier, shown in calculation formula such as formula (II):
Word i, y are inputted for eachiIt is non-normalized log probability, shown in calculation formula such as formula (III):
Y=b+Uh (wt-k..., wt+k;W) (Ⅲ)
Wherein U and b is the parameter of softmax classifier, and h is made of the connection for the word vector extracted from W or average value.
The inspiration of PE model can also be used to next word in prediction sentence from WE, paragraph insertion.Each paragraph list Word is mapped to an only column in matrix D, and each word is mapped to an only column in matrix W.It is compared with WE model, PE mould Type uniquely changes formula (III), and h is made of the connection for the word vector extracted from W or average value to be become to be made of W and D.
Step 4: document cut-point is judged by shot and long term memory network (Long Short-Term Memory, LSTM);
Step 4.1: the sentence vector of acquisition is inputted into trained shot and long term memory network (Long Short-Term Memory, LSTM) in, judge whether previous sentence is a cut-point by shot and long term memory network output result;
Step 4.2: according to cut-point by a partial segmentation at the different several sections of texts of looking like, part the problem of to requirement documents It is exactly demand one by one, the solution part of service documents is exactly method one by one.
LSTM network includes three kinds of doors: forgeing door (Forget Gate), input gate (Input Gate) and out gate (Output Gate).Each gate action is different, and specific effect is as follows:
Forget door: forgeing door and be used to handle the historical information of preservation.Forget door operation and uses current input information and upper One time state, then by one layer sigmoid layers, output area [0,1] is 0 when exporting, gives up historical information, work as input When being 1, retain historical information.Formula (IV) is used whether judgement abandons:
ft=σ (Wf[ht-1, xt]+bf) (IV)
Wherein σ represents sigmoid function, x be by having obtained vector after PE model treatment, h represent output as a result, judge whether be Cut-point, w are shot and long term memory network Connecting quantities, and b is bias, and f determines us in the information to be forgotten of t moment.
Input gate: how input gate decision updates historical information.Input gate can to input information operation after know whether by Current input is updated into historical information.Comprising one sigmoid layers and one tanh layers, sigmoid layers determine that we will more It is new what, the new candidate value of tanh layer generation.Shown in calculation formula such as formula (V) and formula (VI):
it=σ (Wf[ht-1, xt]+bi) (V)
Wherein i determines that the numerical value updated, h represent output as a result, judging whether it is cut-point, w is the connection of shot and long term memory network Parameter, b are bias, CtIt is the historical information of shot and long term memory network t moment.
Historical information is obtained from door is forgotten, update Candidate Key is obtained from input gate, is believed using such as (VII) formula more new historical Breath:
Wherein C is the historical information of shot and long term memory network, and f calculates gained by formula (IV), determines the letter to be forgotten of t moment Breath, i calculate gained by public formula (V), determine the numerical value of update.
Out gate: out gate is used to control present node output information.One sigmoid layers are first passed through to determine to export that A little information, it is then multiplied to output with tanh layers of output phase.Shown in calculation formula such as formula (VIII) and formula (Ⅸ):
ot=σ (Wf[ht-1, xt]+bo) (VIII)
ht=ot*tanh(Ct) (IX)
Wherein σ represents sigmoid function, x be by having obtained vector after PE model treatment, h represent output as a result, judge whether be Cut-point, w are shot and long term memory network Connecting quantities, and b is bias.
After obtaining LSTM output, by one layer sigmoid layer, so that output is between [0,1], when output is close to 1, representative Previous node is cut-point, on the contrary then be continuity point.
When updating historical information using formula (Ⅹ), if the output that the previous time obtains is cut-point, Ct is updated to 0, It is not that cut-point is not handled then.
Ct=0 (when ht-1→1) (X)
Formula (IV) σ in (Ⅹ) represents sigmoid function, and x represents input, and h represents output, judges whether to be cut-point w generation Table Connecting quantity, b represent bias.
Step 5: being inputted according to processing result type structure similarity model;
Step 5.1: if it is requirement documents, then by all sentences of a demand by obtaining sentence vector after PE model treatment A matrix is constituted, while all sentence vectors an of method being taken to constitute another matrix;
Step 5.2: if it is service documents, then by all sentences of a method by obtaining sentence vector after PE model treatment A matrix is constituted, while all sentence vectors an of demand being taken to constitute another matrix;
Step 6: passing through trained convolutional neural networks (Convolutional Neural for two matrixes as input Networks, CNNs) similarity is calculated, method that each demand is intersected and each calculates similarity, takes to each demand similar Spend end value of the maximum value as this demand;
CNNs model is as shown in Figure 2 in the present invention.
CNNs network is generally divided into input layer, output layer, convolutional layer and full articulamentum.
Input layer: input layer directly acts on input matrix, is the segmentation text sentence after PE model treatment in the present invention Matrix.
Output layer: by CNNs treated output, the present invention output be two sections of texts similarity.
Convolutional layer: feature extraction is carried out to input.It is made of convolutional layer and sample level.Convolutional layer effect is to extract input data Feature, the feature that different convolution kernels extract are different.Sample level effect is also to retain important information while reducing data, With speed up processing, the sampling neuron of same layer shares weight.Sample level uses sigmoid function as activation letter Number, so that sample level has shift invariant.
After obtaining segmentation text, word segmentation processing is carried out for text, leaves the word of TF-IDF high, due to being passed through in demand and service Often contain indication information, therefore can also leave all numbers herein, using PE model to each sentence of text after segmentation at Reason, by gained sentence Vector Groups composite matrix, as individually one-dimensional after repetition of figures.
The matrix that requirement documents and service documents are formed first passes through respective convolutional layer, reconnects one layer of similarity after process of convolution Layer exports similarity finally by one layer of full articulamentum.
For the feature as much as possible for obtaining text, using two kinds of convolution operations, as shown in Figure 3: the window size on the left side is 2, entire word vector.A dimension of word vector is only included when the right window size is also 2 every time.In actual experiment, window Mouth size uses 1, Dim/2 and tri- kinds of ∞
When by sample level, maximum value pond, minimum value pond and mean value are used respectively for two kinds of convolution results obtained Chi Hua, different pond methods can be collected into different information, facilitate and carry out subsequent processing.
The similarity invention that similarity layer uses is cosine similarity.Due to having used maximum value, minimum value and mean value three Kind of pond method, therefore, they will mutually seek similarity, due to after sampling the result is that matrix, for each matrix, Every a line all seeks similarity with the every a line of another matrix, and each column all seek similarity with each column of another matrix, as shown in Figure 4. Such as the result is that the matrix of a N × M behind hypothesis maximum value pond.I-th row of matrix will seek phase with the N row of another matrix Like degree, the m column of matrix will be arranged with the jth of another matrix seeks similarity, and the result finally acquired is as similarity layer, simultaneously It will also be to similarity of entire matrix and another Matrix Calculating, due to asking row and column the result of similarity to compare entire matrix Ask similarity result more, therefore the similarity result that duplication obtains entire Matrix Calculating finally connects one so that three's weight is equal A full articulamentum exports similarity result.
Full articulamentum: as articulamentum complete in traditional neural network, the present invention uses one layer of full articulamentum before output.
Step 7: final similarity is obtained to similarity value weighted average;
Step 7.1: asking weighted average final as the problem to be solved of requirement documents after obtaining each demand end value Similarity value;
Step 7.2: above-mentioned steps solve the problem technology with the general introduction of the problem to be solved of requirement documents and service documents Method for, requirement documents include problem to be solved and need indexing section to be achieved when solving the problems, such as this, according to The above method acquires requirement documents again and solves the problems, such as to need indexing section similarity to be achieved when this, seeks two parts weighted average As requirement documents and the final similarity of service documents;
Final similarity calculation is the segmentation knot in each part of segmentation result and service documents of each part of requirement documents It is carried out on fruit, as shown in figure 5, since requirement documents are only there are two part, i.e., problem to be solved and solves the problems, such as this When need index to be achieved, therefore each part carry out text segmentation after can be with the result after each partial segmentation of service documents Intersection seeks similarity, takes the maximum value for intersecting result as the part matching value, such as the problem to be solved of requirement documents Partial segmentation goes out N number of segment, and the method partial segmentation that service documents general introduction solves the problem technology goes out M as a result, calculated crosswise After have N × M matching result, to each part of requirement documents take the maximum value of similarity as this part end value, obtain Take the final similarity of problem to be solved for seeking weighted average as requirement documents after all part end values of requirement documents Value.Similarly, best intersection result is sought in the problem to be solved part of requirement documents and all parts of service documents.
Above-mentioned steps are in the method that the general introduction of the problem to be solved of requirement documents and service documents solves the problem technology Example, requirement documents include problem to be solved and solve the problems, such as to need indexing section to be achieved when this, according to the above method Requirement documents are acquired again to solve the problems, such as to need indexing section similarity to be achieved when this, ask two parts weighted average as demand Document and the final similarity of service documents.
Step 8: final similarity compares with preset threshold, is greater than threshold value then two document matches, is less than threshold value then two texts Shelves mismatch.
Wherein, the cut-point in the step 4 refers to that the previous sentence of document and the latter sentence meaning be not identical, then previous sentence is One cut-point.The historical information more new formula of the shot and long term memory network are as follows:
Ct=0 (when ht-1→1)
Wherein CtIt is the historical information of shot and long term memory network t moment, ht-1It is the output of Last status, judges whether it is point Cutpoint.
When updating historical information, if the output that the previous time obtains is cut-point, by CtBe updated to 0, be not cut-point then It does not handle.

Claims (2)

1. a kind of requirement documents neural network based and service document matches method, it is characterised in that operating procedure is as follows:
Step 1: as document to be matched, requirement documents include that enterprise needs to solve for one requirement documents of input and a service documents Certainly the problem of and index to be achieved is needed when solving the problems, such as this, service documents then include to summarize the side for solving the problem technology Method, the experience for solving similar item, accept technological reserve, related patents obtained that this project has, it is quasi- take grind Study carefully method, the technical indicator mainly realized and project schedule plan;
Step 2: judging that input document is requirement documents or service documents according to document content;
Step 2.1: it is then demand that indexing section to be achieved is needed when including enterprise's problem to be solved and solving the problems, such as this Document extracts enterprise's problem to be solved and needs indexing section to be achieved when solving the problems, such as this;
Step 2.2: having including summarizing the method for solving the problem technology, the experience for solving similar item, accepting this project Technological reserve, related patents obtained, the quasi- research method taken, the technical indicator mainly realized and project schedule plan Part is then service documents, extracts the method for solving the problem technology of summarizing, the experience for solving similar item, accepts this project Technological reserve, related patents obtained, the quasi- research method taken, the technical indicator and project process mainly realized having Plan part;
Step 2.3: the similarity of final requirement documents and service documents will extract part and all clothes to all requirement documents Business document extracts part and calculates similarity, and the general introduction of the problem to be solved and service documents of requirement documents is taken to solve to be somebody's turn to do below For the method for problem technology;
Step 3: the method that the general introduction of problem to be solved part and service documents to requirement documents solves the problem technology Sentence in part carries out paragraph insertion processing, obtains sentence vector;
Step 4: document cut-point is judged by shot and long term memory network;
Step 4.1: the sentence vector of acquisition being inputted in trained shot and long term memory network, is exported by shot and long term memory network As a result judge whether previous sentence is a cut-point;
Step 4.2: according to cut-point by a partial segmentation at the different several sections of texts of looking like, part the problem of to requirement documents It is exactly demand one by one, the solution part of service documents is exactly method one by one.
Step 5: being inputted according to processing result type structure similarity model;
Step 5.1: if it is requirement documents, then by all sentences of a demand by obtaining sentence vector after PE model treatment A matrix is constituted, while all sentence vectors an of method being taken to constitute another matrix;
Step 5.2: if it is service documents, then by all sentences of a method by obtaining sentence vector after PE model treatment A matrix is constituted, while all sentence vectors an of demand being taken to constitute another matrix;
Step 6: similarity is calculated by trained convolutional neural networks using two matrixes as input, what each demand was intersected Similarity is calculated with each method, end value of the maximum value of similarity as this demand is taken to each demand;
Step 7: final similarity is obtained to similarity value weighted average;
Step 7.1: asking weighted average final as the problem to be solved of requirement documents after obtaining each demand end value Similarity value;
Step 7.2: above-mentioned steps solve the problem technology with the general introduction of the problem to be solved of requirement documents and service documents Method for, requirement documents include problem to be solved and need indexing section to be achieved when solving the problems, such as this, according to The above method acquires requirement documents again and solves the problems, such as to need indexing section similarity to be achieved when this, seeks two parts weighted average As requirement documents and the final similarity of service documents;
Step 8: final similarity compares with preset threshold, is greater than threshold value then two document matches, is less than threshold value then two texts Shelves mismatch.
2. requirement documents neural network based according to claim 1 and service document matches method, it is characterised in that:
Cut-point in the step 4 refers to that the previous sentence of document and the latter sentence meaning be not identical, then previous sentence is one Cut-point.The historical information more new formula of the shot and long term memory network are as follows:
Ct=0 (when ht-1→1)
Wherein CtThe historical information of duration short-term memory network t moment, ht-1It is the output of Last status, judges whether to be segmentation Point;
When updating historical information, if the output that the previous time obtains is cut-point, by CtBe updated to 0, be not cut-point then not Processing.
CN201810883232.4A 2018-03-12 2018-08-06 Neural network-based demand document and service document matching method Active CN109033413B (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810200624 2018-03-12
CN2018102006246 2018-03-12

Publications (2)

Publication Number Publication Date
CN109033413A true CN109033413A (en) 2018-12-18
CN109033413B CN109033413B (en) 2022-12-23

Family

ID=64649584

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810883232.4A Active CN109033413B (en) 2018-03-12 2018-08-06 Neural network-based demand document and service document matching method

Country Status (1)

Country Link
CN (1) CN109033413B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108595409A (en) * 2018-03-16 2018-09-28 上海大学 A kind of requirement documents based on neural network and service document matches method
WO2022061833A1 (en) * 2020-09-27 2022-03-31 西门子股份公司 Text similarity determination method and apparatus and industrial diagnosis method and system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106502985A (en) * 2016-10-20 2017-03-15 清华大学 A kind of neural network modeling approach and device for generating title
CN106528528A (en) * 2016-10-18 2017-03-22 哈尔滨工业大学深圳研究生院 A text emotion analysis method and device
CN107133202A (en) * 2017-06-01 2017-09-05 北京百度网讯科技有限公司 Text method of calibration and device based on artificial intelligence
CN107169035A (en) * 2017-04-19 2017-09-15 华南理工大学 A kind of file classification method for mixing shot and long term memory network and convolutional neural networks
CN107291871A (en) * 2017-06-15 2017-10-24 北京百度网讯科技有限公司 Matching degree appraisal procedure, equipment and the medium of many domain informations based on artificial intelligence
CN107679234A (en) * 2017-10-24 2018-02-09 上海携程国际旅行社有限公司 Customer service information providing method, device, electronic equipment, storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106528528A (en) * 2016-10-18 2017-03-22 哈尔滨工业大学深圳研究生院 A text emotion analysis method and device
CN106502985A (en) * 2016-10-20 2017-03-15 清华大学 A kind of neural network modeling approach and device for generating title
CN107169035A (en) * 2017-04-19 2017-09-15 华南理工大学 A kind of file classification method for mixing shot and long term memory network and convolutional neural networks
CN107133202A (en) * 2017-06-01 2017-09-05 北京百度网讯科技有限公司 Text method of calibration and device based on artificial intelligence
CN107291871A (en) * 2017-06-15 2017-10-24 北京百度网讯科技有限公司 Matching degree appraisal procedure, equipment and the medium of many domain informations based on artificial intelligence
CN107679234A (en) * 2017-10-24 2018-02-09 上海携程国际旅行社有限公司 Customer service information providing method, device, electronic equipment, storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
XIANGWEN ZOU: "Require- documents and provide-documents matching algorithm based on topic model", 《2016 INTERNATIONAL CONFERENCE ON AUDIO, LANGUAGE AND IMAGE PROCESSING》 *
尹庆宇: "基于长短期记忆循环神经网络的对话文本主题分割", 《哈工大SCIR》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108595409A (en) * 2018-03-16 2018-09-28 上海大学 A kind of requirement documents based on neural network and service document matches method
WO2022061833A1 (en) * 2020-09-27 2022-03-31 西门子股份公司 Text similarity determination method and apparatus and industrial diagnosis method and system

Also Published As

Publication number Publication date
CN109033413B (en) 2022-12-23

Similar Documents

Publication Publication Date Title
CN108595409A (en) A kind of requirement documents based on neural network and service document matches method
CN110609897B (en) Multi-category Chinese text classification method integrating global and local features
CN110287481B (en) Named entity corpus labeling training system
CN108984526A (en) A kind of document subject matter vector abstracting method based on deep learning
CN111241294B (en) Relationship extraction method of graph convolution network based on dependency analysis and keywords
CN111444726A (en) Method and device for extracting Chinese semantic information of long-time and short-time memory network based on bidirectional lattice structure
CN109325398A (en) A kind of face character analysis method based on transfer learning
CN108875809A (en) The biomedical entity relationship classification method of joint attention mechanism and neural network
CN107423442A (en) Method and system, storage medium and computer equipment are recommended in application based on user's portrait behavioural analysis
Wan et al. Auxiliary demographic information assisted age estimation with cascaded structure
CN110287323B (en) Target-oriented emotion classification method
CN110502753A (en) A kind of deep learning sentiment analysis model and its analysis method based on semantically enhancement
CN108427665A (en) A kind of text automatic generation method based on LSTM type RNN models
CN108874783A (en) Power information O&M knowledge model construction method
CN110188175A (en) A kind of question and answer based on BiLSTM-CRF model are to abstracting method, system and storage medium
CN111046178B (en) Text sequence generation method and system
CN110580287A (en) Emotion classification method based ON transfer learning and ON-LSTM
CN114091460A (en) Multitask Chinese entity naming identification method
Qi et al. Personalized sketch-based image retrieval by convolutional neural network and deep transfer learning
CN112883722B (en) Distributed text summarization method based on cloud data center
Yu et al. Research and implementation of CNN based on TensorFlow
Wang et al. Improvement of MNIST image recognition based on CNN
CN113886562A (en) AI resume screening method, system, equipment and storage medium
CN109033413A (en) A kind of requirement documents neural network based and service document matches method
CN114662456A (en) Image ancient poem generation method based on Faster R-convolutional neural network detection model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant