CN110472117A - A kind of determination method and device of destination document - Google Patents

A kind of determination method and device of destination document Download PDF

Info

Publication number
CN110472117A
CN110472117A CN201810438060.XA CN201810438060A CN110472117A CN 110472117 A CN110472117 A CN 110472117A CN 201810438060 A CN201810438060 A CN 201810438060A CN 110472117 A CN110472117 A CN 110472117A
Authority
CN
China
Prior art keywords
document
label
text
training
matched
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810438060.XA
Other languages
Chinese (zh)
Other versions
CN110472117B (en
Inventor
陈欣
从国华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu - Digital Technology Co Ltd
Original Assignee
Chengdu - Digital Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu - Digital Technology Co Ltd filed Critical Chengdu - Digital Technology Co Ltd
Priority to CN201810438060.XA priority Critical patent/CN110472117B/en
Publication of CN110472117A publication Critical patent/CN110472117A/en
Application granted granted Critical
Publication of CN110472117B publication Critical patent/CN110472117B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the present application discloses a kind of determination method and device of destination document, by obtaining at least one Training document for carrying trained label, the training label carried according to the Training document and the Training document, neural network is trained, obtain target nerve network, pass through target nerve network, obtain the document label of document to be matched and the text label of target text, wherein document to be matched is the document in visible state, target text is the text in clipbook, determine that document to be matched corresponding with the matched document label of the text label of target text is destination document.The text label of target text is the theme that can embody target text, and the document label of document to be matched is the theme that can embody document to be matched, it therefore can be under the premise of user carry out manual switching to document to be matched, it is accurate to determine destination document, improve user experience.

Description

A kind of determination method and device of destination document
Technical field
This application involves Internet technical field more particularly to a kind of determination method and devices of destination document.
Background technique
With the continuous development of information age, user's information to be treated gradually increases, and carries out copy editor in user When, it will usually many documents are opened, and then the text in document is edited.If these documents are simultaneously displayed on display screen On, it is easy mutually to block, therefore can realize the display of some of documents by the switching to display document, such as show The current document for needing to edit or check, hides and does not need the document edited or checked currently.
In this case, if user replicates content of text in webpage or in relevant documentation, it is desirable to will answer The content of text of system affixes in the document opened, due to blocking mutually between document or hiding, the use of some documents Family can not immediately find the destination document window for wanting to carry out content of text stickup, can only show document by manual switching, directly It is shown on a display screen to destination document window, the stickup for the content of text that could be replicated in the destination document window. This manual switching document determines destination document, so as to carried out in destination document content of text stickup mode, need The more time and efforts of user effort, reduces user experience.
Summary of the invention
In order to solve in the prior art, manual switching document determines poor user experience caused by the mode of destination document Problem, the embodiment of the present application provide a kind of determination method and device of destination document.
A kind of determination method of destination document provided by the embodiments of the present application, comprising:
Obtain at least one Training document for carrying trained label;
According to the training label that the Training document and the Training document carry, neural network is trained, is obtained Target nerve network;
By the target nerve network, the document label of document to be matched is obtained;The document to be matched is in can The document for state of seeing this;
By the target nerve network, the text label of target text is obtained;The target text is in clipbook Text;
Determine that document to be matched corresponding with the matched document label of the text label of the target text is destination document.
Optionally, the training label carried according to the Training document and the Training document, to neural network into Row training, comprising:
The training characteristics word for obtaining the Training document generates according to the training characteristics word and corresponds to the Training document Term vector matrix;
According to the training label that the term vector matrix of the Training document and the Training document carry, to neural network into Row training.
Optionally, described by the target nerve network, obtain the document label of document to be matched, comprising:
The processing feature word for obtaining the document to be matched is generated according to the processing feature word corresponding to described to be matched The term vector matrix of document;
By the target nerve network, the document to be matched for corresponding to the term vector matrix of the document to be matched is generated Document label;
It is described by the target nerve network, obtain the text label of target text, comprising:
The text feature word for obtaining the target text generates according to the text feature word and corresponds to the target text Term vector matrix;
By the target nerve network, the text for corresponding to the target text of term vector matrix of the target text is generated This label.
Optionally, the document to be matched includes the first document, then the text label of the determination and the target text The corresponding document to be matched of matched document label is destination document, comprising:
If the document label of first document is same or similar with the text label, using first document as Destination document.
Optionally, the method also includes:
The destination document is shown to user.
A kind of determining device of destination document provided by the embodiments of the present application, comprising:
Training document acquiring unit, for obtaining at least one Training document for carrying trained label;
Target nerve network acquiring unit, for being marked according to the training of the Training document and Training document carrying Label, are trained neural network, obtain target nerve network;
Document label acquiring unit, for obtaining the document label of document to be matched by the target nerve network;Institute Stating document to be matched is the document in visible state;
Text label acquiring unit, for obtaining the text label of target text by the target nerve network;It is described Target text is the text in clipbook;
Destination document determination unit, it is corresponding with the matched document label of the text label of the target text for determination Document to be matched is destination document.
Optionally, the target nerve network acquiring unit, comprising:
First term vector matrix generation unit, for obtaining the training characteristics word of the Training document, according to the training Feature Words generate the term vector matrix for corresponding to the Training document;
Neural metwork training unit, for what is carried according to the term vector matrix of the Training document and the Training document Training label, is trained neural network, forms target nerve network.
Optionally, the document label acquiring unit, comprising:
Second term vector matrix generation unit, for obtaining the processing feature word of the document to be matched, according to the place It manages Feature Words and generates the term vector matrix for corresponding to the document to be matched;
Document label obtains subelement, for generating and corresponding to the document to be matched by the target nerve network Term vector matrix document to be matched document label;
The text label acquiring unit, comprising:
Third term vector matrix generation unit, for obtaining the text feature word of the target text, according to the text Feature Words generate the term vector matrix for corresponding to the target text;
Text label obtains subelement, for generating and corresponding to the target text by the target nerve network The text label of the target text of term vector matrix.
Optionally, the document to be matched includes the first document, then destination document determination unit is specifically used for:
If the document label of first document is same or similar with the text label, using first document as Destination document.
Optionally, described device further include:
Display unit, for showing the destination document to user.
A kind of determination method and device of destination document provided by the embodiments of the present application, is carried by obtaining at least one The Training document of training label, according to the training label that the Training document and the Training document carry, to neural network into Row training, obtain target nerve network, by target nerve network, obtains the document label and target text of document to be matched Text label, wherein document to be matched is the document in visible state, target text is the text in clipbook, determining and mesh The corresponding document to be matched of the matched document label of text label for marking text is destination document.
In the embodiment of the present application, target text is the text in clipbook, i.e., text to be pasted, wherein target is literary This text label is the theme that can embody target text, and document to be matched is the document in visible state, is The document that user opens, wherein the document label of document to be matched is the theme that can embody document to be matched.If therefore having The document label of document to be matched is matched with the text label of target text, then illustrates that the document to be matched is and target text The same or similar document of theme, it is higher that user wants a possibility that pasting target text in the document to be matched, therefore It, can be calibrated under the premise of user does not carry out manual switching to document to be matched using the document to be matched as destination document True determines destination document, improves user experience.
Detailed description of the invention
In order to illustrate the technical solutions in the embodiments of the present application or in the prior art more clearly, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this The some embodiments recorded in application, for those of ordinary skill in the art, without creative efforts, It can also be obtained according to these attached drawings other attached drawings.
Fig. 1 is a kind of flow chart of the determination method of destination document provided by the embodiments of the present application;
Fig. 2 is a kind of display renderings of destination document provided by the embodiments of the present application;
Fig. 3 is a kind of display renderings of destination document provided by the embodiments of the present application;
Fig. 4 is a kind of structural block diagram of the determining device of destination document provided by the embodiments of the present application.
Specific embodiment
In order to make those skilled in the art more fully understand application scheme, below in conjunction in the embodiment of the present application Attached drawing, the technical scheme in the embodiment of the application is clearly and completely described, it is clear that described embodiment is only this Apply for a part of the embodiment, instead of all the embodiments.Based on the embodiment in the application, those of ordinary skill in the art exist Every other embodiment obtained under the premise of creative work is not made, shall fall in the protection scope of this application.
In the prior art, user is when carrying out copy editor, it will usually open more multiple documents, and then to the text in document Word is edited, if these documents are simultaneously displayed on display screen, is easy mutually to block, and usually can show document by switching It realizes the display of some documents, such as can only show the current document for needing to edit or check, hide and currently do not need to compile The document collected or checked.
In this case, if user replicates content of text in webpage or in relevant documentation, it is desirable to will replicate Content of text affix in destination document, due between document mutually blocking or some documents hide, user can not Immediately find the window of destination document, this can show document by manual switching, until destination document window is shown in display screen On, the stickup for the content of text that could be replicated in the destination document window.Manual switching shows document, needs to open Document show one by one, judge whether the display document is destination document by the content in document, therefore this manual switching Document determines destination document, so as to carried out in destination document content of text stickup mode, need the user effort more Time and efforts, reduce user experience.
To solve the above-mentioned problems, the embodiment of the present application provides a kind of determination method of destination document, by obtaining extremely Few one carries the Training document of trained label, according to the training label that the Training document and the Training document carry, Neural network is trained, target nerve network is obtained, by target nerve network, obtains the document label of document to be matched With the text label of target text, wherein document to be matched is the document in visible state, target text is in clipbook Text determines that document to be matched corresponding with the matched document label of the text label of target text is destination document.
In the embodiment of the present application, target text is the text in clipbook, is text to be pasted, wherein target text This text label is the theme that can embody target text, and document to be matched is the document in visible state, is The document that user opens, wherein the document label of document to be matched is the theme that can embody document to be matched.If therefore having The document label of document to be matched is matched with the text label of target text, then illustrates that the document to be matched is and target text The same or similar document of theme, it is higher that user wants a possibility that pasting target text in the document to be matched, therefore It, can be calibrated under the premise of user does not carry out manual switching to document to be matched using the document to be matched as destination document True determines destination document, improves user experience.
A kind of determination method of destination document provided by the embodiments of the present application, can be using at the terminal, it is possible to understand that It is that terminal can be the communication apparatus such as mobile phone, laptop and tablet computer.With reference to the accompanying drawing, the application is described in detail Various non-limiting embodiments.
Refering to what is shown in Fig. 1, the flow chart of the determination method for a kind of destination document of the embodiment of the present application, this method can be with Include the following steps.
S101 obtains at least one Training document for carrying trained label.
Document can be copy editor's Software Create such as file by copy editor's Software Create, such as WORD, EXCEL Format be doc, docx, xls and xlsx file etc..
Training document is the document for training neural network, carries trained label.
Training label is the label that can embody the theme of Training document, can be the tag along sort of Training document, such as Item name or class number, item name for example lives, military, cuisines and science and technology etc., and class number's such as life kind is other Number is 1, and the number of military classification is 2 etc..Training label can also be that other can embody the label of Training document theme, example Such as theme label, keyword label, do not do illustrate one by one herein.
Using the tag along sort of Training document as when training label, at least one training for carrying trained label is obtained Document can crawl news website data by web crawlers, and the type for obtaining the news website data, which is used as, to be crawled The tag along sort of the website data arrived downloading can also pass through the multiple documents classified on open website, by the type of document Name is referred to as tag along sort.
After getting Training document and its tag along sort, tag along sort can also be handled, so as to contingency table Label are managed.Processing to tag along sort, specifically, the merging of similar tags can be carried out, such as scientific and technological classification and science Classification can merge, and form the tag along sort of science and technology;Tag along sort can also be converted to class by the processing to tag along sort It does not number, such as the classification that will live is converted to the classification that number is 1, forms the tag along sort that class number is 1.
S102 is trained neural network, obtains target according to the training label that Training document and Training document carry Neural network.
Neural network is a kind of neural network for simulating human brain, to can be realized the machine learning skill of class artificial intelligence Art generally includes input layer, middle layer and output layer, training data is inputted from input layer, by middle layer to training data It is handled, treated training data is exported from output layer, output data is formed, according to training data and output data pair Neural network is trained.Neural network is trained, the parameter of neural network is actually updated, so that neural network The processing mode of data is changed, to reach specific purpose.
In the embodiment of the present application, neural network is instructed according to the training label that Training document and Training document carry Practice, neural network learning can be made to obtain the corresponding relationship for the training label that Training document and Training document carry, therefore to instruction Document is inputted in the target nerve network got, i.e., the corresponding label of exportable document.
Neural network is trained according to the training label that Training document and Training document carry, specifically, can obtain The training characteristics word for taking Training document generates the term vector matrix for corresponding to Training document according to training characteristics word, according to training The training label that the term vector matrix and Training document of document carry, is trained neural network.
Training document can be obtained more with specifically, by segmenting to Training document to the acquisition of training characteristics word A phrase removes stop words in phrase, forms training characteristics word.Training document participle can be carried out by participle tool, Such as jieba participle tool etc..Stop words, which refers to, itself, without specific meaning, to be only placed in complete sentence and just has work Word, usually auxiliary words of mood, adverbial word, preposition or conjunction etc., such as " ", " " etc..
After forming Feature Words, the term vector of Feature Words can be obtained according to the neural network algorithm that preparatory training obtains, Such as it can be trained by open source word2vec packet.Term vector be human language is carried out it is digitized as a result, each word Remittance can correspond to a term vector, such as vocabulary " I " can correspond to the term vector of n dimension: [0.3,0.8 ... ..., 0.7].
The term vector of each Feature Words is combined to the term vector matrix to form Training document, such as is had in Training document K Feature Words, each Feature Words correspond to n dimension row matrix, and the term vector of first Feature Words is [a11,a12,…,a1n], second The term vector of Feature Words is [a21,a22,…,a2n] ..., the term vector of k-th of Feature Words is [ak1,ak2,…,akn], then the instruction The term vector matrix for practicing document is represented by
After the term vector matrix for forming Training document, it can be carried according to the term vector matrix and Training document of Training document Training label, neural network is trained, make neural network learning obtain Training document term vector matrix and training text The corresponding relationship for the training label that shelves carry, therefore the term vector matrix of the target nerve network inputs document obtained to training, The corresponding label of i.e. exportable document.
S103 obtains the document label of document to be matched by target nerve network.
Document to be matched is the document in visible state, i.e. the document that has already turned on of user, such as can be and beaten The format opened is the file etc. of doc, docx, xls and xlsx.
Due to target nerve network inputs document, i.e., exportable document corresponding label can be in the embodiment of the present application By the way that target nerve network inputs document to be matched, then target nerve network can export the corresponding document mark of document to be matched Label.
The document label of document to be matched is the label that can embody the theme of document to be matched, can be document to be matched Tag along sort, such as item name or class number.Document label can also be that other can embody document subject matter to be matched Label, do not do illustrate one by one herein.
In the training characteristics word that target nerve network is by obtaining Training document, corresponded to according to the generation of training characteristics word The term vector matrix of Training document, according to the training label that the term vector matrix and Training document of Training document carry, to nerve Network is trained in the case where obtaining, and by target nerve network, obtains the document label of document to be matched, can be specific Are as follows: the file characteristics word for obtaining document to be matched generates the term vector matrix for corresponding to document to be matched according to file characteristics word, By target nerve network, the document label for corresponding to the document to be matched of term vector matrix of document to be matched is generated.
Specifically, the acquisition of the term vector matrix of the acquisition modes of the file characteristics word of document to be matched, document to be matched The acquisition modes of the document label of mode and document to be matched can correspond to the training characteristics word with reference to Training document in S102 Acquisition modes, Training document term vector matrix acquisition modes and Training document training label acquisition modes, herein It repeats no more.
S104 obtains the text label of target text by target nerve network.
Target text is the text in clipbook, i.e. the text of user's duplication.It can be one or more phrases, it can also be with It is one or more paragraphs.
Due to target nerve network inputs document, i.e., exportable document corresponding label can be in the embodiment of the present application By the way that target nerve network inputs target text, then target nerve network can export the corresponding text label of target text.
The text label of target text is the label that can embody the theme of target text, the text corresponding to document to be matched Shelves label, the text label of target text are also possible to the tag along sort of target text, such as item name or class number.Mesh The text label of mark text is also possible to other labels that can embody target text theme.
It is understood that the document label of document to be matched and the text label of target text are all to pass through target nerve What network obtained, therefore, the document label of document to be matched and the text label of target text are same alike result, such as document Label is the tag along sort of document to be matched, and is item name, then the text label of target text is also target text Tag along sort, and be item name.
In the training characteristics word that target nerve network is by obtaining Training document, corresponded to according to the generation of training characteristics word The term vector matrix of Training document, according to the training label that the term vector matrix and Training document of Training document carry, to nerve Network is trained in the case where obtaining, and by target nerve network, obtains the text label of target text, can be with specifically: The text feature word for obtaining target text generates the term vector matrix for corresponding to target text according to text feature word, passes through mesh Neural network is marked, the text label for corresponding to the target text of term vector matrix of target text is generated.
Specifically, the acquisition modes of the term vector matrix of the acquisition modes of the text feature word of target text, target text With the acquisition modes of the text label of target text, the acquisition with reference to the training characteristics word of Training document in S102 can be corresponded to Mode, Training document term vector matrix acquisition modes and Training document training label acquisition modes, it is no longer superfluous herein It states.
Above-mentioned S104 can also be executed before S103, can also be performed simultaneously with S103, it is not limited here.
S105 determines that document to be matched corresponding with the matched document label of the text label of target text is target text Shelves.
In the embodiment of the present application, target text is the text in clipbook, is text to be pasted, wherein target text This text label is the theme that can embody target text, and document to be matched is the document in visible state, is The document that user opens, wherein the document label of document to be matched is the theme that can embody document to be matched.If therefore having The document label of document to be matched is matched with the text label of target text, then illustrates that the document to be matched is and target text The same or similar document of theme, it is higher that user wants a possibility that pasting target text in the document to be matched, therefore Using the document to be matched as destination document.
Judge whether the document label of document to be matched matches with the text label of target text, it can be determined that text to be matched Whether the document label of shelves and the text label of target text are same or similar, if so, judging the document label of the first document It is matched with the text label of target text, by document to be matched corresponding with the matched document label of the text label of target text As destination document.
Specifically, if the document label of the first document in document to be matched is same or similar with text label, by One document is as destination document.For example, if the text label of target text is 2, it can be said that the classification of bright target text For military classification, if the document label of the first document is also 2, the document label of the first document and the text label of target text Identical, the first document can be used as destination document;If the document label of the first document be military affairs, the document label of the first document with The text label of target text is similar, and the first document can be used as destination document;If the document label of the first document is army, the The document label of one document is similar to the text label of target text, and the first document can be used as destination document.
It, can also be to user's displaying target document after determining destination document.Specifically, if destination document only has one It is a, the content of destination document can be shown on a display screen, refering to what is shown in Fig. 2, the display of destination document 1 is on a display screen, be used Family can carry out the stickup of target text after being determined according to the content of destination document 1 directly in destination document 1.In Destination document be it is multiple when, can the title of displaying target document or the partial content of displaying target document, with reference to Fig. 3 institute Show, destination document includes destination document 1, destination document 2 and destination document 3, user according to the title of the destination document of display and Partial content, determine paste target text document, however, it is determined that very peculiar target text document be destination document 1, then Title or the partial content region for clicking the destination document 1, can show the full content of the destination document 1, such as Fig. 2 institute Show, and then carries out the stickup of target text in the destination document 1.
A kind of determination method of destination document provided by the embodiments of the present application carries trained mark by obtaining at least one The Training document of label is trained neural network according to the training label that the Training document and the Training document carry, Target nerve network is obtained, by target nerve network, obtains the document label of document to be matched and the text mark of target text Label, wherein document to be matched is the document in visible state, target text is the text in clipbook, determining and target text The corresponding document to be matched of the matched document label of text label be destination document.Can user not to document to be matched into It is accurate to determine destination document under the premise of row manual switching, improve user experience.
Based on a kind of determination method for destination document that above embodiments provide, the embodiment of the present application also provides a kind of mesh The determining device of document is marked, is described in detail its working principle with reference to the accompanying drawing.
Referring to fig. 4, which is a kind of structural block diagram of the determining device of destination document provided by the embodiments of the present application, the dress It sets and includes:
Training document acquiring unit 401, for obtaining at least one Training document for carrying trained label;
Target nerve network acquiring unit 402, for the training according to the Training document and Training document carrying Label is trained neural network, obtains target nerve network;
Document label acquiring unit 403, for obtaining the document mark of document to be matched by the target nerve network Label;The document to be matched is the document in visible state;
Text label acquiring unit 404, for obtaining the text label of target text by the target nerve network; The target text is the text in clipbook;
Destination document determination unit 405, for the determining matched document label pair of text label with the target text The document to be matched answered is destination document.
Optionally, the target nerve network acquiring unit, comprising:
First term vector matrix generation unit, for obtaining the training characteristics word of the Training document, according to the training Feature Words generate the term vector matrix for corresponding to the Training document;
Neural metwork training unit, for what is carried according to the term vector matrix of the Training document and the Training document Training label, is trained neural network, forms target nerve network.
Optionally, the document label acquiring unit, comprising:
Second term vector matrix generation unit, for obtaining the processing feature word of the document to be matched, according to the place It manages Feature Words and generates the term vector matrix for corresponding to the document to be matched;
Document label obtains subelement, for generating and corresponding to the document to be matched by the target nerve network Term vector matrix document to be matched document label;
The text label acquiring unit, comprising:
Third term vector matrix generation unit, for obtaining the text feature word of the target text, according to the text Feature Words generate the term vector matrix for corresponding to the target text;
Text label obtains subelement, for generating and corresponding to the target text by the target nerve network The text label of the target text of term vector matrix.
Optionally, the document to be matched includes the first document, then destination document determination unit is specifically used for:
If the document label of first document is same or similar with the text label, using first document as Destination document.
Optionally, described device further include:
Display unit, for showing the destination document to user.
A kind of determination method and device of destination document provided by the embodiments of the present application, is carried by obtaining at least one The Training document of training label, according to the training label that the Training document and the Training document carry, to neural network into Row training, obtain target nerve network, by target nerve network, obtains the document label and target text of document to be matched Text label, wherein document to be matched is the document in visible state, target text is the text in clipbook, determining and mesh The corresponding document to be matched of the matched document label of text label for marking text is destination document.
In the embodiment of the present application, target text is the text in clipbook, i.e., text to be pasted, wherein target is literary This text label is the theme that can embody target text, and document to be matched is the document in visible state, is The document that user opens, wherein the document label of document to be matched is the theme that can embody document to be matched.If therefore having The document label of document to be matched is matched with the text label of target text, then illustrates that the document to be matched is and target text The same or similar document of theme, it is higher that user wants a possibility that pasting target text in the document to be matched, therefore It, can be calibrated under the premise of user does not carry out manual switching to document to be matched using the document to be matched as destination document True determines destination document, improves user experience.
When introducing the element of various embodiments of the application, the article " one ", "one", " this " and " described " be intended to Indicate one or more elements.Word "include", "comprise" and " having " are all inclusive and mean in addition to listing Except element, there can also be other elements.
It should be noted that those of ordinary skill in the art will appreciate that realizing the whole in above method embodiment or portion Split flow is relevant hardware can be instructed to complete by computer program, and the program can be stored in a computer In read/write memory medium, the program is when being executed, it may include such as the process of above-mentioned each method embodiment.Wherein, the storage Medium can be magnetic disk, CD, read-only memory (Read-Only Memory, ROM) or random access memory (Random Access Memory, RAM) etc..
All the embodiments in this specification are described in a progressive manner, same and similar portion between each embodiment Dividing may refer to each other, and each embodiment focuses on the differences from other embodiments.Especially for device reality For applying example, since it is substantially similar to the method embodiment, so describing fairly simple, related place is referring to embodiment of the method Part explanation.The apparatus embodiments described above are merely exemplary, wherein described be used as separate part description Unit and module may or may not be physically separated.Furthermore it is also possible to select it according to the actual needs In some or all of unit and module achieve the purpose of the solution of this embodiment.Those of ordinary skill in the art are not paying In the case where creative work, it can understand and implement.
The above is only the specific embodiment of the application, it is noted that for the ordinary skill people of the art For member, under the premise of not departing from the application principle, several improvements and modifications can also be made, these improvements and modifications are also answered It is considered as the protection scope of the application.

Claims (10)

1. a kind of determination method of destination document, which is characterized in that the described method includes:
Obtain at least one Training document for carrying trained label;
According to the training label that the Training document and the Training document carry, neural network is trained, target is obtained Neural network;
By the target nerve network, the document label of document to be matched is obtained;The document to be matched is in can see this The document of state;
By the target nerve network, the text label of target text is obtained;The target text is the text in clipbook;
Determine that document to be matched corresponding with the matched document label of the text label of the target text is destination document.
2. the method according to claim 1, wherein described take according to the Training document and the Training document The training label of band, is trained neural network, comprising:
The training characteristics word for obtaining the Training document generates the word for corresponding to the Training document according to the training characteristics word Vector matrix;
According to the training label that the term vector matrix of the Training document and the Training document carry, neural network is instructed Practice.
3. according to the method described in claim 2, acquisition is to be matched it is characterized in that, described by the target nerve network The document label of document, comprising:
The processing feature word for obtaining the document to be matched generates according to the processing feature word and corresponds to the document to be matched Term vector matrix;
By the target nerve network, the text for corresponding to the document to be matched of term vector matrix of the document to be matched is generated Shelves label;
It is described by the target nerve network, obtain the text label of target text, comprising:
The text feature word for obtaining the target text generates the word for corresponding to the target text according to the text feature word Vector matrix;
By the target nerve network, the text mark for corresponding to the target text of term vector matrix of the target text is generated Label.
4. according to claim 1 to method described in 3 any one, which is characterized in that the document to be matched includes the first text Shelves, then determination document to be matched corresponding with the matched document label of the text label of the target text is target text Shelves, comprising:
If the document label of first document is same or similar with the text label, using first document as target Document.
5. according to the method described in claim 4, it is characterized in that, the method also includes:
The destination document is shown to user.
6. a kind of determining device of destination document, which is characterized in that described device includes:
Training document acquiring unit, for obtaining at least one Training document for carrying trained label;
Target nerve network acquiring unit is right for the training label according to the Training document and Training document carrying Neural network is trained, and obtains target nerve network;
Document label acquiring unit, for obtaining the document label of document to be matched by the target nerve network;It is described to Matching document is the document in visible state;
Text label acquiring unit, for obtaining the text label of target text by the target nerve network;The target Text is the text in clipbook;
Destination document determination unit, for determine it is corresponding with the matched document label of the text label of the target text to It is destination document with document.
7. device according to claim 6, which is characterized in that the target nerve network acquiring unit, comprising:
First term vector matrix generation unit, for obtaining the training characteristics word of the Training document, according to the training characteristics Word generates the term vector matrix for corresponding to the Training document;
Neural metwork training unit, for the training according to the term vector matrix of the Training document and Training document carrying Label is trained neural network, forms target nerve network.
8. device according to claim 7, which is characterized in that the document label acquiring unit, comprising:
Second term vector matrix generation unit, it is special according to the processing for obtaining the processing feature word of the document to be matched It levies word and generates the term vector matrix for corresponding to the document to be matched;
Document label obtains subelement, for generating the word for corresponding to the document to be matched by the target nerve network The document label of the document to be matched of vector matrix;
The text label acquiring unit, comprising:
Third term vector matrix generation unit, for obtaining the text feature word of the target text, according to the text feature Word generates the term vector matrix for corresponding to the target text;
Text label obtains subelement, for by the target nerve network, generate correspond to the word of the target text to The text label of the target text of moment matrix.
9. according to device described in claim 6 to 8 any one, which is characterized in that the document to be matched includes the first text Shelves, then destination document determination unit is specifically used for:
If the document label of first document is same or similar with the text label, using first document as target Document.
10. device according to claim 9, which is characterized in that described device further include:
Display unit, for showing the destination document to user.
CN201810438060.XA 2018-05-09 2018-05-09 Target document determination method and device Active CN110472117B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810438060.XA CN110472117B (en) 2018-05-09 2018-05-09 Target document determination method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810438060.XA CN110472117B (en) 2018-05-09 2018-05-09 Target document determination method and device

Publications (2)

Publication Number Publication Date
CN110472117A true CN110472117A (en) 2019-11-19
CN110472117B CN110472117B (en) 2023-01-24

Family

ID=68503694

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810438060.XA Active CN110472117B (en) 2018-05-09 2018-05-09 Target document determination method and device

Country Status (1)

Country Link
CN (1) CN110472117B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6460034B1 (en) * 1997-05-21 2002-10-01 Oracle Corporation Document knowledge base research and retrieval system
CN103123566A (en) * 2011-11-21 2013-05-29 联想(北京)有限公司 Electronic device and text input method thereof
CN103294693A (en) * 2012-02-27 2013-09-11 华为技术有限公司 Searching method, server and system
CN103309850A (en) * 2013-06-25 2013-09-18 北京小米科技有限责任公司 Content editing method, content editing device and terminal

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6460034B1 (en) * 1997-05-21 2002-10-01 Oracle Corporation Document knowledge base research and retrieval system
CN103123566A (en) * 2011-11-21 2013-05-29 联想(北京)有限公司 Electronic device and text input method thereof
CN103294693A (en) * 2012-02-27 2013-09-11 华为技术有限公司 Searching method, server and system
CN103309850A (en) * 2013-06-25 2013-09-18 北京小米科技有限责任公司 Content editing method, content editing device and terminal

Also Published As

Publication number Publication date
CN110472117B (en) 2023-01-24

Similar Documents

Publication Publication Date Title
Kalyuga Informing: A cognitive load perspective
Gupta et al. Automating content extraction of html documents
CN111507099A (en) Text classification method and device, computer equipment and storage medium
Perugini et al. Enhancing usability in CITIDEL: multimodal, multilingual, and interactive visualization interfaces
US9923898B2 (en) Resource management in a presentation environment
Cardie et al. Text annotation for political science research
Rabadán Refining the idea of``applied extensions''
Biletskiy et al. Information extraction from syllabi for academic e-Advising
CN110472117A (en) A kind of determination method and device of destination document
Belerao et al. Summarization using mapreduce framework based big data and hybrid algorithm (HMM and DBSCAN)
Brown et al. In search of Zora/When metadata isn’t enough: Rescuing the experiences of Black women through statistical modeling
Anna et al. Enhancing virtual instruction: Leveraging AI applications for success
De Medio et al. Discovering prerequisite relationships among learning objects: a coursera-driven approach
CN113761147A (en) Logic editor-based questionnaire question display method and device and electronic equipment
Bhatia et al. Opinion score mining system
Weiss The expert cataloging assistant project at the National Library of Medicine
Yimam Adaptive Approaches to Natural Language Processing in Annotation and Application
Moeed et al. Evaluation metrics for headline generation using deep pre-trained embeddings
Nishitha et al. Information Retrieval Based Solutions for Software Engineering Tasks Using C Codebases
Heera Dr. M. Ram Manohar
Agarwal et al. Use Cases of ChatGPT and Other AI Tools With Security Concerns
Lacey Loud and Proud: Gender, Music, and Identity.
Saikh et al. Identifying and Pruning Features for Classifying Translated and Post-edited Gaze Durations
Alrehiely Visualization of version variation
Lea Inside Organizations: Anthropologists at Work.(Book Reviews).

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant