CN110096590A - A kind of document classification method, apparatus, medium and electronic equipment - Google Patents
A kind of document classification method, apparatus, medium and electronic equipment Download PDFInfo
- Publication number
- CN110096590A CN110096590A CN201910206339.XA CN201910206339A CN110096590A CN 110096590 A CN110096590 A CN 110096590A CN 201910206339 A CN201910206339 A CN 201910206339A CN 110096590 A CN110096590 A CN 110096590A
- Authority
- CN
- China
- Prior art keywords
- document
- specified directory
- user
- under
- documents
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Present disclose provides a kind of document classification method, apparatus, medium and electronic equipments, this method comprises: User ID is obtained, the determining quantity with the associated document under specified directory of the User ID;When the number of documents is more than preset value, the structural relation between the document under the specified directory is adjusted;Result adjusted is shown at default interface.The present disclosure proposes a kind of document classification methods, the quantity of system automatic identification statistical documents, and whether user is reminded to sort out according to setting, classifying module is called to sort out the document under specified directory when being sorted out, so that the document under specified directory carries out classified finishing storage according to document content, convenient for user's lookup, office efficiency is improved.
Description
Technical field
This disclosure relates to field of computer technology, in particular to a kind of document classification method, apparatus, medium and electricity
Sub- equipment.
Background technique
With document use more and more frequently, when counting user document data, the number of documents of user is long-range
In folder data, this just illustrates that the possible most of document of user is unordered hash storage.In fact, user need using
When document, is used often through newly-built or copy relevant documentation, current location is just arbitrarily stored in after the completion of editor, with the time
Accumulation, number of documents will be more and more, cause to store a large amount of nameless documents under same catalogue, not by arranging,
Causing can be comparatively laborious when wanting to look for some type of document below.
Therefore, how automatic taxonomic revision quickly and effectively to be carried out to document, just becomes urgent the technical issues of solving.
Disclosure
The disclosure is designed to provide a kind of document classification method, apparatus, medium and electronic equipment, is able to solve above-mentioned
At least one technical problem mentioned.Concrete scheme is as follows:
According to the specific embodiment of the disclosure, in a first aspect, the disclosure provides a kind of document classification method, comprising:
Obtain User ID, the determining quantity with the associated document under specified directory of the User ID;
When the number of documents is more than preset value, the structural relation between the document under the specified directory is adjusted;
Result adjusted is shown at default interface.
Optionally, it is described result adjusted is shown at default interface after, comprising:
According in the default received user instruction in interface, by the structural relation of the document under the specified directory according to
The result adjusted is shown.
Optionally, described when the on-line documentation quantity is more than preset value, it adjusts between the document under the specified directory
Structural relation, comprising:
When the number of documents is more than preset value, the relevance between the document is calculated according to pre-defined rule;
According to the relevance to the document classification.
Optionally, it is described according to the relevance to the document classification, comprising:
The ID for obtaining the document reads the content information of the document;
The high document of the degree of association is polymerize;
Document after polymerization is placed under same catalogue.
Optionally, described when the on-line documentation quantity is more than preset value, it adjusts between the document under the specified directory
Structural relation, comprising:
It is described when the on-line documentation quantity be more than preset value when, provide and whether carry out document classification prompt information;
After confirmation is sorted out, the specified directory Documents are sorted out automatically.
According to the specific embodiment of the disclosure, second aspect, the disclosure provides a kind of document classification device, comprising:
Acquiring unit, for obtaining User ID, the associated document under specified directory of the determining and User ID
Quantity;
Sort out unit, for adjusting between the document under the specified directory when the number of documents is more than preset value
Structural relation;
Display unit, for showing result adjusted at default interface.
Optionally, the display unit is also used to:
According in the default received user instruction in interface, by the structural relation of the document under the specified directory according to
The result adjusted is shown.
Optionally, the classification unit is also used to:
When the number of documents is more than preset value, the relevance between the document is calculated according to pre-defined rule;
According to the relevance to the document classification.
According to the specific embodiment of the disclosure, the third aspect, the disclosure provides a kind of computer readable storage medium,
On be stored with computer program, when described program is executed by processor realize as above described in any item methods.
According to the specific embodiment of the disclosure, fourth aspect, the disclosure provides a kind of electronic equipment, comprising: one or
Multiple processors;Storage device, for storing one or more programs, when one or more of programs are by one or more
When a processor executes, so that one or more of processors realize as above described in any item methods.
The above scheme of the embodiment of the present disclosure compared with prior art, at least has the advantages that the disclosure proposes
A kind of document classification method, the quantity of system automatic identification statistical documents, and whether user is reminded to sort out according to setting,
Classifying module is called to sort out the document under specified directory when being sorted out, so that the text under specified directory
Shelves carry out classified finishing storage according to document content, search convenient for user, improve office efficiency.
Detailed description of the invention
The drawings herein are incorporated into the specification and forms part of this specification, and shows the implementation for meeting the disclosure
Example, and together with specification for explaining the principles of this disclosure.It should be evident that the accompanying drawings in the following description is only the disclosure
Some embodiments for those of ordinary skill in the art without creative efforts, can also basis
These attached drawings obtain other attached drawings.In the accompanying drawings:
Fig. 1 shows the document classification method flow diagram according to the embodiment of the present disclosure;
Fig. 2 shows drag execution flow diagram according to the document classification method of the embodiment of the present disclosure;
Fig. 3 shows the document classification apparatus structure schematic diagram according to the embodiment of the present disclosure;
Fig. 4 shows electronic equipment attachment structure schematic diagram according to an embodiment of the present disclosure.
Specific embodiment
In order to keep the purposes, technical schemes and advantages of the disclosure clearer, below in conjunction with attached drawing to the disclosure make into
It is described in detail to one step, it is clear that described embodiment is only disclosure a part of the embodiment, rather than whole implementation
Example.It is obtained by those of ordinary skill in the art without making creative efforts based on the embodiment in the disclosure
All other embodiment belongs to the range of disclosure protection.
The term used in the embodiments of the present disclosure is only to be not intended to be limiting merely for for the purpose of describing particular embodiments
The disclosure.In the embodiment of the present disclosure and the "an" of singular used in the attached claims, " described " and "the"
It is also intended to including most forms, unless the context clearly indicates other meaning, " a variety of " generally comprise at least two.
It should be appreciated that term "and/or" used herein is only a kind of incidence relation for describing affiliated partner, indicate
There may be three kinds of relationships, for example, A and/or B, can indicate: individualism A, exist simultaneously A and B, individualism B these three
Situation.In addition, character "/" herein, typicallys represent the relationship that forward-backward correlation object is a kind of "or".
It will be appreciated that though may be described in the embodiments of the present disclosure using term first, second, third, etc..,
But these ... it should not necessarily be limited by these terms.These terms be only used to by ... distinguish.For example, implementing not departing from the disclosure
In the case where example range, first ... can also be referred to as second ..., and similarly, second ... can also be referred to as the
One ....
Depending on context, word as used in this " if ", " if " can be construed to " ... when " or
" when ... " or " in response to determination " or " in response to detection ".Similarly, context is depended on, phrase " if it is determined that " or " such as
Fruit detection (condition or event of statement) " can be construed to " when determining " or " in response to determination " or " when detection (statement
Condition or event) when " or " in response to detection (condition or event of statement) ".
It should also be noted that, the terms "include", "comprise" or its any other variant are intended to nonexcludability
Include, so that commodity or device including a series of elements not only include those elements, but also including not clear
The other element listed, or further include for this commodity or the intrinsic element of device.In the feelings not limited more
Under condition, the element that is limited by sentence "including a ...", it is not excluded that in the commodity or device for including the element also
There are other identical elements.
The alternative embodiment of the disclosure is described in detail with reference to the accompanying drawing.
Embodiment 1
As shown in Figure 1, the disclosure provides a kind of document classification method, particularly according to the specific embodiment of the disclosure
Refer to a kind of classifying method of online document, naturally it is also possible to which, suitable for the classification of common document, being applied to master catalogue, (such as user is first
Page) under document classification, can also be any specified file, such as a certain files etc. with multiple storage documents,
Be certainly not limited to this, for it is any include that the positions of multiple online documents can execute the online document automatic clustering side
Method.Online document herein can be allowed for one kind user's online editing word, excel or any can input text
Editing machine.This method specifically comprises the following steps:
Step S102: User ID, the determining quantity with the associated document under specified directory of the User ID are obtained.
Each online document editor is enlightened, requires the editor user for determining the online document, including confirm the volume
The id informations such as account, user name, phone, the mailbox of user are collected, it is determining later associated positioned at specified directory with the User ID
Under document quantity, the online document quantity under the real-time statistics specified directory of computer backstage, when newly-built or copy this catalogue to
When next document, one will be added in statistical magnitude, conversely, deleting document will subtract one in corresponding statistical magnitude.Wherein
Include but is not limited under specified root, such as can be under user's homepage or specify under thread path or current edit line
The page is inferior, can also be any specified file, such as a certain file etc. with multiple storage documents, for the side of explanation
Just, the present embodiment is described using a certain file as specified directory.
Step S104: when the number of documents is more than preset value, the knot between the document under the specified directory is adjusted
Structure relationship.
Wherein, preset value can be with self-setting, such as 20,25,30 etc. can be with, does not do considered critical to particular number,
But it is advisable with being greater than 10, preferably 20-25.Optionally, described when the on-line documentation quantity is more than preset value, it adjusts described specified
The structural relation between document under catalogue, comprising: it is described when the on-line documentation quantity is more than preset value, it provides and whether carries out
Document classification prompt information;After confirmation is sorted out, online document under the specified directory is sorted out automatically.
Specific execution method is as follows: as shown in Figure 2.
Step S1042: automatic to call document associations analysis module when the on-line documentation quantity is more than preset value;
The analysis module is that analysis module trained in advance is carried out using bayesian algorithm, in the analysis module of the training
In, automatic clustering can be carried out according to the degree of association of document.
Bayesian algorithm is as follows:
In naive Bayesian document classification type, calculating whether certain document belongs to Type C, calculation formula is as follows,
P (F1F2...Fn | C) P (C)=P (F1 | C) P (F2 | C) ... P (Fn | C) P (C), wherein P (C) identity type C is literary
The probability that shelves occur, and P (F1 | C) identify the probability that word F1 occurs in C type document.
The rule that P (F1 | C) is calculated is as follows:
1. under Type C the number of all words of all documents and be N,
2. the number that word F1 occurs in all document documents is as M (note: the document being not only under Type C, but complete
The document in portion)
3. in all documents unduplicated word number and be NN
So P (F1 | C)=M/ (N+NN)
By above method it is recognised that the probability P (W | C) that any one word W occurs in document C, if do not had
So probability occur is exactly 0, then the probability that document F belongs to Type C is P (C) * P (W1 | C) * P (W2 | C) * ... * P (Wn | C)
Wherein W1, W2 indicate the word occurred in document F to=p1.P (W1 | C) identify the probability that word W1 occurs under Type C.
Then the probability that document F occurs in other types is calculated with identical method, obtains p2, p3 etc., compares p1,
The value of p2, p3 etc., being worth maximum indicates that document F is more like with such, so that document is divided into the type.
Specific case column are as shown in the table:
Document id | The word contained in document | Doctype |
1 | Bayes's classification formula science | Science and technology |
2 | Bayes's signal-to-noise ratio science | Science and technology |
3 | Formula official documents and correspondence | Patent class |
To be sorted 4 | Bayes's signal-to-noise ratio official documents and correspondence |
Steps are as follows for specific algorithm
1. the document A, B, C etc. in off-line data are carried out manual sort first, A, B belong to science and technology, and C belongs to patent
Class, a total of different word total number is " Bayes "+" classification "+" formula "+" science "+"signal to noise ratio"+" official documents and correspondence " in document
=6
2. taking out same type of document A, B, corresponding total words mesh has 6, and wherein word " Bayes " occurs general
Rate is (2)/(6+6)=2/12, wherein what is identified for first 6 is a total of number of words of science and technology document, second 6 mark
Be unduplicated word in all documents number, successively calculate the probability of each word.
The probability that word occurs under science and technology document:
" Bayes "=(2)/(6+6)=2/12;
" classification "=(1)/(6+6)=2/12;
" formula "=(1)/(6+6)=2/12;
" science "=(2)/(6+6)=2/12;
"signal to noise ratio"=(1)/(6+6)=2/12;
The probability that word occurs under patent class document:
" formula "=(1)/(2+6)=2/8;
" official documents and correspondence "=(1)/(2+6)=2/8;
3. for the stepping 4 to be classified:
In probability=(2/3) * (2/12) * (2/12) * (1/12)=0.001543209 of science and technology document;
In probability=(1/3) * (1/8) * (1/8) * (2/8)=0.001302083333 of patent class document;
In summary, document 4 to be sorted is greater than patent class in science and technology document probability, so that it is classified as science and technology text
Shelves.
Step S1044: the document associations analysis module is sorted out according to the relevance of the online document.
Optionally, the document associations analysis module is sorted out according to the relevance of the online document, comprising:
Firstly, then the ID, such as filename, attribute etc. that obtain a certain online document read the online document
Content information, the title of document, abstract can be read first herein or concluded according to the lexical word existing number that occurs frequently, from
And what the content for substantially analyzing the document record is, such as " liquid crystal display " word repeatedly occurs, then it is assumed that the document is retouched
What is stated is about " liquid crystal display " relevant technology contents;Then all texts under the other catalogue are analyzed in the same way
The content information of shelves, the high online document of the degree of association is polymerize, such as have 100 documents under the catalogue, wherein 30
A piece refers to " liquid crystal display ", then 30 " liquid crystal display " the relevant documents is carried out classification polymerization;Finally, by after polymerization
Online document is placed under same catalogue, and can be named again for the catalogue, such as by above-mentioned 30 " liquid crystal display " class documents
After being placed under same catalogue, " liquid crystal display " is named as it.
Step S106: result adjusted is shown at default interface.
Optionally, it is described result adjusted is shown at default interface after, comprising:
Classification structure adjusted is shown in interface automatically by system in a manner of preview, and user can be automatic to this at this time
The result of classification is judged, when thinking that the categorization results are accurate, then can pass through the side of click confirmation or input validation order
Formula receives the categorization results, and online document is shown according to the categorization results under the specified directory;Otherwise, user can select
It selects and does not receive the categorization results, online document is remained stationary constant under the specified directory.
The present disclosure proposes a kind of online document classifying method, system automatic identification counts the quantity of online document, and root
Whether user is reminded to sort out according to setting, calls classifying module to the online document under specified directory when being sorted out
Sorted out, so that the document under specified directory carries out classified finishing storage according to document content, searches, mention convenient for user
High office efficiency.
Embodiment 2
As shown in figure 3, the disclosure provides a kind of document classification device, particularly according to the specific embodiment of the disclosure
Refer to a kind of categorization arrangement of online document, naturally it is also possible to which, suitable for the classification of common document, being applied to master catalogue, (such as user is first
Page) under document classification, can also be any specified file, such as a certain files etc. with multiple storage documents,
Be certainly not limited to this, for it is any include that the positions of multiple online documents can execute the online document automatic clustering side
Method.Online document herein can be allowed for one kind user's online editing word, excel or any can input text
Editing machine.The device specifically includes: acquiring unit 302 sorts out unit 304 and display unit 306.
Acquiring unit 302: for obtaining User ID, the associated document under specified directory of the determining and User ID
Quantity.
Each online document editor is enlightened, requires the editor user for determining the online document, including confirm the volume
The id informations such as account, user name, phone, the mailbox of user are collected, it is determining later associated positioned at specified directory with the User ID
Under document quantity, the online document quantity under the real-time statistics specified directory of computer backstage, when newly-built or copy this catalogue to
When next document, one will be added in statistical magnitude, conversely, deleting document will subtract one in corresponding statistical magnitude.Wherein
Include but is not limited under specified root, such as can be under user's homepage or specify under thread path or current edit line
The page is inferior, can also be any specified file, such as a certain file etc. with multiple storage documents, for the side of explanation
Just, the present embodiment is described using a certain file as specified directory.
Sort out unit 304: for when the number of documents is more than preset value, adjust document under the specified directory it
Between structural relation.
Wherein, preset value can be with self-setting, such as 20,25,30 etc. can be with, does not do considered critical to particular number,
But it is advisable with being greater than 10, preferably 20-25.Optionally, described when the on-line documentation quantity is more than preset value, it adjusts described specified
The structural relation between document under catalogue, comprising: it is described when the on-line documentation quantity is more than preset value, it provides and whether carries out
Document classification prompt information;After confirmation is sorted out, online document under the specified directory is sorted out automatically.
Specific execution method is as follows: being also used to as shown in Fig. 2, sorting out unit.
The first, automatic to call document associations analysis module when the on-line documentation quantity is more than preset value;
The analysis module is that analysis module trained in advance is carried out using bayesian algorithm, in the analysis module of the training
In, automatic clustering can be carried out according to the degree of association of document.Bayesian algorithm is no longer superfluous herein referring specifically to embodiment 1 as above
It states.
The second, the described document associations analysis module is sorted out according to the relevance of the online document.
Specific example is as follows:
Firstly, then the ID, such as filename, attribute etc. that obtain a certain online document read the online document
Content information, the title of document, abstract can be read first herein or concluded according to the lexical word existing number that occurs frequently, from
And what the content for substantially analyzing the document record is, such as " liquid crystal display " word repeatedly occurs, then it is assumed that the document is retouched
What is stated is about " liquid crystal display " relevant technology contents;Then all texts under the other catalogue are analyzed in the same way
The content information of shelves, the high online document of the degree of association is polymerize, such as have 100 documents under the catalogue, wherein 30
A piece refers to " liquid crystal display ", then 30 " liquid crystal display " the relevant documents is carried out classification polymerization;Finally, by after polymerization
Online document is placed under same catalogue, and can be named again for the catalogue, such as by above-mentioned 30 " liquid crystal display " class documents
After being placed under same catalogue, " liquid crystal display " is named as it.
Display unit 306: for showing result adjusted at default interface.
Be also used to: classification structure adjusted is shown in interface automatically by system in a manner of preview, and user can be at this time
The result that this classifies automatically is judged, it, then can be by clicking confirmation or input validation when thinking that the categorization results are accurate
The mode of order receives the categorization results, and online document is shown according to the categorization results under the specified directory;Otherwise, it uses
Family, which can choose, does not receive the categorization results, and online document is remained stationary constant under the specified directory.
The present disclosure proposes a kind of online document categorization arrangement, system automatic identification counts the quantity of online document, and root
Whether user is reminded to sort out according to setting, calls classifying module to the online document under specified directory when being sorted out
Sorted out, so that the document under specified directory carries out classified finishing storage according to document content, searches, mention convenient for user
High office efficiency.
Embodiment 3
As shown in figure 4, the equipment is for the classification to online document, the electricity the present embodiment provides a kind of electronic equipment
Sub- equipment, comprising: at least one processor;And the memory being connect at least one described processor communication;Wherein,
The memory is stored with the instruction that can be executed by one processor, and described instruction is by described at least one
Device is managed to execute, so that at least one described processor is able to carry out following operation:
Obtain User ID, the determining quantity with the associated document under specified directory of the User ID;
When the number of documents is more than preset value, the structural relation between the document under the specified directory is adjusted;
Result adjusted is shown at default interface.
Optionally, it is described result adjusted is shown at default interface after, comprising:
According in the default received user instruction in interface, by the structural relation of the document under the specified directory according to
The result adjusted is shown.
Optionally, described when the on-line documentation quantity is more than preset value, it adjusts between the document under the specified directory
Structural relation, comprising:
When the number of documents is more than preset value, the relevance between the document is calculated according to pre-defined rule;
According to the relevance to the document classification.
Optionally, it is described according to the relevance to the document classification, comprising:
The ID for obtaining the document reads the content information of the document;
The high document of the degree of association is polymerize;
Document after polymerization is placed under same catalogue.
Optionally, described when the on-line documentation quantity is more than preset value, it adjusts between the document under the specified directory
Structural relation, comprising:
It is described when the on-line documentation quantity be more than preset value when, provide and whether carry out document classification prompt information;
After confirmation is sorted out, the specified directory Documents are sorted out automatically.
Embodiment 4
The embodiment of the present disclosure provides a kind of nonvolatile computer storage media, and the computer storage medium is stored with
Any of the above-described method can be performed in computer executable instructions, the computer executable instructions.
Embodiment 5
Below with reference to Fig. 4, it illustrates the structural representations for the electronic equipment 400 for being suitable for being used to realize the embodiment of the present disclosure
Figure.Terminal device in the embodiment of the present disclosure can include but is not limited to such as mobile phone, laptop, digital broadcasting and connect
Receive device, PDA (personal digital assistant), PAD (tablet computer), PMP (portable media player), car-mounted terminal (such as vehicle
Carry navigation terminal) etc. mobile terminal and such as number TV, desktop computer etc. fixed terminal.Electricity shown in Fig. 4
Sub- equipment is only an example, should not function to the embodiment of the present disclosure and use scope bring any restrictions.
As shown in figure 4, electronic equipment 400 may include processing unit (such as central processing unit, graphics processor etc.)
401, random access can be loaded into according to the program being stored in read-only memory (ROM) 402 or from storage device 408
Program in memory (RAM) 403 and execute various movements appropriate and processing.In RAM 403, it is also stored with electronic equipment
Various programs and data needed for 400 operations.Processing unit 401, ROM 402 and RAM 403 pass through the phase each other of bus 404
Even.Input/output (I/O) interface 405 is also connected to bus 404.
In general, following device can connect to I/O interface 405: including such as touch screen, touch tablet, keyboard, mouse, taking the photograph
As the input unit 406 of head, microphone, accelerometer, gyroscope etc.;Including such as liquid crystal display (LCD), loudspeaker, vibration
The output device 407 of dynamic device etc.;Storage device 408 including such as tape, hard disk etc.;And communication device 409.Communication device
409, which can permit electronic equipment 400, is wirelessly or non-wirelessly communicated with other equipment to exchange data.Although Fig. 4 shows tool
There is the electronic equipment 400 of various devices, it should be understood that being not required for implementing or having all devices shown.It can be with
Alternatively implement or have more or fewer devices.
Particularly, in accordance with an embodiment of the present disclosure, it may be implemented as computer above with reference to the process of flow chart description
Software program.For example, embodiment of the disclosure includes a kind of computer program product comprising be carried on computer-readable medium
On computer program, which includes the program code for method shown in execution flow chart.In such reality
It applies in example, which can be downloaded and installed from network by communication device 409, or from storage device 408
It is mounted, or is mounted from ROM 402.When the computer program is executed by processing unit 401, the embodiment of the present disclosure is executed
Method in the above-mentioned function that limits.
It should be noted that the above-mentioned computer-readable medium of the disclosure can be computer-readable signal media or meter
Calculation machine readable storage medium storing program for executing either the two any combination.Computer readable storage medium for example can be --- but not
Be limited to --- electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor system, device or device, or any above combination.Meter
The more specific example of calculation machine readable storage medium storing program for executing can include but is not limited to: have the electrical connection, just of one or more conducting wires
Taking formula computer disk, hard disk, random access storage device (RAM), read-only memory (ROM), erasable type may be programmed read-only storage
Device (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light storage device, magnetic memory device,
Or above-mentioned any appropriate combination.In the disclosure, computer readable storage medium can be it is any include or storage journey
The tangible medium of sequence, the program can be commanded execution system, device or device use or in connection.And at this
In open, computer-readable signal media may include in a base band or as the data-signal that carrier wave a part is propagated,
In carry computer-readable program code.The data-signal of this propagation can take various forms, including but not limited to
Electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be computer-readable and deposit
Any computer-readable medium other than storage media, the computer-readable signal media can send, propagate or transmit and be used for
By the use of instruction execution system, device or device or program in connection.Include on computer-readable medium
Program code can transmit with any suitable medium, including but not limited to: electric wire, optical cable, RF (radio frequency) etc. are above-mentioned
Any appropriate combination.
Above-mentioned computer-readable medium can be included in above-mentioned electronic equipment;It is also possible to individualism, and not
It is fitted into the electronic equipment.
The calculating of the operation for executing the disclosure can be write with one or more programming languages or combinations thereof
Machine program code, above procedure design language include object oriented program language-such as Java, Smalltalk, C+
+, it further include conventional procedural programming language-such as " C " language or similar programming language.Program code can
Fully to execute, partly execute on the user computer on the user computer, be executed as an independent software package,
Part executes on the remote computer or executes on a remote computer or server completely on the user computer for part.
In situations involving remote computers, remote computer can pass through the network of any kind --- including local area network (LAN)
Or wide area network (WAN)-is connected to subscriber computer, or, it may be connected to outer computer (such as utilize Internet service
Provider is connected by internet).
Flow chart and block diagram in attached drawing are illustrated according to the system of the various embodiments of the disclosure, method and computer journey
The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation
A part of one module, program segment or code of table, a part of the module, program segment or code include one or more use
The executable instruction of the logic function as defined in realizing.It should also be noted that in some implementations as replacements, being marked in box
The function of note can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are actually
It can be basically executed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.Also it to infuse
Meaning, the combination of each box in block diagram and or flow chart and the box in block diagram and or flow chart can be with holding
The dedicated hardware based system of functions or operations as defined in row is realized, or can use specialized hardware and computer instruction
Combination realize.
Being described in unit involved in the embodiment of the present disclosure can be realized by way of software, can also be by hard
The mode of part is realized.Wherein, the title of unit does not constitute the restriction to the unit itself under certain conditions.
Claims (10)
1. a kind of document classification method characterized by comprising
Obtain User ID, the determining quantity with the associated document under specified directory of the User ID;
When the number of documents is more than preset value, the structural relation between the document under the specified directory is adjusted;
Result adjusted is shown at default interface.
2. the method as described in claim 1, which is characterized in that described to carry out showing it at default interface by result adjusted
Afterwards, comprising:
According in the default received user instruction in interface, by the structural relation of the document under the specified directory according to described
Result adjusted is shown.
3. the method as described in claim 1, which is characterized in that described when the on-line documentation quantity is more than preset value, adjustment
The structural relation between document under the specified directory, comprising:
When the number of documents is more than preset value, the relevance between the document is calculated according to pre-defined rule;
According to the relevance to the document classification.
4. method as claimed in claim 3, which is characterized in that it is described according to the relevance to the document classification, comprising:
The ID for obtaining the document reads the content information of the document;
The high document of the degree of association is polymerize;
Document after polymerization is placed under same catalogue.
5. the method as described in claim 1, which is characterized in that described when the on-line documentation quantity is more than preset value, adjustment
The structural relation between document under the specified directory, comprising:
It is described when the on-line documentation quantity be more than preset value when, provide and whether carry out document classification prompt information;
After confirmation is sorted out, the specified directory Documents are sorted out automatically.
6. a kind of document classification device characterized by comprising
Acquiring unit, for obtaining User ID, the determining quantity with the associated document under specified directory of the User ID;
Sort out unit, for adjusting the knot between the document under the specified directory when the number of documents is more than preset value
Structure relationship;
Display unit, for showing result adjusted at default interface.
7. device as claimed in claim 6, which is characterized in that the display unit is also used to:
According in the default received user instruction in interface, by the structural relation of the document under the specified directory according to described
Result adjusted is shown.
8. device as claimed in claim 7, which is characterized in that the classification unit is also used to:
When the number of documents is more than preset value, the relevance between the document is calculated according to pre-defined rule;
According to the relevance to the document classification.
9. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that described program is by processor
The method as described in any one of claims 1 to 5 is realized when execution.
10. a kind of electronic equipment characterized by comprising
One or more processors;
Storage device, for storing one or more programs, when one or more of programs are by one or more of processing
When device executes, so that one or more of processors realize the method as described in any one of claims 1 to 5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910206339.XA CN110096590A (en) | 2019-03-19 | 2019-03-19 | A kind of document classification method, apparatus, medium and electronic equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910206339.XA CN110096590A (en) | 2019-03-19 | 2019-03-19 | A kind of document classification method, apparatus, medium and electronic equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110096590A true CN110096590A (en) | 2019-08-06 |
Family
ID=67443203
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910206339.XA Pending CN110096590A (en) | 2019-03-19 | 2019-03-19 | A kind of document classification method, apparatus, medium and electronic equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110096590A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110674082A (en) * | 2019-09-24 | 2020-01-10 | 北京字节跳动网络技术有限公司 | Method and device for removing online document, electronic equipment and computer readable medium |
CN111858476A (en) * | 2020-07-20 | 2020-10-30 | 上海闻泰电子科技有限公司 | File processing method and device, electronic equipment and computer readable storage medium |
CN111858518A (en) * | 2020-07-09 | 2020-10-30 | 北京字节跳动网络技术有限公司 | Method and device for updating reference document, electronic equipment and storage medium |
CN112269870A (en) * | 2020-11-03 | 2021-01-26 | 北京字跳网络技术有限公司 | Document sorting method and device, electronic equipment and computer readable storage medium |
CN113254583A (en) * | 2021-05-28 | 2021-08-13 | 北京明略软件系统有限公司 | Document marking method, device and medium based on semantic vector |
CN115757799A (en) * | 2022-12-02 | 2023-03-07 | 松原市邹佳网络科技有限公司 | Data storage method and system based on artificial intelligence and cloud platform |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1773492A (en) * | 2004-11-09 | 2006-05-17 | 国际商业机器公司 | Method for organizing multi-file and equipment for displaying multi-file |
CN1855094A (en) * | 2005-04-28 | 2006-11-01 | 国际商业机器公司 | Method and device for processing electronic files of users |
CN104160395A (en) * | 2012-02-29 | 2014-11-19 | Ubic股份有限公司 | Document classification system, document classification method, and document classification program |
CN107943984A (en) * | 2017-11-30 | 2018-04-20 | 广东欧珀移动通信有限公司 | Image processing method, device, computer equipment and computer-readable recording medium |
-
2019
- 2019-03-19 CN CN201910206339.XA patent/CN110096590A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1773492A (en) * | 2004-11-09 | 2006-05-17 | 国际商业机器公司 | Method for organizing multi-file and equipment for displaying multi-file |
CN1855094A (en) * | 2005-04-28 | 2006-11-01 | 国际商业机器公司 | Method and device for processing electronic files of users |
CN104160395A (en) * | 2012-02-29 | 2014-11-19 | Ubic股份有限公司 | Document classification system, document classification method, and document classification program |
CN107943984A (en) * | 2017-11-30 | 2018-04-20 | 广东欧珀移动通信有限公司 | Image processing method, device, computer equipment and computer-readable recording medium |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110674082A (en) * | 2019-09-24 | 2020-01-10 | 北京字节跳动网络技术有限公司 | Method and device for removing online document, electronic equipment and computer readable medium |
CN111858518A (en) * | 2020-07-09 | 2020-10-30 | 北京字节跳动网络技术有限公司 | Method and device for updating reference document, electronic equipment and storage medium |
CN111858518B (en) * | 2020-07-09 | 2022-10-25 | 北京字节跳动网络技术有限公司 | Method and device for updating reference document, electronic equipment and storage medium |
CN111858476A (en) * | 2020-07-20 | 2020-10-30 | 上海闻泰电子科技有限公司 | File processing method and device, electronic equipment and computer readable storage medium |
CN112269870A (en) * | 2020-11-03 | 2021-01-26 | 北京字跳网络技术有限公司 | Document sorting method and device, electronic equipment and computer readable storage medium |
CN113254583A (en) * | 2021-05-28 | 2021-08-13 | 北京明略软件系统有限公司 | Document marking method, device and medium based on semantic vector |
CN113254583B (en) * | 2021-05-28 | 2021-11-02 | 北京明略软件系统有限公司 | Document marking method, device and medium based on semantic vector |
CN115757799A (en) * | 2022-12-02 | 2023-03-07 | 松原市邹佳网络科技有限公司 | Data storage method and system based on artificial intelligence and cloud platform |
CN115757799B (en) * | 2022-12-02 | 2023-10-24 | 北京国联视讯信息技术股份有限公司 | Data storage method and system based on artificial intelligence and cloud platform |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110096590A (en) | A kind of document classification method, apparatus, medium and electronic equipment | |
CN109634698B (en) | Menu display method and device, computer equipment and storage medium | |
US9331971B2 (en) | Message subscription based on message aggregate characteristics | |
CN110162796B (en) | News thematic creation method and device | |
US8458194B1 (en) | System and method for content-based document organization and filing | |
WO2020155750A1 (en) | Artificial intelligence-based corpus collecting method, apparatus, device, and storage medium | |
CN111680254B (en) | Content recommendation method and device | |
WO2013189296A1 (en) | Method and system for processing recommended target software | |
CN106911757A (en) | The method for pushing and device of a kind of business information | |
TW201833851A (en) | Risk control event automatic processing method and apparatus | |
CN108764319A (en) | A kind of sample classification method and apparatus | |
WO2023272850A1 (en) | Decision tree-based product matching method, apparatus and device, and storage medium | |
US9002832B1 (en) | Classifying sites as low quality sites | |
CN109284367B (en) | Method and device for processing text | |
CN110362815A (en) | Text vector generation method and device | |
KR20180011261A (en) | Search processing method and apparatus | |
CN103942328A (en) | Video retrieval method and video device | |
CN110321447A (en) | Determination method, apparatus, electronic equipment and the storage medium of multiimage | |
CN110489156A (en) | Edition control method, device, medium and the electronic equipment of binary format | |
US10474700B2 (en) | Robust stream filtering based on reference document | |
CN112084448B (en) | Similar information processing method and device | |
CN114443943A (en) | Information scheduling method, device and equipment and computer readable storage medium | |
CN110704139B (en) | Icon classification method and device | |
CN111428159A (en) | Online classification method and device | |
CN113051919A (en) | Method and device for identifying named entity |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |