CN106407429A - File tracking method, device and system - Google Patents

File tracking method, device and system Download PDF

Info

Publication number
CN106407429A
CN106407429A CN201610857077.XA CN201610857077A CN106407429A CN 106407429 A CN106407429 A CN 106407429A CN 201610857077 A CN201610857077 A CN 201610857077A CN 106407429 A CN106407429 A CN 106407429A
Authority
CN
China
Prior art keywords
daily record
file
event
terminating machine
database
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610857077.XA
Other languages
Chinese (zh)
Inventor
顾广宇
张淑娟
蔡翔
易庆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
Electric Power Research Institute of State Grid Anhui Electric Power Co Ltd
Original Assignee
State Grid Corp of China SGCC
Electric Power Research Institute of State Grid Anhui Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, Electric Power Research Institute of State Grid Anhui Electric Power Co Ltd filed Critical State Grid Corp of China SGCC
Priority to CN201610857077.XA priority Critical patent/CN106407429A/en
Publication of CN106407429A publication Critical patent/CN106407429A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/11File system administration, e.g. details of archiving or snapshots

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a file tracking method, device and system. In one embodiment, the file tracking method comprises acquiring related information of a terminating machine, capturing and monitoring the operation behavior of a file in a designated directory of the terminating machine, wherein the related information comprises the terminating machine physical address, the network address and the operating system user; generating a corresponding file operation event in dependence on the operation behavior of the file captured in the designated directory; converting the file operation event into a log and storing the log in a database; and constructing a life-cycle tree of a target file in dependence on a data positioning log stored in the database after a request for positioning and tracking a full life circle of the target file and sent by the terminal is received. By means of the file tracking method, device and system, the life-cycle tree being of a unified and visual structure is employed to achieve systematic tracking and control of the operation behavior of a data object in the full life cycle.

Description

Document track method, apparatus and system
Technical field
The present invention relates to art file management technology field, in particular to a kind of document track method, apparatus and system.
Background technology
Under big data environment, the data of magnanimity is carried out with reliable, effective management and control, becomes the focus of data safety research. For the unstructured datas such as text, image, video, its species is miscellaneous, quantity is many, distribution is wide, not of uniform size etc. Feature even more brings challenge to the data object supervision of file system.
In unstructured data management and control work, it is primarily present problems with and challenge:
1) the inadequate system of tracking management and control to operation behavior in data object Life cycle;
2) data object occurs during abnormal behaviour it is difficult to carry out fast and accurately to the Life cycle of this data object Location tracking and evidence obtaining work, when such as there is the security incidents such as data object leakage, it is big that evidence obtaining work generally requires artificial enquiry The audit log of amount, therefrom orients relevant information, this mode wastes time and energy, and precision is also difficult to ensure that;
3) lack effective data visualization analysis;Under cover huge data value, data in substantial amounts of audit log Visual analyzing can not only allow user intuitively understand that the basic condition of current data management and control is using moreover it is possible to view data object The trend feature of the aspects such as frequency, data type, capacity, distribution, thus improve data real-time management and control efficiency.
Content of the invention
It is an object of the invention to, provide a kind of document track method and system based on Life cycle tree, it is permissible Effectively solving problems of the prior art, especially to the tracking management and control of operation behavior in data object Life cycle not The problem of enough systems.
For solving above-mentioned technical problem, the present invention adopts the following technical scheme that:Chased after based on the file of Life cycle tree Track method, using data Life cycle Storage Structure of Tree, to data object, the operation behavior in Life cycle chases after Track management and control.
Content of the invention
In view of this, the purpose of the embodiment of the present invention is to provide a kind of document track method, apparatus and system.
The embodiment of the present invention provides a kind of document track method, is applied to the server with terminating machine communication connection, the party Method includes:
S1, gathers the relevant information of described terminating machine, the operation behavior of the file under terminating machine assigned catalogue is caught Catch monitoring, described relevant information includes terminating machine physical address, the network address, operating system user;
S2, the operation behavior according to the file capturing under described assigned catalogue generates corresponding file operation event;
S3, described file operation event is converted into daily record and stores in database;
S4, receive terminal transmission file destination Life cycle is carried out after the request of location tracking, according to described In database, the data of storage positions the life cycle tree that daily record builds described file destination.
Preferably, step S3 includes:If described file operation event is first kind event, generate line unit, by described the One class event generates daily record, and the first field in described daily record adds this node line unit value, as root node pointer, stores In described database;
If described file operation event is Equations of The Second Kind event, generate line unit, described Equations of The Second Kind event is generated daily record, and Search a up-to-date daily record under the data of described Equations of The Second Kind event, by the content of the first field in a described up-to-date daily record It is added to the first field of newly-generated described daily record, and described daily record is stored in described database, after storage success, will The line unit of newly-generated described daily record adds storage in the second field of a described up-to-date daily record;
If described file operation event is the 3rd class event, generate line unit, described 3rd class event is generated daily record, and Search up-to-date one article of daily record under the data of described 3rd class event, by the content of the first field in a described up-to-date daily record It is added to the first field of newly-generated described daily record, and described daily record is stored in described database.
Preferably, step S4 includes:Described server receives terminal transmission file destination Life cycle is carried out After location tracking request, described server described is carried out in location tracking request to file destination Life cycle according to receive The relevant information of the described file destination comprising generates line unit and is indexed, and determines journal entries;
Described server obtains corresponding data identifier according to described journal entries, inquires about under described data identifier A up-to-date daily record, obtains root node pointer, thus obtaining root node daily record from a described up-to-date daily record;Click through from root knot Row iteration, carries out tree and builds.
Preferably, described file operation event is sent to institute in real time by described terminating machine when capturing file operation event State server, or storage is carried out by terminating machine when capturing file operation and then is sporadically sent to described server.
Preferably, described database is HBase distributed data base.
The embodiment of the present invention also provides a kind of document track device, is applied to the server with terminating machine communication connection, institute State device to include:
Acquisition module, the operation for gathering the relevant information of described terminating machine, to the file under terminating machine assigned catalogue Behavior carries out catching monitoring, and described operation behavior is generated corresponding file operation event, and described relevant information includes terminal Machine physical address, the network address, operating system user;
Processing module, for being converted into daily record by described file operation event;
Memory module, for storing described daily record to database;
Tracing module, in the request that file destination Life cycle is carried out with location tracking receiving terminal transmission Afterwards, the data according to storage in described database positions the life cycle tree that daily record builds described file destination.
Preferably, described processing module:If being additionally operable to described file operation event is first kind event, generate line unit, Described first kind event is generated daily record, and the first field in described daily record adds this node line unit value, refers to as root node Pin, stores in described database;
If being additionally operable to described file operation event is Equations of The Second Kind event, generate line unit, described Equations of The Second Kind event is generated Daily record, and search a up-to-date daily record under the data of described Equations of The Second Kind event, by the first word in a described up-to-date daily record The content of section is added to the first field of newly-generated described daily record, and described daily record is stored in described database, storage After success, the line unit of newly-generated described daily record is added storage in the second field of a described up-to-date daily record;
If being additionally operable to described file operation event is the 3rd class event, generate line unit, described 3rd class event is generated Daily record, and search up-to-date one article of daily record under the data of described 3rd class event, by the first word in a described up-to-date daily record The content of section is added to the first field of newly-generated described daily record, and described daily record is stored in described database.
Preferably, described tracing module:It is additionally operable to described file destination Life cycle be positioned according to receive The relevant information generation line unit following the trail of the described file destination comprising in request is indexed, and determines journal entries;
It is additionally operable to inquire about the up-to-date daily record under described data identifier, obtain root knot from a described up-to-date daily record Point pointer, thus obtain root node daily record;It is iterated from root node, carry out tree and build.
The embodiment of the present invention also provides a kind of file tracking system, described system include being in communication with each other connection terminating machine and Server;
Described terminating machine includes:
Terminal behavior monitoring module, for carrying out to the operation behavior of the file under assigned catalogue catching monitoring, according to institute State operation behavior and generate corresponding file operation event;
Sending module, for being sent to described server by described file operation event;
Follow the trail of request module, for sending the request of file destination Life cycle location tracking to described server;
Described server includes:
Processing module, for being converted into daily record by described file operation event;
Memory module, for storing described daily record to database;
Tracing module, in the request that file destination Life cycle is carried out with location tracking receiving terminal transmission Afterwards, the data according to storage in described database positions the life cycle tree that daily record builds described file destination.
Preferably, the sending module of described terminating machine:
For in real time described file operation event being sent to described server;Or
For described file operation event is stored in described terminating machine, and sporadically will be stored in described terminating machine Described Action Events be sent to described server.
According to the method in above-described embodiment, apparatus and system, using this unification of life cycle tree, intuitively structure is real Show the system tracks management and control to operation behavior in data object Life cycle.
For enabling the above objects, features and advantages of the present invention to become apparent, preferred embodiment cited below particularly, and coordinate Appended accompanying drawing, is described in detail below.
Brief description
In order to be illustrated more clearly that the technical scheme of the embodiment of the present invention, below will be attached to use required in embodiment Figure is briefly described it will be appreciated that the following drawings illustrate only certain embodiments of the present invention, and it is right to be therefore not construed as The restriction of scope, for those of ordinary skill in the art, on the premise of not paying creative work, can also be according to this A little accompanying drawings obtain other related accompanying drawings.
Fig. 1 is the schematic diagram that the server that present pre-ferred embodiments provide is interacted with terminating machine.
Fig. 2 is the block diagram of the server that present pre-ferred embodiments provide.
Fig. 3 is the flow chart of the document track method that first embodiment of the invention provides.
Fig. 4 is that the judgement flow process of file Action Events type in the document track method that present pre-ferred embodiments provide is shown It is intended to.
Fig. 5 is the high-level schematic functional block diagram of the document track device that second embodiment of the invention provides.
Fig. 6 is the high-level schematic functional block diagram of the file tracking system that third embodiment of the invention provides.
Icon:100- server;200- terminating machine;300- network;400- database;110- document track device;111- Memory;112- processor;113- communication unit;1101- acquisition module;1102- processing module;1103- memory module; 1104- tracing module;210- terminal behavior monitoring module;220- sending module;230- follows the trail of request module.
Specific embodiment
Below in conjunction with accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Ground description is it is clear that described embodiment is only a part of embodiment of the present invention, rather than whole embodiments.Generally exist The assembly of the embodiment of the present invention described and illustrated in accompanying drawing can be arranged with various different configurations and design herein.Cause This, be not intended to limit claimed invention to the detailed description of the embodiments of the invention providing in the accompanying drawings below Scope, but it is merely representative of the selected embodiment of the present invention.Based on embodiments of the invention, those skilled in the art are not doing The every other embodiment being obtained on the premise of going out creative work, broadly falls into the scope of protection of the invention.
It should be noted that:Similar label and letter represent similar terms in following accompanying drawing, therefore, once a certain Xiang Yi It is defined in individual accompanying drawing, then do not need it to be defined further and explains in subsequent accompanying drawing.Meanwhile, the present invention's In description, term " first ", " second " etc. are only used for distinguishing description, and it is not intended that indicating or hint relative importance.
As shown in figure 1, being the signal that the server 100 that present pre-ferred embodiments provide is interacted with terminating machine 200 Figure.Described server 100 is communicatively coupled with terminating machine 200 by network 300, to enter row data communication or interaction.Described Server 100 can be the webserver, database server etc..Described terminating machine 200 can be PC (personal Computer, PC), panel computer, smart mobile phone, personal digital assistant (personal digital assistant, PDA) Deng.
As shown in Fig. 2 being the block diagram of the server 100 shown in Fig. 1.Described server 100 includes document track Device 110, memory 111, processor 112, communication unit 113.
Described memory 111, processor 112 and each element of communication unit 113 are directly or indirectly electrical each other Connect, to realize transmission or the interaction of data.For example, these elements can pass through one or more communication bus or letter each other Number line is realized being electrically connected with.Described document track device 110 includes at least one can be with the shape of software or firmware (firmware) Formula is stored in described memory 111 or is solidificated in the operating system (operating system, OS) of described server 100 Software function module.Described processor 112 is used for executing the executable module of storage in described memory 111, for example described Software function module included by document track device 110 and computer program etc..
Wherein, described memory 111 may be, but not limited to, random access memory (Random Access Memory, RAM), read-only storage (Read Only Memory, ROM), programmable read only memory (Programmable Read-Only Memory, PROM), erasable read-only memory (Erasable Programmable Read-Only Memory, EPROM), electricallyerasable ROM (EEROM) (Electric Erasable Programmable Read-Only Memory, EEPROM) etc..Wherein, memory 111 be used for storage program, described processor 112 after receiving execute instruction, Execution described program.Described communication unit 113 be used for by described network 300 set up described server 100 and terminating machine 200 it Between communication connection, and for by described network 300 transceiving data.
First embodiment
Refer to Fig. 3, be the document track of the server 100 being applied to shown in Fig. 2 that present pre-ferred embodiments provide The flow chart of method.Idiographic flow shown in Fig. 3 will be described in detail below.
Step S101, gathers the relevant information of described terminating machine 200, the behaviour to the file under terminating machine 200 assigned catalogue Carry out catching monitoring as behavior.
Described relevant information includes terminating machine 200 physical address, the network address, operating system user etc..Certainly described phase Pass information can also include more relevant informations with regard to described terminating machine 200.
Step S102, the operation behavior according to the file capturing under described assigned catalogue generates corresponding file operation thing Part.
In detail, if capturing the operation behavior of described file, corresponding file-related information, described file phase are extracted Pass information includes:The title of data object, path, file type, operation behavior, operating time, operator's relevant information etc.;Right Can be that described file-related information is changed in the related operation behavior of the information such as data object title, path, file type Become, e.g., if operation behavior changes to it, the situation of change of these information before and after record operation behavior, such as record " is surveyed Examination A.txt " RNTO " test b .txt " etc..Described file-related information according to extracting generates file operation event.
In other embodiments, the operation behavior of the file under terminating machine 200 assigned catalogue is carried out catching the dynamic of monitoring Make to be executed by terminating machine 200.Then described file operation event is sent to by server 100 with real-time or pseudo- real-time mode. Described file operation event is sent to described server 100 in real time by described terminating machine 200 when capturing file operation event, Or carry out when capturing file operation storing and then being sporadically sent to described server 100 by terminating machine 200.In detail Ground, it is that file operation event is stored this that described file operation event is sent to server 100 by described puppet real-time mode Then the file operation event of terminating machine 200 is transmitted away by ground in certain indefinite time.
Step S103, described file operation event is converted into daily record and stores in database 400.
In an example, described file operation event can be divided three classes event:First kind event, Equations of The Second Kind event And the 3rd class event.If described file operation event belongs to the event creating data object type, for example, new files, then For first kind event.If described file operation event belongs to the behavior event of change data object dataId.For example, order again Name, move, saves as, renaming modification extension name etc. operates, the path of data object, data object title, number before and after operation According in three information of type, at least one there occurs change, then for Equations of The Second Kind event.Other behavior events are the 3rd class class event, For example, document open, the operation such as documents editing, the path of data object, data object title, data type three before and after its operation Individual information does not all change.As shown in figure 4, for, in an example, the judgement flow process of described file operation event type is illustrated Figure.In the present embodiment, including step S1031, judge that whether described file operation event is the event of new files, if then sentencing Break as first kind event, if it is not, then execution step S1032.Step S1032, judges whether described file operation event runs after fame Claim the event changing, if being then judged as Equations of The Second Kind event, if otherwise execution step S1033.Step S1033, judges described literary composition Whether part Action Events are the event of path changing, if being then judged as Equations of The Second Kind event, if otherwise execution step S1034.Step Rapid S1034, judges that whether described file operation event is the event of type change, if being then judged as Equations of The Second Kind event, if not Then it is judged as the 3rd class event.
If described file operation event is first kind event, generate line unit, described first kind event is generated daily record, and Add this node line unit value in the first field of described daily record, as root node pointer, store in described database 400.
If described file operation event is Equations of The Second Kind event, generate line unit, described Equations of The Second Kind event is generated daily record, and Search a up-to-date daily record under the data of described Equations of The Second Kind event, by the content of the first field in a described up-to-date daily record It is added to the first field of newly-generated described daily record, and described daily record is stored in described database 400, store successfully Afterwards, the line unit of newly-generated described daily record is added storage in the second field of a described up-to-date daily record;
If described file operation event is the 3rd class event, generate line unit, described 3rd class event is generated daily record, and Search up-to-date one article of daily record under the data of described 3rd class event, by the content of the first field in a described up-to-date daily record It is added to the first field of newly-generated described daily record, and described daily record is stored in described database 400.
Further, in the present embodiment, described database 400 can be HBase distributed data base.Described first field It is good for for root node row;Described second field is good for for child node row.Specifically, if described file operation event is first kind thing Part, then generate line unit, described file operation event generated new daily record, and the root node field (data in daily record:root_rk Field) add this node line unit value, as root node pointer, store in HBase distributed data base.Described first kind event The daily record being generated is exactly root node, the line unit value of oneself is stored " the data of daily record:Root_rk " field, thus convenient The child node querying replicated root node pointer of this root node.
If described file operation event is Equations of The Second Kind event, generate line unit, described file operation event is generated new Daily record, and search a up-to-date daily record under corresponding Equations of The Second Kind event identifier (dataId), by a described up-to-date daily record Root node field (data:Root_rk field) content be added to the root node field (data of this new daily record:Root_rk) word Section, is then stored in HBase distributed data base.Storage success after, by the line unit of this new daily record add storage to described Child node field (the data of a new daily record:Sub_rk field) in, in former daily record, add the finger to this up-to-date daily record Pin.Described data:Root_rk and data:Sub_rk is the equal of the node pointer in tree form data structure, for pointing to root Node and corresponding child node.
If described file operation event is the 3rd class event, generates line unit, event is generated new daily record, and searches phase Answer up-to-date one article of daily record under the 3rd class event identifier (dataId), by root node field in a described up-to-date daily record (data:Root_rk field) content, i.e. the pointer field of root node, be added to the data of this new daily record:Root_rk field, It is then stored in HBase distributed data base.In an example, the line unit of described HBase distributed data base is: RowKey=dataId+ (MAX_VALUE timestamp), wherein, dataId=Hash (Mac+path+filename+ extensions).
Step S104, the request that file destination Life cycle is carried out with location tracking that receiving terminal sends.
Described terminal can be described terminating machine 200 or other electric terminal.User can pass through terminating machine 200 Or the browser installed in other electric terminals checks file destination Life cycle correlation log and life cycle tree.
In addition, in other embodiments, user also can be by browsing of installing in terminating machine 200 or other electric terminal Device issues visual analyzing request, and described server 100 searches statistical model from relevant database, calls ASSOCIATE STATISTICS to calculate Method, carries out statistical analysis to distributed data base, draws accordingly result.Specifically, can be using visual analyzing module to storage Audit log carry out statistical analysis, obtain the entering frequency ranking of unstructured data object, hotspot's distribution ranking, data class The analysis results such as type ranking, life cycle tree extensiveness and intensiveness ranking, data volume trend prediction.Visual analyzing module can be divided For statistical model layer and algorithm layer, statistical model layer is used for defining statistical demand, and algorithm layer is calculated according to statistical demand definition is related Method logic.Except the statistical model presetting, also retains self defined interface, to facilitate the new statistical model of follow-up interpolation and phase Close statistic algorithm.The function of statistic analysis of visual analyzing module employs Distributed Calculation instrument " MapReduce ", for not Same statistical demand calls algorithm to do corresponding data statistic analysis work.Distributed Calculation can realize parallel computation, thus Considerably reduce the time needed for statistical analysis.
Step S105, the data according to storage in described database 400 positions the life that daily record builds described file destination Cycle tree.
Described server 100 receive terminal transmission file destination Life cycle is carried out after location tracking request, Can be according to the phase of described described file destination file destination Life cycle being carried out comprise in location tracking request receiving Pass information generates line unit and is indexed, so that it is determined that journal entries.
Then, described server 100 can obtain corresponding data identifier (dataId) according to described journal entries, inquiry A up-to-date daily record under described data identifier (dataId), obtains root node pointer from a described up-to-date daily record, from And obtain root node daily record, then start iteration from root node, carry out tree and build.If the node of the non-sequential storage of storage, Each child node set is found, thus traveling through each child node in father node;DataId is not changed and the father and son of sequential storage knot Point, direct index is to child node;After the completion of iteration, data tree builds and completes.
The life cycle that life cycle tree described in the embodiment of the present invention includes may include:Create, store, access, pass Defeated, destroy, recover etc., each process correspond to one or more operation behavior type in file managing and control system.Specifically, " establishment " refers to generation process in file system for the data object, contains a series of initialization behaviors of data object." deposit Storage " refers to persistence process in storage device for the data object having produced." access " refers to data by persistence shape State is converted to instantaneous state, and data is read out, changes etc. with the process of operation." transmission " refers to data object is migrated Process " destruction " and " recovery " refer to the deletion to data object and after deletion the reduction process to data.
Taking the common Windows operating system based on NTFS file managing and control system as a example, each life-cycle processes is all Correspond to the action type of one or more data objects." newly-built " operation creates new data object, belongs to life cycle In " establishment " process.Data object in internal memory is persisted in disk for " preservation " and " saving as " operation, belongs to and " deposits Storage " process.The operation such as " opening ", " preview ", " renaming " data object in disk read be read out in internal memory or Modification, belongs to " access " process.The operation such as " movement ", " shearing " is migrated to data object, belongs to " transmission " process. " deletion " operation removes data object, belongs to " destruction " process." reduction " and " revocation is deleted " has recovered data object, belongs to " recovery " process.
According to the method in above-described embodiment, using this unification of life cycle tree, intuitively structure achieves to data pair As the system tracks management and control of operation behavior in Life cycle, by the storage of tree structure data and HBase distributed data base Feature combines, thus the rapidly and efficiently extraction of the tree structure data being applicable under big data environment.By using HBase Table structure, by line unit is designed as:The form of " RowKey=dataId+ (MAX_VALUE timestamp) ", wherein " dataId=Hash (Mac+path+filename+extensions) ", thus the high-performance meeting reading further will Ask it is achieved that the quick indexing of data object life cycle, and can efficiently quickly the life cycle of data object be entered Row is followed the trail of.File operation event is classified it is achieved thereby that carrying out soon to the Life cycle of this data object in the present invention Fast, accurate location tracking and evidence obtaining.So, afterwards can be according to data object title, type, path, and terminating machine 200 Physical address, just can get the dataId of data object, and binding time factor generates line unit, you can quickly navigates to and examines accordingly Meter daily record.In addition, each of life cycle tree node all correspond to a log recording in the audit log table storing, To each daily record recorded data object, can be inquired about all in the tree at its place according to line unit and tree node pointer Node, therefore whole tree structure does not need to travel through whole records of storage in following the trail of building process, can be efficiently quickly complete Become.That is, the present invention efficiently quickly can be inquired about to data Life cycle tree and be built, for data object The increment of Operation Log also well adapting to property.In addition, the method in above-described embodiment proposes one kind and being applied to HBase and deposits The data Life cycle Storage Structure of Tree of storage, under mass data environment, its life cycle is followed the trail of and search efficiency and essence Degree is much better than common data managing and control system, and can efficiently realize the visual analyzing of unstructured data object, obtains To the entering frequency ranking of unstructured data object, hotspot's distribution ranking, data type ranking, life cycle tree range and depth The analysis results such as degree ranking, data volume trend prediction, thus improve data real-time management and control efficiency.
Second embodiment
Refer to Fig. 5, be the functional module of the document track device 110 shown in Fig. 2 that present pre-ferred embodiments provide Schematic diagram.Described document track device 110 is used for executing step S101- step S105 in flow chart shown in Fig. 3.Described file Follow-up mechanism 110 includes acquisition module 1101, processing module 1102, memory module 1103 and tracing module 1104.
Acquisition module 1101, for gathering the relevant information of described terminating machine 200, under terminating machine 200 assigned catalogue The operation behavior of file carries out catching monitoring, and described operation behavior is generated corresponding file operation event, described related letter Breath includes terminating machine 200 physical address, the network address, operating system user.
Processing module 1102, for being converted into daily record by described file operation event.
Memory module 1103, for storing described daily record to database 400.
Described processing module 1102:If being additionally operable to described file operation event is first kind event, generate line unit, by institute State first kind event and generate daily record, and the first field in described daily record adds this node line unit value, as root node pointer, deposits Store up in described database 400.If being additionally operable to described file operation event is Equations of The Second Kind event, generate line unit, by described the Two class events generate daily records, and search a up-to-date daily record under the data of described Equations of The Second Kind event, by described up-to-date one In daily record, the content of the first field is added to the first field of newly-generated described daily record, and described daily record is stored described number According in storehouse 400, after storage success, the line unit of newly-generated described daily record is added storage to the of described one article of up-to-date daily record In two fields.If being additionally operable to described file operation event is the 3rd class event, generate line unit, described 3rd class event is generated Daily record, and search up-to-date one article of daily record under the data of described 3rd class event, by the first word in a described up-to-date daily record The content of section is added to the first field of newly-generated described daily record, and described daily record is stored in described database 400.
Tracing module 1104, for receive terminal transmission location tracking is carried out to file destination Life cycle After request, the data according to storage in described database 400 positions the life cycle tree that daily record builds described file destination.
In detail, described tracing module 1104:It is additionally operable to described file destination Life cycle be carried out according to receive The relevant information of the described file destination comprising in location tracking request generates line unit and is indexed, and determines journal entries.Also may be used For inquiring about the up-to-date daily record under described data identifier (dataId), obtain root node from a described up-to-date daily record Pointer, thus obtain root node daily record.Finally, it is iterated from root node, you can carry out tree and build.
According to the device of the present embodiment, intuitively structure achieves and to operation behavior in data object Life cycle is System follows the trail of management and control.
3rd embodiment
Refer to Fig. 6, be the high-level schematic functional block diagram of the file tracking system that present pre-ferred embodiments provide.This enforcement The schematic diagram that the server 100 shown in Fig. 1 for the system operation of example is interacted with terminating machine 200.Described in the present embodiment it is System includes being in communication with each other terminating machine 200 and the server 100 of connection.
Described terminating machine 200 includes:Terminal behavior monitoring module 210, for the operation row to the file under assigned catalogue For carrying out catching monitoring, corresponding file operation event is generated according to described operation behavior.Sending module 220, for will be described File operation event is sent to described server 100.Follow the trail of request module 230, for sending target literary composition to described server 100 The request of part Life cycle location tracking.
Described server 100 includes:Processing module 1102, for being converted into daily record by described file operation event.Storage Module 1103, for storing described daily record to database 400.Tracing module 1104, for receiving the right of terminal transmission After file destination Life cycle carries out the request of location tracking, according to the data positioning daily record of storage in described database 400 Build the life cycle tree of described file destination.
The sending module 220 of described terminating machine 200:For in real time described file operation event being sent to described clothes Business device 100;Or be used for for described file operation event being stored in described terminating machine 200, and sporadically will be stored in described end Described Action Events in terminal 200 are sent to described server 100.
In detail, acquisition module 1101 or terminal behavior monitoring module 210 can be developed using C language, server 100 journey Sequence can be developed using Java.The exploitation on backstage can use Spring MVC framework, and terminating machine 200 server 100 can pass through B/S Pattern achieves a butt joint.WEB layer can respond the HTTP request that send of front end by Servlet, and call backstage respective service complete Become service logic, and return result to front end.The service logic data that front end receiver backstage is transmitted, can use JavaScript Processed and rendered, and RIA (Rich Internet Applications) is realized with AJAX technology by JSP.
The service interface layer of system can be complete by interface protocols such as WebService, http using unified service framework Become docking, using unified JSON format transmission data.
The relevant database of data storage layer can use Mysql, stores the system number such as user, authority, configuration, model According to integrated with backstage using Hibernate.The distributed data base of data storage layer employs the HBase distribution based on HDFS Formula database, stores the business datums such as audit log, integrated with backstage using simplehbase.
Wherein the present embodiment and second embodiment difference are, the present embodiment is based on and is in communication with each other connection server 100 and terminating machine 200 file tracking system come to be described, in addition the terminating machine 200 in the present embodiment includes terminal row For monitoring module 210, and server 100 does not include acquisition module 1101.Other details with regard to the present embodiment can be further With reference to first embodiment or second embodiment, will not be described here.
It should be understood that disclosed apparatus and method are it is also possible to pass through in several embodiments provided herein Other modes are realized.Device embodiment described above is only schematically, for example, the flow chart in accompanying drawing and block diagram Show the device of multiple embodiments according to the present invention, the architectural framework in the cards of method and computer program product, Function and operation.At this point, each square frame in flow chart or block diagram can represent the one of a module, program segment or code Part, a part for described module, program segment or code comprises holding of one or more logic function for realizing regulation Row instruction.It should also be noted that at some as in the implementation replaced, the function of being marked in square frame can also be to be different from The order being marked in accompanying drawing occurs.For example, two continuous square frames can essentially execute substantially in parallel, and they are sometimes Can execute in the opposite order, this is depending on involved function.It is also noted that it is every in block diagram and/or flow chart The combination of the square frame in individual square frame and block diagram and/or flow chart, can be with the special base of the function of execution regulation or action System in hardware to be realized, or can be realized with combining of computer instruction with specialized hardware.
In addition, each functional module in each embodiment of the present invention can integrate one independent portion of formation Divide or modules individualism is it is also possible to two or more modules are integrated to form an independent part.
If described function realized using in the form of software function module and as independent production marketing or use when, permissible It is stored in a computer read/write memory medium.Based on such understanding, technical scheme is substantially in other words Partly being embodied in the form of software product of part that prior art is contributed or this technical scheme, this meter Calculation machine software product is stored in a storage medium, including some instructions with so that a computer equipment (can be individual People's computer, server 100, or network equipment etc.) execution each embodiment methods described of the present invention all or part step Suddenly.And aforesaid storage medium includes:USB flash disk, portable hard drive, read-only storage (ROM, Read-Only Memory), deposit at random Access to memory (RAM, Random Access Memory), magnetic disc or CD etc. are various can be with the medium of store program codes. It should be noted that herein, such as first and second or the like relational terms are used merely to an entity or behaviour Make with another entity or operation make a distinction, and not necessarily require or imply these entities or operate between exist any this Plant actual relation or order.And, term " inclusion ", "comprising" or its any other variant are intended to nonexcludability Comprise so that including a series of process of key elements, method, article or equipment not only include those key elements, but also Including other key elements being not expressly set out, or also include for this process, method, article or intrinsic the wanting of equipment Element.In the absence of more restrictions, the key element being limited by sentence "including a ..." is it is not excluded that including described wanting Also there is other identical element in the process of element, method, article or equipment.
The foregoing is only the preferred embodiments of the present invention, be not limited to the present invention, for the skill of this area For art personnel, the present invention can have various modifications and variations.All within the spirit and principles in the present invention, made any repair Change, equivalent, improvement etc., should be included within the scope of the present invention.It should be noted that:Similar label and letter exist Representing similar terms in figure below, therefore, once being defined in a certain Xiang Yi accompanying drawing, being then not required in subsequent accompanying drawing It is defined further and to be explained.
The above, the only specific embodiment of the present invention, but protection scope of the present invention is not limited thereto, and any Those familiar with the art the invention discloses technical scope in, change or replacement can be readily occurred in, all should contain Cover within protection scope of the present invention.Therefore, protection scope of the present invention should described be defined by scope of the claims.

Claims (10)

1. a kind of document track method, is applied to the server communicating to connect with terminating machine it is characterised in that the method includes:
S1, gathers the relevant information of described terminating machine, and the operation behavior of the file under terminating machine assigned catalogue is carried out catching prison Control, described relevant information includes terminating machine physical address, the network address, operating system user;
S2, the operation behavior according to the file capturing under described assigned catalogue generates corresponding file operation event;
S3, described file operation event is converted into daily record and stores in database;
S4, receive terminating machine transmission file destination Life cycle is carried out after the request of location tracking, according to described number Data according to storage in storehouse positions the life cycle tree that daily record builds described file destination.
2. document track method as claimed in claim 1 is it is characterised in that step S3 includes:
If described file operation event is first kind event, generate line unit, described first kind event is generated daily record, and in institute The first field stating daily record adds this node line unit value, as root node pointer, stores in described database;
If described file operation event is Equations of The Second Kind event, generate line unit, described Equations of The Second Kind event is generated daily record, and searches A up-to-date daily record under the data of described Equations of The Second Kind event, the content of the first field in a described up-to-date daily record is added To the first field of newly-generated described daily record, and described daily record is stored in described database, after storage success, by new life The line unit of the described daily record becoming adds storage in the second field of a described up-to-date daily record;
If described file operation event is the 3rd class event, generate line unit, described 3rd class event is generated daily record, and searches One article of up-to-date daily record under the data of described 3rd class event, the content of the first field in a described up-to-date daily record is added To the first field of newly-generated described daily record, and described daily record is stored in described database.
3. document track method as claimed in claim 1 is it is characterised in that step S4 includes:
Described server receive terminating machine transmission file destination Life cycle is carried out after location tracking request, described clothes Business device is according to described described file destination file destination Life cycle being carried out comprise in location tracking request receiving Relevant information generates line unit and is indexed, and determines journal entries;
Described server obtains corresponding data identifier according to described journal entries, inquires about up-to-date under described data identifier Article one, daily record, obtains root node pointer, thus obtaining root node daily record from a described up-to-date daily record;Changed from root node In generation, carry out tree and build.
4. document track method as claimed in claim 1 is it is characterised in that described file operation event is existed by described terminating machine Capture and be sent to described server in real time during file operation event, or stored when capturing file operation by terminating machine Then sporadically it is sent to described server.
5. the document track method as described in claim 1-4 any one is it is characterised in that described database divides for HBase Cloth database.
6. a kind of document track device, is applied to the server of terminating machine communication connection it is characterised in that described device bag Include:
Acquisition module, the operation behavior for gathering the relevant information of described terminating machine, to the file under terminating machine assigned catalogue Carry out catching monitoring, and described operation behavior is generated corresponding file operation event, described relevant information includes terminating machine thing Reason address, the network address, operating system user;
Processing module, for being converted into daily record by described file operation event;
Memory module, for storing described daily record to database;
Tracing module, in the request that file destination Life cycle is carried out with location tracking receiving terminating machine transmission Afterwards, the data according to storage in described database positions the life cycle tree that daily record builds described file destination.
7. document track device as claimed in claim 6 is it is characterised in that described processing module:
If being additionally operable to described file operation event is first kind event, generate line unit, described first kind event generated daily record, And the first field in described daily record adds this node line unit value, as root node pointer, stores in described database;
If being additionally operable to described file operation event is Equations of The Second Kind event, generate line unit, described Equations of The Second Kind event generated daily record, And search a up-to-date daily record under the data of described Equations of The Second Kind event, by the first field in a described up-to-date daily record Hold the first field being added to newly-generated described daily record, and described daily record is stored in described database, after storage success, The line unit of newly-generated described daily record is added storage in the second field of a described up-to-date daily record;
If being additionally operable to described file operation event is the 3rd class event, generate line unit, described 3rd class event generated daily record, And search up-to-date one article of daily record under the data of described 3rd class event, by the first field in a described up-to-date daily record Hold the first field being added to newly-generated described daily record, and described daily record is stored in described database.
8. document track device as claimed in claim 6 is it is characterised in that described tracing module:
It is additionally operable to according to described described target file destination Life cycle being carried out comprise in location tracking request receiving The relevant information of file generates line unit and is indexed, and determines journal entries;
It is additionally operable to inquire about the up-to-date daily record under described data identifier, obtain root node from a described up-to-date daily record and refer to Pin, thus obtain root node daily record;It is iterated from root node, carry out tree and build.
9. a kind of file tracking system is it is characterised in that described system includes terminating machine and the server being in communication with each other connection;
Described terminating machine includes:
Terminal behavior monitoring module, for carrying out to the operation behavior of the file under assigned catalogue catching monitoring, according to described behaviour Make behavior and generate corresponding file operation event;
Sending module, for being sent to described server by described file operation event;
Follow the trail of request module, for sending the request of file destination Life cycle location tracking to described server;
Described server includes:
Processing module, for being converted into daily record by described file operation event;
Memory module, for storing described daily record to database;
Tracing module, in the request that file destination Life cycle is carried out with location tracking receiving terminating machine transmission Afterwards, the data according to storage in described database positions the life cycle tree that daily record builds described file destination.
10. file tracking system as claimed in claim 9 is it is characterised in that the sending module of described terminating machine:
For in real time described file operation event being sent to described server;Or
For described file operation event is stored in described terminating machine, and sporadically will be stored in the institute in described terminating machine State Action Events and be sent to described server.
CN201610857077.XA 2016-09-27 2016-09-27 File tracking method, device and system Pending CN106407429A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610857077.XA CN106407429A (en) 2016-09-27 2016-09-27 File tracking method, device and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610857077.XA CN106407429A (en) 2016-09-27 2016-09-27 File tracking method, device and system

Publications (1)

Publication Number Publication Date
CN106407429A true CN106407429A (en) 2017-02-15

Family

ID=57998082

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610857077.XA Pending CN106407429A (en) 2016-09-27 2016-09-27 File tracking method, device and system

Country Status (1)

Country Link
CN (1) CN106407429A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108304724A (en) * 2018-01-25 2018-07-20 中国地质大学(武汉) Document is traced to the source device, system and method
CN109241014A (en) * 2018-07-04 2019-01-18 阿里巴巴集团控股有限公司 Data processing method, device and server
CN109614300A (en) * 2018-11-09 2019-04-12 南京富士通南大软件技术有限公司 A kind of file operation in the WPD based on ETW monitors method
CN109926170A (en) * 2019-03-07 2019-06-25 天津和或节能科技有限公司 A kind of file destroying method and its file destroying system
CN111596933A (en) * 2020-07-09 2020-08-28 腾讯科技(深圳)有限公司 File processing method and device, electronic equipment and computer readable storage medium
CN112433987A (en) * 2020-11-30 2021-03-02 中国人寿保险股份有限公司 Track recording method and device for file maintenance and electronic equipment
CN112925757A (en) * 2021-03-26 2021-06-08 广东好太太智能家居有限公司 Method, equipment and storage medium for tracking operation log of intelligent equipment
CN116108091A (en) * 2022-12-26 2023-05-12 小米汽车科技有限公司 Data processing method, event tracking analysis method, device, equipment and medium

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5864480A (en) * 1995-08-17 1999-01-26 Ncr Corporation Computer-implemented electronic product development
CN101178730A (en) * 2007-12-14 2008-05-14 清华大学 Document management method facing to integration business model
CN101286888A (en) * 2008-05-21 2008-10-15 天柏宽带网络科技(北京)有限公司 Operating method of log system
CN101488171A (en) * 2008-12-16 2009-07-22 安徽和安信息科技有限公司 File authentication method based on separating electronic label
CN102004883A (en) * 2010-12-03 2011-04-06 中国软件与技术服务股份有限公司 Trace tracking method for electronic files
CN104102692A (en) * 2014-06-19 2014-10-15 肖龙旭 Electronic document tracking method based on logs
CN104199900A (en) * 2014-08-26 2014-12-10 中国航天科工集团第二研究院七〇六所 Audit and analysis method based on file trajectory tracking trees
CN104239312A (en) * 2013-06-11 2014-12-24 富泰华工业(深圳)有限公司 File management system and method
CN104796290A (en) * 2015-04-24 2015-07-22 广东电网有限责任公司信息中心 Data security control method and data security control platform
CN105183911A (en) * 2015-10-12 2015-12-23 国家电网公司 Data source binary tree based source tracing method for abnormal data of power system

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5864480A (en) * 1995-08-17 1999-01-26 Ncr Corporation Computer-implemented electronic product development
CN101178730A (en) * 2007-12-14 2008-05-14 清华大学 Document management method facing to integration business model
CN101286888A (en) * 2008-05-21 2008-10-15 天柏宽带网络科技(北京)有限公司 Operating method of log system
CN101488171A (en) * 2008-12-16 2009-07-22 安徽和安信息科技有限公司 File authentication method based on separating electronic label
CN102004883A (en) * 2010-12-03 2011-04-06 中国软件与技术服务股份有限公司 Trace tracking method for electronic files
CN104239312A (en) * 2013-06-11 2014-12-24 富泰华工业(深圳)有限公司 File management system and method
CN104102692A (en) * 2014-06-19 2014-10-15 肖龙旭 Electronic document tracking method based on logs
CN104199900A (en) * 2014-08-26 2014-12-10 中国航天科工集团第二研究院七〇六所 Audit and analysis method based on file trajectory tracking trees
CN104796290A (en) * 2015-04-24 2015-07-22 广东电网有限责任公司信息中心 Data security control method and data security control platform
CN105183911A (en) * 2015-10-12 2015-12-23 国家电网公司 Data source binary tree based source tracing method for abnormal data of power system

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108304724A (en) * 2018-01-25 2018-07-20 中国地质大学(武汉) Document is traced to the source device, system and method
CN109241014A (en) * 2018-07-04 2019-01-18 阿里巴巴集团控股有限公司 Data processing method, device and server
CN109241014B (en) * 2018-07-04 2022-04-15 创新先进技术有限公司 Data processing method and device and server
CN109614300A (en) * 2018-11-09 2019-04-12 南京富士通南大软件技术有限公司 A kind of file operation in the WPD based on ETW monitors method
CN109926170A (en) * 2019-03-07 2019-06-25 天津和或节能科技有限公司 A kind of file destroying method and its file destroying system
CN111596933A (en) * 2020-07-09 2020-08-28 腾讯科技(深圳)有限公司 File processing method and device, electronic equipment and computer readable storage medium
CN112433987A (en) * 2020-11-30 2021-03-02 中国人寿保险股份有限公司 Track recording method and device for file maintenance and electronic equipment
CN112925757A (en) * 2021-03-26 2021-06-08 广东好太太智能家居有限公司 Method, equipment and storage medium for tracking operation log of intelligent equipment
CN116108091A (en) * 2022-12-26 2023-05-12 小米汽车科技有限公司 Data processing method, event tracking analysis method, device, equipment and medium
CN116108091B (en) * 2022-12-26 2024-01-23 小米汽车科技有限公司 Data processing method, event tracking analysis method, device, equipment and medium

Similar Documents

Publication Publication Date Title
CN106407429A (en) File tracking method, device and system
US11703826B1 (en) Monitoring asset hierarchies based on asset group metrics
US11768875B2 (en) Monitoring system control interface for asset tree determination
CN102724059B (en) Website operation state monitoring and abnormal detection based on MapReduce
CN111092852B (en) Network security monitoring method, device, equipment and storage medium based on big data
US20200104401A1 (en) Real-Time Measurement And System Monitoring Based On Generated Dependency Graph Models Of System Components
US20200104402A1 (en) System Monitoring Driven By Automatically Determined Operational Parameters Of Dependency Graph Model With User Interface
US9590880B2 (en) Dynamic collection analysis and reporting of telemetry data
US11258814B2 (en) Methods and systems for using embedding from Natural Language Processing (NLP) for enhanced network analytics
CN103399887A (en) Query and statistical analysis system for mass logs
Jeong et al. Anomaly teletraffic intrusion detection systems on hadoop-based platforms: A survey of some problems and solutions
CN112181960B (en) Intelligent operation and maintenance framework system based on AIOps
CN105743730A (en) Method and system used for providing real-time monitoring for webpage service of mobile terminal
CN111046000B (en) Government data exchange sharing oriented security supervision metadata organization method
US11210278B1 (en) Asset group interface driven by search-derived asset tree hierarchy
CN104298669A (en) Person geographic information mining model based on social network
CN113098888A (en) Abnormal behavior prediction method, device, equipment and storage medium
CN104598536A (en) Structured processing method of distributed network information
CN111340404A (en) Method and device for constructing index system and computer storage medium
CN114637903A (en) Public opinion data acquisition system for directional target data expansion
Rao et al. An optimal machine learning model based on selective reinforced Markov decision to predict web browsing patterns
CN114116872A (en) Data processing method and device, electronic equipment and computer readable storage medium
CN110837593A (en) Tourism tracking system based on focused crawler technology
Dang-Ha et al. Graph of virtual actors (gova): A big data analytics architecture for IoT
WO2023288091A1 (en) Digital forensics tool and method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20170215