CN116541376A - Sample data acquisition method and device, mobile terminal and storage medium - Google Patents

Sample data acquisition method and device, mobile terminal and storage medium Download PDF

Info

Publication number
CN116541376A
CN116541376A CN202310462036.0A CN202310462036A CN116541376A CN 116541376 A CN116541376 A CN 116541376A CN 202310462036 A CN202310462036 A CN 202310462036A CN 116541376 A CN116541376 A CN 116541376A
Authority
CN
China
Prior art keywords
data
segmented
acquisition
sample data
database
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310462036.0A
Other languages
Chinese (zh)
Inventor
滕永达
袁朝
王志海
喻波
韩振国
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Wondersoft Technology Co Ltd
Original Assignee
Beijing Wondersoft Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Wondersoft Technology Co Ltd filed Critical Beijing Wondersoft Technology Co Ltd
Priority to CN202310462036.0A priority Critical patent/CN116541376A/en
Publication of CN116541376A publication Critical patent/CN116541376A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/211Schema design and management
    • G06F16/212Schema design and management with details for data modelling support
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/556Detecting local intrusion or implementing counter-measures involving covert channels, i.e. data leakage between processes
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The application provides a sample data acquisition method, a sample data acquisition device, a mobile terminal and a storage medium. The sample data acquisition method comprises the following steps: carrying out segmentation processing on total data comprising unique identification data in a database to obtain a plurality of segment data; the method comprises the steps of carrying out segmented acquisition on a plurality of segmented data according to a preset acquisition sequence; when an acquisition interrupt instruction is received, completing the acquisition action of the current segmented data, acquiring first segmented sample data corresponding to the current segmented data, and stopping the acquisition action of the segmented data positioned behind the current segmented data according to the acquisition sequence; and acquiring the first sample data according to the first segmented sample data and the second segmented sample data, wherein the second segmented sample data comprises segmented sample data corresponding to segmented data positioned before the current segmented data according to the acquisition sequence. The sample data acquisition method completes the current segmented data according to the requirements and stops the subsequent acquisition action, so that the problem that the sample data acquisition cannot be suspended is avoided.

Description

Sample data acquisition method and device, mobile terminal and storage medium
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to a sample data collection method, a device, a mobile terminal, and a storage medium.
Background
The prior art often realizes sample data collection in a service database through binlog and other binary log queries which are specially used for recording information of the operation of the database in writing. However, these binary logs only exist in mysql and other general relational databases, and when the service database does not have binary logs similar to binlog, it is inconvenient to make random changes to the settings of the service database, and it is difficult to collect sample data.
In order to solve the problems, the prior art adopts database migration tools such as DataX and the like to carry out data synchronization tools of heterogeneous databases. The data X is used as an intermediate medium and is respectively connected with a plurality of databases, and data reading and writing actions in the databases are respectively completed through the data acquisition module and the data writing module. However, when data is acquired by the DataX, the data acquisition action can be normally ended only after all the data to be acquired are acquired, and the acquisition action is interrupted to report errors and exit abnormally. When the data volume in the database to be acquired is huge, the database is occupied for a long time by using the data X to acquire the data and can not be stopped, so that the normal use of a business system related to the database is affected.
Disclosure of Invention
The embodiment of the application provides a sample data acquisition method, a sample data acquisition device, a mobile terminal and a storage medium. The sample data acquisition method can acquire and segment the unique identification data in the database, the segmented data can be respectively subjected to sample acquisition, when the database needs to be temporarily used, the database can be prevented from being continuously occupied by the acquisition completion of the current segmented data and the start of the subsequent segmented data acquisition action is stopped, and the sample data acquisition method capable of being suspended is provided, so that the influence of long-term occupation of the database on related service systems can be avoided.
In order to solve the above technical problems, the present application provides a sample data acquisition method, including:
segmenting the total data to obtain a plurality of segmented data, wherein the total data comprises unique identification data in a database;
the method comprises the steps of carrying out segmented acquisition on a plurality of segmented data according to a preset acquisition sequence;
when an acquisition interrupt instruction is received, completing the acquisition action of current segment data and acquiring first segment sample data corresponding to the current segment data, and stopping the acquisition action of segment data positioned behind the current segment data according to the acquisition sequence;
and acquiring first sample data according to the first segmented sample data and second segmented sample data, wherein the second segmented sample data comprises segmented sample data corresponding to segmented data positioned before the current segmented data according to the acquisition sequence.
Optionally, the sample data collection method provided in the present application further includes:
receiving a database acquisition instruction and acquiring a database according to preset screening conditions;
and inquiring the database according to preset inquiry conditions to acquire unique identification data and generate total data.
Optionally, the sample data collection method provided in the present application further includes:
setting a breakpoint in the total data according to a preset breakpoint setting strategy;
and segmenting the total data according to the break points to obtain a plurality of segmented data.
Optionally, the sample data collection method provided in the present application further includes:
and acquiring state information acquired in a segmented manner, wherein the state information comprises breakpoint position information and acquired data quantity.
Optionally, the sample data collection method provided in the present application further includes:
when an acquisition continuation instruction is received, starting an acquisition action of the segmented data positioned behind the current segmented data according to the acquisition sequence and acquiring corresponding third segmented sample data;
and acquiring second sample data according to the first segment sample data, the second segment sample data and the third segment sample data.
Optionally, the sample data collection method provided in the present application further includes:
and judging whether the plurality of segmented data complete the acquisition action or not, and ending the segmented acquisition action according to a judging result.
Optionally, the sample data collection method provided in the present application further includes:
and after judging that the corresponding sample data are acquired for the plurality of segment data, carrying out database fingerprint training according to the second sample data to generate a database fingerprint model.
The application also provides a sample data acquisition device, comprising:
the segmentation module is used for carrying out segmentation processing on the total data to obtain a plurality of segmented data, wherein the total data comprises unique identification data in a database;
the acquisition module is used for carrying out segmented acquisition on the plurality of segmented data according to a preset acquisition sequence;
the interruption module is used for completing the acquisition action of the current segmented data and acquiring the first segmented sample data corresponding to the current segmented data when receiving an acquisition interruption instruction, and stopping the acquisition action of the segmented data positioned behind the current segmented data according to the acquisition sequence;
the first sample acquisition module is used for acquiring first sample data according to the first segmented sample data and second segmented sample data, wherein the second segmented sample data comprises segmented sample data corresponding to segmented data positioned before the current segmented data according to the acquisition sequence.
The application also provides a mobile terminal, comprising:
at least one processor; the method comprises the steps of,
a memory communicatively coupled to the at least one processor; wherein, the liquid crystal display device comprises a liquid crystal display device,
the memory stores instructions executable by the at least one processor to enable the at least one processor to implement the sample data collection method described above.
The application also provides a computer readable storage medium storing a computer program which when executed by a processor is capable of implementing the sample data collection method described above.
The sample data acquisition method can acquire and segment the unique identification data in the database, the segmented data can be respectively subjected to sample acquisition, when the database needs to be temporarily used, the database can be prevented from being continuously occupied by the acquisition completion of the current segmented data and the start of the subsequent segmented data acquisition action is stopped, and the sample data acquisition method capable of being suspended is provided, so that the influence of long-term occupation of the database on related service systems can be avoided.
The foregoing description is merely an overview of the technical solutions provided in the present application, and may be implemented according to the content of the specification in order to make the technical means of the present application more clearly understood, and in order to make the above-mentioned and other objects, features and advantages of the present application more clearly understood, the following detailed description of the present application is given.
Drawings
One or more embodiments are illustrated by way of example and not limitation in the figures of the accompanying drawings, in which like references indicate similar elements, and in which the figures of the drawings are not to be taken in a limiting sense, unless otherwise indicated.
FIG. 1 is a schematic diagram of a sample data collection method according to an embodiment of the present disclosure;
FIG. 2 is a schematic diagram of a second exemplary method for collecting sample data according to an embodiment of the present disclosure;
FIG. 3 is a third schematic diagram of a sample data collection method according to an embodiment of the present disclosure;
FIG. 4 is a schematic diagram of a sample data collection method according to an embodiment of the present disclosure;
FIG. 5 is a schematic diagram of a sample data collection method according to an embodiment of the present disclosure;
FIG. 6 is a schematic diagram of a sample data collection method according to an embodiment of the present disclosure;
FIG. 7 is a schematic diagram of a sample data collection method according to an embodiment of the present disclosure;
FIG. 8 is a schematic illustration of a sample data collection procedure provided herein;
FIG. 9 is a schematic illustration of server deployment in a sample data acquisition system provided herein;
FIG. 10 is a schematic diagram of a sample data collection device according to an embodiment of the present application;
FIG. 11 is a second schematic diagram of a sample data collection device according to an embodiment of the present disclosure;
FIG. 12 is a third schematic diagram of a sample data collection device according to an embodiment of the present disclosure;
FIG. 13 is a fourth schematic diagram of a sample data collection device according to an embodiment of the present disclosure;
FIG. 14 is a fifth schematic diagram of a sample data collection device according to an embodiment of the present disclosure;
FIG. 15 is a schematic diagram of a sample data collection device according to an embodiment of the present disclosure;
FIG. 16 is a schematic diagram of a sample data collection device according to an embodiment of the present disclosure;
fig. 17 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
Exemplary embodiments of the present application will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present application are shown in the drawings, it should be understood that the present application may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
A first embodiment of the present application provides a sample data collection method, as shown in fig. 1, including:
step 101, carrying out segmentation processing on total data to obtain a plurality of segment data, wherein the total data comprises unique identification data in a database;
102, carrying out segmented acquisition on a plurality of segmented data according to a preset acquisition sequence;
step 103, when an acquisition interrupt instruction is received, completing the acquisition action of the current segmented data and acquiring first segmented sample data corresponding to the current segmented data, and stopping the acquisition action of the segmented data positioned behind the current segmented data according to the acquisition sequence;
step 104, obtaining first sample data according to the first segment sample data and second segment sample data, wherein the second segment sample data comprises segment sample data corresponding to segment data positioned before the current segment data according to the acquisition sequence.
Specifically, the sample data acquisition method provided by the application is different from the existing data offline full-volume data synchronization tool, and is an online and updatable database acquisition tool. The sample data acquisition method provided by the application performs segmentation processing on the total data to obtain a plurality of segmented data. And then sequencing the segmented data and sequentially carrying out online acquisition on the segmented data to obtain corresponding segmented sample data. In the process of collecting the segmented sample data, when an acquisition interrupt instruction is received, completing the collection action of the current segmented data to obtain first segmented sample data, stopping the collection action of the subsequent segmented data, and then obtaining the first sample data on line according to the first segmented sample data and second segmented sample data corresponding to the segmented data which are arranged in sequence before the current segmented data.
In addition, the application also provides another sample data acquisition example, which is specifically as follows:
segmenting total data comprising unique identification data in a database, setting one or more break points in each segmented data, sequencing the segmented data, and completing segmented sample data acquisition corresponding to the segmented data one by one according to the sequence. In the process of collecting single piece of segment data, after receiving a collection interruption instruction, stopping the collection action at the breakpoint of the segment data, obtaining segment sample data corresponding to the data before the breakpoint of the segment data, and obtaining the first sample data according to the first segment sample data corresponding to the half data before the current segment data breakpoint and the second segment sample data corresponding to the completed segment data.
In addition, the application also provides another sample data acquisition example, which is specifically as follows:
when the business system and business progress related to the business database depend on more than one database, the sample data acquisition method provided by the application can acquire the sample of the unique identification data in the databases. At this time, the total data is derived from a plurality of heterogeneous databases or isomorphic databases, the data segmentation of the total data can be segmented according to the difference of data sources to obtain segmented data corresponding to the databases, then the segmented data are collected to obtain corresponding segmented sample data, and model training is performed according to the segmented sample data to obtain a corresponding database fingerprint model, or the segmented sample data are summarized and model trained to obtain database fingerprint models for identifying all databases, so that the method is used in the field of data protection leakage in the databases.
In addition, the sample data acquisition method provided by the application can perform model training according to the obtained first sample data, generate a database fingerprint model capable of uniquely identifying a single database identity, and store the database fingerprint model for ready use. When the files are required to be compared through database fingerprints, a database fingerprint model is called and compared with the files, whether the data in the files come from the database can be judged, and therefore the leakage protection function of the database is achieved.
It should be emphasized that the sample data collection method provided in the present application does not limit the types of collected database objects, and various common databases, such as a relational database, an analytical database, a multi-modal database, and a non-relational database, can all employ the sample data collection method provided in the present application.
The sample data acquisition method can acquire and segment the unique identification data in the database, the segmented data can be respectively subjected to sample acquisition, when the database needs to be temporarily used, the database can be prevented from being occupied continuously after the current segmented data are acquired and the follow-up segmented data acquisition action is stopped, and the sample data acquisition method capable of being suspended is provided, so that the influence of long-term occupation of the database on related service systems is avoided. In addition, the sample data collected by the method is the unique identification data of the database, the unique identification data can be used for identifying the database, and the collected unique identification data can be applied to the field of database leakage protection.
Specifically, for the mysql relational database, the binary log is used for recording and storing writing operation information of a row, query information is not recorded, and when data acquisition is needed, the data acquisition and synchronization can be realized by opening the binlog. However, binary logs similar to binlog exist only in mysql databases, and for other databases, data collection cannot be achieved by opening binlog operations. Moreover, the acquisition source of the sample data is often a client entrusted to a user applying the sample data acquisition method, the user cannot change the service database of the client in terms of setting without permission, and when the service database of the client is not a relational database, a rapid data acquisition method is not provided. And the database migration tool is used for collecting the service database, so that the service database of the client can be occupied for a long time when the data volume is too large, the service development related to the service database of the client is influenced, and the sample data collection can only be stopped. However, these database migration tools often do not have pause and stop functions, and directly interrupting the collection action may cause the database migration tool to report errors and exit abnormally.
According to the sample data acquisition method, the unique identification data in the database are subjected to segmentation processing to obtain a plurality of segmented data, the segmented data are sequentially acquired according to the sequence, when the data volume of the service database is overlarge, acquisition interrupt instructions can be received, and the data acquisition action can be flexibly controlled, so that excessive interference to the service database and service systems associated with the service database is avoided, and the requirements of stopping and suspending the data acquisition process of the service database are met.
On the basis of the above embodiment, as shown in fig. 2, the sample data collection method provided in the present application further includes:
step 105, receiving a database acquisition instruction and acquiring a database according to preset screening conditions;
and 106, inquiring the database according to preset inquiry conditions to acquire unique identification data and generate total data.
Specifically, in the sample data collection method provided by the application, after receiving the database collection instruction. And establishing a sample acquisition task in a sample data acquisition system corresponding to the sample data acquisition method, selecting the database type of the service database meeting the requirements according to the entered screening conditions, and screening out the table and the column number where the unique identification data capable of uniquely identifying each heterogeneous database is located. And then, inquiring the data meeting the requirements of the identification database through the pre-input inquiry conditions, thereby obtaining total data or the total data acquisition. The query condition is used for acquiring unique identification data in a plurality of heterogeneous databases, and the total data can be obtained by acquiring digital information of specific columns from a specific table preset in the databases, and taking the digital information in the heterogeneous databases as the unique identification data.
On the basis of the embodiment, the databases are screened to obtain the databases needing to collect sample data, the data completely irrelevant to the identity of the identification database is removed in a screening and inquiring mode, the first sample data collected according to the total data obtained after screening is more accurate, and the unique characteristics of the databases can be reflected by the database fingerprint generated according to the training of the first sample data.
On the basis of the above embodiment, as shown in fig. 3, the sample data collection method provided in the present application further includes:
step 111, setting up a breakpoint in the total data according to a preset breakpoint setting strategy;
and 112, segmenting the total data according to the break points to obtain a plurality of segment data.
Specifically, the sample data acquisition of the method establishes a breakpoint in total data according to a preset breakpoint establishment strategy, and segments the total data according to the breakpoint to obtain a plurality of segment data. For example, the breakpoint set-up policy may be to set up a breakpoint in the total data and divide the breakpoint into a plurality of pieces of segmented data, thereby reducing the workload of a single data acquisition.
On the basis of the embodiment, the breakpoint is set in the total data, the total data containing a large amount of data is divided into a plurality of segmented data containing smaller data according to the breakpoint, and then the segmented data containing smaller data are respectively collected according to the sequence to obtain a plurality of segmented sample data.
On the basis of the above embodiment, as shown in fig. 4, the sample data collection method provided in the present application further includes:
step 107, acquiring state information acquired in a segmented mode, wherein the state information comprises breakpoint position information and acquired data quantity.
Specifically, after the acquisition interrupt instruction is received and the acquisition action of the segmented data positioned behind the current segmented data according to the acquisition sequence is stopped, the breakpoint position information during segmented acquisition interrupt and the state information of segmented acquisition such as the acquired data quantity when the current segmented data finishes the acquisition action can be acquired. In addition, the acquired state information of the segmented acquisition action can be stored in a database, such as a mysql database, so that the segmented acquisition state information can be recorded and stored.
When the breakpoint is set in the segment data, the state information acquired by the segment includes the position information of the current segment data in the total data, the position information of the breakpoint in the current segment data, the acquired data quantity and other state information, and corresponding storage actions are performed.
When the sample data acquisition method is used for acquiring a plurality of databases, the related information of the database where the currently acquired segmented data is located is also required to be recorded, and corresponding storage actions are performed.
On the basis of the embodiment, the collection state information is stored, so that a user can conveniently call the operation state of fingerprint collection of the database according to the requirement, the user can conveniently know the state of the database collection, and the user can conveniently generate a collection continuing instruction according to the state information of segmented collection, so that the recovery of the state after the collection interruption is completed.
On the basis of the above embodiment, as shown in fig. 5, the sample data collection method provided in the present application further includes:
step 108, when receiving an acquisition continuation instruction, starting an acquisition action of the segment data positioned behind the current segment data according to the acquisition sequence and acquiring corresponding third segment sample data;
step 109, obtaining second sample data according to the first segment sample data, the second segment sample data and the third segment sample data.
Specifically, when an acquisition continuation instruction is received, for example, when a sample data acquisition system applying the sample data acquisition method detects that the breakpoint switch is not in a pause or stop state, a segment acquisition action is continued, and sample data acquisition is performed on a plurality of segment data located behind the current segment data in an acquisition sequence.
On the basis of the embodiment, the method for recovering and continuing to collect according to the collection continuing instruction is additionally arranged, so that collection of the database fingerprints after the interruption can be continued, the obtained database fingerprints comprise more database information, and the database is accurately identified through the database fingerprints.
On the basis of the above embodiment, as shown in fig. 6, the sample data collection method provided in the present application further includes:
step 110, judging whether the acquisition action of the plurality of segment data is completed or not, and ending the segment acquisition action according to the judging result.
Specifically, after sample data acquisition is performed on a plurality of segment data positioned behind the current segment data according to the acquisition sequence, whether all segment data are acquired is judged, and after sample data acquisition actions are completed on all segment data obtained by cutting the total data, the sample data acquisition actions are stopped.
On the basis of the embodiment, the judgment of whether all the segmented data obtained by splitting the total data are acquired is additionally arranged, so that the identification accuracy of the acquired second sample data is improved, and whether the database has data leakage or not is judged more accurately according to the second sample data.
On the basis of the above embodiment, as shown in fig. 7, the sample data collection method provided in the present application further includes:
and 113, after judging that the corresponding sample data are acquired for the plurality of segment data, training the database fingerprint according to the second sample data, and generating a database fingerprint model.
Specifically, after the total data generated by the unique identification data of the database is collected, model training is carried out according to the collected second sample data, a database fingerprint model capable of uniquely identifying the identity of the single database is generated, and the database fingerprint model is stored and ready for use. When the files are required to be compared through database fingerprints, a database fingerprint model is called and compared with the files, whether the data in the files come from the database can be judged, and therefore the leakage protection function of the database is achieved.
On the basis of the embodiment, as the second sample data is obtained according to all data uniquely identified in the database, the second sample data obtained after the sample data is collected on all the total data carries all the characteristics of the database for identification and distinction, the database fingerprint obtained through training in the mode can be used for distinguishing the database from other databases, and when judging whether the problem of data leakage in the database exists in the file, whether the data in the file comes from the database corresponding to the database fingerprint can be judged by comparing the data in the file with the database fingerprint, so that the leakage protection function of the database is realized. And, the database fingerprint generated according to the second sample data is more accurate than the database fingerprint generated according to the first sample data obtained after the suspension of the acquisition action, and the judgment of whether the database has leakage is more accurate.
On the basis of the above embodiment, as shown in fig. 8, the present application further provides an example of a sample data collection procedure.
Firstly, in a sample data acquisition system applying the sample data acquisition method provided by the application, a database fingerprint acquisition task is established, screening conditions are input into the system, the type of a service database which wants to acquire sample data, specific table position information, specific data column number position information and other contents in the service database are selected, and query conditions for data acquisition are input. And then, according to the input query conditions, the acquisition action of the service database is realized by querying the database. And break points are set up on the total data or the total data in the service database, a plurality of segment data are obtained by segmentation, meanwhile, a collection break point control switch is set up in the sample data collection system, a user can change the state of the control switch through interface operation corresponding to the system, and the system judges the state of the control switch in real time. The segmented data are then collected sequentially, e.g., pulled to generate a plurality of segmented sample files. In the collection process, the state of the breakpoint control switch is judged, when the state is a stop or pause state, the collection action of the segment data which is currently being collected is completed, the collection action of the subsequent segment data is stopped, and at the moment, the collection state information such as the position of the breakpoint in the total data, the collected data quantity and the like is stored and stored in a mysql database. And when judging that the breakpoint control state is not a pause or stop state, continuing to acquire the subsequent segmented data. And when the total acquired data amount is judged to be reached, namely, when the total data are acquired, generating second sample data according to all acquired segmented sample data, and calling a database fingerprint training function in the system to perform model training, thereby obtaining a database fingerprint model for representing the identity of the service database. The breakpoint switch in the sample data acquisition system can be set based on a memory, so that the acquisition efficiency of sample data is guaranteed.
On the basis of the above embodiment, the self-application also provides another sample data acquisition example:
firstly, in a sample data acquisition system applying the sample data acquisition method provided by the application, a database fingerprint acquisition task is established, screening conditions are input into the system, the type of a service database which wants to acquire sample data, specific table position information, specific data column number position information and other contents in the service database are selected, and query conditions for data acquisition are input. And then, according to the input query conditions, the acquisition action of the service database is realized by querying the database. And the total quantity or total data of the acquired data in the service database is segmented to obtain a plurality of segmented data, then one or a plurality of breakpoints are set up in each segmented data, meanwhile, an acquisition breakpoint control switch is set up in the sample data acquisition system, a user can change the state of the control switch through the interface operation corresponding to the system, and the system judges the state of the control switch in real time. The segmented data are then sequentially pulled to generate acquisition segmented sample data. In the process of collecting single piece of segment data, judging the state of a breakpoint control switch, stopping the collection action at the breakpoint of the segment data after receiving a collection interruption instruction, obtaining segment sample data corresponding to the data before the break point of the segment data, and storing the collection state information such as the position of the segment data where the break point is located, the position of the specific segment data in the total data, the collected data quantity and the like in a mysql database. And when judging that the breakpoint control state is not a pause or stop state, continuing to acquire the second half segment data of the current segment data. And when judging that the total quantity of the acquired data is not reached, pulling the acquired segmented sample data corresponding to the next segmented data and judging the state of the breakpoint control switch. And when the total acquired data amount is judged to be reached, namely, when the total data are acquired, generating second sample data according to all acquired segmented sample data, and calling a database fingerprint training function in the system to perform model training, thereby obtaining a database fingerprint model for representing the identity of the service database. The breakpoint switch in the sample data acquisition system can be set based on a memory, so that the acquisition efficiency of sample data is guaranteed.
On the basis of the above embodiment, as shown in fig. 9, the present application further provides a layout schematic of a database fingerprint acquisition system applying the sample data acquisition method of the present application, and specific system construction examples are as follows:
including data leakage prevention systems, database fingerprint prediction servers, database fingerprint training servers, client a, and network attached storage (Network Attached Storage, nas). The user logs in the data leakage prevention system through the client A and establishes a database fingerprint sample collection task, after the data leakage prevention system completes sample data collection work, the data leakage prevention system sends the sample data to the database fingerprint server, the database fingerprint training server carries out model training according to the sample data to obtain a database fingerprint model and stores the database fingerprint model in Nas, and second sample data collected in the data leakage prevention system can be stored in Nas as well and is subjected to relevant catalog sharing in the data leakage prevention system and the database fingerprint training server. After database fingerprint model training, the data leakage prevention system retrieves the generated database fingerprint from the Nas and issues the model to the database fingerprint prediction server through a software framework such as thraft. The client A sends an actual file needing to be detected whether a data bevel-mouth exists or not to the database fingerprint prediction server, the database fingerprint prediction server loads the issued database fingerprint model and performs data leakage prediction, and a judging or predicting result is returned to the client A.
The second embodiment of the present application further provides a sample data collecting device, as shown in fig. 10, including:
a segmentation module 121, configured to perform segmentation processing on total data to obtain a plurality of segmented data, where the total data includes unique identification data in a plurality of heterogeneous databases;
the acquisition module 122 is configured to perform segment acquisition on the plurality of segment data according to a preset acquisition sequence;
the interruption module 123 is configured to complete an acquisition action of the current segment data and obtain first segment sample data corresponding to the current segment data when an acquisition interruption instruction is received, and stop an acquisition action of segment data located after the current segment data according to an acquisition sequence;
the first sample acquiring module 124 is configured to acquire first sample data according to first segment sample data and second segment sample data, where the second segment sample data includes segment sample data corresponding to segment data that is located before the current segment data in an acquisition order.
In addition to the above embodiment, as shown in fig. 11, the sample data acquisition device further includes:
the screening module 125 is configured to receive a database acquisition instruction and acquire a plurality of heterogeneous databases according to preset screening conditions;
and the query module 126 is configured to query the plurality of heterogeneous databases according to preset query conditions to obtain unique identification data in the plurality of heterogeneous databases and generate total data.
On the basis of the above embodiment, as shown in fig. 12, the segmentation module 121 includes:
a breakpoint setting unit 127 for setting a breakpoint in the total data according to a preset breakpoint setting policy;
the data slicing unit 128 is configured to slice the total data according to the breakpoint to obtain a plurality of segment data.
In addition to the above embodiment, as shown in fig. 13, the sample data acquisition device further includes:
the status information obtaining module 129 is configured to obtain status information of segment collection, where the status information includes breakpoint position information and collected data volume.
On the basis of the above embodiment, as shown in fig. 14, the sample data collecting device further includes:
the acquisition recovery module 130 is configured to, when receiving an acquisition continuation instruction, start an acquisition action of the subsequent segmented data and obtain sample data corresponding to the subsequent segmented data;
a second sample acquiring module 131, configured to acquire second sample data according to the first segmented sample data, the second segmented sample data, and the third segmented sample data.
In addition to the above embodiment, as shown in fig. 15, the sample data acquisition device further includes:
the ending judging module 132 is configured to judge whether the plurality of segment data completes the collection action, and end the segment collection action according to the judging result.
On the basis of the above embodiment, as shown in fig. 16, the sample data collection device further includes:
the database fingerprint generation module 133 is configured to perform database fingerprint training according to sample data corresponding to the plurality of segment data after determining that the plurality of segment data are acquired to obtain corresponding sample data, and generate a database fingerprint model.
A third embodiment of the present application relates to a mobile terminal, as shown in fig. 17, including:
at least one processor 161; the method comprises the steps of,
a memory 162 communicatively coupled to the at least one processor 161; wherein, the liquid crystal display device comprises a liquid crystal display device,
the memory 162 stores instructions executable by the at least one processor 161 to enable the at least one processor 161 to implement the sample data collection method described in the first embodiment of the present application.
Where the memory and the processor are connected by a bus, the bus may comprise any number of interconnected buses and bridges, the buses connecting the various circuits of the one or more processors and the memory together. The bus may also connect various other circuits such as peripherals, voltage regulators, and power management circuits, which are well known in the art, and therefore, will not be described any further herein. The bus interface provides an interface between the bus and the transceiver. The transceiver may be one element or may be a plurality of elements, such as a plurality of receivers and transmitters, providing a means for communicating with various other apparatus over a transmission medium. The data processed by the processor is transmitted over the wireless medium via the antenna, which further receives the data and transmits the data to the processor.
The processor is responsible for managing the bus and general processing and may also provide various functions including timing, peripheral interfaces, voltage regulation, power management, and other control functions. And memory may be used to store data used by the processor in performing operations.
A fourth embodiment of the present application relates to a computer-readable storage medium storing a computer program. The computer program, when executed by a processor, implements the sample data collection method described in the first embodiment of the present application.
That is, it will be understood by those skilled in the art that all or part of the steps in implementing the methods of the embodiments described above may be implemented by a program stored in a storage medium, where the program includes several instructions for causing a device (which may be a single-chip microcomputer, a chip or the like) or a processor (processor) to perform all or part of the steps in the methods of the embodiments described herein. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
Other embodiments of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the application disclosed herein. This application is intended to cover any variations, uses, or adaptations of the application following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.
It is to be understood that the present application is not limited to the precise arrangements and instrumentalities shown in the drawings, which have been described above, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the application is limited only by the appended claims.

Claims (10)

1. A method of sample data collection, the method comprising:
segmenting the total data to obtain a plurality of segmented data, wherein the total data comprises unique identification data in a database;
the segmented data are segmented and collected according to a preset collection sequence;
when an acquisition interrupt instruction is received, completing the current acquisition action of the segmented data and acquiring first segmented sample data corresponding to the current segmented data, and stopping the acquisition action of the segmented data positioned behind the current segmented data according to the acquisition sequence;
and acquiring first sample data according to the first segmented sample data and second segmented sample data, wherein the second segmented sample data comprises segmented sample data corresponding to segmented data positioned before the current segmented data according to the acquisition sequence.
2. The method of claim 1, wherein the segmenting the total data into a plurality of segmented data, and wherein before the total data includes the unique identification data in the database, further comprises:
receiving a database acquisition instruction and acquiring the database according to preset screening conditions;
and inquiring the database according to preset inquiry conditions to acquire the unique identification data and generate the total data.
3. The method of claim 1, wherein the segmenting the total data into a plurality of segmented data, wherein the total data including the unique identification data in the database comprises:
setting a breakpoint in the total data according to a preset breakpoint setting strategy;
and segmenting the total data according to the break points to obtain the plurality of segmented data.
4. The method according to claim 3, wherein when the acquisition interrupt command is received, the steps of completing the current acquisition of the segment data and acquiring the first segment sample data corresponding to the current segment data, and after stopping the acquisition of the segment data located after the current segment data in the acquisition order, further comprise:
and acquiring the state information acquired by the segments, wherein the state information comprises breakpoint position information and acquired data quantity.
5. The method according to claim 1, wherein when the acquisition interrupt command is received, the steps of completing the current acquisition of the segment data and acquiring the first segment sample data corresponding to the current segment data, and after stopping the acquisition of the segment data located after the current segment data in the acquisition order, further comprise:
when an acquisition continuation instruction is received, starting an acquisition action of the segment data positioned behind the current segment data according to the acquisition sequence and acquiring corresponding third segment sample data;
and acquiring second sample data according to the first segmented sample data, the second segmented sample data and the third segmented sample data.
6. The method of claim 5, wherein after the obtaining second sample data from the first segmented sample data, the second segmented sample data, and the third segmented sample data, further comprises:
and judging whether the segmented data are subjected to the acquisition action or not, and ending the segmented acquisition action according to a judging result.
7. The method of claim 6, wherein after the determining whether the plurality of segmented data has completed the collecting act and ending the segmented collecting act according to the determination result, further comprising:
and after judging that the plurality of segment data are acquired to obtain corresponding sample data, carrying out database fingerprint training according to the second sample data to generate a database fingerprint model.
8. A sample data acquisition device, comprising:
the segmentation module is used for carrying out segmentation processing on total data to obtain a plurality of segmented data, wherein the total data comprises unique identification data in a database;
the acquisition module is used for carrying out segmented acquisition on the plurality of segmented data according to a preset acquisition sequence;
the interruption module is used for completing the current acquisition action of the segmented data and acquiring first segmented sample data corresponding to the current segmented data when an acquisition interruption instruction is received, and stopping the acquisition action of the segmented data positioned behind the current segmented data according to the acquisition sequence;
and the first sample acquisition module is used for acquiring first sample data according to the first segmented sample data and second segmented sample data, wherein the second segmented sample data comprises segmented sample data corresponding to segmented data positioned before the current segmented data according to the acquisition sequence.
9. A mobile terminal, comprising:
at least one processor; the method comprises the steps of,
a memory communicatively coupled to the at least one processor; wherein, the liquid crystal display device comprises a liquid crystal display device,
the memory stores instructions executable by the at least one processor to enable the at least one processor to implement the sample data collection method of any one of claims 1-7.
10. A computer readable storage medium storing a computer program, wherein the computer program when executed by a processor implements the sample data collection method of any one of claims 1-7.
CN202310462036.0A 2023-04-26 2023-04-26 Sample data acquisition method and device, mobile terminal and storage medium Pending CN116541376A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310462036.0A CN116541376A (en) 2023-04-26 2023-04-26 Sample data acquisition method and device, mobile terminal and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310462036.0A CN116541376A (en) 2023-04-26 2023-04-26 Sample data acquisition method and device, mobile terminal and storage medium

Publications (1)

Publication Number Publication Date
CN116541376A true CN116541376A (en) 2023-08-04

Family

ID=87453503

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310462036.0A Pending CN116541376A (en) 2023-04-26 2023-04-26 Sample data acquisition method and device, mobile terminal and storage medium

Country Status (1)

Country Link
CN (1) CN116541376A (en)

Similar Documents

Publication Publication Date Title
CN113254323B (en) Online full link voltage measurement method and device and computer equipment
CN111105150A (en) Project risk analysis system
US20030163469A1 (en) System and method for predicting execution time of a database utility command
CN108388513B (en) Automatic testing method and device
CN116541376A (en) Sample data acquisition method and device, mobile terminal and storage medium
CN100576182C (en) The real-time monitoring system of computer documents and method
CN111258765A (en) Load balancing method and device, computing equipment and storage medium
EP3786750B1 (en) Data collection system, data collection method, and program
CN109933798B (en) Audit log analysis method and audit log analysis device
CN110457187B (en) TPC-E test method and test server capable of conveniently running based on backup test data
CN112631929A (en) Test case generation method and device, storage medium and electronic equipment
CN110348984B (en) Automatic credit card data input method and related equipment under different transaction channels
JP4583260B2 (en) General-purpose computer operation procedure creation device, program, and storage medium
CN109635033B (en) Method for processing million-level stock data, collecting logs and importing logs into database
CN113886235A (en) Test plan determination method, test plan determination device, test plan determination medium, and electronic device
JPH0152144B2 (en)
CN108897873A (en) A kind of method, apparatus, storage medium and processor generating job file
CN113110292B (en) Machine tool working state prediction method and system based on time sequence power data
JP7221471B1 (en) DEBUG SUPPORT DEVICE, CONTROL SYSTEM, DEBUG SUPPORT METHOD AND PROGRAM
CN112866044B (en) Network equipment state information acquisition method and device
CN112565015B (en) Internet of things communication method and device, computer equipment and storage medium
KR20000055986A (en) System and method for automatically extracting and verifing the data
CN112905602B (en) Data comparison method, computing device and computer storage medium
CN109491845B (en) Test method and system for storage product controller
US10534344B2 (en) Operation management system and measurement system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination