CN110515974A - Data pick-up method, apparatus, computer equipment and storage medium - Google Patents

Data pick-up method, apparatus, computer equipment and storage medium Download PDF

Info

Publication number
CN110515974A
CN110515974A CN201910634368.6A CN201910634368A CN110515974A CN 110515974 A CN110515974 A CN 110515974A CN 201910634368 A CN201910634368 A CN 201910634368A CN 110515974 A CN110515974 A CN 110515974A
Authority
CN
China
Prior art keywords
data
extracted
condition
memory
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910634368.6A
Other languages
Chinese (zh)
Other versions
CN110515974B (en
Inventor
张国锐
戴勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Kingdee Software China Co Ltd
Original Assignee
Kingdee Software China Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Kingdee Software China Co Ltd filed Critical Kingdee Software China Co Ltd
Priority to CN201910634368.6A priority Critical patent/CN110515974B/en
Publication of CN110515974A publication Critical patent/CN110515974A/en
Application granted granted Critical
Publication of CN110515974B publication Critical patent/CN110515974B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24553Query execution of query operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Telephonic Communication Services (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

This application involves a kind of data pick-up method, apparatus, computer equipment and storage mediums.The described method includes: obtaining the data pick-up task for carrying Data Identification to be extracted;Data to be extracted corresponding with the Data Identification to be extracted are obtained according to the data pick-up task;Intermediate data is extracted in paging from the data to be extracted, includes error identification data in the intermediate data;Count various features information corresponding to the error identification data;The corresponding data error rate of the characteristic information described in the intermediate data is calculated, the characteristic information that the data error rate is greater than first threshold is labeled as target signature;Extraction condition is generated according to the target signature;It is extracted from the data to be extracted according to the extraction condition, obtains the target data for meeting the extraction condition.It can be improved the accuracy of data pick-up using this method.

Description

Data pick-up method, apparatus, computer equipment and storage medium
Technical field
This application involves field of computer technology, more particularly to a kind of data pick-up method, apparatus, computer equipment and Storage medium.
Background technique
With the development of computer technology, computer can handle a large amount of data.Just for the data that guarantee True property, user usually require to check obtained data.Since the magnanimity of data increases, it is non-that whole data are examined successively Often expend time cost and working resource.Therefore, there is the sampling observation mode to data.By to the partial data extracted into Row checks, the overall data quality of total data is assessed with this.
In conventional manner, the mode for extracting data is usually to be directed to whole data disposably to randomly select out part number According to for checking.But the data that extract of this mode are a part in overall data, the data randomly selected There are contingency, can not accurately reflect the quality of data of overall data.Therefore, in order to which the data extracted can be accurate Reflection overall data the quality of data, how accurately to extract data as the current technical issues that need to address.
Summary of the invention
Based on this, it is necessary to which, for the technical problem of above-mentioned data pick-up inaccuracy, providing one kind can be improved data pumping Take data pick-up method, apparatus, computer equipment and the storage medium of accuracy.
A kind of data pick-up method, which comprises
Obtain the data pick-up task for carrying Data Identification to be extracted;
Data to be extracted corresponding with the Data Identification to be extracted are obtained according to the data pick-up task;
Intermediate data is extracted in paging from the data to be extracted, includes error identification data in the intermediate data;
Count various features information corresponding to the error identification data;
The corresponding data error rate of the characteristic information described in the intermediate data is calculated, by the data error rate Characteristic information greater than first threshold is labeled as target signature;
Extraction condition is generated according to the target signature;
It is extracted from the data to be extracted according to the extraction condition, obtains the target for meeting the extraction condition Data.
It is corresponding with confidence that the intermediate data is also carried in the data pick-up task in one of the embodiments, It ceases, includes quantity accounting in the configuration information;Intermediate data is extracted in the paging from the data to be extracted
Memory source information is obtained, data capacity condition is determined according to the memory source information;
The initial data for meeting the data capacity condition is extracted from the data to be extracted;
The initial data is filtered based on filter condition, obtains filtered data;
It is randomly selected from the filtered data according to the quantity accounting, obtains intermediate data;
Repeat the step for extracting from the data to be extracted and meeting the initial data of the data capacity condition Suddenly, until traversing all data to be extracted.
It further include in one of the embodiments, attribute information in the configuration information;It is described according to the memory source Information determines that data capacity condition includes:
The memory footprint of the corresponding data of each attribute is determined according to the attribute information;
Count the memory footprint for the corresponding data of all properties that intermediate data described in the configuration information includes The sum of, obtain the corresponding memory footprint of the intermediate data;
Calculate the memory source space memory footprint corresponding with the intermediate data in the memory source information Ratio, generate data capacity condition.
It is described in one of the embodiments, to include: according to target signature generation extraction condition
The target signature of different characteristic type is combined, multiple combination conditions are obtained;
Corresponding test data is extracted from the intermediate data according to the combination condition;
The corresponding data error rate of the combination condition described in the intermediate data is calculated using the test data;
By the data error rate be greater than second threshold and comprising the most combination condition of the target signature be labeled as Extraction condition.
It is extracted from the data to be extracted described according to the extraction condition in one of the embodiments, After obtaining the step of meeting the target data of the extraction condition, the method also includes:
The corresponding data volume of target data data volume corresponding with the intermediate data is compared;
When the corresponding data volume of target data data volume corresponding greater than the intermediate data, then from the target The target data of the intermediate data corresponding data amount is randomly selected in data.
A kind of data pick-up device, described device include:
Task acquisition module, for obtaining the data pick-up task for carrying Data Identification to be extracted;
Data acquisition module, for according to the data pick-up task obtain it is corresponding with the Data Identification to be extracted to Extract data;
Data extraction module wraps in the intermediate data for the paging extraction intermediate data from the data to be extracted Include error identification data;
Characteristic information statistical module, for counting various features information corresponding to the error identification data;
Target signature mark module, it is wrong for calculating the corresponding data of the characteristic information described in the intermediate data The characteristic information that the data error rate is greater than first threshold is labeled as target signature by accidentally rate;
Extraction condition generation module, for generating extraction condition according to the target signature;
The data extraction module is also used to be extracted from the data to be extracted according to the extraction condition, obtains Meet the target data of the extraction condition.
It is corresponding with confidence that the intermediate data is also carried in the data pick-up task in one of the embodiments, It ceases, includes quantity accounting in the configuration information;The data extraction module is also used to obtain memory source information, according to described Memory source information determines data capacity condition;It is extracted from the data to be extracted and meets the original of the data capacity condition Data;The initial data is filtered based on filter condition, obtains filtered data;From the filtered data It is randomly selected according to the data accounting, obtains intermediate data;It is extracted from the data to be extracted described in repeating The step of meeting the initial data of the data capacity condition, until traversing all data to be extracted.
It further include in one of the embodiments, attribute information in the configuration information;The data extraction module is also used In the memory footprint for determining the corresponding data of each attribute according to the attribute information;It counts described in the configuration information The sum of the memory footprint of the corresponding data of all properties that intermediate data includes obtains the corresponding memory of the intermediate data Occupied space;Calculate the memory source space memory footprint corresponding with the intermediate data in the memory source information Ratio, generate data capacity condition.
A kind of computer equipment, including memory and processor, the memory are stored with computer program, the processing The step of device realizes above-mentioned data pick-up method when executing the computer program.
A kind of computer readable storage medium, is stored thereon with computer program, and the computer program is held by processor Above-mentioned data pick-up method and step is realized when row.
Above-mentioned data pick-up method, apparatus, computer equipment and storage medium are obtained according to obtained data pick-up task Data to be extracted, paging extracts intermediate data from data to be extracted, and median average error identification data in are corresponding Various features information.By the characteristic information of mistake of statistics mark data, the higher target signature of error rate is determined, according to mesh It marks feature and generates extraction condition, the target data for meeting extraction condition is extracted from data to be extracted.Compared with traditional approach, lead to The corresponding various features information of the error identification data counted in data to be extracted is crossed, the extraction item for meeting target signature is generated Part, the target data extracted according to extraction condition can accurately reflect the quality of data of whole data to be extracted, effectively Improve the accuracy of data pick-up.
Detailed description of the invention
Fig. 1 is the applied environment figure of data pick-up method in one embodiment;
Fig. 2 is the flow diagram of data pick-up method in one embodiment;
Fig. 3 is flow diagram the step of generating extraction condition according to target signature in one embodiment;
Fig. 4 is the structural block diagram of data pick-up device in one embodiment;
Fig. 5 is the internal structure chart of computer equipment in one embodiment.
Specific embodiment
It is with reference to the accompanying drawings and embodiments, right in order to which the objects, technical solutions and advantages of the application are more clearly understood The application is further elaborated.It should be appreciated that specific embodiment described herein is only used to explain the application, not For limiting the application.
Data pick-up method provided by the present application, can be applied in terminal, also can be applied to application as shown in Figure 1 In environment.Here for being applied in application environment as shown in Figure 1.Wherein, terminal 102 passes through network and server 104 It is communicated.Terminal 102 can upload the data pick-up request for carrying Data Identification to be extracted, server to server 104 According to the data pick-up request received, the data pick-up task for carrying Data Identification to be extracted is generated.104 basis of server Data pick-up task obtains data to be extracted corresponding with Data Identification to be extracted, and mediant is extracted in paging from data to be extracted According to including error identification data in intermediate data.Various features information corresponding to 104 mistake of statistics mark data of server, The corresponding data error rate of characteristic information in intermediate data is calculated, the feature that data error rate is greater than first threshold is believed Breath is labeled as target signature.Server 104 generates extraction condition according to target signature, according to extraction condition from data to be extracted It is extracted, obtains the target data for meeting extraction condition.Obtained target data can be back to terminal by server 104 102.Wherein, terminal 102 can be, but not limited to be various personal computers, laptop, smart phone, tablet computer and just Take formula wearable device, server 104 can with the server cluster of the either multiple servers compositions of independent server come It realizes.
In one embodiment, as shown in Fig. 2, providing a kind of data pick-up method, this method can apply and terminal, Also it can be applied to the server in Fig. 1.Here it is applied to be illustrated for the server in Fig. 1 in this way, including with Lower step:
Step 202, the data pick-up task for carrying Data Identification to be extracted is obtained.
Server can obtain data pick-up task in several ways.For example, server can be obtained from task list Take data pick-up task.Server can also be requested by receiving the data pick-up that terminal uploads, and be built according to data pick-up request Vertical data pick-up task.Carry Data Identification to be extracted in data pick-up task, Data Identification to be extracted be for mark to Extract the mark of data.A variety of identifier markings data to be extracted can be used.For example, Data Identification to be extracted can be wait take out Access is according to corresponding number.When data to be extracted are stored in the form of tables of data, Data Identification to be extracted is also possible to wait take out The table name fetched according to corresponding tables of data.
Step 204, data to be extracted corresponding with Data Identification to be extracted are obtained according to data pick-up task.
Server can execute the data pick-up task got.Exist between Data Identification to be extracted and data to be extracted Corresponding relationship, server can obtain from database corresponding with Data Identification to be extracted to be extracted according to data pick-up task Data.Data to be extracted can be stored in the database in the form of tables of data, and data to be extracted can be more in tables of data A field, the collection of the corresponding multiple fields of a line is combined into a data to be extracted in tables of data.
Data to be extracted can be the data in multiple fields.For example, in Shared Service Center (Shared Services Center, SSC) in, data to be extracted can be the corresponding task data of document.The document of every part of uploading system can correspond to One task.Every part of document can correspond to one of multiple-task type, for example can be document and check task, document audit Task or document check task dispatching.It can reflect the whole number of the corresponding task data of document by the task data extracted According to quality.Data to be extracted can also be the operation data of power equipment in power domain, pass through the power equipment that extracts Operation data can reflect the overall data quality of power equipment operation data.
Step 206, intermediate data is extracted in paging from data to be extracted, includes error identification data in intermediate data.
Server is when in face of a large amount of data to be extracted, in such a way that paging is extracted from randomly selecting in data to be extracted Between data.Specifically, server determines every page of corresponding data volume of data to be extracted that can be extracted according to memory source information. Server can carry out paging extraction to data to be extracted using various ways.It is to be extracted that server can call single thread to extract Data also can use multi-threaded parallel and extract data to be extracted.For example, server can successively be extracted with single thread every page to Data are extracted, extract the data to be extracted of every page of corresponding data volume from data to be extracted when extracting every time, and cached. It is extracted, is recycled until traversing all data to be extracted next time again after one page data pick-up to be extracted.Service Device can also carry out paging to all data to be extracted according to the memory source of server, extract every page using multi-threaded parallel Data to be extracted.
Server randomly selects out the data to be extracted of part as intermediate data, by every page from every page of data to be extracted It extracts the intermediate data gone to be summarized, obtains total intermediate data.Including error identification data and correctly in intermediate data Mark data.All data to be extracted are disposably extracted in conventional manner, and it is corresponding random to record all data to be extracted The serial number for extracting position can occupy unnecessary memory source.When the data volume of data to be extracted is more, in addition number to be extracted According to occupied memory source itself, it be easy to cause the spilling of memory.And intermediate data is extracted by paging in the present embodiment, Do not need to cache all data to be extracted, the memory source of the data volume corresponding server of the data to be extracted of every page of extraction, The spilling of memory is not will cause.
Step 208, various features information corresponding to mistake of statistics mark data.
There are corresponding a variety of attributes, every attribute can correspond to one or more characteristic informations for data to be extracted.It is different Data to be extracted same alike result corresponding to characteristic information can be identical, be also possible to different.Server can be with Median average various features information corresponding to error identification data in, obtained characteristic information is for indicating error identification Feature possessed by data.When data to be extracted are the set of multiple fields in tables of data, each field can be to be extracted The corresponding attribute of data.The corresponding information of every attribute is the corresponding characteristic information of error identification data.
For example, error identification data are that the centre extracted is appointed when data to be extracted task data corresponding for document There is the task data of mistake in data of being engaged in.The corresponding various features information of server statistics error identification data.Task data It may include a variety of attributes, such as executable unit, task type, processing people and creation time etc..Using attribute as executable unit For, the corresponding various features information of task data can include unit A, unit B or unit C for executable unit.
Step 210, the corresponding data error rate of characteristic information in intermediate data is calculated, data error rate is greater than The characteristic information of first threshold is labeled as target signature.
Server reads the part mediant for meeting characteristic information respectively according to various features information from intermediate data According to.Wherein, server can read part corresponding with characteristic information intermediate data using various ways.For example, server can To be successively read the corresponding part intermediate data of each characteristic information using single thread.Server also can use multi-threaded parallel Read the corresponding part intermediate data of various features information.
It include correct mark data and error identification data in the part intermediate data that server is read, server by utilizing is just True mark data and the corresponding data error rate of the corresponding characteristic information of error identification data calculating.Specifically, server calculates Ratio between the corresponding error identification data of characteristic information part intermediate data corresponding with characteristic information, obtains characteristic information Corresponding data error rate.
For example, the data volume of data to be extracted is M, server paging from M item data to be extracted is extracted to obtain mediant According to data volume be N.The corresponding various features information of error identification data in N intermediate data of server statistics.With a variety of spies For the corresponding characteristic information of executable unit that may include in reference breath, characteristic information is that executable unit is unit A.Server The part intermediate data for meeting that executable unit is unit A is read from N intermediate data.It is unit that server, which calculates executable unit, The ratio for the part intermediate data that error identification data and executable unit in the part intermediate data of A are unit A, obtains feature Information is that executable unit is the corresponding data error rate of unit A.Wherein, M and N is positive integer, and M is greater than N.A is multiple holds The mark of an executable unit in row unit.
The corresponding data error rate of multiple characteristic informations is compared with first threshold by server respectively, by error in data The characteristic information that rate is greater than first threshold is labeled as target signature.Wherein, first threshold can be configured according to actual needs. First threshold can be the percentage between 0-1, for example can be 85%.Server is by by first threshold and data error rate It is compared, the multiple characteristic informations obtained to statistics screen, and filter out the feature that data error rate is greater than first threshold The biggish characteristic information of data error rate is labeled as target signature by information.
Step 212, extraction condition is generated according to target signature.
Step 214, it is extracted from data to be extracted according to extraction condition, obtains the number of targets for meeting extraction condition According to.
The corresponding target signature of different attribute is combined by server, obtains multiple combinations condition.Server can be adopted Corresponding target signature between different attribute is combined with the mode of permutation and combination, generates extraction condition, the extraction condition energy of generation Enough accurately filter out meets the corresponding data of the higher target signature of error rate in data to be extracted.Server by utilizing counts The extraction conditions of higher error rates extract the target data in data to be extracted, if the corresponding error rate of target data is higher, Then indicate that the overall data quality of data to be extracted is lower.If the corresponding error rate of target data is lower, then it represents that number to be extracted According to overall data quality it is higher.The target data extracted according to extraction condition can accurately react the whole of data to be extracted Volume data quality.Server is extracted from data to be extracted according to extraction condition, obtains the number of targets for meeting extraction condition According to the data pick-up task that data to be extracted are extracted in completion, server can will extract obtained target data and send To corresponding terminal.
In the present embodiment, data to be extracted, the paging from data to be extracted are obtained according to obtained data pick-up task Extract intermediate data, median average corresponding various features information of error identification data in.It is identified by mistake of statistics The characteristic information of data determines the higher target signature of error rate, generates extraction condition according to target signature.From number to be extracted The target data for meeting extraction condition is obtained according to middle extraction.Compared with traditional approach, by counting the mistake in data to be extracted The corresponding various features information of mark data generates the extraction condition for meeting target signature, the mesh extracted according to extraction condition Mark data can accurately reflect the quality of data of whole data to be extracted, effectively raise the accuracy of data pick-up.
In one embodiment, the corresponding configuration information of intermediate data, configuration information are also carried in data pick-up task In include quantity accounting, from data to be extracted paging extract intermediate data includes: obtain memory source information, provided according to memory Source information determines data capacity condition;The initial data for meeting data capacity condition is extracted from data to be extracted;Based on filtering Condition is filtered initial data, obtains filtered data;It is carried out at random from filtered data according to quantity accounting It extracts, obtains intermediate data;It repeats and extracts the step of meeting the initial data of data capacity condition from data to be extracted, Until traversing all data to be extracted.
The corresponding configuration information of intermediate data can also be carried in data pick-up task, server can be according to intermediate data Corresponding configuration information extracts intermediate data from data to be extracted.It may include centre in the corresponding configuration information of intermediate data The corresponding much information of data.For example, can include but is not limited in configuration information between intermediate data and data to be extracted The a variety of attributes and the corresponding attribute type of each attribute that data accounting, filter condition, intermediate data include.Wherein, A variety of attributes that intermediate data includes can be identical as a variety of attributes that data to be extracted include, and is also possible to data packet to be extracted The a part in a variety of attributes included.The corresponding attribute information of initial data attribute information corresponding with intermediate data is identical.
Server obtains the memory source information of memory, may include the size of memory headroom in memory source information.Clothes Business device determines the data capacity of data to be extracted that can be extracted and cache every time according to the size of memory headroom.Server according to Data capacity generates data capacity condition, i.e., the data volume for the data to be extracted extracted every time should be less than or equal to the number of memory According to capacity, prevents the data volume of the data to be extracted disposably extracted to be greater than memory headroom with this, server memory is avoided to overflow Out.
Server can extract the initial data for meeting data capacity condition using various ways from data to be extracted, will The initial data extracted is extracted as page of data.For example, server can be according to the corresponding serial number of data to be extracted Sequence extracts the initial data for meeting the data volume of data capacity condition, and server can also extract at random from data to be extracted Meet the initial data of the data volume of data capacity condition.
Server obtains the filter condition in configuration information, and the initial data based on filter condition to every page is filtered, Obtain filtered data.It wherein, may include one or more filter conditions in configuration information.Filter condition can according to The corresponding attribute of data is extracted to be configured.When data to be extracted task data corresponding for document, filter condition be can wrap The task filter condition for task data is included, can also include the invoice filtrating condition that task data corresponds to document.Wherein In one embodiment, server can also successively judge whether data to be extracted meet filter condition.When data fit to be extracted When filter condition, then corresponding data to be extracted are filtered.When data to be extracted do not meet filter condition, then by corresponding wait take out Access is according to extracting and caching into memory, as initial data, until the data volume of initial data reaches the data capacity of memory.
Server reads the quantity accounting in the corresponding configuration information of intermediate data, and quantity accounting indicates the centre extracted Percentage between data and filtered data.Server is accounted for according to the quantity in filtered data and configuration information Than determining the quantity from the intermediate data of this page of filtered data pick-up.Specifically, server can calculate filtered number According to quantity and quantity accounting product, obtain the quantity of intermediate data.
Server randomly selects the filtered data of destination number as intermediate data from filtered data.Specifically , server can call the corresponding serial number of filtered data of random algorithm acquisition destination number, according to what is be randomly derived Serial number extracts intermediate data.
After server is drawn into intermediate data in the initial data from the every page of extraction, it can repeat to hold with single thread Row extracts the step of meeting the initial data of data capacity condition from data to be extracted, continue from the initial data of extraction with Machine extracts intermediate data.Until traversed all data to be extracted, then terminate paging extraction, server can will be each The intermediate data that page extracts is summarized, in order to various features corresponding to error identification data of the median average in Information.
In the present embodiment, server by from data to be extracted paging extract and meet the original number of data capacity condition According to, prevent from disposably caching a large amount of data to be extracted and caused by memory overflow.Initial data is carried out based on filter condition Filtering, improves the accuracy of filtered data.Intermediate data, In are randomly selected by quantity accounting from filtered data Paging ensure that the consistency for the probability that all data to be extracted can be drawn into the case where extracting.Server is until traversing institute The data to be extracted having then stop paging and extract intermediate data, avoid after being extracted initial data, increase again in data to be extracted The new data that add and the case where can not extract again, effectively raise the real-time of intermediate data extraction.
In one embodiment, server may determine that calculate filtered data quantity and quantity accounting product it Afterwards, the quantity of intermediate data is obtained.When the quantity for the intermediate data being calculated is integer, then directly according to intermediate data Quantity randomly selects intermediate data from filtered data.When the quantity for the intermediate data being calculated is decimal, service Device is then rounded the decimal being calculated, and randomly selects intermediate data from filtered data according to the quantity after rounding.Tool Body, server randomly selects intermediate data according to the integer part for the decimal being calculated from filtered data, and remembers Record the fractional part in the decimal being calculated.Server obtains when extracting the numerical value of the fractional part of record with paging later The numerical value of fractional part of intermediate data quantity add up.When cumulative obtained value is greater than 1, then in extract next time Between data when extract an intermediate data more, and to cumulative obtained value go it is whole after continued with the numerical value of fractional part it is subsequent It is cumulative.
In the present embodiment, server when calculate the quantity of filtered data and the percentage that need to be extracted product it Afterwards, the quantity of obtained intermediate data is judged.When the quantity of intermediate data is decimal, then extraction mediant is rounded According to the value of fractional part and the value of the subsequent fractional part being calculated are added up.When cumulative obtained value is greater than 1, Then next time extract intermediate data when more extract an intermediate data, be effectively guaranteed paging extract when every it is filtered The consistency for the probability that data are drawn into.
In one embodiment, further include attribute information in configuration information, data capacity is determined according to memory source information Condition includes: that the memory footprint of the corresponding data of each attribute is determined according to attribute information;Count intermediate in configuration information It is empty to obtain the corresponding EMS memory occupation of intermediate data for the sum of the memory footprint of the corresponding data of all properties that data include Between;The ratio of the memory source space memory footprint corresponding with intermediate data in memory source information is calculated, number is generated According to capacity conditions.
It include the attribute information and the corresponding attribute classification of attribute of intermediate data, clothes in the configuration information of intermediate data Device of being engaged in can extract corresponding intermediate data in data to be extracted according to the attribute information in intermediate data configuration information, in every Between the corresponding attribute of data be all identical.The corresponding attribute information of intermediate data may include that data to be extracted are corresponding all Attribute information also may include the corresponding part attribute information of data to be extracted.
Server determines the memory of the occupancy of the corresponding data of each attribute by reading the attribute information in configuration information The size in space.Specifically, including the corresponding attribute type of attribute in attribute information, server is according to the corresponding Attribute class of attribute The size in committed memory space needed for type determines.For example, in attribute information include two attributes, respectively varchar (20) and char(1).The memory headroom that server can determine that two attributes are occupied according to attribute information is respectively 20 bytes and 1 byte. The all properties that server statistics configuration information intermediate data includes, the corresponding memory footprint of intermediate data are all properties The summation for the memory headroom that corresponding data occupy.For example, when intermediate data only includes above-mentioned two attribute, intermediate data pair The memory footprint answered is 21 bytes.
Server can by calculating the ratio of corresponding with the intermediate data memory footprint in memory source space, Obtain the item number for the intermediate data that can be cached in memory.The item number for the intermediate data that server can be cached according to memory generates Data capacity condition, data capacity condition, which can should be less than or equal to memory for the corresponding data volume of intermediate data, to be cached Intermediate data item number.
In the present embodiment, server is by calculating the corresponding memory footprint of every intermediate data, calculation server The ratio in memory source space and intermediate data memory footprint generates data capacity condition according to calculated result.Server When data are extracted in each paging from data to be extracted, the data for meeting data capacity condition, the memory effectively avoided are extracted It overflows.
In one embodiment, as shown in figure 3, the step of generating extraction condition according to target signature includes:
Step 302, the target signature of different characteristic type is combined, obtains multiple combination conditions.
Step 304, corresponding test data is extracted from intermediate data according to combination condition.
Step 306, the corresponding data error rate of combination condition in intermediate data is calculated using test data.
Step 308, by data error rate be greater than second threshold and comprising the most combination condition of target signature be labeled as Extraction condition.
The target signature of obtained various features type is carried out permutation and combination by server, obtains multiple combination conditions.Its In, the corresponding feature of same attribute is the feature of same characteristic type, without row between the target signature of same characteristic features type Column combination.For example, error identification data may include three attribute, respectively a, b and c.The corresponding characteristic information of three attribute Respectively include a1, a2, b1, b2, c1 and c2.Server can by calculate six kinds of characteristic informations respectively in intermediate data it is right Multiple data error rates are compared with first threshold by the data error rate answered respectively, according to comparison result by a1, a2, b1 And c1 is labeled as target signature.Server by multiple target signatures carry out permutation and combination, obtain include a1b1, a1c1, a2b1, Multiple combination conditions of a2c1, b1c1, a1b1c1 and a2b1c1.
Server extracts the test data for meeting combination condition respectively from the intermediate data extracted, wraps in test data Include error identification data and correct mark data.Server is utilized respectively error identification data and test number in test data According to calculating the data error rate of corresponding combination condition.Specifically, server calculates error identification data in test data and surveys Try data between ratio, using obtained ratio as combination condition the corresponding data error rate in intermediate data.Each group The corresponding data error rate in intermediate data can be calculated in conjunction condition.
The corresponding data error rate of multiple combination conditions being calculated is compared server with second threshold, will count According to error rate be greater than second threshold and comprising the most combination condition of target signature be labeled as extraction condition.Wherein, the second threshold Value can be configured according to actual needs.Second threshold can be the percentage between 0-1, for example can be 80%.Second Threshold value can be identical as first threshold, can also be different from first threshold.For example, in said combination condition only have a1c1 and The corresponding data error rate of a1b1c1 is greater than second threshold, and server then reads the target signature for including in two combination conditions The label that combination condition is a1b1c1 is by quantity.Server is extracted from data to be extracted according to extraction condition Obtain the target data for meeting extraction condition.
In the present embodiment, the target signature of different characteristic type is combined by server, obtains multiple combination conditions, It is wrapped by judging whether the corresponding data error rate of combination condition is greater than in second threshold and combination condition in intermediate data The quantity of the target signature included, by data error rate be greater than second threshold and comprising the most combination condition of target signature mark For extraction condition.Server is taken out from data to be extracted according to the extraction condition for including error identification datum target feature It takes, effectively raises the accuracy of data pick-up.
In one embodiment, server can use multi-threaded parallel and the target signature of different characteristic type carried out group It closes.Specifically, server by utilizing multi-threaded parallel composite object feature, and increase a target signature every time.It is every to increase by one After a target signature obtains combination condition, combination condition corresponding data error rate in intermediate data is just calculated.Work as combination When the corresponding data error rate of condition is greater than second threshold, then a target signature is continued growing, until combination condition is corresponding Data error rate is less than or equal to second threshold, or when having traversed all target signatures, then stops, obtaining combination condition.
In the present embodiment, the target signature of server by utilizing multi-threaded parallel combination different characteristic type, and generating The corresponding data error rate of combination condition be greater than second threshold when, just continuation composite object feature.When combination condition is corresponding When data error rate is less than second threshold, continuing growing target signature only can allow corresponding data error rate smaller, reduce not Necessary calculating process has effectively saved the operation resource of server.
In one embodiment, it is being extracted from data to be extracted according to extraction condition, is obtaining meeting extraction condition Target data the step of after, above-mentioned data pick-up method further include: by the corresponding data volume of target data and intermediate data Corresponding data volume is compared;When the corresponding data volume of target data data volume corresponding greater than intermediate data, then from mesh The target data of intermediate data corresponding data amount is randomly selected in mark data.
The corresponding data volume of the target data extracted data volume corresponding with intermediate data is compared server.When When the corresponding data volume of target data is less than intermediate data corresponding data volume, then all target datas are directly acquired.Work as mesh When marking the corresponding data volume of data and being greater than the corresponding data volume of intermediate data, then intermediate data pair is randomly selected from target data Answer the target data of data volume.Wherein, server randomly selects the number of targets of intermediate data corresponding data amount from target data According to mode of randomly selecting can be similar with the mode of intermediate data is randomly selected in above-described embodiment from filtered data, Therefore details are not described herein.
In the present embodiment, since the data volume of intermediate data is configured according to the operating condition of server, server The data volume of target data is compared with the data volume of intermediate data.When the corresponding data volume of target data is greater than mediant When according to corresponding data volume, then the target data of intermediate data corresponding data amount is randomly selected from target data.Target data It is to be extracted from data to be extracted according to extraction condition, clothes can be adapted in the case where not influencing to extract the accuracy of data The service ability of business device, reduces the data processing load of server.
In one embodiment, after obtaining meeting the target data of extraction condition, server can also be utilized and be obtained Target data the quality of data of data to be extracted is assessed, obtain assessment result.Server can use various ways The quality of data is assessed.For example, server can calculate ratio shared by error identification data in target data.Work as target When ratio shared by error identification data is more than preset value in data, it is determined that the quality of data is poor.When error identification data institute The ratio accounted for is not above preset value, it is determined that the quality of data is preferable.Server can also be by calculating mistake in target data Error present in mark data.When the error existing for the error identification data is greater than preset value, it is determined that the quality of data is poor. When the error existing for the error identification data is less than or equal to preset value, it is determined that the quality of data is preferable.Server can recorde Obtained data quality accessment result can also be back to corresponding terminal by obtained assessment result.
In the present embodiment, the accurate target data that server by utilizing extracts to the qualities of data of data to be extracted into Row assessment, effectively raises the accuracy of data quality accessment result.
It should be understood that although each step in the flow chart of Fig. 2-3 is successively shown according to the instruction of arrow, These steps are not that the inevitable sequence according to arrow instruction successively executes.Unless expressly stating otherwise herein, these steps Execution there is no stringent sequences to limit, these steps can execute in other order.Moreover, at least one in Fig. 2-3 Part steps may include that perhaps these sub-steps of multiple stages or stage are not necessarily in synchronization to multiple sub-steps Completion is executed, but can be executed at different times, the execution sequence in these sub-steps or stage is also not necessarily successively It carries out, but can be at least part of the sub-step or stage of other steps or other steps in turn or alternately It executes.
In one embodiment, as shown in figure 4, providing a kind of data set, comprising: task acquisition module 402, data Obtain module 404, data extraction module 406, characteristic information statistical module 408, target signature mark module 410 and extraction condition Generation module 412, in which:
Task acquisition module 402, for obtaining the data pick-up task for carrying Data Identification to be extracted.
Data acquisition module 404, it is corresponding with Data Identification to be extracted to be extracted for being obtained according to data pick-up task Data.
Data extraction module 406 includes mistake in intermediate data for the paging extraction intermediate data from data to be extracted Mark data.
Characteristic information statistical module 408, for various features information corresponding to mistake of statistics mark data.
Target signature mark module 410, for calculating the corresponding data error rate of characteristic information in intermediate data, The characteristic information that data error rate is greater than first threshold is labeled as target signature.
Extraction condition generation module 412, for generating extraction condition according to target signature.
Above-mentioned data extraction module 406 is also used to be extracted from data to be extracted according to extraction condition, is met The target data of extraction condition.
In one embodiment, the corresponding configuration information of intermediate data, configuration information are also carried in data pick-up task In include quantity accounting;Above-mentioned data extraction module 406 is also used to obtain memory source information, is determined according to memory source information Data capacity condition;The initial data for meeting data capacity condition is extracted from data to be extracted;Based on filter condition to original Data are filtered, and obtain filtered data;It is randomly selected from filtered data, is obtained according to data accounting Between data;It repeats and extracts the step of meeting the initial data of data capacity condition from data to be extracted, until traversing institute There are data to be extracted.
It in one embodiment, further include attribute information in configuration information;Above-mentioned data extraction module 406 is also used to basis Attribute information determines the memory footprint of the corresponding data of each attribute;What intermediate data included in statistics configuration information is all The sum of the memory footprint of the corresponding data of attribute obtains the corresponding memory footprint of intermediate data;Calculate memory source The ratio of memory source space memory footprint corresponding with intermediate data in information generates data capacity condition.
In one embodiment, above-mentioned extraction condition generation module 412 is also used to the target signature of different characteristic type It is combined, obtains multiple combination conditions;Corresponding test data is extracted from intermediate data according to combination condition;Utilize test Data calculate the corresponding data error rate of combination condition in intermediate data;By data error rate be greater than second threshold and comprising The most combination condition of target signature is labeled as extraction condition.
In one embodiment, above-mentioned data extraction module 406 is also used to the corresponding data volume of target data and centre The corresponding data volume of data is compared;When the corresponding data volume of target data data volume corresponding greater than intermediate data, then The target data of intermediate data corresponding data amount is randomly selected from target data.
Specific about data pick-up device limits the restriction that may refer to above for data pick-up method, herein not It repeats again.Modules in above-mentioned data pick-up device can be realized fully or partially through software, hardware and combinations thereof.On Stating each module can be embedded in the form of hardware or independently of in the processor in computer equipment, can also store in a software form In memory in computer equipment, the corresponding operation of the above modules is executed in order to which processor calls.
In one embodiment, a kind of computer equipment is provided, which can be terminal, be also possible to take Business device.Here it is illustrated by taking server as an example, internal structure chart can be as shown in Figure 5.The computer equipment includes passing through Processor, memory, network interface and the database of system bus connection.Wherein, the processor of the computer equipment is for mentioning For calculating and control ability.The memory of the computer equipment includes non-volatile memory medium, built-in storage.This is non-volatile Storage medium is stored with operating system, computer program and database.The built-in storage is the behaviour in non-volatile memory medium The operation for making system and computer program provides environment.The database of the computer equipment extracts data for storing data.It should The network interface of computer equipment is used to communicate with external terminal by network connection.The computer program is executed by processor When to realize a kind of data pick-up method.
It will be understood by those skilled in the art that structure shown in Fig. 5, only part relevant to application scheme is tied The block diagram of structure does not constitute the restriction for the computer equipment being applied thereon to application scheme, specific computer equipment It may include perhaps combining certain components or with different component layouts than more or fewer components as shown in the figure.
In one embodiment, a kind of computer equipment, including memory and processor are provided, is stored in memory Computer program, the processor realize the step in above-mentioned data pick-up embodiment of the method when executing computer program.
In one embodiment, a kind of computer readable storage medium is provided, computer program is stored thereon with, is calculated Machine program realizes the step in above-mentioned data pick-up embodiment of the method when being executed by processor.
Those of ordinary skill in the art will appreciate that realizing all or part of the process in above-described embodiment method, being can be with Relevant hardware is instructed to complete by computer program, the computer program can be stored in a non-volatile computer In read/write memory medium, the computer program is when being executed, it may include such as the process of the embodiment of above-mentioned each method.Wherein, To any reference of memory, storage, database or other media used in each embodiment provided herein, Including non-volatile and/or volatile memory.Nonvolatile memory may include read-only memory (ROM), programming ROM (PROM), electrically programmable ROM (EPROM), electrically erasable ROM (EEPROM) or flash memory.Volatile memory may include Random access memory (RAM) or external cache.By way of illustration and not limitation, RAM is available in many forms, Such as static state RAM (SRAM), dynamic ram (DRAM), synchronous dram (SDRAM), double data rate sdram (DDRSDRAM), enhancing Type SDRAM (ESDRAM), synchronization link (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic ram (DRDRAM) and memory bus dynamic ram (RDRAM) etc..
Each technical characteristic of above embodiments can be combined arbitrarily, for simplicity of description, not to above-described embodiment In each technical characteristic it is all possible combination be all described, as long as however, the combination of these technical characteristics be not present lance Shield all should be considered as described in this specification.
The several embodiments of the application above described embodiment only expresses, the description thereof is more specific and detailed, but simultaneously It cannot therefore be construed as limiting the scope of the patent.It should be pointed out that coming for those of ordinary skill in the art It says, without departing from the concept of this application, various modifications and improvements can be made, these belong to the protection of the application Range.Therefore, the scope of protection shall be subject to the appended claims for the application patent.

Claims (10)

1. a kind of data pick-up method, which comprises
Obtain the data pick-up task for carrying Data Identification to be extracted;
Data to be extracted corresponding with the Data Identification to be extracted are obtained according to the data pick-up task;
Intermediate data is extracted in paging from the data to be extracted, includes error identification data in the intermediate data;
Count various features information corresponding to the error identification data;
The corresponding data error rate of the characteristic information described in the intermediate data is calculated, the data error rate is greater than The characteristic information of first threshold is labeled as target signature;
Extraction condition is generated according to the target signature;
It is extracted from the data to be extracted according to the extraction condition, obtains the number of targets for meeting the extraction condition According to.
2. the method according to claim 1, wherein also carrying the mediant in the data pick-up task It include quantity accounting in the configuration information according to corresponding configuration information;It is described from the data to be extracted paging extract in Between data include:
Memory source information is obtained, data capacity condition is determined according to the memory source information;
The initial data for meeting the data capacity condition is extracted from the data to be extracted;
The initial data is filtered based on filter condition, obtains filtered data;
It is randomly selected from the filtered data according to the quantity accounting, obtains intermediate data;
Described the step of extraction meets the initial data of the data capacity condition from the data to be extracted is repeated, directly To all data to be extracted of traversal.
3. according to the method described in claim 2, it is characterized in that, further including attribute information in the configuration information;Described Determine that data capacity condition includes: according to the memory source information
The memory footprint of the corresponding data of each attribute is determined according to the attribute information;
The sum of the memory footprint for the corresponding data of all properties that intermediate data described in the configuration information includes is counted, Obtain the corresponding memory footprint of the intermediate data;
Calculate the memory source space in the memory source information and the ratio of the corresponding memory footprint of the intermediate data Value generates data capacity condition.
4. the method according to claim 1, wherein described generate extraction condition packet according to the target signature It includes:
The target signature of different characteristic type is combined, multiple combination conditions are obtained;
Corresponding test data is extracted from the intermediate data according to the combination condition;
The corresponding data error rate of the combination condition described in the intermediate data is calculated using the test data;
By the data error rate be greater than second threshold and comprising the most combination condition of the target signature labeled as extracting Condition.
5. the method according to claim 1, wherein it is described according to the extraction condition from the number to be extracted It is extracted in, after obtaining the step of meeting the target data of the extraction condition, the method also includes:
The corresponding data volume of target data data volume corresponding with the intermediate data is compared;
When the corresponding data volume of target data data volume corresponding greater than the intermediate data, then from the target data In randomly select the target data of the intermediate data corresponding data amount.
6. a kind of data pick-up device, which is characterized in that described device includes:
Task acquisition module, for obtaining the data pick-up task for carrying Data Identification to be extracted;
Data acquisition module, it is corresponding to be extracted with the Data Identification to be extracted for being obtained according to the data pick-up task Data;
Data extraction module includes mistake in the intermediate data for the paging extraction intermediate data from the data to be extracted Accidentally mark data;
Characteristic information statistical module, for counting various features information corresponding to the error identification data;
Target signature mark module, for calculating the corresponding error in data of the characteristic information described in the intermediate data The characteristic information that the data error rate is greater than first threshold is labeled as target signature by rate;
Extraction condition generation module, for generating extraction condition according to the target signature;
The data extraction module is also used to be extracted from the data to be extracted according to the extraction condition, is met The target data of the extraction condition.
7. device according to claim 6, which is characterized in that also carry the mediant in the data pick-up task It include quantity accounting in the configuration information according to corresponding configuration information;The data extraction module is also used to obtain memory money Source information determines data capacity condition according to the memory source information;It is extracted from the data to be extracted and meets the number According to the initial data of capacity conditions;The initial data is filtered based on filter condition, obtains filtered data;From institute It states in filtered data and is randomly selected according to the data accounting, obtain intermediate data;It repeats described from described The step of meeting the initial data of the data capacity condition is extracted in data to be extracted, until traversing all numbers to be extracted According to.
8. device according to claim 7, which is characterized in that further include attribute information in the configuration information;The number It is also used to determine the memory footprint of the corresponding data of each attribute according to the attribute information according to abstraction module;Described in statistics The sum of the memory footprint of the corresponding data of all properties that intermediate data described in configuration information includes, obtains the centre The corresponding memory footprint of data;The memory source space calculated in the memory source information is corresponding with the intermediate data Memory footprint ratio, generate data capacity condition.
9. a kind of computer equipment, including memory and processor, the memory are stored with computer program, feature exists In the step of processor realizes any one of claims 1 to 5 the method when executing the computer program.
10. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the computer program The step of method described in any one of claims 1 to 5 is realized when being executed by processor.
CN201910634368.6A 2019-07-15 2019-07-15 Data extraction method and device, computer equipment and storage medium Active CN110515974B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910634368.6A CN110515974B (en) 2019-07-15 2019-07-15 Data extraction method and device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910634368.6A CN110515974B (en) 2019-07-15 2019-07-15 Data extraction method and device, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN110515974A true CN110515974A (en) 2019-11-29
CN110515974B CN110515974B (en) 2022-03-11

Family

ID=68623357

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910634368.6A Active CN110515974B (en) 2019-07-15 2019-07-15 Data extraction method and device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN110515974B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111897864A (en) * 2020-08-13 2020-11-06 创智和宇信息技术股份有限公司 Expert database data extraction method and system based on Internet AI outbound
CN112988817A (en) * 2021-04-12 2021-06-18 携程旅游网络技术(上海)有限公司 Data comparison method, system, electronic equipment and storage medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU3688799A (en) * 1998-06-30 2000-01-13 Lenovo Innovations Limited (Hong Kong) Data sampling method and device
US20110249905A1 (en) * 2010-01-15 2011-10-13 Copanion, Inc. Systems and methods for automatically extracting data from electronic documents including tables
CN103488700A (en) * 2013-09-04 2014-01-01 用友软件股份有限公司 Data extracting system and data extracting method
CN104866484A (en) * 2014-02-21 2015-08-26 阿里巴巴集团控股有限公司 Data processing method and device
CN106570060A (en) * 2016-09-30 2017-04-19 微梦创科网络科技(中国)有限公司 Data random extraction method and apparatus in information flow
CN106959955A (en) * 2016-01-11 2017-07-18 中国移动通信集团陕西有限公司 The data processing method and device of a kind of database
CN107357790A (en) * 2016-05-09 2017-11-17 阿里巴巴集团控股有限公司 A kind of unexpected message detection method, apparatus and system
CN109582833A (en) * 2018-11-06 2019-04-05 阿里巴巴集团控股有限公司 Abnormal Method for text detection and device
CN109710679A (en) * 2018-12-28 2019-05-03 北京旷视科技有限公司 Data pick-up method and device

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU3688799A (en) * 1998-06-30 2000-01-13 Lenovo Innovations Limited (Hong Kong) Data sampling method and device
US20110249905A1 (en) * 2010-01-15 2011-10-13 Copanion, Inc. Systems and methods for automatically extracting data from electronic documents including tables
CN103488700A (en) * 2013-09-04 2014-01-01 用友软件股份有限公司 Data extracting system and data extracting method
CN104866484A (en) * 2014-02-21 2015-08-26 阿里巴巴集团控股有限公司 Data processing method and device
CN106959955A (en) * 2016-01-11 2017-07-18 中国移动通信集团陕西有限公司 The data processing method and device of a kind of database
CN107357790A (en) * 2016-05-09 2017-11-17 阿里巴巴集团控股有限公司 A kind of unexpected message detection method, apparatus and system
CN106570060A (en) * 2016-09-30 2017-04-19 微梦创科网络科技(中国)有限公司 Data random extraction method and apparatus in information flow
CN109582833A (en) * 2018-11-06 2019-04-05 阿里巴巴集团控股有限公司 Abnormal Method for text detection and device
CN109710679A (en) * 2018-12-28 2019-05-03 北京旷视科技有限公司 Data pick-up method and device

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
MITALI SRIVASTAVA 等: "Data preprocessing is considered as an important phase of Web usage mining due to unstructured, heterogeneous and noisy nature of log data. Complete and effective data preprocessing insures the efficiency and scalability of algorithms used in pattern disco", 《ACM》 *
张方: "面向开源社区的Web数据抽取技术研究", 《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》 *
朱小刚: "不均衡数据分类下特征有效识别分析", 《计算机仿真》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111897864A (en) * 2020-08-13 2020-11-06 创智和宇信息技术股份有限公司 Expert database data extraction method and system based on Internet AI outbound
CN112988817A (en) * 2021-04-12 2021-06-18 携程旅游网络技术(上海)有限公司 Data comparison method, system, electronic equipment and storage medium
CN112988817B (en) * 2021-04-12 2024-03-12 携程旅游网络技术(上海)有限公司 Data comparison method, system, electronic device and storage medium

Also Published As

Publication number Publication date
CN110515974B (en) 2022-03-11

Similar Documents

Publication Publication Date Title
CN111506498B (en) Automatic generation method and device of test case, computer equipment and storage medium
CN110399293B (en) System test method, device, computer equipment and storage medium
CN108573371A (en) The data measures and procedures for the examination and approval, device, computer equipment and storage medium
CN108377240A (en) Exceptional interface detection method, device, computer equipment and storage medium
CN108446362A (en) Data cleansing processing method, device, computer equipment and storage medium
CN109474578A (en) Message method of calibration, device, computer equipment and storage medium
CN109542428A (en) Method for processing business, device, computer equipment and storage medium
CN111563368A (en) Report generation method and device, computer equipment and storage medium
CN110363645A (en) Asset data processing method, device, computer equipment and storage medium
CN109284920A (en) The method and system of user information risk assessment based on big data
CN111563075B (en) Service verification system, method and equipment and storage medium
CN108334625A (en) Processing method, device, computer equipment and the storage medium of user information
CN111178830A (en) Cost accounting method and device, computer equipment and storage medium
CN109800278A (en) Data assets map application method, device, computer equipment and storage medium
CN110515974A (en) Data pick-up method, apparatus, computer equipment and storage medium
CN109377383A (en) Product data synchronous method, device, computer equipment and storage medium
CN110413507A (en) System detection method, device, computer equipment and storage medium
CN110290486A (en) Short message sends test method, device, computer equipment and storage medium
CN109543073A (en) Enterprise's supply and marketing relation map generation method, device and computer equipment
CN109325868A (en) Questionnaire data processing method, device, computer equipment and storage medium
CN110084476B (en) Case adjustment method, device, computer equipment and storage medium
CN110837511B (en) Data processing method, system and related equipment
CN109656474A (en) Date storage method, device, computer equipment and storage medium
CN109542947B (en) Data statistical method, device, computer equipment and storage medium
CN116579580A (en) Method, device, computer equipment and storage medium for processing bill

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant