WO2018184418A1 - 数据清洗的方法、终端及计算机可读存储介质 - Google Patents

数据清洗的方法、终端及计算机可读存储介质 Download PDF

Info

Publication number
WO2018184418A1
WO2018184418A1 PCT/CN2018/074858 CN2018074858W WO2018184418A1 WO 2018184418 A1 WO2018184418 A1 WO 2018184418A1 CN 2018074858 W CN2018074858 W CN 2018074858W WO 2018184418 A1 WO2018184418 A1 WO 2018184418A1
Authority
WO
WIPO (PCT)
Prior art keywords
policy information
concurrent
cleaning
configuration table
data
Prior art date
Application number
PCT/CN2018/074858
Other languages
English (en)
French (fr)
Inventor
李治
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2018184418A1 publication Critical patent/WO2018184418A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues

Definitions

  • the present application belongs to the field of computer technologies, and in particular, to a data cleaning method, a terminal, and a computer readable storage medium.
  • the prior art requires the user to download the data to be cleaned from the database in advance, generate the txt file of the data to be cleaned, and then perform calculation using the prophet software.
  • the files generated by the prophet software need to be converted into txt files before they can be uploaded to the database, and the operation efficiency of the prophet software is low, and it takes more than 12 hours to clean the data. It can be seen that the traditional dividend data cleaning process is complicated, the steps are redundant, and the efficiency is very low.
  • the embodiment of the present application provides a data cleaning method, a terminal, and a computer readable storage medium, which solve the problems of complicated process, redundant steps, and low efficiency in cleaning the traditional dividend data in the prior art.
  • a method of data cleaning comprising:
  • the cleaning task When the cleaning task for the traditional dividend data is received, the cleaning task is inserted into the preset task execution table, and the execution time corresponding to the cleaning task is set;
  • the scheduling packet is obtained through the Oracle database, and the policy information is cleaned in a multi-concurrent manner according to the scheduling package;
  • the cleaned policy information is submitted in batches and stored in the Oracle database.
  • the data cleaning of the policy information in the multi-concurrent manner includes:
  • submitting and storing the cleaned policy information into the Oracle database in batches includes:
  • the policy information after being cleaned by the concurrent process is read out to the second preset array, and the policy information in the second preset array is submitted to the Oracle database in batches by using the commit command;
  • the policy information in the batch is stored in the corresponding result table in the Oracle database according to the execution time of the policy information in each batch and the process number of the corresponding concurrent process.
  • the method further includes:
  • the cleaning task is not executed.
  • the status information of the cleaning task is an execution failure, the processed data in the plurality of concurrent processes is deleted, and the cleaning task is re-executed.
  • the method further includes:
  • the switch configuration table is read, and the invalid data in the policy information to be cleaned is removed according to the switch configuration table.
  • a terminal in a second aspect, includes:
  • a task receiving module configured to insert the cleaning task into a preset task execution table when the cleaning task for the traditional dividend data is received, and set an execution time corresponding to the cleaning task;
  • a concurrent cleaning module configured to acquire a scheduling packet through an Oracle database when the execution time arrives, and perform data cleaning on the policy information in a multi-concurrent manner according to the scheduling packet;
  • the storage module is configured to submit and save the cleaned policy information into the Oracle database in batches.
  • a computer readable storage medium is stored, the computer readable storage medium storing computer readable instructions that, when executed by a processor, implement the following steps:
  • the cleaning task When the cleaning task for the traditional dividend data is received, the cleaning task is inserted into the preset task execution table, and the execution time corresponding to the cleaning task is set;
  • the scheduling packet is obtained through the Oracle database, and the policy information is cleaned in a multi-concurrent manner according to the scheduling package;
  • the cleaned policy information is submitted in batches and stored in the Oracle database.
  • a terminal comprising a memory, a processor, and computer readable instructions stored in the memory and executable on the processor, the processor implementing the computer readable instructions The following steps:
  • the cleaning task When the cleaning task for the traditional dividend data is received, the cleaning task is inserted into the preset task execution table, and the execution time corresponding to the cleaning task is set;
  • the scheduling packet is obtained through the Oracle database, and the policy information is cleaned in a multi-concurrent manner according to the scheduling package;
  • the cleaned policy information is submitted in batches and stored in the Oracle database.
  • the embodiment of the present application inserts the cleaning task into a preset task execution table when receiving a cleaning task for the traditional dividend data, and sets an execution time corresponding to the cleaning task;
  • the scheduling packet is obtained through the Oracle database, and the policy information is cleaned in a multi-concurrent manner according to the scheduling package; finally, the cleaned policy information is submitted and stored in the batch to the Oracle.
  • the steps of converting the file formats to each other are eliminated, the efficiency of data cleaning is effectively improved, and the overall time consumption of data cleaning is reduced.
  • FIG. 1 is a flowchart of an implementation of a method for data cleaning provided by a first embodiment of the present application
  • step S102 is a specific implementation flowchart of step S102 in the data cleaning method provided by the first embodiment of the present application
  • step S103 is a specific implementation flowchart of step S103 in the method for data cleaning provided by the first embodiment of the present application;
  • FIG. 4 is a schematic block diagram of a terminal provided by a second embodiment of the present application.
  • FIG. 5 is a schematic block diagram of a terminal provided by a third embodiment of the present application.
  • the cleaning task for the traditional dividend data when the cleaning task for the traditional dividend data is received, the cleaning task is inserted into the preset task execution table, and the execution time corresponding to the cleaning task is set; when the execution time arrives, Obtaining the scheduling package through the Oracle database, and performing data cleaning on the policy information in a multi-concurrent manner according to the scheduling package; finally, the cleaned policy information is submitted in batches and stored in the Oracle database, thereby eliminating the need for the maintenance.
  • the steps of converting file formats to each other effectively improve the efficiency of data cleaning and reduce the overall time consuming of data cleaning.
  • the embodiments of the present application also provide corresponding terminals, which are respectively described in detail below.
  • FIG. 1 is a flowchart of an implementation of a method for data cleaning provided by a first embodiment of the present application.
  • the data cleaning method is applied to a terminal, and the terminal includes but is not limited to a computer, a server, and the like.
  • the data cleaning method includes:
  • step S101 when the cleaning task for the traditional dividend data is received, the cleaning task is inserted into the preset task execution table, and the execution time corresponding to the cleaning task is set.
  • the terminal acquires the cleaning task for the traditional dividend data according to the trigger operation of the user on the page, inserts the cleaning task into the preset task execution table, and sets the execution of the task according to the user operation. time.
  • the task execution table may be a pala_batch_plan table, and the pala_batch_plan table includes scheduled for controlling execution time. Start date field.
  • the value in the scheduled start date field is modified to set the execution time of the cleaning task.
  • the cleaning task is allowed to start only when the current time is greater than or equal to the execution time, thereby facilitating the user to arrange the cleaning task, which is beneficial to the rational use of the cpu resource.
  • step S102 when the execution time arrives, the scheduling packet is obtained through the Oracle database, and the policy information is cleaned in a multi-concurrent manner according to the scheduling packet.
  • the embodiment of the present application obtains a scheduling package by using an Oracle database, where the scheduling package includes information related to cleaning task execution, such as parameter information and process information. And then, according to the related information in the scheduling package, starting a plurality of concurrent processes, and performing data cleaning on the policy information in the traditional dividend data through the concurrent process.
  • the Oracle database is scheduled according to the The execution time in the start date field.
  • the scheduling package is automatically acquired, and multiple concurrent processes are started to automatically perform the cleaning task.
  • the cleaning process is optimized in the embodiment of the present application, including data cleaning in a multi-concurrent manner, and the policy information to be processed is automatically filtered by each concurrent process, thereby ensuring that the policy information is not Repeat execution.
  • FIG. 2 shows a specific implementation process for data cleaning of policy information in a multi-concurrent manner provided by the first embodiment of the present application.
  • the data cleaning of the policy information in a multi-concurrent manner includes:
  • step S201 the concurrent configuration table is read, and a plurality of concurrent processes are started according to the concurrent configuration table.
  • the concurrent configuration table is used by the salesperson to configure the number of concurrent processes.
  • the terminal may start a corresponding number of concurrent processes according to the concurrent configuration table to prepare for data cleaning.
  • the preparing action before reading the concurrent configuration table further includes screening the policy information
  • the method may further include:
  • the switch configuration table is read, and the invalid data in the policy information to be cleaned is removed according to the switch configuration table.
  • different policy information may be generated by different sales organizations.
  • the sales organization in the embodiment of the present application is divided into a primary organization and a secondary organization according to an administrative division.
  • the binary configuration table is used to distinguish which sales organization the policy information belongs to.
  • the policy basic information configuration table is used to record which insurance items in the policy information need to be cleaned.
  • the embodiment of the present application initially filters out the policy information to be cleaned in combination with the binary configuration table and the policy information basic configuration table, so as to reduce the data cleaning operation on the invalid policy information.
  • the binary configuration table and the policy information basic configuration table also serve as the basis for subsequent data cleaning.
  • the switch configuration table is dynamically configured by a service personnel for the salesperson to record invalid data for each policy information.
  • the terminal selects invalid data that the salesperson does not care about from the policy information according to the switch configuration table, and inserts the invalid data into the data statistics table, so that the business personnel perform the next operation.
  • a certain type of policy information includes information of an insurer name field, an age field, a gender field, a telephone field, and a qualification information field
  • the switch configuration table is an insurer name field, an age field, and a gender in the policy information.
  • the field, phone field, and education information fields set the switch options. If the salesperson believes that the academic information field is irrelevant data, the option of the learning information field may be closed in the switch configuration table, and the terminal only reads the insurer name field in the policy information according to the switch configuration table. , age field, gender field, phone field information.
  • the embodiment of the present application filters out the policy information to be processed based on the one-two configuration table, the policy information basic configuration table, and the switch configuration table, and pre-excludes policy information and irrelevant data that are not required to be cleaned in the cleaning preparation phase. , thereby reducing the workload of cleaning, and further improving the efficiency of data cleaning.
  • the preparing action may further include a state determining, where the method further includes:
  • the cleaning task is not executed.
  • the status information of the cleaning task is an execution failure, the processed data in the plurality of concurrent processes is deleted, and the cleaning task is re-executed.
  • the embodiment of the present application determines the status information of the cleaning task corresponding to the current execution time, and when the cleaning task is successfully executed, the cleaning task is not executed; when the cleaning task fails, the concurrent thread is deleted. If the data has passed, the process goes to step S202 to re-execute the cleaning task; when the cleaning task is not executed, the process goes to step S202 to execute the cleaning task; thereby avoiding repeated execution of the cleaning task, which is beneficial to reducing the time and cost. CPU resource consumption.
  • step S202 a remainder between the last digit of the policy number corresponding to the policy information and the total number of concurrent processes is obtained, and the policy information is allocated to the concurrent process corresponding to the remainder.
  • the embodiment of the present application sets a corresponding process number for each concurrent process. After filtering out the policy information to be processed, the embodiment of the present application allocates the processing process corresponding to the policy information according to the policy number corresponding to the policy information. First, the policy information to be processed and its corresponding policy number are obtained; then the remainder between the last digit of the policy number and the total number of concurrent processes is obtained, and finally the policy information is assigned to the process number according to the remainder The concurrent process of the remainder, the concurrent process is the processing process of the policy information.
  • the current policy number to be processed is 201702008
  • the total number of concurrent processes initiated is 3, respectively numbered 0, 1, 2; then the remainder between the last 8 digits of the policy number and the total number of concurrent processes 3
  • the policy information with the policy number 201702008 is assigned to the concurrent process with process number 2.
  • the corresponding processing process is assigned to all the policy information to be processed, thereby ensuring that each policy information has a corresponding processing process, thereby avoiding the situation that the policy information is repeatedly cleaned, which is beneficial to improving the efficiency of data cleaning.
  • step S203 the policy information to be processed is read by using a cursor, the read policy information is cached into the first preset array, and submitted to the allocated concurrent process in batches, and the concurrent process is pre-processed.
  • the data cleaning algorithm is configured to perform data cleaning on the policy information.
  • the embodiment of the present application uses a cursor to read policy information to be processed from an Oracle database.
  • Each read data information is first placed in the first preset array cache.
  • the read data information is submitted to the corresponding concurrent process as a batch. Processing, the data is cleaned by the concurrent process according to a preset data cleaning algorithm.
  • the specified threshold may be 5000 strips/batch.
  • the 5000 pieces of data information are submitted from the first preset array to the concurrent process for cleaning, which is beneficial to reduce the consumption of the I/O port.
  • step S103 the cleaned policy information is submitted in batches and stored in the Oracle database.
  • FIG. 3 shows a specific implementation process of step S103 in the data cleaning method provided by the first embodiment of the present application.
  • the step S103 includes:
  • step S301 the policy information after the concurrent process cleaning is read out to the second preset array, and the policy information in the second preset array is submitted to the Oracle database in batches by using the commit command.
  • the embodiment of the present application reads the cleaned data information from the concurrent process, and caches the data information into a second preset array. Similarly, when the number of readings reaches a specified threshold, the data information in the second preset array is submitted to the Oracle database for storage as a batch.
  • the specified threshold may be 5000 strips/batch. After reading 5000 pieces of data in a loop, the 5000 pieces of data information are read out from the second preset array to the Oracle database for storage, which can further reduce the consumption of the I/O port.
  • step S302 the policy information in the batch is stored in the corresponding result table in the Oracle database according to the execution time of the policy information in each batch and the process number of the corresponding concurrent process.
  • the Oracle database includes a primary partition divided according to an execution time and a secondary partition divided according to a process number.
  • the result table includes two procs for determining the partition to which the data belongs. Date field and Order num field, the proc The date field indicates the execution time, and the Order num field indicates the process number. Every piece of data read from the concurrent process has proc Two attribute information of the date field and the Order num field.
  • the proc corresponding to each piece of data information The date field and the Order num field can accurately store the piece of data information to the corresponding result table.
  • the embodiment of the present application uses the partitioning method to store the policy information after the cleaning, which is convenient for deleting the policy information in the historical time, and improving the query efficiency of the policy information of the cleaning.
  • the size of the serial number of each step does not mean the order of execution order, and the order of execution of each step should be determined by its function and internal logic, and should not constitute any implementation process of the embodiment of the present application. limited.
  • FIG. 4 is a schematic block diagram of a terminal provided by a second embodiment of the present application. For the convenience of description, only parts related to the embodiment of the present application are shown.
  • the method for implementing the data cleaning described in any of the foregoing embodiments of FIG. 1 to FIG. 3 may be a software unit, a hardware unit, or a combination of software and hardware.
  • the terminal includes, but is not limited to, a computer, a server, and the like.
  • the terminal includes:
  • the task receiving module 41 is configured to insert the cleaning task into a preset task execution table when the cleaning task for the traditional dividend data is received, and set an execution time corresponding to the cleaning task;
  • the concurrent cleaning module 42 is configured to: when the execution time arrives, obtain a scheduling package by using an Oracle database, and perform data cleaning on the policy information in a multi-concurrent manner according to the scheduling package;
  • the storage module 43 is configured to submit and store the cleaned policy information in batches into the Oracle database.
  • the task receiving module 41 acquires a cleaning task for the traditional dividend data according to the trigger operation of the user on the page, and inserts the cleaning task into a preset task execution table to set the task.
  • Execution time Exemplarily, the task execution table may be a pala_batch_plan table, and the pala_batch_plan table includes scheduled for controlling execution time. Start date field.
  • the cleaning task is inserted into the pala_batch_plan table, the value in the scheduled start date field is modified according to a user operation to set an execution time of the cleaning task. After the execution time is set, the cleaning task is allowed to start only when the current time is greater than or equal to the execution time, thereby facilitating the user to arrange the cleaning task, which is beneficial to the rational use of the cpu resource.
  • the concurrent cleaning module 42 acquires the scheduling package through the Oracle database.
  • the scheduling package includes information related to execution of the cleaning task, such as parameter information and process information. And then, according to the related information in the scheduling package, starting a plurality of concurrent processes, and performing data cleaning on the policy information by using the concurrent process.
  • the concurrent cleaning module 42 can be scheduled according to the scheduled The execution time in the start date field.
  • the scheduling package is automatically acquired, and several concurrent processes are started to automatically perform the cleaning task.
  • the concurrent cleaning module 42 further includes:
  • the startup unit 421 is configured to read a concurrent configuration table, and start a plurality of concurrent processes according to the concurrent configuration table;
  • the allocating unit 422 is configured to obtain a remainder between the last digit of the policy number corresponding to the policy information and the total number of concurrent processes, and allocate the policy information to the concurrent process corresponding to the remainder;
  • the cleaning unit 423 is configured to use the cursor to read the policy information to be processed, cache the read policy information into a first preset array, and submit the batch process to the allocated concurrent process, and the concurrent process is followed by the concurrent process.
  • a preset data cleaning algorithm performs data cleaning on the policy information.
  • the concurrent configuration table is used by the salesperson to configure the number of concurrent processes.
  • the terminal may start a corresponding number of concurrent processes according to the concurrent configuration table to prepare for data cleaning.
  • the preparing action before the concurrent configuration table is read by the embodiment of the present application may further include: screening the policy information, where the terminal further includes:
  • the screening module 44 is configured to obtain a binary configuration table and a policy information basic configuration table before reading the concurrent configuration table, and filter the information from the Oracle database according to the binary configuration table and the policy information basic configuration table.
  • the policy information of the cleaning; the switch configuration table is read, and the invalid data in the policy information to be cleaned is removed according to the switch configuration table.
  • different policy information may be generated by different sales organizations.
  • the sales organization in the embodiment of the present application is divided into a primary organization and a secondary organization according to an administrative division.
  • the binary configuration table is used to distinguish which sales organization the policy information belongs to.
  • the policy basic information configuration table is used to record which insurance items in the policy information need to be cleaned.
  • the embodiment of the present application initially screens out the policy information to be cleaned based on the one-two configuration table and the basic information of the policy information, which is beneficial to reducing the data cleaning operation on the invalid policy information.
  • the binary configuration table and the policy information basic configuration table also serve as the basis for subsequent data cleaning.
  • the switch configuration table is dynamically configured by a service personnel, and is used by the salesperson to record invalid data.
  • the terminal selects invalid data that the salesperson does not care about from the policy information according to the switch configuration table, and inserts the invalid data into the data statistics table, so that the business personnel perform the next operation.
  • the embodiment of the present application filters out policy information to be processed based on the binary configuration table, the policy information basic configuration table, and the switch configuration table, and pre-excludes policy information that does not need to be cleaned in the cleaning preparation stage, thereby reducing the workload of cleaning. It is beneficial to further improve the efficiency of data cleaning.
  • the preparing action after starting the multiple concurrent processes according to the concurrent configuration table may further include a status determining, where the terminal further includes:
  • the status identification module 45 is configured to: after the several concurrent processes are started according to the concurrent configuration table, obtain the status information of the cleaning task from the log table; if the status information of the cleaning task is successful, the status is not executed. If the status of the cleaning task is an execution failure, the processed data in the plurality of concurrent processes is deleted, and the cleaning task is re-executed.
  • the embodiment of the present application determines the status information of the cleaning task corresponding to the current execution time, and when the cleaning task is successfully executed, the cleaning task is not executed; when the cleaning task fails, the concurrent thread is deleted.
  • the data that has passed jumps to the allocating unit 422 to re-execute the cleaning task; when the cleaning task is not executed, jumps to the allocating unit 422 to perform the cleaning task; thereby avoiding repeated execution of the cleaning task, which is beneficial for reducing time Cost and CPU resource consumption.
  • the allocation unit 422 allocates the processing process corresponding to the policy information according to the policy number corresponding to the policy information. First, the allocating unit 422 obtains the policy information to be processed and its corresponding policy number; then obtains the remainder between the last digit of the policy number and the total number of concurrent processes, and finally assigns the policy information to the remaining number according to the remainder The process number is the concurrent process of the remainder, and the concurrent process is the process of processing the policy information.
  • the current policy number to be processed is 201702008
  • the total number of concurrent processes initiated is 3, respectively numbered 0, 1, 2; then the remainder between the last 8 digits of the policy number and the total number of concurrent processes 3
  • the policy information with the policy number 201702008 is assigned to the concurrent process with process number 2.
  • the corresponding processing process is assigned to all the policy information to be processed, thereby ensuring that each policy information has a corresponding processing process, thereby avoiding the situation that the policy information is repeatedly cleaned, which is beneficial to improving the efficiency of data cleaning.
  • the cleaning unit 423 uses the cursor to read the policy information to be processed from the Oracle database.
  • Each read data information is first placed in the first preset array cache.
  • the read data information is submitted to the corresponding concurrent process as a batch. Processing, the data is cleaned by the concurrent process according to a preset data cleaning algorithm.
  • the specified threshold may be 5000 strips/batch.
  • the 5000 pieces of data information are submitted from the first preset array to the concurrent process for cleaning to reduce the consumption of the I/O port.
  • the storage module 43 further includes:
  • the submitting unit 431 is configured to read the policy information after the concurrent process cleaning to the second preset array, and submit the policy information in the second preset array to the Oracle database in batches by using the commit command;
  • the storage unit 432 is configured to store the policy information in the batch into the corresponding result table in the Oracle database according to the execution time of the policy information in each batch and the process number of the corresponding concurrent process.
  • the embodiment of the present application reads the cleaned data information from the concurrent process by the submitting unit 431, and caches the data information into a second preset array. Similarly, when the number of readings reaches a specified threshold, the read data information is submitted to the Oracle database for storage as a batch.
  • the specified threshold may be 5000 strips/batch. After reading 5000 pieces of data in a loop, the 5000 pieces of data information are read out from the second preset array to the Oracle database for storage to reduce the consumption of the I/O port.
  • the Oracle database includes a primary partition divided according to an execution time and a secondary partition divided according to a process number.
  • the result table includes two procs for determining the partition to which the data belongs. Date field and Order num field, the proc The date field indicates the execution time, and the Order num field indicates the process number. Every piece of data read from the concurrent process has proc Two attribute information of the date field and the Order num field.
  • the storage unit 432 is configured according to the proc date field and the Order corresponding to each piece of data information.
  • the num field can accurately store the piece of data information to the corresponding result table.
  • the embodiment of the present application uses the partitioning method to store the policy information after the cleaning, which is convenient for deleting the policy information in the historical time, and improving the query efficiency of the policy information of the cleaning.
  • terminal in the embodiment of the present application may be used to implement all the technical solutions in the foregoing method embodiments, and the functions of the respective functional modules may be specifically implemented according to the method in the foregoing method embodiment, and the specific implementation process may refer to The related descriptions in the above examples are not described herein again.
  • the cleaning task for the traditional dividend data when the cleaning task for the traditional dividend data is received, the cleaning task is inserted into the preset task execution table, and the execution time corresponding to the cleaning task is set;
  • the scheduling package is obtained through the Oracle database, and the policy information is cleaned in a multi-concurrent manner according to the scheduling package.
  • the cleaned policy information is submitted in batches and stored in the Oracle database. Therefore, the steps of converting the file formats to each other are omitted, the efficiency of data cleaning is effectively improved, and the overall time consumption of data cleaning is reduced.
  • FIG. 5 is a schematic block diagram of a terminal provided by a third embodiment of the present application.
  • the terminal as shown may include one or more processors 501 (only one shown); one or more input devices 502 (only one shown), one or more output devices 503 ( Only one) memory 504 is shown.
  • the processor 501, the input device 502, the output device 503, and the memory 504 are connected by a bus 506.
  • the input device 502 is configured to receive a cleaning task for traditional dividend data;
  • the memory 504 is configured to store computer readable instructions;
  • the processor 501 is configured to execute the memory readable computer readable instructions to perform the following operations:
  • the cleaning task When the cleaning task for the traditional dividend data is received, the cleaning task is inserted into the preset task execution table, and the execution time corresponding to the cleaning task is set; when the execution time arrives, the scheduling is obtained through the Oracle database.
  • the package according to the scheduling package, performs data cleaning on the policy information in a multi-concurrent manner; the cleaned policy information is submitted in batches and stored in the Oracle database.
  • the data cleaning of the policy information in the multi-concurrent manner includes:
  • submitting and storing the cleaned policy information into the Oracle database in batches includes:
  • the policy information after being cleaned by the concurrent process is read out to the second preset array, and the policy information in the second preset array is submitted to the Oracle database in batches by using the commit command;
  • the policy information in the batch is stored in the corresponding result table in the Oracle database according to the execution time of the policy information in each batch and the process number of the corresponding concurrent process.
  • processor 501 is further configured to:
  • the status information of the cleaning task is obtained from the log table.
  • the cleaning task is not executed.
  • the status information of the cleaning task is an execution failure, the processed data in the plurality of concurrent processes is deleted, and the cleaning task is re-executed.
  • processor 501 is further configured to:
  • the switch configuration table is read, and the invalid data in the policy information to be cleaned is removed according to the switch configuration table.
  • processor 501 may be a central processing unit (Central) Processing Unit, CPU) and / or graphics processor (Graphic Processing Unit (GPU), can also be combined with other general-purpose processors, digital signal processors (DSP), ASICs (Application) Specific Integrated Circuit (ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, etc.
  • CPU central processing unit
  • GPU Graphics
  • the input device 502 can include a touchpad, a fingerprint sensor (for collecting fingerprint information of the user and direction information of the fingerprint), a microphone, a communication module (such as a Wi-Fi module, a 2G/3G/4G network module), a physical button, and the like. .
  • the output device 503 can include a display (LCD or the like), a speaker, and the like.
  • the display can be used to display information input by the user or information provided to the user, and the like.
  • the display may include a display panel, optionally, a liquid crystal display (Liquid)
  • a display panel is configured in the form of a Crystal Display (LCD) or an Organic Light-Emitting Diode (OLED).
  • the touch panel may be covered on the display. When the touch panel detects a touch operation on or near the touch panel, the touch panel transmits to the processor 501 to determine the type of the touch event, and then the processor 501 according to the type of the touch event. Provide the corresponding visual output on the display.
  • the processor 501, the input device 502, the output device 503, and the memory 504 described in the embodiments of the present application may be implemented in the embodiment of the data cleaning method provided by the embodiment of the present application, where No longer.
  • the disclosed method, terminal, and computer readable storage medium may be implemented in other manners.
  • the device embodiments described above are merely illustrative.
  • the division of the modules and units is only a logical function division.
  • there may be another division manner for example, multiple units or components may be used. Combinations can be integrated into another system, or some features can be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional unit and module in each embodiment of the present application may be integrated into one processing unit, or each unit or module may exist physically separately, or two or more units or modules may be integrated into one unit. .
  • the functions may be stored in a computer readable storage medium if implemented in the form of a software functional unit and sold or used as a standalone product.
  • the technical solution of the present application which is essential or contributes to the prior art, or a part of the technical solution, may be embodied in the form of a software product, which is stored in a storage medium, including
  • the instructions are used to cause a computer device (which may be a personal computer, server, or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present application.
  • the foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), and a random access memory (RAM, Random Access).
  • a variety of media that can store computer readable instructions such as a Memory, a disk, or an optical disk.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Quality & Reliability (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种数据清洗的方法、终端及计算机可读存储介质,所述方法包括:当接收到对传统分红数据的清洗任务时,将所述清洗任务插入预设的任务执行表中,并设置所述清洗任务对应的执行时间(S101);当所述执行时间到达时,通过Oracle数据库获取调度包,按照所述调度包,采用多并发的方式对保单信息进行数据清洗(S102);将清洗后的保单信息分批次提交并存储至所述Oracle数据库中(S103)。省去了文件格式相互转换的步骤,解决了现有技术对传统分红数据进行清洗时过程复杂、步骤冗余、效率低下问题,有效地提高了数据清洗的效率,降低了数据清洗的总体耗时。

Description

数据清洗的方法、终端及计算机可读存储介质
本申请申明享有2017年4月6日递交的申请号为CN 201710221427.8、名称为“数据清洗的方法及终端”中国专利申请的优先权,该中国专利申请的整体内容以参考的方式结合在本申请中。
技术领域
本申请属于计算机技术领域,尤其涉及一种数据清洗的方法、终端及计算机可读存储介质。
背景技术
在对传统分红数据进行清洗时,现有技术需要用户预先从数据库中下载待清洗的数据,将所述待清洗的数据生成txt文件,然后使用prophet软件进行计算。而prophet软件生成的文件需要再次转换成txt文件才能够上传至数据库中,且prophet软件的运算效率低,往往在清洗数据这个步骤也要花费12小时以上。可见,传统分红数据的清洗过程复杂、步骤冗余,效率十分低下。
技术问题
鉴于此,本申请实施例提供了一种数据清洗的方法、终端及计算机可读存储介质,以解决现有技术对传统分红数据进行清洗时过程复杂、步骤冗余、效率低下问题。
技术解决方案
第一方面,提供了一种数据清洗的方法,所述方法包括:
当接收到对传统分红数据的清洗任务时,将所述清洗任务插入预设的任务执行表中,并设置所述清洗任务对应的执行时间;
当所述执行时间到达时,通过Oracle数据库获取调度包,按照所述调度包,采用多并发的方式对保单信息进行数据清洗;
将清洗后的保单信息分批次提交并存储至所述Oracle数据库中。
进一步地,所述采用多并发的方式对保单信息进行数据清洗包括:
读取并发配置表,根据所述并发配置表启动若干条并发进程;
求取保单信息对应的保单号的末位数字与并发进程总数之间的余数,将所述保单信息分配给所述余数对应的并发进程;
使用游标读取待处理的保单信息,将所读取的保单信息缓存至第一预设数组中,并分批次提交至所分配的并发进程,由所述并发进程按照预设的数据清洗算法对所述保单信息进行数据清洗。
进一步地,所述将清洗后的保单信息分批次提交并存储至所述Oracle数据库中包括:
将经并发进程清洗后的保单信息读出至第二预设数组中,采用commit命令分批次将该第二预设数组中的保单信息提交至Oracle数据库中;
按照每批次中的保单信息的执行时间和对应并发进程的进程编号,将该批次中的保单信息存储至Oracle数据库中对应的结果表中。
进一步地,在根据所述并发配置表启动若干条并发进程之后,所述方法还包括:
从日志表中获取所述清洗任务的状态信息;
若所述清洗任务的状态信息为执行成功,则不再执行本次清洗任务;
若所述清洗任务的状态信息为执行失败,则删除所述若干条并发进程中的已处理数据,并重新执行本次清洗任务。
进一步地,在读取并发配置表之前,所述方法还包括:
获取一二元配置表和保单信息基本配置表,根据所述一二元配置表和保单信息基本配置表从所述Oracle数据库中筛选出待清洗的保单信息;
读取开关配置表,根据所述开关配置表去除所述待清洗的保单信息中的无效数据。
第二方面,提供了一种终端,所述终端包括:
任务接收模块,用于当接收到对传统分红数据的清洗任务时,将所述清洗任务插入预设的任务执行表中,并设置所述清洗任务对应的执行时间;
并发清洗模块,用于当所述执行时间到达时,通过Oracle数据库获取调度包,按照所述调度包,采用多并发的方式对保单信息进行数据清洗;
存储模块,用于将清洗后的保单信息分批次提交并存储至所述Oracle数据库中。
第三方面,提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机可读指令,所述计算机可读指令被处理器执行时实现如下步骤:
当接收到对传统分红数据的清洗任务时,将所述清洗任务插入预设的任务执行表中,并设置所述清洗任务对应的执行时间;
当所述执行时间到达时,通过Oracle数据库获取调度包,按照所述调度包,采用多并发的方式对保单信息进行数据清洗;
将清洗后的保单信息分批次提交并存储至所述Oracle数据库中。
第四方面,提供了一种终端,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令,所述处理器执行所述计算机可读指令时实现如下步骤:
当接收到对传统分红数据的清洗任务时,将所述清洗任务插入预设的任务执行表中,并设置所述清洗任务对应的执行时间;
当所述执行时间到达时,通过Oracle数据库获取调度包,按照所述调度包,采用多并发的方式对保单信息进行数据清洗;
将清洗后的保单信息分批次提交并存储至所述Oracle数据库中。
有益效果
与现有技术相比,本申请实施例通过在接收到对传统分红数据的清洗任务时,将所述清洗任务插入预设的任务执行表中,并设置所述清洗任务对应的执行时间;当所述执行时间到达时,则通过Oracle数据库获取调度包,按照所述调度包,采用多并发的方式对保单信息进行数据清洗;最后将清洗后的保单信息分批次提交并存储至所述Oracle数据库中,从而省去了文件格式相互转换的步骤,有效地提高了数据清洗的效率,降低了数据清洗的总体耗时。
附图说明
图1是本申请第一实施例提供的数据清洗的方法的实现流程图;
图2是本申请第一实施例提供的数据清洗的方法中步骤S102的具体实现流程图;
图3是本申请第一实施例提供的数据清洗的方法中步骤S103的具体实现流程图;
图4是本申请第二实施例提供的终端的示意性框图;
图5是本申请第三实施例提供的终端的示意性框图。
本发明的实施方式
本申请实施例通过在接收到对传统分红数据的清洗任务时,将所述清洗任务插入预设的任务执行表中,并设置所述清洗任务对应的执行时间;当所述执行时间到达时,则通过Oracle数据库获取调度包,按照所述调度包,采用多并发的方式对保单信息进行数据清洗;最后将清洗后的保单信息分批次提交并存储至所述Oracle数据库中,从而省去了文件格式相互转换的步骤,有效地提高了数据清洗的效率,降低了数据清洗的总体耗时。本申请实施例还提供了相应的终端,以下分别进行详细的说明。
图1是本申请第一实施例提供的数据清洗的方法的实现流程。
在本申请实施例中,所述数据清洗的方法应用于终端上,所述终端包括但不限于计算机、服务器等。参阅图1,所述数据清洗的方法包括:
在步骤S101中,当接收到对传统分红数据的清洗任务时,将所述清洗任务插入预设的任务执行表中,并设置所述清洗任务对应的执行时间。
在本申请实施例中,终端根据用户在页面上的触发操作获取对传统分红数据的清洗任务,将所述清洗任务插入至预设的任务执行表中,并根据用户操作设定该任务的执行时间。示例性地,所述任务执行表可以为pala_batch_plan表,所述pala_batch_plan表中包括有用于控制执行时间的scheduled start date字段。本申请实施例在将所述清洗任务插入至所述pala_batch_plan表的同时,修改所述scheduled start date字段中的值,以设置该清洗任务的执行时间。在设置完执行时间后,只有在当前时间大于或等于所述执行时间时,该清洗任务才允许启动,从而方便了用户安排清洗任务,有利于合理使用cpu资源。
在步骤S102中,当所述执行时间到达时,通过Oracle数据库获取调度包,按照所述调度包,采用多并发的方式对保单信息进行数据清洗。
当所述执行时间到达时,本申请实施例通过Oracle数据库获取调度包,所述调度包中包括了参数信息、流程信息等与清洗任务执行相关的信息。然后根据所述调度包中的相关信息,启动若干个并发进程,通过所述并发进程对所述传统分红数据中的保单信息进行数据清洗。
示例性地,如前所述,所述Oracle数据库根据所述scheduled start date字段中的执行时间,当所述执行时间到达时,则自动获取调度包,启动多个并发进程自动执行清洗任务。
与现有技术相比,本申请实施例对清洗流程进行了优化,包括采用多并发的方式进行数据清洗,由每个并发进程自动筛选出待处理的保单信息,从而保证了保单信息不会被重复执行。
可选地,图2示出了本申请第一实施例提供的采用多并发的方式对保单信息进行数据清洗的具体实现流程。参阅图2,所述采用多并发的方式对保单信息进行数据清洗包括:
在步骤S201中,读取并发配置表,根据所述并发配置表启动若干条并发进程。
在本申请实施例中,所述并发配置表用于业务员配置并发进程的数量。终端可根据所述并发配置表启动相应数量的并发进程,以为数据清洗作准备。
可选地,在读取并发配置表之前的准备动作还包括对保单信息的筛选,所述方法还可以包括:
获取一二元配置表和保单信息基本配置表,根据所述一二元配置表和保单信息基本配置表从所述Oracle数据库中筛选出待清洗的保单信息;
读取开关配置表,根据所述开关配置表去除所述待清洗的保单信息中的无效数据。
本申请实施例中,不同保单信息可能由不同的销售机构产生,本申请实施例中的所述销售机构根据行政区划划分为了一级机构和二级机构。所述一二元配置表用于区分保单信息属于哪个销售机构。所述保单基本信息配置表则用于记录保单信息中的哪些险种需要进行数据清洗。本申请实施例结合所述一二元配置表和保单信息基本配置表初步筛选出待清洗的保单信息,以减少对无效保单信息的数据清洗操作。所述一二元配置表和保单信息基本配置表也作为后续数据清洗的基础。
所述开关配置表由业务人员动态配置,用于业务员记录针对每一个保单信息中的无效数据。终端根据该开关配置表从保单信息中筛选出业务员不关心的无效数据并插入至数据统计表中,以待业务人员进行下一步的操作。比如某类保单信息中包括了保险人姓名字段、年龄字段、性别字段、电话字段、学历信息字段的信息,所述开关配置表中为该类保单信息中的保险人姓名字段、年龄字段、性别字段、电话字段、学历信息字段设置了开关选项。若业务员认为学历信息字段为无关数据,则可以在所述开关配置表中关闭所述学习信息字段的选项,终端则根据所述开关配置表仅读取该类保单信息中的保险人姓名字段、年龄字段、性别字段、电话字段的信息。
在这里,本申请实施例基于所述一二元配置表、保单信息基本配置表以及开关配置表筛选出待处理的保单信息,在清洗准备阶段预先排除了无需进行清洗的保单信息以及无关的数据,进而减少了清洗的工作量,有利于进一步提高数据清洗的效率。
可选地,在根据所述并发配置表启动若干条并发进程之后准备动作还可以包括状态判断,所述方法还包括:
从日志表中获取所述清洗任务的状态信息;
若所述清洗任务的状态信息为执行成功,则不再执行本次清洗任务;
若所述清洗任务的状态信息为执行失败,则删除所述若干条并发进程中的已处理数据,并重新执行本次清洗任务。
在这里,本申请实施例通过判别当前执行时间对应的清洗任务的状态信息,在所述清洗任务执行成功时,不再执行本次清洗任务;在所述清洗任务执行失败则删除并发线程所跑过的数据,跳转到步骤S202重新执行清洗任务;在所述清洗任务未执行时,则跳转到步骤S202执行清洗任务;从而避免了对清洗任务的重复执行,有利于减少时间的耗费和CPU资源的消耗。
在步骤S202中,求取保单信息对应的保单号的末位数字与并发进程总数之间的余数,将所述保单信息分配给所述余数对应的并发进程。
本申请实施例为每一个并发进程设置了对应的进程编号。在筛选出待处理保单信息之后,本申请实施例根据保单信息对应的保单号来分配所述保单信息对应的处理进程。首先,获取待处理的保单信息及其对应的保单号;然后求取所述保单号的末位数字与并发进程总数之间的余数,最后根据所述余数将所述保单信息分配给进程编号为所述余数的并发进程,所述并发进程即所述保单信息的处理进程。示例性地,假如当前待处理的保单号为201702008,所启动的并发进程总数为3,分别编号为0、1、2;则该保单号的末位数字8与并发进程总数3之间的余数为2,则将该保单号为201702008的保单信息分配给进程编号为2的并发进程。依次类推,为所有待处理的保单信息分配对应的处理进程,从而保证了每一个保单信息都有对应的处理进程,避免了保单信息被重复清洗的情况,有利于提高数据清洗的效率。
在步骤S203中,使用游标读取待处理的保单信息,将所读取的保单信息缓存至第一预设数组中,并分批次提交至所分配的并发进程,由所述并发进程按照预设的数据清洗算法对所述保单信息进行数据清洗。
在这里,本申请实施例采用游标从Oracle数据库中读取待处理的保单信息。每读取一条数据信息先放到第一预设数组中缓存,当读取的条数达到指定阈值时,则将所读取的数据信息作为一个批次,一并提交至对应的并发进程进行处理,由所述并发进程按照预设的数据清洗算法进行数据清洗。可选地,所述指定阈值可以为5000条/批次。具体代码如下:
FETCH  c_pol_ind  BULK COLLECT INTO v_pol_ind  LIMIT 5000;
每当读取5000条数据后,则将所述5000条数据信息从所述第一预设数组中一并提交至并发进程进行清洗中,有利于减少I/O端口的消耗。
在步骤S103中,将清洗后的保单信息分批次提交并存储至所述Oracle数据库中。
可选地,图3示出了本申请第一实施例提供的数据清洗的方法中步骤S103的具体实现流程。参阅图3,所述步骤S103包括:
在步骤S301中,将经并发进程清洗后的保单信息读出至第二预设数组中,采用commit命令分批次将该第二预设数组中的保单信息提交至Oracle数据库中。
在并发进程完成对数据信息的清洗之后,本申请实施例从所述并发进程中读出经过清洗的数据信息,并将所述数据信息缓存至第二预设数组中。同样地,当读取的条数达到指定阈值时,则将所述第二预设数组中的数据信息作为一个批次,提交至Oracle数据库进行存储。可选地,所述指定阈值可以为5000条/批次。每当循环读出5000条数据后,则将所述5000条数据信息从所述第二预设数组中一并读出至Oracle数据库进行存储,可以进一步减少I/O端口的消耗。
在步骤S302中,按照每批次中的保单信息的执行时间和对应并发进程的进程编号,将该批次中的保单信息存储至Oracle数据库中对应的结果表中。
在本申请实施例中,所述Oracle数据库中包括按照执行时间划分的一级分区和按照进程编号划分的二级分区。结果表中包括两个用于确定数据所属分区的proc date字段和Order num字段,所述proc date字段表示执行时间,所述Order num字段表示进程编号。从并发进程中读出的每一条数据信息都具备proc date字段和Order num字段两个属性信息。当commit命令将一个批次的数据信息提交至所述Oracle数据库时,则根据每一条数据信息对应的proc date字段和Order num字段即可准确地将该条数据信息存储至对应的结果表。本申请实施例采用分区的方式对清洗之后的保单信息进行存储,既方便了对历史时间上的保单信息进行删除,也提高了对本次清洗的保单信息的查询效率。
应理解,在上述实施例中,各步骤的序号的大小并不意味着执行顺序的先后,各步骤的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。
图4示出了本申请第二实施例提供的终端的示意性框图,为了便于说明,仅示出了与本申请实施例相关的部分。
在本申请实施例中,所述终端用于实现上述图1至图3任一实施例中所述的数据清洗的方法,可以是软件单元、硬件单元或者软硬件结合的单元。所述终端包括但不限于计算机、服务器等。
参阅图4,所述终端包括:
任务接收模块41,用于当接收到对传统分红数据的清洗任务时,将所述清洗任务插入预设的任务执行表中,并设置所述清洗任务对应的执行时间;
并发清洗模块42,用于当所述执行时间到达时,通过Oracle数据库获取调度包,按照所述调度包,采用多并发的方式对保单信息进行数据清洗;
存储模块43,用于将清洗后的保单信息分批次提交并存储至所述Oracle数据库中。
在本申请实施例中,所述任务接收模块41根据用户在页面上的触发操作获取对传统分红数据的清洗任务,并将所述清洗任务插入至预设的任务执行表中,设定该任务的执行时间。示例性地,所述任务执行表可以为pala_batch_plan表,所述pala_batch_plan表中包括有用于控制执行时间的scheduled start date字段。本申请实施例在将所述清洗任务插入至所述pala_batch_plan表的同时,根据用户操作修改所述scheduled start date字段中的值,以设置该清洗任务的执行时间。在设置完执行时间后,只有在当前时间大于或等于所述执行时间时,该清洗任务才允许启动,从而方便了用户安排清洗任务,有利于合理使用cpu资源。
当所述执行时间到达时,所述并发清洗模块42则通过Oracle数据库获取调度包。其中,所述调度包中包括了参数信息、流程信息等与清洗任务执行相关的信息。然后根据所述调度包中的相关信息,启动若干个并发进程,通过所述并发进程对所述保单信息进行数据清洗。
示例性地,如前所述,所述并发清洗模块42可根据所述scheduled start date字段中的执行时间,当所述执行时间到达时,则自动获取调度包,启动若干个并发进程自动执行清洗任务。
与现有技术相比,本申请实施例对清洗流程进行了优化,包括采用多并发的方式进行数据清洗,由每个并发进程自动筛选出待处理的保单信息,从而保证了保单信息不会被重复执行。所述并发清洗模块42还包括:
启动单元421,用于读取并发配置表,根据所述并发配置表启动若干条并发进程;
分配单元422,用于求取保单信息对应的保单号的末位数字与并发进程总数之间的余数,将所述保单信息分配给所述余数对应的并发进程;
清洗单元423,用于使用游标读取待处理的保单信息,将所读取的保单信息缓存至第一预设数组中,并分批次提交至所分配的并发进程,由所述并发进程按照预设的数据清洗算法对所述保单信息进行数据清洗。
在本申请实施例中,所述并发配置表用于业务员配置并发进程的数量。终端可根据所述并发配置表启动相应数量的并发进程,以为数据清洗作准备。
可选地,本申请实施例在读取并发配置表之前的准备动作还可以包括对保单信息的筛选,所述终端还包括:
筛选模块44,用于在读取并发配置表之前,获取一二元配置表和保单信息基本配置表,根据所述一二元配置表和保单信息基本配置表从所述Oracle数据库中筛选出待清洗的保单信息;读取开关配置表,根据所述开关配置表去除所述待清洗的保单信息中的无效数据。
本申请实施例中,不同保单信息可能由不同的销售机构产生,本申请实施例中的所述销售机构根据行政区划划分为了一级机构和二级机构。所述一二元配置表用于区分保单信息属于哪个销售机构。所述保单基本信息配置表则用于记录保单信息中的哪些险种需要进行数据清洗。本申请实施例基于所述一二元配置表和保单信息基本配置表初步筛选出待清洗的保单信息,有利于减少对无效保单信息的数据清洗操作。所述一二元配置表和保单信息基本配置表也作为后续数据清洗的基础。
所述开关配置表由业务人员动态配置,用于业务员记录无效数据。终端根据该开关配置表从保单信息中筛选出业务员不关心的无效数据并插入至数据统计表中,以待业务人员进行下一步的操作。本申请实施例基于所述一二元配置表、保单信息基本配置表以及开关配置表筛选出待处理的保单信息,在清洗准备阶段预先排除无需进行清洗的保单信息,进而减少了清洗的工作量,有利于进一步提高数据清洗的效率。
可选地,在根据所述并发配置表启动若干条并发进程之后的准备动作还可以包括状态判断,所述终端还包括:
状态识别模块45,用于在根据所述并发配置表启动若干条并发进程之后,从日志表中获取所述清洗任务的状态信息;若所述清洗任务的状态信息为执行成功,则不再执行本次清洗任务;若所述清洗任务的状态信息为执行失败,则删除所述若干条并发进程中的已处理数据,并重新执行本次清洗任务。
在这里,本申请实施例通过判别当前执行时间对应的清洗任务的状态信息,在所述清洗任务执行成功时,不再执行本次清洗任务;在所述清洗任务执行失败则删除并发线程所跑过的数据,跳转到分配单元422重新执行清洗任务;在所述清洗任务未执行时,则跳转到分配单元422执行清洗任务;从而避免了对清洗任务的重复执行,有利于减少时间的耗费和CPU资源的消耗。
对于所述待处理的保单信息,本申请实施例由所述分配单元422根据保单信息对应的保单号来分配所述保单信息对应的处理进程。首先,分配单元422获取待处理的保单信息及其对应的保单号;然后求取所述保单号的末位数字与并发进程总数之间的余数,最后根据所述余数将所述保单信息分配给进程编号为所述余数的并发进程,所述并发进程即所述保单信息的处理进程。示例性地,假如当前待处理的保单号为201702008,所启动的并发进程总数为3,分别编号为0、1、2;则该保单号的末位数字8与并发进程总数3之间的余数为2,则将该保单号为201702008的保单信息分配至进程编号为2的并发进程。依次类推,为所有待处理的保单信息分配对应的处理进程,从而保证了每一个保单信息都有对应的处理进程,避免了保单信息被重复清洗的情况,有利于提高数据清洗的效率。
在完成进程分配之后,本申请实施例由所述清洗单元423采用游标从Oracle数据库中读取待处理的保单信息。每读取一条数据信息先放到第一预设数组中缓存,当读取的条数达到指定阈值时,则将所读取的数据信息作为一个批次,一并提交至对应的并发进程进行处理,由所述并发进程按照预设的数据清洗算法进行数据清洗。可选地,所述指定阈值可以为5000条/批次。具体代码如下:
FETCH  c_pol_ind  BULK COLLECT INTO v_pol_ind  LIMIT 5000;
每当读取5000条数据后,则将所述5000条数据信息从所述第一预设数组中一并提交至并发进程进行清洗中,以减少I/O端口的消耗。
进一步地,所述存储模块43还包括:
提交单元431,用于将经并发进程清洗后的保单信息读出至第二预设数组中,采用commit命令分批次将该第二预设数组中的保单信息提交至Oracle数据库中;
存储单元432,用于按照每批次中的保单信息的执行时间和对应并发进程的进程编号,将该批次中的保单信息存储至Oracle数据库中对应的结果表中。
在并发进程完成对数据信息的清洗之后,本申请实施例由所述提交单元431从所述并发进程中读出经过清洗的数据信息,并将所述数据信息缓存至第二预设数组中。同样地,当读取的条数达到指定阈值时,则将所读取的数据信息作为一个批次,提交至Oracle数据库进行存储。可选地,所述指定阈值可以为5000条/批次。每当循环读出5000条数据后,则将所述5000条数据信息从所述第二预设数组中一并读出至Oracle数据库进行存储,以减少I/O端口的消耗。
在本申请实施例中,所述Oracle数据库中包括按照执行时间划分的一级分区和按照进程编号划分的二级分区。结果表中包括两个用于确定数据所属分区的proc date字段和Order num字段,所述proc date字段表示执行时间,所述Order num字段表示进程编号。从并发进程中读出的每一条数据信息都具备proc date字段和Order num字段两个属性信息。当commit命令将一个批次的数据信息提交至所述Oracle数据库时,所述存储单元432根据每一条数据信息对应的proc date字段和Order num字段,即可准确地将该条数据信息存储至对应的结果表。本申请实施例采用分区的方式对清洗之后的保单信息进行存储,既方便了对历史时间上的保单信息进行删除,也提高了对本次清洗的保单信息的查询效率。
需要说明的是,本申请实施例中的终端可以用于实现上述方法实施例中的全部技术方案,其各个功能模块的功能可以根据上述方法实施例中的方法具体实现,其具体实现过程可参照上述实例中的相关描述,此处不再赘述。
综上所述,本申请实施例通过在接收到对传统分红数据的清洗任务时,将所述清洗任务插入预设的任务执行表中,并设置所述清洗任务对应的执行时间;当所述执行时间到达时,则通过Oracle数据库获取调度包,按照所述调度包,采用多并发的方式对保单信息进行数据清洗;最后将清洗后的保单信息分批次提交并存储至所述Oracle数据库中,从而省去了文件格式相互转换的步骤,有效地提高了数据清洗的效率,降低了数据清洗的总体耗时。
为了便于更好地实施本申请实施例中的上述方法实施例,本申请还提供了用于配合实施执行上述方法实施例的相关终端。图5给出本申请第三实施例提供的终端的示意性框图。如图所示的该终端可以包括:一个或多个处理器501(图中仅示出一个);一个或多个输入设备502(图中仅示出一个),一个或多个输出设备503(图中仅示出一个)、存储器504。上述处理器501、输入设备502、输出设备503、存储器504通过总线506连接。所述输入设备502用于接收对传统分红数据的清洗任务;所述存储器504用于存储计算机可读指令;所述处理器501用于执行所述存储器存储的计算机可读指令以执行如下操作:
当接收到对传统分红数据的清洗任务时,将所述清洗任务插入预设的任务执行表中,并设置所述清洗任务对应的执行时间;当所述执行时间到达时,通过Oracle数据库获取调度包,按照所述调度包,采用多并发的方式对保单信息进行数据清洗;将清洗后的保单信息分批次提交并存储至所述Oracle数据库中。
进一步地,所述采用多并发的方式对保单信息进行数据清洗包括:
读取并发配置表,根据所述并发配置表启动若干条并发进程;
求取保单信息对应的保单号的末位数字与并发进程总数之间的余数,将所述保单信息分配给所述余数对应的并发进程;
使用游标读取待处理的保单信息,将所读取的保单信息缓存至第一预设数组中,并分批次提交至所分配的并发进程,由所述并发进程按照预设的数据清洗算法对所述保单信息进行数据清洗。
进一步地,所述将清洗后的保单信息分批次提交并存储至所述Oracle数据库中包括:
将经并发进程清洗后的保单信息读出至第二预设数组中,采用commit命令分批次将该第二预设数组中的保单信息提交至Oracle数据库中;
按照每批次中的保单信息的执行时间和对应并发进程的进程编号,将该批次中的保单信息存储至Oracle数据库中对应的结果表中。
进一步地,所述处理器501还用于:
在根据所述并发配置表启动若干条并发进程之后,从日志表中获取所述清洗任务的状态信息;
若所述清洗任务的状态信息为执行成功,则不再执行本次清洗任务;
若所述清洗任务的状态信息为执行失败,则删除所述若干条并发进程中的已处理数据,并重新执行本次清洗任务。
进一步地,所述处理器501还用于:
在读取并发配置表之前,获取一二元配置表和保单信息基本配置表,根据所述一二元配置表和保单信息基本配置表从所述Oracle数据库中筛选出待清洗的保单信息;
读取开关配置表,根据所述开关配置表去除所述待清洗的保单信息中的无效数据。
应当理解,在本申请实施例中,所称处理器501可以是中央处理单元(Central Processing Unit,CPU) 和/或图形处理器(Graphic Processing Unit,GPU),也可以在此基础上结合其他通用处理器、数字信号处理器 (Digital Signal Processor,DSP)、专用集成电路 (Application Specific Integrated Circuit,ASIC)、现成可编程门阵列 (Field-Programmable Gate Array,FPGA) 或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。
输入设备502可以包括触控板、指纹采传感器(用于采集用户的指纹信息和指纹的方向信息)、麦克风、通信模块(比如Wi-Fi模块、2G/3G/4G网络模块)、物理按键等。
输出设备503可以包括显示器(LCD等)、扬声器等。其中,显示器可用于显示由用户输入的信息或提供给用户的信息等。显示器可包括显示面板,可选的,可以采用液晶显示器(Liquid Crystal Display,LCD)、有机发光二极管(Organic Light-Emitting Diode, OLED)等形式来配置显示面板。进一步的,上述触控板可覆盖在显示器上,当触控板检测到在其上或附近的触摸操作后,传送给处理器501以确定触摸事件的类型,随后处理器501根据触摸事件的类型在显示器上提供相应的视觉输出。
具体实现中,本申请实施例中所描述的处理器501、输入设备502、输出设备503、存储器504可执行本申请实施例提供的数据清洗的方法的实施例中所描述的实现方式,在此不再赘述。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的方法、终端及计算机可读存储介质,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述模块、单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元、模块可以集成在一个处理单元中,也可以是各个单元、模块单独物理存在,也可以两个或两个以上单元、模块集成在一个单元中。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储计算机可读指令的介质。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应所述以权利要求的保护范围为准。

Claims (20)

  1. 一种数据清洗的方法,其特征在于,所述方法包括:
    当接收到对传统分红数据的清洗任务时,将所述清洗任务插入预设的任务执行表中,并设置所述清洗任务对应的执行时间;
    当所述执行时间到达时,通过Oracle数据库获取调度包,按照所述调度包,采用多并发的方式对保单信息进行数据清洗;
    将清洗后的保单信息分批次提交并存储至所述Oracle数据库中。
  2. 如权利要求1所述的数据清洗的方法,其特征在于,所述采用多并发的方式对保单信息进行数据清洗包括:
    读取并发配置表,根据所述并发配置表启动若干条并发进程;
    求取保单信息对应的保单号的末位数字与并发进程总数之间的余数,将所述保单信息分配给所述余数对应的并发进程;
    使用游标读取待处理的保单信息,将所读取的保单信息缓存至第一预设数组中,并分批次提交至所分配的并发进程,由所述并发进程按照预设的数据清洗算法对所述保单信息进行数据清洗。
  3. 如权利要求1所述的数据清洗的方法,其特征在于,所述将清洗后的保单信息分批次提交并存储至所述Oracle数据库中包括:
    将经并发进程清洗后的保单信息读出至第二预设数组中,采用commit命令分批次将该第二预设数组中的保单信息提交至Oracle数据库中;
    按照每批次中的保单信息的执行时间和对应并发进程的进程编号,将该批次中的保单信息存储至Oracle数据库中对应的结果表中。
  4. 如权利要求2所述的数据清洗的方法,其特征在于,在根据所述并发配置表启动若干条并发进程之后,所述方法还包括:
    从日志表中获取所述清洗任务的状态信息;
    若所述清洗任务的状态信息为执行成功,则不再执行本次清洗任务;
    若所述清洗任务的状态信息为执行失败,则删除所述若干条并发进程中的已处理数据,并重新执行本次清洗任务。
  5. 如权利要求2所述的数据清洗的方法,其特征在于,在读取并发配置表之前,所述方法还包括:
    获取一二元配置表和保单信息基本配置表,根据所述一二元配置表和保单信息基本配置表从所述Oracle数据库中筛选出待清洗的保单信息;
    读取开关配置表,根据所述开关配置表去除所述待清洗的保单信息中的无效数据。
  6. 一种终端,其特征在于,所述终端包括:
    任务接收模块,用于当接收到对传统分红数据的清洗任务时,将所述清洗任务插入预设的任务执行表中,并设置所述清洗任务对应的执行时间;
    并发清洗模块,用于当所述执行时间到达时,通过Oracle数据库获取调度包,按照所述调度包,采用多并发的方式对保单信息进行数据清洗;
    存储模块,用于将清洗后的保单信息分批次提交并存储至所述Oracle数据库中。
  7. 如权利要求6所述的终端,其特征在于,所述并发清洗模块包括:
    启动单元,用于读取并发配置表,根据所述并发配置表启动若干条并发进程;
    分配单元,用于求取保单信息对应的保单号的末位数字与并发进程总数之间的余数,将所述保单信息分配给所述余数对应的并发进程;
    清洗单元,用于使用游标读取待处理的保单信息,将所读取的保单信息缓存至第一预设数组中,并分批次提交至所分配的并发进程,由所述并发进程按照预设的数据清洗算法对所述保单信息进行数据清洗。
  8. 如权利要求6所述的终端,其特征在于,所述存储模块包括:
    提交单元,用于将经并发进程清洗后的保单信息读出至第二预设数组中,采用commit命令分批次将该第二预设数组中的保单信息提交至Oracle数据库中;
    存储单元,用于按照每批次中的保单信息的执行时间和对应并发进程的进程编号,将该批次中的保单信息存储至Oracle数据库中对应的结果表中。
  9. 如权利要求7所述的终端,其特征在于,所述终端还包括:
    状态识别模块,用于在根据所述并发配置表启动若干条并发进程之后,从日志表中获取所述清洗任务的状态信息;若所述清洗任务的状态信息为执行成功,则不再执行本次清洗任务;若所述清洗任务的状态信息为执行失败,则删除所述若干条并发进程中的已处理数据,并重新执行本次清洗任务。
  10. 如权利要求7所述的终端,其特征在于,所述数终端还包括:
    筛选模块,用于在读取并发配置表之前,获取一二元配置表和保单信息基本配置表,根据所述一二元配置表和保单信息基本配置表从所述Oracle数据库中筛选出待清洗的保单信息;读取开关配置表,根据所述开关配置表去除所述待清洗的保单信息中的无效数据。
  11. 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机可读指令,其特征在于,所述计算机可读指令被处理器执行时实现如下步骤:
    当接收到对传统分红数据的清洗任务时,将所述清洗任务插入预设的任务执行表中,并设置所述清洗任务对应的执行时间;
    当所述执行时间到达时,通过Oracle数据库获取调度包,按照所述调度包,采用多并发的方式对保单信息进行数据清洗;
    将清洗后的保单信息分批次提交并存储至所述Oracle数据库中。
  12. 如权利要求11所述的计算机可读存储介质,其特征在于,所述采用多并发的方式对保单信息进行数据清洗包括:
    读取并发配置表,根据所述并发配置表启动若干条并发进程;
    求取保单信息对应的保单号的末位数字与并发进程总数之间的余数,将所述保单信息分配给所述余数对应的并发进程;
    使用游标读取待处理的保单信息,将所读取的保单信息缓存至第一预设数组中,并分批次提交至所分配的并发进程,由所述并发进程按照预设的数据清洗算法对所述保单信息进行数据清洗。
  13. 如权利要求11所述的计算机可读存储介质,其特征在于,所述将清洗后的保单信息分批次提交并存储至所述Oracle数据库中包括:
    将经并发进程清洗后的保单信息读出至第二预设数组中,采用commit命令分批次将该第二预设数组中的保单信息提交至Oracle数据库中;
    按照每批次中的保单信息的执行时间和对应并发进程的进程编号,将该批次中的保单信息存储至Oracle数据库中对应的结果表中。
  14. 如权利要求12所述的计算机可读存储介质,其特征在于,在根据所述并发配置表启动若干条并发进程之后,还包括:
    从日志表中获取所述清洗任务的状态信息;
    若所述清洗任务的状态信息为执行成功,则不再执行本次清洗任务;
    若所述清洗任务的状态信息为执行失败,则删除所述若干条并发进程中的已处理数据,并重新执行本次清洗任务。
  15. 如权利要求12所述的计算机可读存储介质,其特征在于,在读取并发配置表之前,还包括:
    获取一二元配置表和保单信息基本配置表,根据所述一二元配置表和保单信息基本配置表从所述Oracle数据库中筛选出待清洗的保单信息;
    读取开关配置表,根据所述开关配置表去除所述待清洗的保单信息中的无效数据。
  16. 一种终端,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令,其特征在于,所述处理器执行所述计算机可读指令时实现如下步骤:
    当接收到对传统分红数据的清洗任务时,将所述清洗任务插入预设的任务执行表中,并设置所述清洗任务对应的执行时间;
    当所述执行时间到达时,通过Oracle数据库获取调度包,按照所述调度包,采用多并发的方式对保单信息进行数据清洗;
    将清洗后的保单信息分批次提交并存储至所述Oracle数据库中。
  17. 如权利要求16所述的终端,其特征在于,所述采用多并发的方式对保单信息进行数据清洗包括:
    读取并发配置表,根据所述并发配置表启动若干条并发进程;
    求取保单信息对应的保单号的末位数字与并发进程总数之间的余数,将所述保单信息分配给所述余数对应的并发进程;
    使用游标读取待处理的保单信息,将所读取的保单信息缓存至第一预设数组中,并分批次提交至所分配的并发进程,由所述并发进程按照预设的数据清洗算法对所述保单信息进行数据清洗。
  18. 如权利要求16所述的终端,其特征在于,所述将清洗后的保单信息分批次提交并存储至所述Oracle数据库中包括:
    将经并发进程清洗后的保单信息读出至第二预设数组中,采用commit命令分批次将该第二预设数组中的保单信息提交至Oracle数据库中;
    按照每批次中的保单信息的执行时间和对应并发进程的进程编号,将该批次中的保单信息存储至Oracle数据库中对应的结果表中。
  19. 如权利要求17所述的终端,其特征在于,在根据所述并发配置表启动若干条并发进程之后,还包括:
    从日志表中获取所述清洗任务的状态信息;
    若所述清洗任务的状态信息为执行成功,则不再执行本次清洗任务;
    若所述清洗任务的状态信息为执行失败,则删除所述若干条并发进程中的已处理数据,并重新执行本次清洗任务。
  20. 如权利要求17所述的终端,其特征在于,在读取并发配置表之前,还包括:
    获取一二元配置表和保单信息基本配置表,根据所述一二元配置表和保单信息基本配置表从所述Oracle数据库中筛选出待清洗的保单信息;
    读取开关配置表,根据所述开关配置表去除所述待清洗的保单信息中的无效数据。
PCT/CN2018/074858 2017-04-06 2018-01-31 数据清洗的方法、终端及计算机可读存储介质 WO2018184418A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710221427.8A CN107688592B (zh) 2017-04-06 2017-04-06 数据清洗的方法及终端
CN201710221427.8 2017-04-06

Publications (1)

Publication Number Publication Date
WO2018184418A1 true WO2018184418A1 (zh) 2018-10-11

Family

ID=61152355

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/074858 WO2018184418A1 (zh) 2017-04-06 2018-01-31 数据清洗的方法、终端及计算机可读存储介质

Country Status (2)

Country Link
CN (1) CN107688592B (zh)
WO (1) WO2018184418A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112800043A (zh) * 2021-02-05 2021-05-14 凯通科技股份有限公司 一种物联网终端信息提取方法、装置、设备和存储介质

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112925772A (zh) * 2019-12-06 2021-06-08 北京沃东天骏信息技术有限公司 一种数据动态拆分方法和装置
CN111597180A (zh) * 2020-05-19 2020-08-28 山东汇贸电子口岸有限公司 一种基于存储过程的otrs系统的数据清洗方法

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103514205A (zh) * 2012-06-27 2014-01-15 中国电信股份有限公司 海量数据处理方法和系统
CN103942104A (zh) * 2014-04-23 2014-07-23 北京金山网络科技有限公司 一种任务管理方法及装置
CN105205105A (zh) * 2015-08-27 2015-12-30 浪潮集团有限公司 一种基于storm的数据ETL系统及处理方法
CN105787008A (zh) * 2016-02-23 2016-07-20 浪潮通用软件有限公司 一种大数据量的数据去重清洗方法
CN106294745A (zh) * 2016-08-10 2017-01-04 东方网力科技股份有限公司 大数据清洗方法及装置
CN106484915A (zh) * 2016-11-03 2017-03-08 国家电网公司信息通信分公司 一种海量数据的清洗方法和系统

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8060700B1 (en) * 2008-12-08 2011-11-15 Nvidia Corporation System, method and frame buffer logic for evicting dirty data from a cache using counters and data types
CN103593352B (zh) * 2012-08-15 2016-10-12 阿里巴巴集团控股有限公司 一种海量数据清洗方法及装置
CN106294492A (zh) * 2015-06-08 2017-01-04 深圳中兴网信科技有限公司 数据清洗方法及清洗引擎
CN106202346B (zh) * 2016-06-29 2019-11-01 广东省信息网络有限公司 一种数据加载清洗引擎、调度与存储系统

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103514205A (zh) * 2012-06-27 2014-01-15 中国电信股份有限公司 海量数据处理方法和系统
CN103942104A (zh) * 2014-04-23 2014-07-23 北京金山网络科技有限公司 一种任务管理方法及装置
CN105205105A (zh) * 2015-08-27 2015-12-30 浪潮集团有限公司 一种基于storm的数据ETL系统及处理方法
CN105787008A (zh) * 2016-02-23 2016-07-20 浪潮通用软件有限公司 一种大数据量的数据去重清洗方法
CN106294745A (zh) * 2016-08-10 2017-01-04 东方网力科技股份有限公司 大数据清洗方法及装置
CN106484915A (zh) * 2016-11-03 2017-03-08 国家电网公司信息通信分公司 一种海量数据的清洗方法和系统

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112800043A (zh) * 2021-02-05 2021-05-14 凯通科技股份有限公司 一种物联网终端信息提取方法、装置、设备和存储介质

Also Published As

Publication number Publication date
CN107688592B (zh) 2020-03-17
CN107688592A (zh) 2018-02-13

Similar Documents

Publication Publication Date Title
US9075629B2 (en) Multi-phase resume from hibernate
WO2018184418A1 (zh) 数据清洗的方法、终端及计算机可读存储介质
US10331584B2 (en) Internal system namespace exposed through use of two local processors and controller memory buffer with two reserved areas
EP3404538B1 (en) Data processing method, and data processing apparatus
US20110252426A1 (en) Processing batch transactions
EP4394595A1 (en) Job solving method and apparatus
WO2021072880A1 (zh) 虚拟机内部快照异步创建方法、装置、系统及存储介质
EP3451193A1 (en) Electronic device and file data journaling method of electronic device
CN109684270B (zh) 数据库归档方法、装置、系统、设备及可读存储介质
CN109885565B (zh) 一种数据表清理方法和装置
US20160306665A1 (en) Managing resources based on an application's historic information
US20160357473A1 (en) Electronic device and method of managing memory of electronic device
WO2022048358A1 (zh) 数据处理方法、装置及存储介质
CN114936173B (zh) 一种eMMC器件的读写方法、装置、设备和存储介质
CN107368255B (zh) 解锁方法、移动终端及计算机可读存储介质
CN111984402A (zh) 一种线程池统一调度监控方法及系统
US20190004816A1 (en) Systems and methods for heterogeneous system on a chip servers
WO2015175742A1 (en) Compliant auditing architecture
US10552419B2 (en) Method and system for performing an operation using map reduce
CN108156310A (zh) 一种指纹处理的方法和系统、终端设备及计算机可读介质
WO2019041701A1 (zh) 集群扩展方法、装置、电子设备及存储介质
CN104424142A (zh) 一种多核处理器系统中访问共享资源的方法与装置
CN109426563B (zh) 一种进程管理方法及装置
US20230325198A1 (en) Coordinated boot synchronization and startup of information handling system subsystems
CN113806055A (zh) 一种轻量级任务调度方法、系统、装置及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18780838

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC , EPO FORM 1205A DATED 16.01.2020.

122 Ep: pct application non-entry in european phase

Ref document number: 18780838

Country of ref document: EP

Kind code of ref document: A1