CN117992451A

CN117992451A - Single-sub-site scanning data integration method

Info

Publication number: CN117992451A
Application number: CN202410055547.5A
Authority: CN
Inventors: 杲先柱; 苏战营; 李晓军; 连梦真; 曾雨俊
Original assignee: Shanghai Qianzhen Information Technology Co ltd
Current assignee: Shanghai Qianzhen Information Technology Co ltd
Priority date: 2024-01-15
Filing date: 2024-01-15
Publication date: 2024-05-07

Abstract

The invention relates to the technical field of data integration, in particular to a single-station scanning data integration method which comprises the steps of obtaining various data, extracting corresponding data from a limited-period historical data table according to a set duplication elimination cleaning rule, integrating the obtained table data to obtain a sub-single-station frame table, capturing and eliminating twice unloading data of the same station from an original scanning data table, obtaining a repeatable loading and unloading station scanning table and a non-repeatable station frame table after processing, carrying out data summarization on the repeatable loading and unloading station scanning table and the non-repeatable station scanning table, associating the obtained station scanning summary table with a handover list service table, associating a limited-period historical target table, an allocation or net point clearance time table and a dispatch scanning table by utilizing the obtained station scanning supplementary handover list table, obtaining a target large-width table, comprehensively optimizing the existing report, directly obtaining calculated index data from the target large-width table for reconstruction, fully releasing server resources, and improving the calculation speed of the report.

Description

Single-sub-site scanning data integration method

Technical Field

The invention relates to a sub-single-site scanning data integration method, and belongs to the field of data integration.

Background

The current report development of the digital bin is to acquire report indexes by directly scanning the data captured by the table to remove the duplication and then correlating or aggregating, because the duplication removing rule of the scanned data in the past is not unified and the public indexes cannot be reused, so that a great number of report codes in the digital bin are repeatedly calculated, resources among the servers are increasingly tense, the resource competition among the jobs makes the report calculation speed slower and slower, the servers can not run more reports, new servers have to be applied, and aiming at the current situation, the sub-single-site scanning and cleaning of the large-width table is urgently needed to solve the defects of repeated duplication removing and identical index calculating.

Summary of the invention

Aiming at the problems in the prior art, the invention provides a single-station scanning data integration method.

The technical scheme adopted for solving the technical problems is as follows:

the utility model provides a sub-single site scanning data integration method, which comprises the following steps:

Obtaining various data to obtain corresponding tables, including an original scanning data table, a limited period history target table, a limited period list information table and a routing information table;

preprocessing an original scanning data table to obtain a limited-period historical data table with a back calculation period of 21 days;

Extracting corresponding data from the limited-period historical data table according to a set duplication removal cleaning rule to obtain corresponding tables, including a loading and delivery scanning table, a unloading and delivery scanning table, a collection and sorting scanning table, an abnormal scanning table and a dispatch scanning table, and integrating the obtained tables to obtain a sub-single-site frame table;

Capturing twice unloading data of the same station from the original scanning data table to obtain a corresponding table, and associating the unloading data with the process data table to obtain a repeatable loading and unloading station scanning table;

Removing twice unloading data from the sub-single site frame table to obtain a non-repeatable site frame table, and obtaining a non-repeatable site scanning table by using a left association process data table of the non-repeatable site frame table;

The method comprises the steps of carrying out data summarization on a repeatable loading and unloading site scanning table and a non-repeatable site scanning table to obtain a site scanning summary table, and associating the site scanning summary table with a handover list service table for supplementing handover list information to obtain a site scanning supplementary handover list table;

And (5) utilizing site scanning to supplement a handover list table to associate a limited period historical target table, an allocation or net point clear time table and a dispatch scanning table to obtain a target large-width table.

Further, the obtaining various data, the obtaining the corresponding table includes:

acquiring original scanning data to obtain an original scanning data table;

acquiring data from the past 50 days to the past 21 days in the history target broad table to obtain a limited-period history target table;

obtaining data of the last half year and the last 21 days with scanning records of the recording list information table, and obtaining a limited-period recording list information table;

Acquiring main route or standby route information to obtain a route information table;

obtaining data of twice unloading at the same station from the original scanning data table in a self-correlation mode to obtain a data table of twice unloading at the same station;

Acquiring handover list service data to obtain a handover list service list;

and acquiring allocation or site clearance time data to obtain an allocation or site clearance time table.

Further, the preprocessing the original scan data table to obtain a period history data table with a back calculation period of 21 days includes:

And excluding the last 21 days of full-quantity scanning data in the original scanning data table from the site data with the repeatable mark of no or the same site number of 2 in the limit history target table to obtain the limit history data table.

Further, capturing the twice unloading data of the same station from the original scanning data table, and obtaining the corresponding table includes:

Grabbing the sub-list and the data of the site existing in the two-time unloading data table of the same site from the original scanning data table to obtain a two-time scanning detail table of the same site;

The association of the unloading data with the process data table to obtain the repeatable loading and unloading site scanning table comprises the following steps:

And the repeated loading and unloading station scanning table is obtained by associating the two-time unloading data table of the same station with the two-time scanning list of the same station, the limited-period list information table and the routing information table according to the set information retention rule.

Further, the process data table in the non-repeatable site framework table left association process data table comprises: loading and sending scanning table, unloading and sending scanning table, collecting and sorting scanning table, abnormal scanning table, limited list information table and route information table.

Further, extracting corresponding data from the limit history data table according to the set deduplication cleaning rule to obtain a corresponding table comprising;

The loading and delivery scanning data in the time-limited historical data table are subjected to repeated taking of the latest record at the site according to the sub-list, and the loading and delivery scanning table is obtained;

the unloading and arrival scanning data in the time-limited historical data table are subjected to repeated removal at a station according to the sub-list to obtain the earliest record, so that the unloading and arrival scanning table is obtained;

the method comprises the steps of taking out the set drag scanning data from a limited period historical data table, and taking the earliest record according to the sub-list in the site for duplication removal to obtain a set drag sorting scanning table;

taking out the dispatch from the limited period historical data table, taking the earliest dispatch according to the main single dimension, and preferentially taking the record of the dispatch data at the signing site to obtain a dispatch scanning table;

And taking out the abnormal input-output warehouse-in scanning data from the limited period historical data table, and obtaining an abnormal scanning table according to a record formed by taking the earliest sub-list at the site and performing de-duplication.

Further, the data dimension of the deadline history target table comprises a main list, a sub list, stations, repeatable marks, station sequences, loading handover lists, the same station number and station reverse sequences;

the data dimension of the limited-period list information table comprises a main list, and the specified next station index of each station of the scanning data is calculated through the initial allocation and the target allocation in the main list;

The data dimension of the deadline history data table comprises a main list, a sub list, a site and a scanning type;

the data dimension of the handover list service table comprises various actual arrival and departure times of the vehicle and arrival and departure types;

the data dimension of the two-time unloading data table at the same site comprises a main sheet, a sub sheet, a site, a scanning type, a current site type, a warehousing time, a scanner, a signing type and a signing type; the scanning types of the two-time scanning list at the same site comprise loading, collecting and dragging, sorting, abnormal warehousing and abnormal ex-warehouse;

Two pieces of data of each sub-list in the same-station twice unloading data table comprise first unloading scanning time uld_tm1, second unloading scanning time uld_tm2 and unloading times uld_cnt.

Further, in the repeatable loading station scan table:

Using the two-time unloading data table of the same site to correlate different scanning types of data in the two-time scanning schedule of the same site, wherein the scanning time of the two-time scanning schedule of the same site is smaller than the data of the second unloading time uld_tm2 and uld_cnt=1 to form first detailed scanning information of a sub-list at the site, and the scanning time of the two-time scanning schedule of the same site is greater than or equal to the second unloading time uld_tm2 and the data of uld_cnt=2 to form second detailed scanning information of the sub-list at the site;

And the two unloading data tables at the same station are associated with a limit schedule information table and a route information table to obtain the index of the specified next station.

Further, in the site scanning supplementary handover list information table:

The loading and unloading handover orders and the handover order in the station scanning summary list are respectively associated with the actual station entering and exiting time, station entering and exiting ground weighting and station entering and exiting card swiping of the handover order service list supplementary sub-orders in the station, the station is sequenced to obtain a field site_seq by utilizing the earliest station entering time field, and the actual one-station, the actual next-station, the last-station loading handover orders and the next-station unloading handover orders of the sub-orders in each station are calculated.

Further, in the target large width table:

The site scanning and supplementing the data of the last site of each sub-list in the history target list of the time limit, adding site_seq in the current batch to the site sequence of each sub-list in the history target list of the time limit, scanning and supplementing the actual one-site and last-site loading and delivering list fields in the information list of the delivery list by utilizing the last site of each sub-list in the history target list of the time limit and the loading and delivering list supplementing site, associating the return information of the original list to supplement the return time of the original list, associating the allocation or site clearance time list, calculating the allocation specified transfer time according to a service algorithm, and supplementing the actual next site index of the sub-list to the site by the associated dispatch scanning list.

The invention has the beneficial effects that:

1. The existing report is comprehensively optimized, the report which has longer operation time and is developed directly based on the original scanning table is screened out according to the operation history information, codes such as de-duplication and repeated calculation indexes can be removed, calculated index data can be directly obtained from the target large-width table for reconstruction, server resources are fully released, and the calculation speed of the report is improved.

2. Based on site integration of the large-width table, the method can optimize historical reports, facilitate subsequent development of new reports, fetching and other works, simplify logic change for the condition that the same index is distributed in different reports, avoid modifying each report one by one like in the existing operation, and only uniformly modify site scanning integration of the large-width table, thereby greatly improving data accuracy and speed.

Drawings

Other features, objects and advantages of the present invention will become more apparent upon reading of the detailed description of non-limiting embodiments, given with reference to the accompanying drawings in which:

FIG. 1 is a logic diagram of a prior art data cleansing;

FIG. 2 is a schematic diagram of a sub-single site scan data integration method according to the present invention;

FIG. 3 is a logic diagram of a large-width table dwd _site_scan_ dtl obtained from an original scan data table in a single-site scan data integration method according to the present invention;

Detailed Description

The invention is further described in connection with the following detailed description, in order to make the technical means, the creation characteristics, the achievement of the purpose and the effect of the invention easy to understand.

Please refer to fig. 1, which is a logic diagram of data cleaning in the prior art, when the data cleaning is performed in this way, multiple deduplication cleaning and index calculation are required to be performed on the original scan data according to different target reports, so as to obtain the required target report, but the following problems exist in the method of using the data cleaning:

1. The original site scanning data deduplication method is not accurate enough, and repeated loading and unloading scanning for multiple times at the same site cannot be performed accurately.

2. The original site scanning data are cleaned, signed and received and the two tables are active, codes are redundant, secondary combination is needed when the site scanning data are used, and the efficiency is low.

3. The original site scanning data lacks information such as collection, sorting, abnormal scanning, original bill returning and the like, and the original site scanning data still needs to be associated and supplemented when in use, thereby wasting time and labor and resources.

4. The original scanning data is repeated, the information of the upper and lower sites is inaccurate, processing such as duplicate removal correction is needed for each report development, and development time and server resources are wasted.

5. Many business indexes are repeatedly calculated, so that codes and data are repeated in a large quantity, and once the logic change of the indexes, a plurality of codes need to be changed, so that the workload is large and the change is easy to miss.

Therefore, the obtained data have a large number of repetitions, and the memory occupation is large, and the data accuracy and the operation convenience are unsatisfactory.

Referring to fig. 2, for the above technical problem, the original scan data is cleaned according to the deduplication rule of the set determination mode, so as to obtain a target large-width table dwd _site_scan_ dtl, the corresponding target report can extract the corresponding data from the target large-width table dwd _site_scan_ dtl, the unified target large-width table dwd _site_scan_ dtl adopts multi-step deduplication, the obtained data is accurate and carries out supplementary integration on information such as collection, sorting, abnormal scanning, original bill returning and the like, no repeated redundant data exists, occupied memory is limited, server resources can be fully released, report calculation speed is improved, and all the data can be clearly and accurately shown in the target large-width table dwd _site_scan_ dtl;

Besides optimizing the historical report, the integrated target large-width table also facilitates subsequent development of new report, fetching and other works, logic change is simpler for the condition that the same index is distributed in different reports, each report does not need to be modified one by one like the prior operation, only the site scanning integrated target large-width table dwd _site_scan_ dtl is required to be uniformly modified, and data accuracy and speed are greatly improved.

Referring to fig. 3, in an embodiment of the present disclosure, a method for integrating sub-single-site scan data includes the following steps:

s1: acquiring various data to obtain a corresponding table;

The method specifically comprises the following steps: acquiring original scanning data to obtain an original scanning data table_original;

acquiring data from the last 50 days to the last 21 days in a historical target large-width table dwd _site_scan_ dtl, wherein the data comprise a main list, a sub list, sites, repeatable marks, site sequences, loading handover lists, the same site number and site reverse sequences, and the site sequences are arranged in a reverse manner to obtain a limited-period historical target table_his;

The data between the last 50 days and the last 21 days is selected to obtain data in a relatively short period of time, so that the recent situation can be better reflected instead of the long-term trend, and the 'site order' is inverted to meet the specific viewing or analysis requirement.

Acquiring data recorded by scanning in the last half year and the last 21 days of a recording list information table; the method is to acquire time-efficient data, the data in the last half year can reflect market or business changes in the last half year, and the data recorded in the last 21 days by scanning emphasizes the activity of the data and the real-time property of business activities;

The data dimension is a main list, a limited list information table table_bill is obtained, the table mainly utilizes the initial allocation and the destination allocation in the table to calculate the specified index of the next station of each station of the scanning data, and the initial allocation and the destination allocation are terms describing the starting point and the end point of the information. From these two fields, the circulation and prescribed path of data between the stations can be calculated, thereby obtaining "prescribed next station index per station". These metrics may include, but are not limited to: the expected arrival time of the next station, the cargo amount of the next station, the abnormal condition of the next station, etc.;

For example: the data in the limited list information table table_bill contains information of the origination allocation and destination allocation of each transport, and by analyzing the data, it is possible to calculate to which center each allocation center should send the transport, and the expected arrival time. For example, the a-dispatch center should send the shipment to the B-center recently because, based on the data display, the B-center is the "next stop index" specified for the next stop of the a-center.

Such analysis helps to better manage the transport network, optimize transport paths, improve transport efficiency, and ensure that the transport can be delivered on time.

Obtaining main route or standby route information to obtain a route information table (table_ rtng); the primary route refers to the primary transmission path, while the backup route is an alternative transmission path.

The method comprises the steps of obtaining data of two unloading at the same site from an original scanning data table table_original in a self-correlation mode so as to find out records with specific matching conditions, obtaining unloading scanning data of loading and unloading at the same site or unloading at the same site from a sending part, and further screening out unloading events of other site loading and unloading or unloading part scanning in a specific time range by taking the unloading time of the sub-table in the self-correlation mode as a boundary line for multiple loading and unloading operations of the sub-table at the site, wherein the data correlation is carried out by taking the two unloading scanning time of the sub-table at the same site as a reference, the related records are helped to be found out more accurately, and the unloading scanning data of other site loading and unloading or unloading to the sending part scanning must be met between the two unloading scanning times, wherein the unloading events of other site loading and unloading with other activity records in the specific time range are further screened out, and the first unloading records and second unloading records (fields of the sub-table at the same site and the second unloading site are respectively represented by the field (tmd) and the field (tmd) of the sub-table 1 and the second unloading records of the sub-table (tmd). Facilitating subsequent analysis and data integration.

Capturing data from the handover list service table to obtain a handover list service table_hdvr, wherein the handover list service table mainly comprises various actual in-out times, in-out types and the like of the vehicle, and the information is convenient for knowing the transportation state, transportation efficiency and optimizing the transportation flow and other conditions of the vehicle;

time data of the distribution or net point clearance is obtained from related data sources, the time of the clearance of each distribution center or net point is usually recorded, a distribution or net point clearance time table_site_ frqc is obtained, and transportation plans, adjustment personnel, resources and the like can be arranged according to the time, so that efficient operation of transportation and storage is ensured. By periodically updating and maintaining the table_site_ frqc table, the accuracy and timeliness of the data can be ensured, thereby better meeting the service requirements.

S2: preprocessing an original scanning data table to obtain a limited-period historical data table with a back calculation period of 21 days;

The method specifically comprises the following steps: and (3) excluding the data of the sites with the repeated marks of no or the same site number of 2 from the total scanning data of the last 21 days in the table_original, and only retaining the data which are not repeatedly recorded in the history data or the site number of which is not 2, thereby obtaining a limited-period historical data table table_scan of the scanning data in 21 days of the back calculation period, wherein the data dimension of the table is a main list, a sub list, a site and a scanning type, and the first data deduplication cleaning is realized.

Suppose there is a record of a sub-list in the table_original table at site a, this record has a repeatable flag and the number of sites is 3. In the table_his history table, this sub-list has a repeatable flag recorded at site a and the number of sites is 2. According to the above rule, this record will be excluded from the table scan table because it is not satisfactory. Other eligible records will be contained in the table _ scan table and each record in the table is unique. Thus, the first data deduplication cleaning can be realized.

The table table_original contains the total scan data of the last 21 days, the data can come from various stations, each station can have a plurality of scan records, the table table_his records the previous history data, in the table, one field is a repeatable mark, whether the data can be reused or not, the other field is the number of stations, the design target is to exclude the station data with the repeatable mark of the table_his or the number of the same stations of 2 from the table_original, thus a table table_scan containing only scan data in 21 days in a back calculation period can be obtained, the back calculation period of the station scan data is determined to be 21 days, the loading and unloading data can be combined together in the maximum range, and the accurate duplication removal of the total data in the repeated loading and unloading scan of the station can be ensured.

S3: extracting corresponding data from a limit history data table table_scan according to a set deduplication cleaning rule to obtain a corresponding table, wherein the table comprises a loading and dispatching scanning table table_lod, an unloading and dispatching scanning table table_uld, a set dragging and sorting scanning table table_ cotn _ sotn, an abnormal scanning table table exception and a dispatch scanning table table_ delv _rcv, and carrying out data integration on the obtained loading and dispatching scanning table table_lod, the unloading and dispatching scanning table table_uld, the set dragging and sorting scanning table table_ cotn _ sotn, the abnormal scanning table table exception and the dispatch scanning table table_ delv _rcv to obtain a sub-single site frame table;

The method comprises the following steps:

the loading and sending scanning data in a scanning data table_scan in 21 days of the back calculation period are subjected to repeated removal at a site according to a sub-list to obtain the latest record, so that a loading and sending scanning table table_lod is obtained;

The reason for taking out the latest record is to acquire the latest scanning information, in the process of logistics or transportation, the goods are often required to be scanned for multiple times, the current time and state can be recorded in each scanning, and the latest record can be taken out to acquire the latest goods state, for example, whether the goods reach the destination, are signed by customers or not, and the like. The logistics company can be helped to timely master the dynamic state of goods, and more accurate service is provided for clients.

Unloading and arrival scanning data in a scanning data table_scan in 21 days of a back calculation period are subjected to repeated removal at a site according to a sub-list to obtain an earliest record, and unloading and arrival scanning table table_uld is obtained;

For the unloading and arrival scanning data of each sub-list at each station, only the earliest record is needed to be obtained to know whether the goods of the sub-list arrive at the station and to carry out unloading or arrival scanning. The earliest record represents the earliest arrival time or earliest arrival time of the goods, so that a logistics company is helped to know the arrival and arrival conditions of the goods in time, and the transportation dynamics of the goods are better mastered.

The method comprises the steps of taking out set drag scanning data from a scanning data table table_scan within 21 days of a back calculation period, taking out the earliest record according to sub-list site de-duplication, and taking out sorting data from the table table_scan, and taking out the latest record according to sub-list site de-duplication to obtain a table table_ cotn _ sotn;

for each sub-sheet to gather scan and sort data at each site, one record of the earliest and latest needs to be acquired to know the status and time of the sub-sheet's goods during sorting and transportation. The earliest record of pallet scan data may indicate when the shipment begins, while the latest record of sort data may indicate when the shipment is complete.

Taking out the dispatch from a table_scan of the scanning data in 21 days of the back calculation period, taking the earliest dispatch according to the main single dimension, and preferentially taking the record of the dispatch data at the signing site to obtain a table table_ delv _rcv;

During the logistics or transportation process, the sending and receiving scanning data of each main bill need to be determined at the earliest time point and the sending data is preferentially fetched at the receiving site. Thus, the dispatch and signing order and time of each main single cargo and the situation of dispatch data at signing sites can be known. The dispatch and signing states and time of the goods can be known through the information, and the distribution dynamics of the goods can be mastered.

Taking out the record formed by carrying out duplicate removal on the abnormal input and output storage scanning data according to the earliest sub-list at the site from the table_scan of the scanning data in 21 days of the back calculation period to obtain a table table exception;

in the logistics or transportation process, the abnormal warehouse-in and warehouse-out scanning data generally records the warehouse-in and warehouse-out time and state of the goods at a certain site, so that the abnormal condition of the goods is better known. Taking the earliest record can avoid the same anomalies as the repeated records and provide more accurate information.

The steps are classified, various effective data are taken out after being subjected to accurate duplication removal, various effective information is obtained from a plurality of related information, and a data basis is provided for subsequent data integration.

The data of tables table_lod, table_uld, table_ cotn _sotn and table_ delv _rcv are integrated together to form a sub-single site frame table_site_frame, wherein fields such as a main list, a sub list, a site, a scanning type, a current site type and the like are grouped, fields such as a warehouse-in time, a scanner and the like are earliest (only sign-in scanning is sent), and a sign-in type (only sign-in scanning) is latest.

In the logistics process, the warehousing time is the time point when the goods enter the warehouse, and the scanner is the staff performing the warehousing operation. The earliest record is taken to be understood as finding the earliest record of the warehousing time and the data of the scanner, so that the earliest warehousing information of the goods can be obtained, and the warehousing time of the goods and the warehousing operation executed by the staff can be known;

The sign-in type refers to a type in which a receiver or a designated agent signs in for a cargo after the cargo arrives at a destination. Taking the latest record may be understood as finding the data of the last record signing type, so that the latest signing type information may be obtained. In the logistics or transportation process, knowing the latest signing type information is helpful for tracking the transportation state of goods and confirming whether the goods are signed successfully or not;

s4: capturing twice unloading data of the same station from the original scanning data table to obtain a corresponding table, and associating the unloading data with the process data table to obtain a repeatable loading and unloading station scanning table;

The method specifically comprises the following steps:

Grabbing data of sub-sheets and stations existing in a double-unloading data table table_repeat_frame of the same station from the table table_original to obtain a double-scanning detail table table_repeat_scan of the same station, wherein the scanning types comprise loading, collecting, sorting, abnormal warehouse entry, abnormal warehouse exit and the like;

specifically, the method for obtaining the list table_repeat_scan of the co-site twice comprises the following steps:

S41, acquiring original scanning data with a data cleaning back calculation period of 21 days: the original scan data is first obtained from a database or data table of a certain logistics company. Such data may include information on the goods, scan time, scanner, site, etc.

S42, processing to obtain a table_original: by processing these raw scan data, a table named table_original can be obtained. This table contains all the raw scan data, one scan event for each entry or row.

S43, obtaining data of twice unloading at the same station in a self-correlation mode: in the table_original table, the data of twice unloading at the same station are found according to the station and the unloading scanning time by using a self-correlation mode. This is understood to mean that in one station the goods are unloaded twice. Through this step, a table named table_repeat_frame can be obtained.

S44, determining loading and unloading operation of the sub-list at the station: for multiple loading and unloading operations of the sub-list at the station, the scanning time of two times of unloading of the same sub-list at the same station is taken as a boundary. This means that between the two discharge scan times, there must be other station discharge or discharge scan data to the delivery scan.

S45, grabbing data of the sub-list and the site existing in the table_repeat_frame: finally, those sub-sheets and the data of the sites existing in the table_repeat_frame are grabbed from the table_original table. This results in a table named table _ repeat _ scan. The data in this table includes sub-sheet information, site information, and associated scan types such as loading, pallet, sorting, abnormal warehousing, abnormal ex-warehouse, etc.

The main purpose of this process is to find sub-orders and stations with repeated unloading or loading operations from a large amount of scan data to better understand and analyze the transport and storage of goods.

The table_repeat_frame is utilized to respectively correlate different scanning type data in the table_repeat_scan, the table_repeat_frame is used as a basic frame, and then the basic frame is correlated with the table_repeat_scan table according to different scanning types such as loading, collection dragging, sorting, abnormal warehouse entry, abnormal warehouse exit and the like, so that detailed information of each sub-list under different scanning types of sites can be obtained; data with scan time less than the second time of unloading uld_tm2 and uld_cnt equal to 1 are screened out from the table table_repeat_scan. The data represent the first detailed scanning information of the sub-list at the site, and also in the table_repeat_scan table, the data with the scanning time greater than or equal to the second unloading time uld_tm2 and uld_cnt (unloading times) equal to 2 are screened out, the data represent the second detailed scanning information of the sub-list at the site, and finally, the obtained data are required to be associated with the record information and the route information so as to obtain the index of each sub-list for prescribing the next site at the site. This index includes the expected arrival time, the expected departure time, the next station status, etc., resulting in a repeatable loading station scan table _ repeat _ dtl.

Assume that there is one logistics company's data, which contains information of a plurality of sub-units under different scanning types at a site. First, a table_repeat_frame table is used as a base frame, and then associated with a table_repeat_scan table according to different scan types (e.g., loading, cluster dragging, sorting, etc.).

Next, the data of the first and second detailed scan information are filtered out in the table_repeat_scan table. For example:

the first scan time of the sub-sheet a at station B is 9 a.m., the type of scan is loading, and this is the first scan of the sub-sheet at that station.

The second scan time of sub-sheet a at station B is 10 a.m., the scan type is sort, and this is the second scan of the sub-sheet at that station.

And then, correlating the screened data with the list information and the route information to acquire the index of each sub-list for specifying the next station at the station. For example:

The next station of the sub-sheet A at the station B is the station C, the estimated arrival time is 11 am, the estimated departure time is 12 noon, and the next station state is normal.

Finally, a new table named "table_repeat_ dtl" can be obtained, which contains the detailed index of each sub-list specifying the next station at the site. This allows a better understanding of the transport and handling of each sub-unit between sites.

S5: removing twice unloading data from the sub-single site frame table to obtain a non-repeatable site frame table, and obtaining a non-repeatable site scanning table by using a left association process data table of the non-repeatable site frame table;

The method specifically comprises the following steps: and eliminating repeated site table_repeat_frame scanning data from the table_site_frame table to form a conventional sub-single site frame table table_site_normal, wherein the conventional scanning refers to that a sub-single is allowed to be scanned once in the same type of site, and one piece of data needs to be de-duplicated and reserved for multiple occurrences.

The detailed information of loading, unloading, pallet sorting, abnormal out-warehouse entering, next station regulation and the like is obtained by using a table_site_normal framework left-associated table_lod, table_uld, table_ cotn _sotn, table_ exception, table _bill and table_ rtng, and a non-repeatable site scanning table table_normal_ dtl is obtained;

S6: and carrying out data summarization on the repeatable loading and unloading site scanning table table_repeat_ dtl and the unrepeatable site scanning table_normal_ dtl to obtain a site scanning summary table_site_all, and associating the site scanning summary table_site_all with the handover list service table_hdvr for supplementing handover list information to obtain a site scanning supplementary handover list table_site_hdvr.

Specifically, the data summarization is performed on the repeatable loading site scan table table_repeat_ dtl and the unrepeatable site scan table_normal_ dtl, because there may be multiple routes when the calculation prescribes the next site, the associated result may have repeated data, the result data must be de-duplicated, and it is ensured that the sub-list has only one piece of data at the site, otherwise, the subsequent calculation may actually make errors in both the outbound and the site sequencing.

The method comprises the steps that indexes such as actual station entering and exiting time of a sub-list at a station, weighing of a station entering and exiting platform, station entering and exiting card swiping and the like are respectively associated with a loading handover list and a unloading handover list in a table table_site_all, station ordering is carried out on each sub-list by using a station earliest entering time field to obtain a field site_seq (only current batch data sequence), and index sub-lists such as an actual one-station, an actual next-station loading handover list, a previous-station loading handover list, a next-station unloading handover list and the like of the sub-list at each station are calculated simultaneously to obtain the table_site_hdvr;

S7: the data of the last site of each sub-list in the history target table of the relevant limit time of the station scanning and supplementing the handover list table of the site_site_hdvr (reverse order of the site = 1), the site order of each sub-list in the table of the current batch is added with the site order of the site_seq in the current batch, so that the order of the history data of each sub-list and the current batch data can be ensured to be connected, the last site of each sub-list in the history table of the site and the loading handover list are utilized to supplement the actual one-site and the last-site loading handover list field in the table of the loading handover list of the site_site_hdvr, because the site data of the sub-list positioned at the earliest of the current cleaning window is the index without the relevant scanning information of the last site, the relevant original list return time is supplemented, the relevant table_site_ frqc (allocation or clear time) is calculated according to a service algorithm, the relevant table_ delv _rcv is supplemented with the site to the actual site of the next site of the member to form the target table of the site (37si_35);

according to the method for cleaning and weighing the data, the existing report is comprehensively optimized to obtain the target large-width table dwd _site_scan_ dtl, and various data are subjected to repeated accurate weighing, so that the accuracy of the data is higher, repeated data is greatly reduced, server resources are fully released, the calculation speed of the report is improved, all the data are intensively recorded in the target large-width table dwd _site_scan_ dtl, once index logic changes, only the site scanning integration large-width table is required to be uniformly modified, the workload is greatly reduced, and the occurrence of missing correction and other conditions is avoided.

And the target large-width table dwd _site_scan_ dtl is used for merging and recording the signing state and the active state of the sub-list, so that the speed of data extraction in subsequent data processing is faster, and the information of set dragging, sorting, abnormal scanning and the like which are lack when the original site scanning data are additionally recorded is associated, and the complete data extraction can be realized only through the target large-width table dwd _site_scan_ dtl without associating other data tables again when the subsequent new report is developed, the number is fetched and the like.

Although the present disclosure describes embodiments, not every embodiment is described in terms of a single embodiment, and such description is for clarity only, and one skilled in the art will recognize that the embodiments described in the disclosure as a whole may be combined appropriately to form other embodiments that will be apparent to those skilled in the art.

Claims

1. A sub single site scanning data integration method is characterized in that: comprising the following steps:

2. The sub-single site scan data integration method of claim 1, wherein: the obtaining various data, obtaining the corresponding table includes:

acquiring original scanning data to obtain an original scanning data table;

Acquiring handover list service data to obtain a handover list service list;

3. The sub-single site scan data integration method of claim 1, wherein: the preprocessing the original scanning data table to obtain a limit history data table with a back calculation period of 21 days comprises the following steps:

4. The sub-single site scan data integration method of claim 2, wherein: the capturing the twice unloading data of the same station from the original scanning data table, and obtaining the corresponding table comprises the following steps:

5. The sub-single site scan data integration method of claim 1, wherein: the process data table in the non-repeatable site framework table left-associated process data table comprises: loading and sending scanning table, unloading and sending scanning table, collecting and sorting scanning table, abnormal scanning table, limited list information table and route information table.

6. A sub-single site scan data integration method according to claim 3, wherein: extracting corresponding data from the limited period historical data table according to the set duplicate removal cleaning rule to obtain a corresponding table comprising;

7. The sub-single site scan data integration method of claim 2, wherein: the data dimension of the deadline history target table comprises a main list, a sub list, stations, repeatable marks, station sequences, loading handover lists, the same station number and station reverse sequences;

The data dimension of the two-time unloading data table at the same site comprises a main sheet, a sub sheet, a site, a scanning type, a current site type, a warehousing time, a scanner, a signing type and a signing type;

the scanning types of the two-time scanning list at the same site comprise loading, collecting and dragging, sorting, abnormal warehousing and abnormal ex-warehouse;

8. The sub-single site scan data integration method of claim 7, wherein: the method comprises the steps of:

9. The sub-single site scan data integration method of claim 8, wherein: the site scanning supplementary handover list information table comprises:

10. The sub-single site scan data integration method of claim 9, wherein: the following is described in the target large width table: