CN112699174A

CN112699174A - Big data building product drive chain diagram generation method

Info

Publication number: CN112699174A
Application number: CN202110306025.4A
Authority: CN
Inventors: 陶锋; 张勇
Original assignee: China State Construction eCommerce Co Ltd
Current assignee: China State Construction eCommerce Co Ltd
Priority date: 2021-03-23
Filing date: 2021-03-23
Publication date: 2021-04-23
Anticipated expiration: 2041-03-23
Also published as: CN112699174B

Abstract

The invention provides a method for generating a drive chain diagram of big data building supplies, which comprises the following steps: mining the driving relation of each building article of the order to generate driving relation basic data and establish a structural model; identifying and determining a core object and other objects for the internal structure of the driving relationship, and inputting the core object and other objects into the structure corresponding to the structure model to establish a matching model; calculating driving probability values between the core object and other objects and driving probability values between other objects; and selecting a matching model, comparing other objects in the other matching models with the core object in the matching model, connecting the corresponding class structures of the other objects and the core object when the other objects and the core object are the same, and generating a driving chain diagram by combining the driving probability value. The invention provides a drive chain diagram which can show the influence relation of one main drive commodity and other n groups of commodity sets under different dynamic combination conditions, and can support the sales procurement personnel to more specifically carry out inventory satisfaction and gap analysis.

Description

Big data building product drive chain diagram generation method

Technical Field

The invention relates to the technical field of big data analysis, in particular to a big data building product drive chain diagram generation method.

Background

With the centralization of the purchasing and selling of the building products, a large number of batch-type purchasing orders are recorded in the selling system, and the orders comprise various types of building products. Behind these purchase orders are the implementation requirements of various construction projects, and in a building construction scene, multiple categories of building supplies are generally required to be purchased for a certain construction.

The popularization of the online sales mode shortens the change of orders and stocks to the minute level or even the second level, and when a new order comes, stock purchasing personnel need to analyze the stock gaps of other building supplies in real time according to data of a system background so as to support a new centralized purchasing strategy. This process requires that the purchasing analyst has more clear concrete understanding to the sales relation between the building supplies in self system, and the instrument that supports the purchasing analyst to carry out the analysis at present is comparatively loaded down with trivial details, still needs a large amount of manual experiences. The problems of long time consumption and easy error exist.

The order requirements of the building products are greatly different from the structures of common consumer goods, the consumption of the building products of all types is closely related, and the deeper relationship is not easy to find by utilizing the traditional big data association clustering method, so that the inventory requirement analysis under the business background is not enough supported.

Disclosure of Invention

The invention aims to provide a big data building product drive chain diagram generation method which runs in a system through calculation programming and updates in real time and can support a buyer to be used for inventory gap analysis of a building product centralized online sales scene. According to the method, a building product driving chain diagram is generated by analyzing a large amount of sales order data in a background system, so that the support relationship among various types of building products in the near term can be reflected clearly, and further high-order data analysis means are enriched.

The embodiment of the invention is realized by the following technical scheme: a big data building product drive chain diagram generation method comprises the following steps:

s1, mining a driving relationship according to an order placing rule among the building products in the obtained order, generating basic data of the driving relationship of the building products and establishing a structural model of the driving relationship;

s2, identifying the internal structure of the driving relationship to determine a core object and other objects of the driving relationship, and inputting the core object and other objects into a class structure corresponding to the structural model of the driving relationship to establish a driving relationship matching model D-model;

s3, calculating driving probability values between the core object and other objects and driving probability values between other objects;

s4, selecting one D-model, comparing other objects in the other D-models with the core object in the D-model, connecting the class structures corresponding to the other objects and the core object when the other objects and the core object are the same, and generating a driving chain diagram by combining the driving probability value calculated in the step S3.

According to a preferred embodiment, step S1 further includes the steps of:

s11, according to the ordering rule of the building supplies, a structural model of the driving relationship is created, wherein the data structure of the structural model of the driving relationship comprises the following steps: main driving commodity structureDG(ii) a Driven commodity collection structureDBG(ii) a Secondary driven commodity class structureS-DBG；DGAndDBGthe main driving relationship among the two dRel-m;DGandS-DBGthe secondary drive relationship between dRel-s;DBGandS-DBGthe auxiliary driving relationship among the two dRel-a;

s12, analyzing the obtained orders to calculate the sum of the comprehensive errors, and generating the driving relationship basic data corresponding to each order, wherein the calculation formula of the sum of the comprehensive errors is as follows:

（1）

in the formula (1), inBWhen the order is smaller than the preset threshold value, the order conforms to the basic data requirement of the driving relationship;ma quantity of SKU for the total number of categories of construction supplies in the order;kordering the current SKU in the order, whereinkThe value range of (a) is [1,m]；Cthe number of purchases for the current SKU;Pis a sheetA price;

the ratio of the time difference of the order is obtained;Afor the material categories to which the SKU belongs, different material categories correspond to different values.

According to a preferred embodiment, step S12

Calculated by the following method:

when in usekWhen the ratio is not less than 1,

is 1.0, whenk=mWhen the temperature of the water is higher than the set temperature,

is 2.0; when in usekIn the case of other values of the value,

calculated by the following formula, the formula is expressed as:

（2）

in the formula (2), the reaction mixture is,T ₁the order time for the first SKU is,T _mthe order time for the last SKU is,T _kthe order time is the currently calculated SKU order time.

According to a preferred embodiment, step S2 further includes the steps of:

s21, influence value on all SKUs in orderICalculating, using SKU with maximum influence value as driving relation core object, and dividing the driving relation core object into main driving commodityDGWherein the influence value is calculatedI ofThe formula is expressed as:

（3）

S22.excluding driving relationship core objects, remainderm-1 SKU as other object;

s23, repeating the steps until all orders are processed and form the D-model.

According to a preferred embodiment, step S22 further includes:

s221. by the pair of the restm-secondary impact value of 1 SKUDSAnd calculating to determine a driven object in other objects, wherein the formula is as follows:

（4）

s222. for the result of calculationDSSorting by magnitude of value in descending order，Accumulating the remainder one by oneDSWhen the accumulated value is just greater than

In time, the SKU corresponding to the secondary influence value participating in accumulation is divided into the driven commodity setDBGWhereinD _totalIs left overm-sum of secondary impact values of 1 SKU;

s223, if remaining SKUs exist, all the SKUs are divided into the secondary driven commodity setS-DBG。

According to a preferred embodiment, step S3 further includes the steps of:

s31, calculatingDGDrive theDBGProbability value of (2)P _mThe formula is expressed as:

（5）

in the formula (5), the reaction mixture is,k ₁is the same as that in the current D-modelDGAndDBGin (1)SKUA total number of D-models that are identical in sequence;t _eis composed ofk ₁In the same D-model, the latest order placing time of the corresponding order is obtained;t _sis composed ofk ₁The earliest ordering time corresponding to the order in the same D-model;tthe order placing time of the current D-model corresponding order is obtained;C _DGis the same as that in the current D-modelDGThe same order placing times of all the D-models corresponding to the orders are obtained;C _DBGis the same as that in the current D-modelDGSKU andDBGall the D-models with the same SKU sequence correspond to the order placing times of the order;

s32, calculatingDGDriving S-DBGProbability value of (2)P _SThe formula is expressed as:

（6）

in the formula (6), the reaction mixture is,C _S-DBGis the same as that in the current D-modelDGSKU and S-DBGAll the D-models with the same SKU sequence correspond to the order placing times of the order;

S33.DBGdriving S-DBGProbability value of (2)P _aAccording toP _mAndP _sis determined, the formula is expressed as:

（7）。

according to a preferred embodiment, step S4 further includes the steps of:

s41, selecting one D-model, and traversing another D-modelDBGAnd S-DBGThe SKU sequence in (1), and the selected D-modelDGIf the same SKU exists, the SKU will existDBGAnd S-DBGConnecting selected ones of the D-modelsDGAnd establishing a dRel-m, dRel-s or dRel-a association relation;

wherein the result of calculation isP _mAs a weight of the association dRel-m,P _sas a weight of the association dRel-s,P _athen the weight value is used as the weight value of the incidence relation dRel-a;

and S42, repeating the steps until all the D-models are compared.

According to a preferred embodiment, in step S41, the association relationship between the D-model and another D-model is established as follows:

modifying the pointing relation of the class structure in the D-model according to the association relation of the dRel-m, the dRel-s and the dRel-a;

calculated in step S41P _mAs a weight of the association dRel-m,P _sas a weight of the association dRel-s,P _athen, as the weight of the association dRel-a, the following is specifically mentioned:

according toP _m、P _sAndP _aand modifying the driving probability member variable in the class structure data class instance in the D-model.

According to a preferred embodiment, step S1 is preceded by the steps of:

combining orders in batches, namely combining scattered orders of the same customer within a preset time range to generate a composite order comprising multiple SKUs;

removing invalid orders, namely removing the combined composite orders with the SKU number of 1, and taking the rest orders as valid orders;

and sequencing the effective orders according to the order placing time of the orders.

According to a preferred embodiment, the material category in step S12 includes a main material category, an auxiliary material category and a tool category, wherein the value a of the main material category is greater than the value a of the auxiliary material category is greater than the value a of the tool category.

The technical scheme of the embodiment of the invention at least has the following advantages and beneficial effects: (1) the driving chain diagram generated by the method provided by the invention can show the influence relationship of one main driving commodity and other n groups of commodity sets under different dynamic combination conditions, and can particularly support sales buyers to carry out inventory satisfaction degree and gap analysis more specifically;

(2) the driving relation matching model D-model is generated according to ordering rules of different types of articles during building construction, so that the driving relation matching model D-model has stronger industrial pertinence, can reflect the influence relation of all types of building articles more visually, and can provide a data basis for visualization of inventory requirements of the building articles;

(3) the drive chain graph uses the SKU as the unique identification of the building product analysis, and can provide a new analysis dimension for the user image of the order initiating customer.

Drawings

Fig. 1 is a flowchart of steps of a method for generating a drive chain diagram according to embodiment 1 of the present invention;

FIG. 2 is a schematic diagram of a D-model logic structure provided in embodiment 1 of the present invention;

FIG. 3 is a logic diagram of a D-model connection result provided in embodiment 1 of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.

Example 1

Referring to fig. 1, the embodiment provides a method for generating a drive chain diagram of a big data building product, which runs in a system through calculation programming and updates in real time, and can support a buyer to use for stock gap analysis of a centralized online sales scene of the building product. The method generates the building product driving chain diagram by analyzing a large amount of sales order data in the background system. The generated driving chain diagram can clearly reflect the support relationship among various recent building products, and further enriches high-order data analysis means, and the method comprises the following steps:

preprocessing order data, namely preprocessing sales order data obtained by a background system and determining source data which have conditions and are used for generating a driving chain diagram; the pretreatment specifically comprises the following steps:

creating a preprocessing calculation cache pool and a preprocessing result cache pool in a Hive database of Hadoop; loading the sales order data acquired from the background service system into a calculation cache pool, and sequencing the sales order data from far to near according to order placing time; and combining scattered orders of the same customer in a similar time range to form a composite virtual order containing multiple SKUs of the building products.

Using JAVA program to read in the first order of calculation buffer pool to become class object and find the second orderkIndividual scattered orderORD _kWithin 24 hours of the order, if found to be laternAnd combining the scattered orders to form a composite order containing n +1 SKUs, deleting the original scattered orders, and storing the combined composite order into a preprocessing result cache pool in the Hive. The method for judging the scattered order is to judge whether the number of SKUs contained in the order is 1, and if the number of SKUs in the order is 1, namely only one type of building products is contained, the order is deleted. And repeating the steps until all orders in the calculation preprocessing calculation cache pool are processed.

The preprocessing also comprises invalid order elimination, namely scanning the residual orders in the preprocessing calculation cache pool, and deleting the data of the still scattered orders from the database of the preprocessing calculation cache pool.

Furthermore, the system also comprises a sequencing preprocessing flow, all effective orders are sequenced according to the order placing time of the building supplies, and the sequencing rule is preferably from far to near.

After data acquired from a background service system is subjected to a data preprocessing process, a driving relationship mining step is carried out, wherein the driving relationship mining step specifically comprises the following steps:

creating a driving relation matching model D-model, and creating a structural model of the driving relation according to the ordering rule of the building supplies, wherein the driving relation is a main driving commodityDGDriven commodity collectionDBGAnd secondary driven commodity setS-DBGThree main objects. Furthermore, three relations of a main driving relation dRel-m, a secondary driving relation dRel-s and an auxiliary driving relation dRel-a are included. WhereinDBGAndS-DBGmay include a plurality of construction article SKUs, anDGOnly one SKU, the drive core object, is included as the primary drive item. In this embodiment, the switch is drivenThe system matching model D-model is realized by class oriented programming, and respectively and correspondingly creates SKU,DG、DBG、S-DBGClass I, whereinDG、DBG、S-DBGAnd creating a SKU sequence with a sequence number in the class, wherein the initial state of the SKU sequence is an empty element sequence.

Create dRel-m, dRel-s, dRel-a class structural data class implementation, and willP _m、P _s、P _aAs member variables are contained therein, respectively.

And (3) creating a large D-model class, containing all the class objects, forming a final data structure, wherein the final formed D-model logical structure is shown in FIG. 2.

Further, generating the driving relationship basic data specifically includes the following steps:

and loading source data from the preprocessing result buffer pool, and simultaneously creating a driving relation basic data pool.

And loading the first order of the preprocessing result buffer pool for analysis and calculation, generating corresponding driving relationship basic data, and calculating the matching degree of the driving relationship.

Calculating the first step, calculating the ratio of the next single time difference

In particular, the amount of the surfactant is,

obtained by the following methodkWhen the ratio is not less than 1,

is 1.0; when in usek=mWhen the temperature of the water is higher than the set temperature,

is 2.0;

kin the case of other values of the value,

is between [1.0, 2.0 ]]According to the ratio betweenkIndividual constructionOrdering time of building articlesT _k1 st time for placing order of building articlesT ₁The time of ordering with the last building articleT _mThe formula is as follows:

the sum of the composite errors is then calculatedBThe formula is as follows:

in the formula (I), the compound is shown in the specification,Bi.e. the sum of the composite errors in the process of calculating the degree of matching of the driving relationshipBWhen the order is smaller than the preset threshold value, the order conforms to the basic data requirement of the driving relationship;ma quantity of SKU for the total number of categories of construction supplies in the order;kordering the current SKU in the order, whereinkThe value range of (a) is [1,m]；Cthe number of purchases for the current SKU;Pis monovalent;

the ratio of the time difference of the order is obtained;Afor the material categories to which the SKU belongs, different material categories correspond to different values;

in this embodiment, the material category includes main material category, auxiliary material category and tool category, and usually, the main material categoryAOf value greater than that of the auxiliary materialAOf value greater than that of the tool typeAA value; preferably, the main material corresponds toAValue 1.0, corresponding to the auxiliary materialAValue 0.75, corresponding to tool classAThe value was 0.5. In this exampleBIs set to 0.2, i.e., whenBWhen the value of (2) is less than 0.2, the order is considered to be in accordance with the driving relationship basic data, and the order data is stored in a driving relationship basic data pool. And repeating the steps to know that all orders in the preprocessing result buffer pool are processed. And then emptying or deleting the preprocessing result buffer pool.

Further, after mining the driving relationship of the order, identifying the core object in the driving relationship, specifically including the following steps:

loading a first order in a driving relationship basic data pool, and firstly identifying a main driving building product, wherein the main driving building product refers to the SKU with the maximum influence or action in the order, namely the SKU with the maximum influence value, and the calculation influence value formula is as follows:

calculating an impact value for a SKUIAfter that, the SKU with the largest influence value is classified into the D-model class structureDGSKU sequence of class structure.

By aligning the restm-secondary impact value of 1 SKUDSAnd calculating to determine a driven object in other objects, wherein the formula is as follows:

further, for the calculatedDSSorting by magnitude of value in descending order，Accumulating the remainder one by oneDSWhen the accumulated value is just greater than

In time, the SKU corresponding to the secondary influence value participating in accumulation is divided into the driven commodity setDBGWhereinD _totalIs left overm-sum of secondary impact values of 1 SKU; if there are remaining SKUs, all the secondary driven commodity set is sorted intoS-DBG。

The above steps are repeated until all orders are processed and a D-model is formed.

Further, after all the metadata are generated into a D-model, calculating a driving probability value between the core object and another object and a driving probability value between the other objects specifically includes:

and copying the basic data pool of the driving relationship in the memory to generate a pool to be processed for calculating the driving probability, thereby avoiding the confusion of the reference object.

Loading a first D-model into the pool to be processed, and respectively acquiring the D-modelsDG、DBG、S-DBGThe SKU sequence in (1). Searching and calculating in the whole driving relation basic data pool according to the three SKU sequencesC _DG、C _DBG、C _S-DBGThe value of (c).C _DGIs the same as that in the current D-modelDGThe same order placing times of all the D-models corresponding to the orders are obtained;C _DBGis the same as that in the current D-modelDGSKU andDBGall the D-models with the same SKU sequence correspond to the order placing times of the order;C _S-DBGis the same as that in the current D-modelDGSKU and S-DBGAll D-models with the same SKU sequence correspond to the order placement times.

Next, calculateDGDrive theDBGProbability value of (2)P _mThe formula is expressed as:

in the formula (I), the compound is shown in the specification,k ₁is the same as that in the current D-modelDGAndDBGin (1)SKUA total number of D-models that are identical in sequence;t _eis composed ofk ₁In the same D-model, the latest order placing time of the corresponding order is obtained;t _sis composed ofk ₁The earliest ordering time corresponding to the order in the same D-model;tthe order placing time of the current D-model corresponding order is the time unit of day.

Further, calculatingDGDriving S-DBGProbability value of (2)P _SThe formula is expressed as:

in particular, the method comprises the following steps of,DBGdriving S-DBGProbability value of (2)P _aAccording toP _mAndP _sis determined, the formula is expressed as:

P _s、P _m、P _aand after the calculation is finished, directly operating the D-model instance object to realize numerical value updating.

Repeating the steps until all the driving probabilities calculate the D-model of the pool to be processedP _s、P _m、P _aAnd finishing the updating. Furthermore, objects from the driving probability calculation completion pool to the driving probability calculation standby pool are re-referenced, and confusion is avoided.

Then, a driving chain diagram generation step is performed, specifically as follows:

and (3) creating a new graph data structure, wherein the initial state is 0 node and 0 relation. And randomly selecting a D-model as a node to be added into the graph data structure.

Traverse another D-modelDBGAnd S-DBGThe SKU sequence in (1), and the selected D-modelDGIf the same SKU exists, the SKU will existDBGAnd S-DBGConnecting selected ones of the D-modelsDGAnd establishing a dRel-m, dRel-s or dRel-a incidence relation, namely modifying the pointing relation in the D-model seed linked list data class instance.

At the same time will be calculatedP _mAs a weight of the association dRel-m,P _sas a weight of the association dRel-s,P _athe drive probability P member variable in the chained list dataclass instance in the D-model is modified as a weight for the association dRel-a. Preferably, the D-models are selected from the remaining D-models, except for the first selected D-modelDGCan only be connected.

Selecting the next D-model, repeating the operation,up to all D-modelsDBGAndS-DBGthe alignment was completed.

Referring to fig. 3, a minimum scale chain diagram result logic is shown: when in useDBGAnd S-DBGWith some SKU appearing in other D-modelsDGThe same SKU of the connection result logic diagram of (a), is described in detail below:

randomly selecting one D-model as the current D-model; while scanning other D-modelsDGWhen a SKU sequence in the S-DBG is found with a D-modelDGAre the same (e.g., both 20200911), the SKU (e.g., example SKU)_b) With the new D-modelDGConnecting;

similarly, when a SKU sequence in a DBG is found in a D-modelDGAre the same (e.g., both 20200323), the SKU (e.g., example SKU)₃) With the new D-modelDGThe connection is made.

Through the steps, a minimum-scale driving chain diagram consisting of three D-model connections is obtained.

In summary, the drive chain diagram generated by the method provided by the invention can show the influence relationship between one main drive commodity and other n groups of commodity sets under different dynamic combination conditions, and can specifically support the sales procurement personnel to more specifically perform inventory satisfaction and gap analysis; the driving relation matching model D-model is generated according to ordering rules of different types of articles during building construction, so that the driving relation matching model D-model has stronger industrial pertinence, can reflect the influence relation of all types of building articles more visually, and can provide a data basis for visualization of inventory requirements of the building articles; the drive chain graph uses the SKU as the unique identification of the building product analysis, and can provide a new analysis dimension for the user image of the order initiating customer.

The above is only a preferred embodiment of the present invention, and is not intended to limit the present invention, and various modifications and changes will occur to those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A big data building product drive chain diagram generation method is characterized by comprising the following steps:

2. The big data construction product drive chain map generation method according to claim 1, wherein the step S1 further comprises the steps of:

s11, according to the ordering rule of the building supplies, a structural model of the driving relationship is created, wherein the data structure of the structural model of the driving relationship comprises the following steps: main driving commodity structureDG(ii) a Driven commodity collection structureDBG(ii) a Secondary driven commodity class structureS-DBG；DGAndDBGthe main driving relationship among the two dRel-m;DGandS-DBGthe secondary drive relationship between dRel-s;DBGandS- DBGthe auxiliary driving relationship among the two dRel-a;

（1）

in the formula (1), inBWhen the order is smaller than the preset threshold value, the order conforms to the basic data requirement of the driving relationship;ma quantity of SKU for the total number of categories of construction supplies in the order;kordering the current SKU in the order, whereinkThe value range of (a) is [1,m]；Cthe number of purchases for the current SKU;Pis monovalent;

3. The big data construction product drive chain map generation method according to claim 2, wherein in step S12

Calculated by the following method:

when in usekWhen the ratio is not less than 1,

is 2.0; when in usekIn the case of other values of the value,

calculated by the following formula, the formula is expressed as:

（2）

in the formula (2), the reaction mixture is,T ₁the order time for the first SKU is,T _mthe order time for the last SKU is,T _kfor the currently calculated SKAnd U ordering time.

4. The big data construction product drive chain map generation method according to claim 2 or 3, wherein the step S2 further comprises the steps of:

（3）

s22, excluding the driving relation core object and remainingm-1 SKU as other object;

s23, repeating the steps until all orders are processed and form the D-model.

5. The big data construction product drive chain map generation method according to claim 4, wherein the step S22 further comprises:

（4）

s223, if remaining SKUs exist, dividing all SKUsSecondary driven commodity collectionS-DBG。

6. The big data construction product drive chain map generation method according to claim 2, wherein the step S3 further comprises the steps of:

（5）

（6）

S33.DBGdriving S-DBGProbability value of (2)P _aAccording toP _mAndP _ssize determination of (D), formularyShown as follows:

（7）。

7. the big data construction product drive chain map generation method according to claim 6, wherein the step S4 further comprises the steps of:

and S42, repeating the steps until all the D-models are compared.

8. The method for generating the drive chain map of the big data construction product according to claim 7, wherein in step S41, an association relationship between the D-model and another D-model is established as follows:

9. The big data construction product drive chain map generation method according to claim 1, wherein step S1 is preceded by the steps of:

10. The big data construction product drive chain map generation method of claim 2, wherein the big material class in step S12 includes a main material class, an auxiliary material class and a tool class, wherein the value a of the main material class is greater than the value a of the auxiliary material class.