CN116167624B - Determination method of target category identification, storage medium and electronic equipment - Google Patents

Determination method of target category identification, storage medium and electronic equipment Download PDF

Info

Publication number
CN116167624B
CN116167624B CN202310449913.0A CN202310449913A CN116167624B CN 116167624 B CN116167624 B CN 116167624B CN 202310449913 A CN202310449913 A CN 202310449913A CN 116167624 B CN116167624 B CN 116167624B
Authority
CN
China
Prior art keywords
attribute information
target
event
node
executed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310449913.0A
Other languages
Chinese (zh)
Other versions
CN116167624A (en
Inventor
袁雷锋
王旭东
张俊
孙茂鹏
司义品
周麟钗
李宏图
单威
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianxinda Information Technology Co ltd
Original Assignee
Tianxinda Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianxinda Information Technology Co ltd filed Critical Tianxinda Information Technology Co ltd
Priority to CN202310449913.0A priority Critical patent/CN116167624B/en
Publication of CN116167624A publication Critical patent/CN116167624A/en
Application granted granted Critical
Publication of CN116167624B publication Critical patent/CN116167624B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0635Risk analysis of enterprise or organisation activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/40Business processes related to the transportation industry

Landscapes

  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Engineering & Computer Science (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • Theoretical Computer Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Physics & Mathematics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Educational Administration (AREA)
  • Quality & Reliability (AREA)
  • Operations Research (AREA)
  • Game Theory and Decision Science (AREA)
  • Development Economics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention relates to the field of data processing, and in particular, to a method for determining a target class identifier, a storage medium, and an electronic device, where the method includes: a, a i The ith object identifier corresponding to the event to be executed, b j File content data for the i-th file; acquiring a target feature identification group T, T j For the j-th target feature in T, T j =1 for indicating that any one of a belongs to b j ,t j =0 for a i Not belonging to b j The method comprises the steps of carrying out a first treatment on the surface of the Obtaining feature vector f= (H) 1 ,H 2 ,t 1 ,t 2 ,...,t j ,...,t m ,REL,P);H 1 Identifying for the first feature; h 2 Identifying for the second feature; m is the number of preset files, REL is the execution identifier corresponding to the associated event of the event to be executed; p is an influence coefficient; and obtaining the target category identification according to the F. Therefore, the accuracy of determining the target category identification corresponding to the event to be executed can be improved.

Description

Determination method of target category identification, storage medium and electronic equipment
Technical Field
The present invention relates to the field of data processing, and in particular, to a method for determining a target class identifier, a storage medium, and an electronic device.
Background
In the air freight industry, before an event to be executed is executed, a category identifier corresponding to the event to be executed is often determined, so that a cargo detection method corresponding to the category identifier is determined as a cargo detection method corresponding to the event to be executed. The event to be executed is an air freight flight corresponding to a waybill.
At present, when determining a class identifier corresponding to an event to be executed, determining a preset score corresponding to each target object corresponding to the event to be executed in a plurality of preset scores, wherein the target objects are cargoes, and the preset scores are used for representing the risk degree of the corresponding target objects in the air cargo transportation process; and then summing the preset scores of all the targets corresponding to the event to be executed to obtain a total score corresponding to the event to be executed, and determining the category identification corresponding to the event to be executed according to the total score.
However, since the risk level of at least part of the target objects in the air cargo transportation process is continuously updated along with the change of the actual situation, and the corresponding preset score can be adjusted after the update, the preset score cannot be updated completely in real time, and based on the fact, the accuracy of the total score corresponding to the event to be executed is low, and the accuracy of the category identification corresponding to the event to be executed is further determined to be low.
Disclosure of Invention
Aiming at the technical problems, the invention adopts the following technical scheme:
according to an aspect of the present invention, there is provided a method for determining a target class identifier, including the steps of:
s100, obtaining a target object identification group A= (a) corresponding to the event to be executed 1 ,a 2 ,...,a i ,...,a n ) I=1, 2, n; wherein a is i And (3) the ith object identifier corresponding to the event to be executed, and n is the number of object identifiers corresponding to the event to be executed.
S200, acquiring a file content data set b= (B) 1 ,b 2 ,...,b j ,...,b m ) J=1, 2, m; wherein b j The file content data of the ith file, and m is the number of preset files; b j Including at least one candidate identification.
S300, according to A or B, obtaining a target feature identification group T= (T) 1 ,t 2 ,...,t j ,...,t m ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein t is j For the j-th target feature in T, T j =1 or 0, t j =1 for tableShow a 1 、a 2 、...、a i 、...、a n Any one of b j ,t j =0 for a i Not belonging to b j
S400, obtaining a feature vector F= (H) corresponding to the event to be executed 1 ,H 2 ,t 1 ,t 2 ,...,t j ,...,t m REL, P); wherein H is 1 For the first characteristic identification, H 1 =1 or 0, h 1 =1 for indicating that the event type of the event to be executed is the first target type, H 1 =0 to indicate that the event type of the event to be executed is not the first target type; h 2 For the second characteristic mark, H 2 =1 or 0, h 2 =1 for indicating that the event type of the event to be executed is the second target type, H 2 =0 to indicate that the event type of the event to be executed is not the second target type; REL is an execution identifier corresponding to an associated event of an event to be executed, rel=1 or 0, rel=1 being used to indicate that at the current time period now When the association event has been performed, rel=0 is used to indicate that at time now The time-associated event is not executed; and P is an influence coefficient used for representing the influence degree of historical data corresponding to the initiator of the event to be executed on the identification of the determined target category.
S500, based on the classification model, determining the candidate category identification corresponding to the F as the target category identification corresponding to the event to be executed in a plurality of candidate category identifications.
According to another aspect of the present invention, there is also provided a non-transitory computer readable storage medium having stored therein at least one instruction or at least one program, the at least one instruction or the at least one program being loaded and executed by a processor to implement the method for determining the target class identification described above.
According to another aspect of the present invention, there is also provided an electronic device comprising a processor and the above-described non-transitory computer-readable storage medium.
The invention has at least the following beneficial effects:
in the invention, firstly, the group A and the file content number are identified by the object corresponding to the event to be executedDetermination of t from group B j And determining H according to the event type of the event to be executed 1 And H 2 Determining REL according to the execution condition of the related event of the event to be executed, determining P according to the historical data corresponding to the initiator of the event to be executed, and then determining P according to the data obtained by t j 、H 1 、H 2 F obtained by REL and P determines a target category identification from a plurality of candidate category identifications.
In the related art, first, the total score of the preset scores corresponding to n object identifiers is determined according to the object identifier group a, and then the object category identifier is determined according to the total score, but the preset score corresponding to each object identifier is lower in accuracy because the preset score cannot be updated completely in real time, so that the accuracy of the object category identifier corresponding to the event to be executed is lower; compared with the related art, t in the invention j Is determined according to whether the file content data of each file comprises any object identifier corresponding to the event to be executed or not, and then t j Is determined according to the latest file content data, and the preset score is not required to be adjusted according to the file content data, so that t j More accurate, and then according to t j The accuracy of the obtained F for determining the target category identification is higher, so that the accuracy for determining the target category identification corresponding to the event to be executed is improved.
In addition, compared with the determination of the target category identification in the related art, which only considers the corresponding preset score of the target object identification, F in the invention also comprises H 1 、H 2 And REL and P, and further consider the event type of the event to be executed, the associated event of the event to be executed and the historical data corresponding to the initiator of the event to be executed when determining the target category identification corresponding to the event to be executed, so that the characteristics of the event to be executed are more obvious, and the aim of further improving the accuracy of determining the target category identification corresponding to the event to be executed is fulfilled.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flowchart of a method for determining a target category identifier according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to fall within the scope of the invention.
The embodiment of the invention provides a method for determining target category identification, wherein the method can be completed by any one or any combination of the following: terminals, servers, and other devices with processing capabilities, which are not limited in this embodiment of the present invention.
The method for determining the target class identification will be described with reference to a flowchart of the method for determining the target class identification shown in fig. 1.
The method comprises the following steps:
s100, obtaining a target object identification group A= (a) corresponding to the event to be executed 1 ,a 2 ,...,a i ,...,a n ),i=1,2,...,n。
Wherein a is i And (3) the ith object identifier corresponding to the event to be executed, and n is the number of object identifiers corresponding to the event to be executed.
Specifically, the event to be executed is an air freight flight corresponding to an air freight bill, and the take-off time of the flight corresponding to the event to be executed is in the current time now And then, the object is a cargo, the object identifier is the name of the corresponding object, and the object identifier corresponding to the event to be executed is the object identifier included in the waybill corresponding to the event to be executed.
S200, acquiring a file content data set b= (B) 1 ,b 2 ,...,b j ,...,b m ),j=1,2,...,m。
Wherein b j The file content data of the ith file, and m is the number of preset files; b j Including at least one candidate identification.
For example, m=3, b 1 The corresponding file is the "x-ray machine difficult to identify List", b 2 The corresponding file is suspected dangerous goods list, b 2 The corresponding file is an implicit dangerous goods list; wherein b 1 Comprising names of goods difficult to identify by a plurality of X-ray machines, b 1 The goods which are difficult to identify by each corresponding X-ray machine are candidates; b 2 Comprising the names of a plurality of suspected dangerous goods b 2 Each corresponding suspected dangerous cargo is a candidate; b 3 Names of goods comprising several hidden dangerous goods, b 3 The corresponding goods of each hidden dangerous goods are one candidate, and the candidate is identified as the name of the corresponding candidate.
S300, according to A or B, obtaining a target feature identification group T= (T) 1 ,t 2 ,...,t j ,...,t m )。
Wherein t is j For the j-th target feature in T, T j =1 or 0, t j =1 for a 1 、a 2 、...、a i 、...、a n Any one of b j ,t j =0 for a i Not belonging to b j
S400, obtaining a feature vector F= (H) corresponding to the event to be executed 1 ,H 2 ,t 1 ,t 2 ,...,t j ,...,t m ,REL,P)。
Wherein H is 1 For the first characteristic identification, H 1 =1 or 0, h 1 =1 for indicating that the event type of the event to be executed is the first target type, H 1 =0 is used to indicate that the event type of the event to be executed is not the first target type. H 2 For the second characteristic mark, H 2 =1 or 0, h 2 =1 for representing the event to be executedThe event type of the piece is the second target type, H 2 =0 is used to indicate that the event type of the event to be executed is not the second target type. REL is an execution identifier corresponding to an associated event of an event to be executed, rel=1 or 0, rel=1 being used to indicate that at the current time period now When the association event has been performed, rel=0 is used to indicate that at time now The time-associated event is not performed. And P is an influence coefficient used for representing the influence degree of historical data corresponding to the initiator of the event to be executed on the identification of the determined target category.
Specifically, the first target type is a type corresponding to the guard flight, and correspondingly, the event type of the event to be executed is the first target type and is used for indicating that the event to be executed is the guard flight, and the event type of the event to be executed is not the first target type and is used for indicating that the event to be executed is not the guard flight.
In a specific embodiment, the second target type is a type corresponding to an important flight, and the corresponding event type of the event to be executed is the second target type for indicating that the event to be executed is an important flight, and the event type of the event to be executed is not the second target type for indicating that the event to be executed is not an important flight; in another specific embodiment, the second target type is a type corresponding to a focused on route, and the corresponding event type of the event to be executed is that the second target type is used for indicating that the route corresponding to the event to be executed is a focused on route, and the event type of the event to be executed is not that the second target type is used for indicating that the route corresponding to the event to be executed is not a focused on route.
The related event is a differential record; in a specific embodiment, REL is determined by the following method: determine at time now Whether the differential record product serial number exists on the waybill corresponding to the event to be executed or not; if yes, rel=1; otherwise rel=0.
The initiator of the event to be executed is the agent corresponding to the event to be executed; in a specific embodiment, P is the credit of the agent corresponding to the event to be executed, that is, the credit of the shipper corresponding to the air freight bill.
Optionally, P is the rootAccording to the historic unpacking rate M of the agent 1 Historical return rate M 2 Historical handover rate M 3 Number of bills M for air freight over the past year 4 Quantity of cargo M for air freight over the last year 5 And/or the cargo weight M for air freight over the last year 6 And (3) determining.
For example, p= (M 1 +M 2 +M 3 +M 4 /M max 4 +M 5 /M max 5 +M 6 /M max 6 ) 100/6; wherein the maximum number of waybills M max 4 =max(M 1 4 ,M 2 4 ,...,M poi 4 ,...,M sum 4 ) Poi=1, 2, sum, max () is a preset maximum value determination function, M poi 4 The method comprises the steps that the number of the waybills for carrying out air freight in the past year is the poi candidate agents, sum is the number of the candidate agents, and the agent corresponding to the event to be executed is any one of sum candidate agents; maximum cargo quantity M max 5 =max(M 1 5 ,M 2 5 ,...,M poi 5 ,...,M sum 5 ),M poi 5 Number of cargo for air freight over the past year for the poi candidate agent; maximum cargo weight M max 6 =max(M 1 6 ,M 2 6 ,...,M poi 6 ,...,M sum 6 ),M poi 6 Cargo weight for air freight for the last year for the poi candidate agent.
S500, based on the classification model, determining the candidate category identification corresponding to the F as the target category identification corresponding to the event to be executed in a plurality of candidate category identifications.
Specifically, the classification model may be a random forest model or a GBDT (Gradient Boosting Decision Tree, gradient descent tree) model, which is not limited in the embodiment of the present invention.
Optionally, the number of candidate category identifiers is 5, and the 5 candidate category identifiers can be identifiers corresponding to a low risk category, a lower risk category, a common category, a strict control category and a high risk category respectively.
In addition, each candidate category identifier is provided with a corresponding cargo inspection method, after the target category identifier corresponding to the event to be executed is determined, the cargo inspection method corresponding to the event to be executed can be determined, so that a simpler cargo inspection method can be matched with the event to be executed with higher safety, and a more complex cargo inspection method can be matched with the event to be executed with lower safety, and the cargo inspection efficiency can be improved while the safety of the event to be executed is improved.
It can be seen that, in the present invention, t is first determined by the object identification group a and the file content data group B corresponding to the event to be executed j And determining H according to the event type of the event to be executed 1 And H 2 Determining REL according to the execution condition of the related event of the event to be executed, determining P according to the historical data corresponding to the initiator of the event to be executed, and then determining P according to the data obtained by t j 、H 1 、H 2 F obtained by REL and P determines a target category identification from a plurality of candidate category identifications.
In the related art, first, the total score of the preset scores corresponding to n object identifiers is determined according to the object identifier group a, and then the object category identifier is determined according to the total score, but the preset score corresponding to each object identifier is lower in accuracy because the preset score cannot be updated completely in real time, so that the accuracy of the object category identifier corresponding to the event to be executed is lower; compared with the related art, t in the invention j Is determined according to whether the file content data of each file comprises any object identifier corresponding to the event to be executed or not, and then t j Is determined according to the latest file content data, and the preset score is not required to be adjusted according to the file content data, so that t j More accurate, and then according to t j The accuracy of the obtained F for determining the target category identification is higher, so that the accuracy for determining the target category identification corresponding to the event to be executed is improved.
In addition, only the target is considered in comparison with the determination of the target class identification in the related artThe F in the invention also comprises H 1 、H 2 And REL and P, and further consider the event type of the event to be executed, the associated event of the event to be executed and the historical data corresponding to the initiator of the event to be executed when determining the target category identification corresponding to the event to be executed, so that the characteristics of the event to be executed are more obvious, and the aim of further improving the accuracy of determining the target category identification corresponding to the event to be executed is fulfilled.
Optionally, the classification model is obtained by the following method:
s501, obtaining a training sample set D= (D) 1 ,d 2 ,...,d x ,...,d y ),x=1,2,..,y。
Wherein d x The x training sample in D; y is the number of training samples in D; d, d x =(d x 1 ,d x 2 ,...,d x r ,...,d x s ,N x ),r=1,2,..,s;d x r The parameter corresponding to the r target attribute information for the x first executed event; s is the number of target attribute information, s=m+4; n (N) x Identifying a candidate category corresponding to the xth first executed event; h 1 For the parameter of the 1 st target attribute information corresponding to the event to be executed, H 2 For the parameter of the 2 nd target attribute information corresponding to the event to be executed, t j For the parameters of the (j+2) -th target attribute information corresponding to the event to be executed, REL is the parameters of the (m+1) -th target attribute information corresponding to the event to be executed, and P is the parameters of the (m+2) -th target attribute information corresponding to the event to be executed.
Specifically, the first executed event is an air freight flight corresponding to an air freight bill, and the take-off time of the flight corresponding to the first executed event is in time now Before or at time now The 1 st target attribute information is an event type corresponding to the first target type, the 2 nd target attribute information is an event type corresponding to the second target type, the (j+2) th target attribute information is a case that the j-th file contains a target object identifier, the (m+1) th target attribute information is an associated event execution case, and the (m+2) th targetThe tag attribute information is credit of the agent. And D can be obtained by carrying out sample equalization processing based on the data resampling and the cost sensitive matrix.
S502, performing q times of random selection of L training samples in the D, and taking each training sample randomly selected in the D for the kth time as a target training sample in the kth target training sample group to obtain a target training sample list SAM= (SAM) 1 ,sam 2 ,...,sam k ,...,sam q ),sam k =(sam k 1 ,sam k 2 ,...,sam k p ,...,sam k L ),k=1,2,..,q,p=1,2,..,L。
Wherein, sam k The k target training sample group in the SAM, q is the number of the target training sample groups in the SAM; sam k p Is sam k The p-th target training sample in (1), L is sam k L < y.
If l× qy, the same training samples exist in at least part of the target training sample groups, and if l×q is less than or equal to y, the intersection of any two target training sample groups may be an empty set, or the same training sample exists in at least part of the target training sample groups.
For example, y=9000 and l=1000, and then 1000 training samples are randomly selected from 9000 training samples and performed 10 times, thereby obtaining Sam= (SAM) 1 ,sam 2 ,...,sam k ,...,sam 10 ),sam k =(sam k 1 ,sam k 2 ,...,sam k p ,...,sam k 1000 ). At this time, the same training sample exists in at least part of the target training sample group.
S503, for sam k Performing decision tree generation processing to obtain a sam k And a corresponding decision tree.
S504, constructing a classification model based on a plurality of decision trees.
Specifically, the classification model is a random forest model.
Optionally, the decision tree generation process includes the following steps:
s510, taking the target training sample group subjected to decision tree generation processing as a current group.
S520, generating a root node of a decision tree corresponding to the current group, and taking all target training samples in the current group as samples corresponding to the root node.
S530, performing child node generation processing on the root node.
The child node generation process includes:
and S531, taking the node for generating the child node as the current node.
S532, determining whether the current node meets the recursion stop condition of the decision tree corresponding to the current group; if yes, go to step S533; otherwise, step S534 is entered.
S533, determining the current node as a leaf node, marking the leaf node by the candidate category identification with the largest number in all the candidate category identifications in all the samples corresponding to the current node, and proceeding to step S539.
For example, after the current node is determined as a leaf node, the number of samples corresponding to the current node is 5, wherein the candidate category identifiers in the 1 st sample, the 3 rd sample and the 5 th sample are all identifiers corresponding to low-risk categories, the candidate category identifiers in the 2 nd sample are all identifiers corresponding to strict control categories, the candidate category identifiers in the 4 th sample are all identifiers corresponding to common categories, and then the leaf node is identified by using the identifiers corresponding to the low-risk categories.
S534, obtaining a coefficient group to calculate the coefficient GINI= (GINI) corresponding to each target attribute information according to all the samples corresponding to the current node 1 ,gini 2 ,...,gini r ,...,gini s )。
Wherein gini is r And the coefficient is the coefficient of the kunit corresponding to the r-th target attribute information calculated based on all samples corresponding to the current node.
gini r =1-∑ var=1 f(var) (|c var r |/|C|) 2 The method comprises the steps of carrying out a first treatment on the surface of the Wherein, C is the sample number of all samples corresponding to the current node, and f (var) is the pairThe number of the remaining parameters obtained after the duplicate removal of the parameter corresponding to the r-th target attribute information in all samples corresponding to the current node, |c var r And the I is the same number as the var-th residual parameter in the parameters corresponding to the r-th target attribute information in all samples corresponding to the current node.
For example, |c|=5, the parameters corresponding to the r-th target attribute information in all samples corresponding to the current node are 5, 10, 11 and 5, respectively, and then the 3 remaining parameters obtained by processing 5, 10, 11 and 5 are 5, 10 and 11, respectively, and f (var) =3, and since 2 of 5, 10, 11 and 5 have 2 of 5, 2 of 10 and 1 of 11, the value of|c is thus equal to 1 r |=2,|c 2 r |=2,|c 3 r |=1。
S535, according to the GINI, obtaining the minimum radix factor GINI corresponding to the current node min =min(gini 1 ,gini 2 ,...,gini r ,...,gini s ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein min () is a preset minimum value determination function.
S536 according to gini min And determining part of all samples corresponding to the current node as a first sample and the other part as a second sample according to the corresponding target attribute information.
And S537, in the decision tree corresponding to the current group, generating a first child node and a second child node of the current node, taking each first sample as a sample corresponding to the first child node, and taking each second sample as a sample corresponding to the second child node.
S538, performing the child node generation processing on the first child node, and performing the child node generation processing on the second child node.
S539, determining whether each node in the decision tree corresponding to the current group is a leaf node or a node which generates a corresponding first child node and a second child node; if yes, outputting a decision tree corresponding to the current group.
Therefore, compared with the generation of the decision tree by adopting a multi-way tree generation algorithm such as a C4.5 algorithm or an ID3 algorithm and the like, the generation of the decision tree is performed by adopting the method for determining the minimum radix coefficient, and the generated decision tree is a binary tree, so that the tree structure can be simplified, and the generation efficiency of the decision tree is improved; in addition, in the invention, the sample division is performed by adopting the target attribute information corresponding to the minimum radix coefficient, and compared with the generation of the decision tree based on the multi-way tree generation algorithm such as the C4.5 algorithm or the ID3 algorithm, the decision tree generation in the invention does not need logarithmic calculation, thereby saving the calculation resources and further improving the efficiency of the decision tree generation.
Optionally, step S536 includes the steps of:
s5361, gini min The corresponding target attribute information is used as the partition attribute information corresponding to the current node.
S5362, obtaining a threshold set E= (E) corresponding to the partition attribute information corresponding to the current node 1 ,e 2 ,...,e u ,...,e v ),u=1,2,...,v。
Wherein e r E, the method is that the u threshold value corresponding to the segmentation attribute information corresponding to the current node is e 1 >e 2 >...>e u >...>e v The method comprises the steps of carrying out a first treatment on the surface of the v is the number of preset threshold values corresponding to the segmentation attribute information corresponding to the current node. The number of thresholds corresponding to at least part of the target attribute information is greater than 1.
S5363, determining whether a corresponding target node exists in the current node; if yes, determining the number num of the target nodes with the same corresponding segmentation attribute information as the segmentation attribute information corresponding to the current node in all the target nodes corresponding to the current node, and entering into step S5364; otherwise, num=0, and the process proceeds to step S5365.
When the current node is not a root node, each node and the root node between the current node and the root node in the decision tree corresponding to the current group are target nodes corresponding to the current node, and when the current node is the root node, the decision tree corresponding to the current group does not have the target nodes corresponding to the current node.
For example, if PAPO1 is the root node, one child node of PAPO1 is PAPO2, one child node of PAPO2 is PAPO3, one child node of PAPO3 is PAPO4, and PAPO4 is the current node, then the target nodes corresponding to PAPO4 are PAPO1, PAPO2, and PAPO2.
S5364, determining whether (num+1) is less than or equal to v; if yes, go to step S5365; otherwise, step S533 is entered.
S5365, for each sample corresponding to the current node, determining whether the target parameter in the sample is less than or equal to e num+1 The method comprises the steps of carrying out a first treatment on the surface of the If yes, taking the sample as a first sample; otherwise, taking the sample as a second sample; the target parameter is a parameter corresponding to the segmentation attribute information corresponding to the current node in the corresponding sample.
S5366, determining each first sample as a sample corresponding to the first sub-node, and determining each second sample as a sample corresponding to the second sub-node.
Therefore, the number of the thresholds corresponding to at least part of the target attribute information in the invention can be multiple, and at least part of the nodes in the decision tree can generate the sub-nodes based on any one of the target attribute information and one of the thresholds corresponding to the target attribute information, and the sub-nodes of the nodes can still generate the sub-nodes based on the other one of the thresholds corresponding to the target attribute information. Compared with the method that the threshold value corresponding to each target attribute information is one, the method can generate the decision tree with larger depth based on the limited number of target attribute information, so that the classification effect of the classification model constructed based on the decision tree is better, and the purpose of improving the accuracy of determining the target category identification corresponding to the event to be executed is achieved.
Optionally, the recursive stopping condition is that candidate category identifiers in each sample corresponding to the current node are the same, or the depth of the current node in the decision tree corresponding to the current group reaches the sum of the numbers of thresholds corresponding to all the target attribute information, or the depth of the current node in the decision tree corresponding to the current group reaches the preset depth, or the number of samples corresponding to the current node is 0.
Specifically, the sum of the threshold values corresponding to all the target attribute information is obtained by summing the threshold value numbers corresponding to each target attribute information. The preset depth is 10 or more and 100 or less.
In a specific embodiment, H 1 、H 2 Target attribute information corresponding to REL and P is filtered attribute information.
Based on this, the post-screening attribute information is determined by the following method:
s610, obtain parameter list w= (W 1 ,w 2 ,...,w h1 ,...,w Q1 ),w h1 =(w h1 1 ,w h1 2 ,...,w h1 h2 ,...,w h1 Q2 ),h1=1,2,...,Q1,h2=1,2,...,Q2。
Wherein w is h1 For the parameter group corresponding to the h1 candidate attribute information, Q1 is the number of candidate attribute information; w (w) h1 h2 Is w h1 In the method, Q2 is the number of second executed events, and Q1 is more than or equal to 4; each piece of filtered attribute information is any one of Q1 pieces of candidate attribute information, t j The corresponding target attribute information is different from each candidate attribute information.
Specifically, the second executed event is an air freight flight corresponding to an air freight bill, and the take-off time of the flight corresponding to the second executed event is in time now Before or at time now . In addition to the filtered attribute information, the candidate attribute information may be a waybill number, a waybill source, an agent name, an agent code, a flight number, a flight date, or the like.
S620, if w h1 All of the parameters in are the same or w h1 Any two parameters are different, then W is deleted in W h1 To obtain a deleted parameter list W '(W') 1 ,w´ 2 ,w´ 3 ,w´ 4 )。
Wherein w z For the z-th deleted parameter set in W', z=1, 2,3,4; s is less than or equal to Q1.
S630, w z The corresponding candidate attribute information is used as the attribute information after screening to obtain a filtered attribute information set ATT= (ATT) 1 ,att 2 ,att 3 ,att 4 )。
Wherein att is z The z-th filtered attribute information in the ATT.
Therefore, the method and the device can reduce the possibility of determining the candidate attribute information with the same corresponding parameters and different corresponding arbitrary two parameters as the screened attribute information, further can enable the relevance between the parameters corresponding to the screened attribute information in F and the target category identifiers to be larger, and achieve the purpose of improving the accuracy of determining the target category identifiers corresponding to the events to be executed.
In another specific embodiment, H 1 、H 2 Target attribute information corresponding to REL and P is filtered attribute information.
Based on this, the post-screening attribute information is determined by the following method:
s640, obtain parameter list w= (W 1 ,w 2 ,...,w h1 ,...,w Q1 ),w h1 =(w h1 1 ,w h1 2 ,...,w h1 h2 ,...,w h1 Q2 ),h1=1,2,...,Q1,h2=1,2,...,Q2。
Wherein w is h1 For the parameter group corresponding to the h1 candidate attribute information, Q1 is the number of candidate attribute information; w (w) h1 h2 Is w h1 In the method, Q2 is the number of second executed events, and Q1 is more than or equal to 4; each piece of filtered attribute information is any one of Q1 pieces of candidate attribute information, t j The corresponding target attribute information is different from each candidate attribute information.
Specifically, at least a portion of the first execution event is a second execution event, which is not limited in the embodiment of the present invention.
S650, for w h1 Performing parameter deduplication treatment to obtain w h1 Corresponding de-duplicated parameter set w' h1 To obtain a de-duplicated parameter list W "= (W)" 1 ,w" 2 ,...,w" h1 ,...,w" Q1 ),w" h1 =(w" h1 1 ,w" h1 2 ,...,w" h1 h3 ,...,w" h1 Q3(h1) ),h3=1,2,...,Q3(h1)。
Wherein w' h1 h3 Is w' h1 The h3 post-deduplication parameter of (2), Q3 (h 1) is w' h1 Q3 (h 1) is less than or equal to Q2; w' h1 Is different.
S670, if [ Q3 (h 1)]Q2 is smaller than a first preset value pre 1 Or greater than a second preset value pre 2 Then W is deleted in W' h1 To obtain a deleted parameter list W '(W') 1 ,w´ 2 ,w´ 3 ,w´ 4 )。
Wherein, pre 2 >pre 1 ,w´ z For the z-th deleted parameter set in W', z=1, 2,3,4.
S680, w z The corresponding candidate attribute information is used as the attribute information after screening to obtain a filtered attribute information set ATT= (ATT) 1 ,att 2 ,att 3 ,att 4 ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein att is z And the z-th filtered attribute information.
Therefore, the probability of determining the candidate attribute information with basically the same corresponding parameters and basically different corresponding two parameters as the screened attribute information can be reduced, and the relevance between the parameters corresponding to the screened attribute information in F and the target category identification is further increased, so that the purpose of improving the accuracy of determining the target category identification corresponding to the event to be executed is achieved.
In another specific embodiment, H 1 、H 2 Target attribute information corresponding to REL and P is filtered attribute information;
based on this, the target attribute information is determined by the following method:
s710, obtain parameter list w= (W 1 ,w 2 ,...,w h1 ,...,w Q1 ),w h1 =(w h1 1 ,w h1 2 ,...,w h1 h2 ,...,w h1 Q2 ),h1=1,2,...,Q1,h2=1,2,...,Q2。
Wherein w is h1 For the parameter group corresponding to the h1 candidate attribute information, Q1 is the number of candidate attribute information;w h1 h2 is w h1 In the method, Q2 is the number of second executed events, and Q1 is more than or equal to s; each piece of filtered attribute information is any one of Q1 pieces of candidate attribute information, t j The corresponding target attribute information is different from each candidate attribute information;
s720, obtaining w h1 h2 Hash value ash of (a) h1 h2 To obtain hash value list ash= (ASH 1 ,ash 2 ,...,ash h1 ,...,ash Q1 ),ash h1 =(ash h1 1 ,ash h1 2 ,...,ash h1 h2 ,...,ash h1 Q2 )。
Wherein, ash h1 Is the h1 hash value group in ASH.
S730, obtain priority group Pri= (PRI) 1 ,pri 2 ,...,pri h1 ,...,pri Q1 )。
Wherein pri h1 PRI is the h1 st priority in PRI h1 =[∑ h2=1 Q2 (ash h1 h2 -ash ave h1 ) 2 ]/Q2,ash ave h1 As a priority reference factor, ash ave h1 =[∑ h2=1 Q2 (ash h1 h2 )]/Q2。
S740, if pri h1 Priority pri greater than a predetermined target 0 Then corresponding W will be among W h1 Deleting to obtain a deleted parameter list W '(W') 1 ,w´ 2 ,w´ 3 ,w´ 4 )。
Wherein w z For the z-th deleted parameter set in W', z=1, 2,3,4.
S750, w z The corresponding candidate attribute information is used as the attribute information after screening to obtain a filtered attribute information set ATT= (ATT) 1 ,att 2 ,att 3 ,att 4 )。
Wherein att is z And the z-th filtered attribute information.
Therefore, compared with the two embodiments, the embodiment can reduce the possibility of larger parameter fluctuation of the screened attribute information corresponding to the different second executed events, further reduce the possibility of larger parameter fluctuation of the screened attribute information corresponding to the different first executed events, so that the fluctuation of the parameter corresponding to the screened attribute information in the training sample is smaller, the possibility of larger parameter difference of the screened attribute information in different training samples corresponding to the same candidate category identification is reduced, further reduce the possibility of inaccurate target category identification determined based on a classification model constructed by a plurality of training samples, and achieve the purpose of improving the accuracy of determining the target category identification corresponding to the event to be executed.
Optionally, W is a list of parameters subjected to data preprocessing, where the data preprocessing includes removing unique features, removing irrelevant features, converting a feature format, analyzing a missing value, analyzing an outlier, and/or normalizing data, which is not limited by the embodiment of the present invention.
Optionally, the voting mechanism corresponding to the classification model constructed based on the decision tree may be a simple voting mechanism, a minority-compliant majority, a threshold voting or a bayesian voting mechanism, and preferably, the voting mechanism corresponding to the classification model constructed based on the decision tree is a minority-compliant majority, and a soft voting mode is adopted.
Optionally, after step S504, the effect of the classification model may be evaluated, specifically, the model accuracy, the confusion matrix, the thermodynamic diagram, the F1 score, the error rate iteration curve, and/or the index weight calculation may be evaluated.
Embodiments of the present invention also provide a non-transitory computer readable storage medium that may be disposed in an electronic device to store at least one instruction or at least one program for implementing one of the methods embodiments, the at least one instruction or the at least one program being loaded and executed by the processor to implement the methods provided by the embodiments described above.
Embodiments of the present invention also provide an electronic device comprising a processor and the aforementioned non-transitory computer-readable storage medium.
Embodiments of the present invention also provide a computer program product comprising program code for causing an electronic device to carry out the steps of the method according to the various exemplary embodiments of the invention described in the present specification when the program product is run on the electronic device.
While certain specific embodiments of the invention have been described in detail by way of example, it will be appreciated by those skilled in the art that the above examples are for illustration only and are not intended to limit the scope of the invention. Those skilled in the art will also appreciate that many modifications may be made to the embodiments without departing from the scope and spirit of the invention. The scope of the present disclosure is defined by the appended claims.

Claims (10)

1. A method for determining a target class identifier, the method comprising the steps of:
s100, obtaining a target object identification group A= (a) corresponding to the event to be executed 1 ,a 2 ,...,a i ,...,a n ) I=1, 2, n; wherein a is i The ith object identifier corresponding to the event to be executed is identified, and n is the number of the object identifiers corresponding to the event to be executed;
S200, acquiring a file content data set b= (B) 1 ,b 2 ,...,b j ,...,b m ) J=1, 2, m; wherein b j The file content data of the ith file, and m is the number of preset files; b j Including at least one candidate identification;
s300, according to A or B, obtaining a target feature identification group T= (T) 1 ,t 2 ,...,t j ,...,t m ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein t is j For the j-th target feature in T, T j =1 or 0, t j =1 for a 1 、a 2 、...、a i 、...、a n Any one of b j ,t j =0 for a i Not belonging to b j
S400, obtaining a feature vector F= (H) corresponding to the event to be executed 1 ,H 2 ,t 1 ,t 2 ,...,t j ,...,t m REL, P); wherein H is 1 For the first characteristic identification, H 1 =1 or 0, h 1 =1 for indicating that the event type of the event to be executed is the first target type, H 1 =0 to indicate that the event type of the event to be executed is not the first target type; h 2 For the second characteristic mark, H 2 =1 or 0, h 2 =1 for indicating that the event type of the event to be executed is the second target type, H 2 =0 to indicate that the event type of the event to be executed is not the second target type; REL is an execution identifier corresponding to the associated event of the event to be executed, rel=1 or 0, rel=1 is used for representing the time at the current time now When the association event has been performed, rel=0 is used to indicate that at time now The association event is not executed; p is an influence coefficient used for representing the influence degree of historical data corresponding to the initiator of the event to be executed on the determination of the target class identifier;
S500, based on the classification model, determining the candidate category identification corresponding to the F as the target category identification corresponding to the event to be executed in a plurality of candidate category identifications.
2. The determination method according to claim 1, wherein the classification model is obtained by:
s501, obtaining a training sample set D= (D) 1 ,d 2 ,...,d x ,...,d y ) X=1, 2,; wherein d x The x training sample in D; y is the number of training samples in D; d, d x =(d x 1 ,d x 2 ,...,d x r ,...,d x s ,N x ),r=1,2,..,s;d x r The parameter corresponding to the r target attribute information for the x first executed event; s is the number of the target attribute information, s=m+4; n (N) x Is the x first alreadyExecuting candidate category identification corresponding to the event; h 1 H, as the parameter of the 1 st target attribute information corresponding to the event to be executed 2 For the parameter of the 2 nd target attribute information corresponding to the event to be executed, t j REL is a parameter of the (m+1) -th target attribute information corresponding to the event to be executed, and P is a parameter of the (m+2) -th target attribute information corresponding to the event to be executed;
s502, performing q times of random selection of L training samples in the D, and taking each training sample randomly selected in the D for the kth time as a target training sample in the kth target training sample group to obtain a target training sample list SAM= (SAM) 1 ,sam 2 ,...,sam k ,...,sam q ),sam k =(sam k 1 ,sam k 2 ,...,sam k p ,...,sam k L ) K=1, 2, q, p=1, 2, L; wherein, sam k The k target training sample group in the SAM, q is the number of the target training sample groups in the SAM; sam k p Is sam k The p-th target training sample in (1), L is sam k The number of target training samples in (1), L < y;
s503, for sam k Performing decision tree generation processing to obtain a sam k A corresponding decision tree;
s504, constructing the classification model based on a plurality of decision trees.
3. The determination method according to claim 2, wherein the decision tree generation process includes the steps of:
s510, taking a target training sample group for generating and processing the decision tree as a current group;
s520, generating a root node of the decision tree corresponding to the current group, and taking all target training samples in the current group as samples corresponding to the root node;
s530, performing child node generation processing on the root node;
the child node generation process includes:
s531, taking the node for generating and processing the child node as a current node;
s532, determining whether the current node meets the recursion stop condition of the decision tree corresponding to the current group; if yes, go to step S533; otherwise, step S534 is entered;
S533, determining the current node as a leaf node, marking the leaf node by the candidate category identification with the largest number in the candidate category identifications in all samples corresponding to the current node, and entering step S539;
s534, according to all samples corresponding to the current node, obtaining a group of coefficient of the Kerni to calculate a coefficient of the Kerni= (GINI) corresponding to each target attribute information 1 ,gini 2 ,...,gini r ,...,gini s ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein gini is r The coefficient is a coefficient of the foundation corresponding to the r-th target attribute information calculated based on all samples corresponding to the current node;
s535, according to the GINI, obtaining the minimum radix factor GINI corresponding to the current node min =min(gini 1 ,gini 2 ,...,gini r ,...,gini s ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein min () is a preset minimum value determining function;
s536 according to gini min The corresponding target attribute information is used for determining part of all samples corresponding to the current node as a first sample and the other part as a second sample;
s537, in the decision tree corresponding to the current group, generating a first child node and a second child node of the current node, taking each first sample as a sample corresponding to the first child node, and taking each second sample as a sample corresponding to the second child node;
S538, performing the child node generation processing on the first child node, and performing the child node generation processing on the second child node;
s539, determining whether each node in the decision tree corresponding to the current group is a leaf node or a node which generates a corresponding first child node and a second child node; if yes, outputting a decision tree corresponding to the current group.
4. A determination method according to claim 3, wherein said step S536 comprises the steps of:
s5361, gini min The corresponding target attribute information is used as the segmentation attribute information corresponding to the current node;
s5362, obtaining a threshold set E= (E) corresponding to the segmentation attribute information corresponding to the current node 1 ,e 2 ,...,e u ,...,e v ) U=1, 2, v; wherein e r E, as a u-th threshold value corresponding to the segmentation attribute information corresponding to the current node 1 >e 2 >...>e u >...>e v The method comprises the steps of carrying out a first treatment on the surface of the v is the number of preset threshold values corresponding to the segmentation attribute information corresponding to the current node; the number of the thresholds corresponding to at least part of the target attribute information is greater than 1;
s5363, determining whether the current node has a corresponding target node; if yes, determining the number num of the target nodes with the same corresponding segmentation attribute information as the segmentation attribute information corresponding to the current node in all the target nodes corresponding to the current node, and proceeding to step S5364; otherwise, num=0, and the process proceeds to step S5365; when the current node is not the root node, each node between the current node and the root node in the decision tree corresponding to the current group and the root node are target nodes corresponding to the current node, and when the current node is the root node, no target node corresponding to the current node exists in the decision tree corresponding to the current group;
S5364, determining whether (num+1) is less than or equal to v; if yes, go to step S5365; otherwise, step S533 is entered;
s5365, for each sample corresponding to the current node, determining whether the target parameter in the sample is less than or equal to e num+1 The method comprises the steps of carrying out a first treatment on the surface of the If yes, taking the sample as a first sample; otherwise, taking the sample as a second sample; the target parameter is the corresponding sample and the target parameterParameters corresponding to the segmentation attribute information corresponding to the current node;
s5366, determining each first sample as a sample corresponding to the first sub-node, and determining each second sample as a sample corresponding to the second sub-node.
5. The method according to claim 4, wherein the recursive stopping condition is that candidate class identifiers in each sample corresponding to the current node are the same, or the depth of the current node in the decision tree corresponding to the current group reaches the sum of the numbers of thresholds corresponding to all the target attribute information, or the depth of the current node in the decision tree corresponding to the current group reaches a preset depth, or the number of samples corresponding to the current node is 0.
6. The method of determining according to claim 2, wherein H 1 、H 2 Target attribute information corresponding to REL and P is filtered attribute information;
the attribute information after screening is determined by the following method:
s610, obtain parameter list w= (W 1 ,w 2 ,...,w h1 ,...,w Q1 ),w h1 =(w h1 1 ,w h1 2 ,...,w h1 h2 ,...,w h1 Q2 ) H1=1, 2, Q1, h2=1, 2, Q2; wherein w is h1 For the parameter group corresponding to the h1 candidate attribute information, Q1 is the number of candidate attribute information; w (w) h1 h2 Is w h1 In the method, Q2 is the number of second executed events, and Q1 is more than or equal to 4; each of the filtered attribute information is any one of Q1 candidate attribute information, t j The corresponding target attribute information is different from each candidate attribute information;
s620, if w h1 All of the parameters in are the same or w h1 Any two parameters are different, then W is deleted in W h1 To obtain a deleted parameter list W '(W') 1 ,w´ 2 ,w´ 3 ,w´ 4 ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein w z For the z-th deleted parameter set in W', z=1, 2,3,4; s is less than or equal to Q1;
s630, w z The corresponding candidate attribute information is used as the attribute information after screening to obtain a filtered attribute information set ATT= (ATT) 1 ,att 2 ,att 3 ,att 4 ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein att is z The z-th filtered attribute information in the ATT.
7. The method of determining according to claim 2, wherein H 1 、H 2 Target attribute information corresponding to REL and P is filtered attribute information;
the attribute information after screening is determined by the following method:
s640, obtain parameter list w= (W 1 ,w 2 ,...,w h1 ,...,w Q1 ),w h1 =(w h1 1 ,w h1 2 ,...,w h1 h2 ,...,w h1 Q2 ) H1=1, 2, Q1, h2=1, 2, Q2; wherein w is h1 For the parameter group corresponding to the h1 candidate attribute information, Q1 is the number of candidate attribute information; w (w) h1 h2 Is w h1 In the method, Q2 is the number of second executed events, and Q1 is more than or equal to 4; each of the filtered attribute information is any one of Q1 candidate attribute information, t j The corresponding target attribute information is different from each candidate attribute information;
s650, for w h1 Performing parameter deduplication treatment to obtain w h1 Corresponding de-duplicated parameter set w' h1 To obtain a de-duplicated parameter list W "= (W)" 1 ,w" 2 ,...,w" h1 ,...,w" Q1 ),w" h1 =(w" h1 1 ,w" h1 2 ,...,w" h1 h3 ,...,w" h1 Q3(h1) ) H3=1, 2,., Q3 (h 1); wherein w' h1 h3 Is w' h1 The h3 post-deduplication parameter of (2), Q3 (h 1) is w' h1 The number of parameters after deduplication, Q3 (h 1)≤Q2;w" h1 The parameters after any two de-duplication are different;
s670, if [ Q3 (h 1)]Q2 is smaller than a first preset value pre 1 Or greater than a second preset value pre 2 Then W is deleted in W' h1 To obtain a deleted parameter list W '(W') 1 ,w´ 2 ,w´ 3 ,w´ 4 ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein, pre 2 >pre 1 ,w´ z For the z-th deleted parameter set in W', z=1, 2,3,4;
S680, w z The corresponding candidate attribute information is used as the attribute information after screening to obtain a filtered attribute information set ATT= (ATT) 1 ,att 2 ,att 3 ,att 4 ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein att is z And the z-th filtered attribute information.
8. The method of determining according to claim 2, wherein H 1 、H 2 Target attribute information corresponding to REL and P is filtered attribute information;
the target attribute information is determined by the following method:
s710, obtain parameter list w= (W 1 ,w 2 ,...,w h1 ,...,w Q1 ),w h1 =(w h1 1 ,w h1 2 ,...,w h1 h2 ,...,w h1 Q2 ) H1=1, 2, Q1, h2=1, 2, Q2; wherein w is h1 For the parameter group corresponding to the h1 candidate attribute information, Q1 is the number of candidate attribute information; w (w) h1 h2 Is w h1 In the method, Q2 is the number of second executed events, and Q1 is more than or equal to s; each of the filtered attribute information is any one of Q1 candidate attribute information, t j The corresponding target attribute information is different from each candidate attribute information;
s720, obtaining w h1 h2 Hash value ash of (a) h1 h2 To obtain hash value list ash= (ASH 1 ,ash 2 ,...,ash h1 ,...,ash Q1 ),ash h1 =(ash h1 1 ,ash h1 2 ,...,ash h1 h2 ,...,ash h1 Q2 ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein, ash h1 A 1 st hash value group in the ASH;
s730, obtain priority group Pri= (PRI) 1 ,pri 2 ,...,pri h1 ,...,pri Q1 ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein pri h1 PRI is the h1 st priority in PRI h1 =[∑ h2=1 Q2 (ash h1 h2 -ash ave h1 ) 2 ]/Q2,ash ave h1 As a priority reference factor, ash ave h1 =[∑ h2=1 Q2 (ash h1 h2 )]/Q2;
S740, if pri h1 Priority pri greater than a predetermined target 0 Then corresponding W will be among W h1 Deleting to obtain a deleted parameter list W '(W') 1 ,w´ 2 ,w´ 3 ,w´ 4 ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein w z For the z-th deleted parameter set in W', z=1, 2,3,4;
s750, w z The corresponding candidate attribute information is used as the attribute information after screening to obtain a filtered attribute information set ATT= (ATT) 1 ,att 2 ,att 3 ,att 4 ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein att is z And the z-th filtered attribute information.
9. A non-transitory computer readable storage medium having stored therein at least one instruction or at least one program, wherein the at least one instruction or the at least one program is loaded and executed by a processor to implement the determination method of any one of claims 1-8.
10. An electronic device comprising a processor and the non-transitory computer-readable storage medium of claim 9.
CN202310449913.0A 2023-04-25 2023-04-25 Determination method of target category identification, storage medium and electronic equipment Active CN116167624B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310449913.0A CN116167624B (en) 2023-04-25 2023-04-25 Determination method of target category identification, storage medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310449913.0A CN116167624B (en) 2023-04-25 2023-04-25 Determination method of target category identification, storage medium and electronic equipment

Publications (2)

Publication Number Publication Date
CN116167624A CN116167624A (en) 2023-05-26
CN116167624B true CN116167624B (en) 2023-07-07

Family

ID=86418585

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310449913.0A Active CN116167624B (en) 2023-04-25 2023-04-25 Determination method of target category identification, storage medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN116167624B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111859384A (en) * 2020-07-23 2020-10-30 平安证券股份有限公司 Abnormal event monitoring method and device, computer equipment and storage medium
CN114860793A (en) * 2022-07-05 2022-08-05 中航信移动科技有限公司 Data processing method, data processing device, storage medium and electronic equipment
CN114880581A (en) * 2022-06-30 2022-08-09 中航信移动科技有限公司 User data processing method, storage medium and electronic device

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110084377B (en) * 2019-04-30 2023-09-29 京东城市(南京)科技有限公司 Method and device for constructing decision tree
US11631014B2 (en) * 2019-08-02 2023-04-18 Capital One Services, Llc Computer-based systems configured for detecting, classifying, and visualizing events in large-scale, multivariate and multidimensional datasets and methods of use thereof

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111859384A (en) * 2020-07-23 2020-10-30 平安证券股份有限公司 Abnormal event monitoring method and device, computer equipment and storage medium
CN114880581A (en) * 2022-06-30 2022-08-09 中航信移动科技有限公司 User data processing method, storage medium and electronic device
CN114860793A (en) * 2022-07-05 2022-08-05 中航信移动科技有限公司 Data processing method, data processing device, storage medium and electronic equipment

Also Published As

Publication number Publication date
CN116167624A (en) 2023-05-26

Similar Documents

Publication Publication Date Title
US20070094216A1 (en) Uncertainty management in a decision-making system
US6418425B1 (en) Prediction apparatus for predicting based on similar cases and method thereof
US11562262B2 (en) Model variable candidate generation device and method
CN112116184A (en) Factory risk estimation using historical inspection data
US20220253725A1 (en) Machine learning model for entity resolution
CN112800232B (en) Case automatic classification method based on big data
Irmanita et al. Classification of Malaria Complication Using CART (Classification and Regression Tree) and Naïve Bayes
CN113590396A (en) Method and system for diagnosing defect of primary device, electronic device and storage medium
WO2020257784A1 (en) Inspection risk estimation using historical inspection data
CN116167624B (en) Determination method of target category identification, storage medium and electronic equipment
CN113837578A (en) Gridding supervision and management evaluation method for power supervision enterprise
CN113989838A (en) Pedestrian re-recognition model training method, recognition method, system, device and medium
CN113283673A (en) Model performance attenuation evaluation method, model training method and device
CN115937568B (en) Basalt structure background classification method, basalt structure background classification system, basalt structure background classification device and storage medium
CN115063604B (en) Feature extraction model training and target re-identification method and device
JP2021135611A (en) Diversion design support system and diversion design support method
CN115269571A (en) Data quality evaluation method based on data processing
CN115829722A (en) Training method of credit risk scoring model and credit risk scoring method
JP5063639B2 (en) Data classification method, apparatus and program
CN115048290A (en) Software quality evaluation method and device, storage medium and computer equipment
JP3602084B2 (en) Database management device
Wardoyo et al. Weighted majority voting by statistical performance analysis on ensemble multiclassifier
CN112131415A (en) Method and device for improving data acquisition quality based on deep learning
CN116932487B (en) Quantized data analysis method and system based on data paragraph division
CN113361472B (en) Radar HRRP target identification method based on ILFACs model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant