CN116720935A

CN116720935A - Account category identification method, apparatus, computer device and storage medium

Info

Publication number: CN116720935A
Application number: CN202310446275.7A
Authority: CN
Inventors: 杜心达
Original assignee: Industrial Consumer Finance Co Ltd
Current assignee: Industrial Consumer Finance Co Ltd
Priority date: 2023-04-24
Filing date: 2023-04-24
Publication date: 2023-09-08

Abstract

The application relates to an account category identification method, an account category identification device, a computer device, a storage medium and a computer program product. The method comprises the following steps: acquiring a plurality of account data; the account data includes: a plurality of influence factors influencing the category to which the account belongs; for each influence factor, acquiring a plurality of initial hypothesis values corresponding to the influence factors; dividing a plurality of initial hypothesized values into a plurality of hypothesized value intervals based on a plurality of thresholds corresponding to the influence factors; obtaining a plurality of key value pairs based on a plurality of initial hypothesis values in a hypothesis value interval; assigning a target hypothesis value to a plurality of account data with influence factors based on key value pairs, calculating a category value corresponding to each account data under each key value pair, and determining the category of each account data based on the category value; based on the category of each key value to the next plurality of account data, the relation between the influence factors of different values and the category of the account data is obtained. By adopting the method, the accuracy of the account type identification result can be improved.

Description

Account category identification method, apparatus, computer device and storage medium

Technical Field

The present application relates to the field of big data technology, and in particular, to a method, an apparatus, a computer device, a storage medium, and a computer program product for identifying account types.

Background

In the field of big data wind control, in the process of identifying account types, the interpretability of an identification model is related to the stability and generalization capability, and because the overfitting often occurs on the trend of violating the norms, the elimination of abnormal influence factors is particularly important in the process of identifying account types.

In the conventional technology, the gradient lifting algorithm provides a gain importance calculation interface of the influence factors, and the influence degree of each influence factor on the identification result is controlled by using the weight ratio of the influence factors.

However, in the conventional technology, only the single information of the weight of the influence factor in the recognition model is obtained through the gain importance calculation interface of the influence factor, and the abnormal influence factor cannot be found, so that the accuracy of the account type recognition result is lower.

Disclosure of Invention

In view of the foregoing, it is desirable to provide an account type recognition method, apparatus, computer device, computer readable storage medium, and computer program product that can improve accuracy of account type recognition results.

In a first aspect, the application provides an account category identification method. The method comprises the following steps:

Acquiring a plurality of account data; the account data includes: a plurality of influence factors influencing the category to which the account belongs;

for each influence factor, acquiring a plurality of initial hypothesis values corresponding to the influence factors; dividing a plurality of initial hypothesized values into a plurality of hypothesized value intervals based on a plurality of thresholds corresponding to the influence factors;

obtaining a plurality of key value pairs based on a plurality of initial hypothesis values in a hypothesis value interval; wherein, the key value pair is composed of a target hypothesis value and an influence factor in a hypothesis value interval; the number of key-value pairs is the same as the number of initial hypothesized values in the hypothesized value interval;

assigning a target hypothesis value to a plurality of account data with influence factors based on key value pairs, calculating a category value corresponding to each account data under each key value pair, and determining the category of each account data based on the category value;

based on the category of each key value to the next plurality of account data, the relation between the influence factors of different values and the category of the account data is obtained.

In one embodiment, obtaining a plurality of key-value pairs based on a plurality of initial hypothesis values within a hypothesis value interval includes:

selecting an initial hypothesis value which is not selected in the hypothesis value interval as a target hypothesis value, and obtaining a key value pair consisting of the target hypothesis value and an influence factor; repeating the steps until the number of key value pairs is the same as the number of initial hypothesized values in the hypothesized value interval.

In one embodiment, before dividing the plurality of initial hypothesis values into a plurality of hypothesis value intervals based on a plurality of thresholds corresponding to the influence factors, the method further includes:

acquiring a plurality of initial thresholds corresponding to the influence factors from a storage object;

for each influence factor, performing de-duplication and sequencing on a plurality of initial thresholds to obtain a threshold list of the corresponding influence factors; the threshold list comprises a plurality of thresholds corresponding to the influence factors.

In one embodiment, assigning the target hypothesis value to the plurality of account data having the impact factor based on the key value pairs includes:

selecting an initial hypothesis value which is not selected in the hypothesis value interval as a target hypothesis value according to each hypothesis value interval;

determining an ordering of the plurality of hypothesis value intervals based on the threshold list;

determining a ranking of the plurality of target hypothesis values based on the ranking of the plurality of hypothesis value intervals;

the target hypothesis values are assigned to the plurality of account data with the impact factors in turn based on the ordering of the plurality of target hypothesis values.

In one embodiment, the method further comprises:

the relationship between the impact factor and the category of the account data is stored in a storage object.

In one embodiment, storing the relationship between the impact factor and the category of the account data in the storage object includes:

acquiring category values of a plurality of account data with target hypothesis values for each influence factor;

and acquiring average values of the class values of the plurality of account data, and storing the influence factors, the plurality of hypothesis values and the plurality of average values into a storage object.

In a second aspect, the application further provides an account category identification device. The device comprises:

the account data acquisition module is used for acquiring a plurality of account data; the account data includes: a plurality of influence factors influencing the category to which the account belongs;

the hypothesis value interval dividing module is used for acquiring a plurality of initial hypothesis values corresponding to each influence factor; dividing a plurality of initial hypothesized values into a plurality of hypothesized value intervals based on a plurality of thresholds corresponding to the influence factors;

the key value pair acquisition module is used for acquiring a plurality of key value pairs based on a plurality of initial hypothesis values in the hypothesis value interval; wherein, the key value pair is composed of a target hypothesis value and an influence factor in a hypothesis value interval; the number of key-value pairs is the same as the number of initial hypothesized values in the hypothesized value interval;

The category determining module is used for assigning the target hypothesis value to a plurality of account data with influence factors based on the key value pairs, calculating a category value corresponding to each account data under each key value pair, and determining the category of each account data based on the category value;

and the category relation determining module is used for obtaining the relation between the influence factors of different values and the categories of the account data based on the categories of the plurality of account data under each key value.

In a third aspect, the present application also provides a computer device. The computer device comprises a memory storing a computer program and a processor implementing the steps of the method of any of the embodiments described above when the processor executes the computer program.

In a fourth aspect, the present application also provides a computer device readable storage medium. The computer device readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the method of any of the embodiments described above.

In a fifth aspect, the present application also provides a computer program product. The computer program product comprises a computer program which, when executed by a processor, implements the steps of the method according to any of the embodiments described above.

The account category identification method, the account category identification device, the computer equipment, the storage medium and the computer program product firstly acquire a plurality of account data; the account data includes: a plurality of influence factors influencing the category to which the account belongs; and acquiring a plurality of initial hypothesis values corresponding to the influence factors according to each influence factor. Then, the plurality of initial hypothesis values are divided into a plurality of hypothesis value intervals based on a plurality of thresholds corresponding to the influence factors. Then, based on a plurality of initial hypothesized values in the hypothesized value interval, a plurality of key value pairs are obtained; wherein, the key value pair is composed of a target hypothesis value and an influence factor in a hypothesis value interval; the number of key-value pairs is the same as the number of initial hypothesized values within the hypothesized value interval. Further, assigning a target hypothesis value to a plurality of account data with influence factors based on the key value pairs, calculating a category value corresponding to each account data under each key value pair, and determining the category of each account data based on the category value; based on the category of each key value to the next plurality of account data, the relation between the influence factors of different values and the category of the account data is obtained. According to the method, the target hypothesis value is assigned to the plurality of account data with the influence factors, the influence of the change of the numerical value of each influence factor on the category to which the account data belongs can be intuitively reflected by uniformly assigning the same influence factor to all the account data, the abnormal influence factors can be removed in the account type identification process, and the accuracy of the account type identification result can be improved.

Drawings

FIG. 1 is an application environment diagram of an account category identification method in one embodiment;

FIG. 2 is a flow chart of a method for account category identification in one embodiment;

FIG. 3 is a flow diagram of training and using an account category recognition model in one embodiment;

FIG. 4 is a block diagram of an account category identification device in one embodiment;

fig. 5 is an internal structural diagram of a computer device in one embodiment.

Detailed Description

The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.

The account type identification method provided by the embodiment of the application can be applied to an application environment shown in figure 1. Wherein the terminal 102 communicates with the server 104 via a network. The data storage system may store data that the server 104 needs to process. The data storage system may be integrated on a server or may be placed on a cloud or other network server. The server 104 may provide the terminal 102 with an account type recognition environment, and the server 104 may interact with the terminal 102 in communication to enter the account type recognition environment. First, the server 104 may acquire a plurality of account data from the plurality of terminals 102, respectively; the account data includes: a plurality of influence factors influencing the category to which the account belongs; and acquiring a plurality of initial hypothesis values corresponding to the influence factors according to each influence factor. Server 104 may then divide the plurality of initial hypothesis values into a plurality of hypothesis value intervals based on the plurality of thresholds corresponding to the impact factors. Server 104 may then derive a plurality of key-value pairs based on a plurality of initial hypothesis values within the hypothesis value interval; wherein, the key value pair is composed of a target hypothesis value and an influence factor in a hypothesis value interval; the number of key-value pairs is the same as the number of initial hypothesized values within the hypothesized value interval. Further, based on the key value pairs, the server 104 may assign the target hypothesis values to the plurality of account data having the influence factors, calculate a category value corresponding to each account data under each key value pair, determine a category of each account data based on the category value, and obtain a relationship between the influence factors of different values and the categories of the account data based on the categories of each key value pair under the plurality of account data. Further, the server may feed back the category of each account data to the terminal 102 and store the relationship between the different numerical impact factors and the category of account data to the data storage system.

The terminal 102 may be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, internet of things devices, and portable wearable devices, where the internet of things devices may be smart speakers, smart televisions, smart air conditioners, smart vehicle devices, and the like. The portable wearable device may be a smart watch, smart bracelet, headset, or the like. The server 104 may be implemented as a stand-alone server or as a server cluster of multiple servers.

The account type identification method provided by the embodiment of the application can be applied to a server or a terminal on one side, and can also be applied to a system comprising the server and the terminal, and the account type identification method is realized through interaction of the server and the terminal.

In one embodiment, as shown in fig. 2, an account category identification method is provided, and the method is applied to a server single-side implementation as an example for explanation, and includes the following steps 202 to 208.

Step 202, acquiring a plurality of account data; the account data includes: and a plurality of influence factors influencing the category to which the account belongs.

In this embodiment, the account data may include, but is not limited to: account information, a plurality of influence factors influencing the category to which the account belongs.

In this embodiment, the values of the plurality of influence factors corresponding to the same account data are independent of each other, i.e., the values of the plurality of influence factors corresponding to the same account data may be the same, partially the same, or all the different.

In this embodiment, the values of the same influencing factor in different account data are independent of each other, that is, the values of the same influencing factor in different account data may be the same, partially the same, or all different.

Step 204, for each influence factor, obtaining a plurality of initial hypothesis values corresponding to the influence factors; the plurality of initial hypothesis values are divided into a plurality of hypothesis value intervals based on a plurality of thresholds corresponding to the influence factors.

In this embodiment, the server may obtain a plurality of thresholds corresponding to the influence factors from the storage object, and generate a plurality of threshold intervals correspondingly.

In this embodiment, the threshold section may be left-open and right-closed. For example, when the plurality of thresholds are respectively: 3. 6,9, the server may generate a corresponding threshold interval based on the plurality of thresholds: [3,6),[6,9).

In this embodiment, the server may divide the plurality of initial hypothesis values into different hypothesis value intervals based on the numerical relationship between the plurality of initial hypothesis values and the plurality of thresholds, respectively, with the threshold value interval as the hypothesis value interval. For example, when the initial hypothesis corresponding to the influence factor is: 3. 4, 5, 8, a plurality of thresholds are respectively: when [3, 6), [6, 9), the server may divide the initial hypothesis values 3, 4, and 5 into the hypothesis value interval [3, 6), and the initial hypothesis value 8 into the hypothesis value interval [6, 9), based on the numerical relationship between the plurality of initial hypothesis values and the plurality of threshold values, respectively.

In another embodiment, the server may not be limited to the multiple thresholds corresponding to the impact factors in the storage object to form the threshold intervals, and may acquire multiple hypothesis value intervals by means of equal frequency division box based on multiple initial hypothesis values.

In another embodiment, the server may not be limited to the multiple thresholds corresponding to the influence factors in the storage object to form the threshold interval, and the server may further obtain the threshold interval by means of random value taking, and divide the multiple initial hypothesis values into different hypothesis value intervals based on the numerical relationships between the multiple initial hypothesis values and the multiple thresholds.

Step 206, obtaining a plurality of key value pairs based on a plurality of initial hypothesis values in the hypothesis value interval; wherein, the key value pair is composed of a target hypothesis value and an influence factor in a hypothesis value interval; the number of key-value pairs is the same as the number of initial hypothesized values within the hypothesized value interval.

In this embodiment, for each influence factor, the server may form a key value pair from a plurality of target hypothesis values corresponding to the influence factor, respectively. For example, when the influence factor is a, the target hypothesis values corresponding to the influence factor a are: 1. 5, 7, the server may combine the influence factor a with the target hypothesis values (1, 5, 7) corresponding to the influence factor a into a key value pair: a-1 (influencing factor A and target hypothesis value 1), A-5 (influencing factor A and target hypothesis value 5), and A-7 (influencing factor A and target hypothesis value 7).

And step 208, assigning the target hypothesis value to the plurality of account data with the influence factors based on the key value pairs, calculating a category value corresponding to each account data under each key value pair, and determining the category of each account data based on the category value.

In this embodiment, for each key value pair, the server may use, as target account data, a plurality of account data having influence factors in the key value pair, and use, as target influence factors, influence factors in the key value pair.

In this embodiment, the server may assign, based on the key value pair, a target hypothesis value in the key value pair to a target influence factor in the plurality of target account data, calculate a category value corresponding to each account data under each key value pair, and determine a category of each account data based on the category value.

Step 210, based on the category of each key value to the next plurality of account data, obtaining the relation between the influence factors of different values and the category of the account data.

In this embodiment, each key value corresponds to an influence factor and a target hypothesis value, and by assigning the target hypothesis value to a plurality of account data with influence factors and uniformly assigning the same influence factor to all account data, the influence of the change of each influence factor value on the category to which the account data belongs can be intuitively reflected, and the relationship between the influence factors with different values and the category of the account data is obtained, so that the abnormal influence factors can be removed in the account type identification process.

In the account type identification method, a plurality of account data are firstly acquired; the account data includes: a plurality of influence factors influencing the category to which the account belongs; and acquiring a plurality of initial hypothesis values corresponding to the influence factors according to each influence factor. Then, the plurality of initial hypothesis values are divided into a plurality of hypothesis value intervals based on a plurality of thresholds corresponding to the influence factors. Then, based on a plurality of initial hypothesized values in the hypothesized value interval, a plurality of key value pairs are obtained; wherein, the key value pair is composed of a target hypothesis value and an influence factor in a hypothesis value interval; the number of key-value pairs is the same as the number of initial hypothesized values within the hypothesized value interval. Further, assigning a target hypothesis value to a plurality of account data with influence factors based on the key value pairs, calculating a category value corresponding to each account data under each key value pair, and determining the category of each account data based on the category value; based on the category of each key value to the next plurality of account data, the relation between the influence factors of different values and the category of the account data is obtained. According to the method, the target hypothesis value is assigned to the plurality of account data with the influence factors, the influence of the change of the numerical value of each influence factor on the category to which the account data belongs can be intuitively reflected by uniformly assigning the same influence factor to all the account data, the abnormal influence factors can be removed in the account type identification process, and the accuracy of the account type identification result can be improved.

In some embodiments, as shown in fig. 3, deriving a plurality of key-value pairs based on a plurality of initial hypothesis values within a hypothesis value interval may include: selecting an initial hypothesis value which is not selected in the hypothesis value interval as a target hypothesis value, and obtaining a key value pair consisting of the target hypothesis value and an influence factor; repeating the steps until the number of key value pairs is the same as the number of initial hypothesized values in the hypothesized value interval.

In this embodiment, the server may generate, based on a plurality of initial hypothesis values within the hypothesis value interval, the same number of key value pairs as the number of initial hypothesis values within the hypothesis value interval, wherein an influence factor of each key value pair is consistent, and a target hypothesis value of each key value pair is different.

In another embodiment, the server may also sequentially obtain the target hypothesis values from the multiple initial hypothesis values according to a preset sequence (from big to small, from small to big, from interval to interval, etc.) based on the multiple initial hypothesis values in the hypothesis value interval, until the number of key value pairs is the same as the number of initial hypothesis values in the hypothesis value interval.

In some embodiments, before dividing the plurality of initial hypothesis values into a plurality of hypothesis value intervals based on the plurality of thresholds corresponding to the influence factors, the method may further include: acquiring a plurality of initial thresholds corresponding to the influence factors from a storage object; for each influence factor, performing de-duplication and sequencing on a plurality of initial thresholds to obtain a threshold list of the corresponding influence factors; the threshold list comprises a plurality of thresholds corresponding to the influence factors.

In this embodiment, the server may order the plurality of initial thresholds in order from large to small or from small to large.

In this embodiment, the server may perform deduplication on a plurality of initial thresholds, and rank the plurality of initial thresholds after deduplication, so as to obtain a threshold list corresponding to the impact factor. The duplicate removal is performed first, so that the data quantity which needs to be processed in the sequence can be reduced, the acquisition efficiency of the threshold list is improved, and the computer resources in the acquisition process of the threshold list are reduced.

In another embodiment, the server may also sort the multiple initial thresholds first, and de-repeat the sorted multiple initial thresholds to obtain a threshold list of corresponding impact factors.

In some embodiments, as shown in FIG. 3, assigning the target hypothesis value to the plurality of account data having the impact factor based on the key value pairs may include: selecting an initial hypothesis value which is not selected in the hypothesis value interval as a target hypothesis value according to each hypothesis value interval; determining an ordering of the plurality of hypothesis value intervals based on the threshold list; determining a ranking of the plurality of target hypothesis values based on the ranking of the plurality of hypothesis value intervals; the target hypothesis values are assigned to the plurality of account data with the impact factors in turn based on the ordering of the plurality of target hypothesis values.

In this embodiment, the server may divide the threshold value interval into different assumption value intervals (threshold value intervals) as assumption value intervals, and determine the order of the assumption value intervals based on the threshold value list (the ordered non-repeated threshold values).

In this embodiment, as shown in fig. 3, the server may traverse multiple hypothesis value intervals, select a target hypothesis value from each hypothesis value interval, determine a ranking of multiple target hypothesis values based on the ranking of the multiple hypothesis value intervals, and assign the target hypothesis values to multiple account data with influence factors in sequence based on the ranking of the multiple target hypothesis values.

In some embodiments, the above method may further comprise: the relationship between the impact factor and the category of the account data is stored in a storage object.

In this embodiment, the storage object may be integrated in the account category identification model or in a database corresponding to the account category identification model.

In some embodiments, as shown in fig. 3, storing the relationship between the impact factor and the category of the account data into the storage object may include: acquiring category values of a plurality of account data with target hypothesis values for each influence factor; and acquiring average values of the class values of the plurality of account data, and storing the influence factors, the plurality of hypothesis values and the plurality of average values into a storage object.

In this embodiment, the server may determine, based on the influence factors, the plurality of assumption values, and the plurality of average values, an influence of each influence factor on a class value of account data under different assumption values, so as to facilitate prediction of a change trend generated by the class value of account data along with a change of the influence factor value, intuitively embody an influence of the change of each influence factor value on a class to which the account data belongs, facilitate removal of an abnormal influence factor in an account type identification process, and improve accuracy of an account type identification result.

It should be understood that, although the steps in the flowcharts related to the embodiments described above are sequentially shown as indicated by arrows, these steps are not necessarily sequentially performed in the order indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in the flowcharts described in the above embodiments may include a plurality of steps or a plurality of stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of the steps or stages is not necessarily performed sequentially, but may be performed alternately or alternately with at least some of the other steps or stages.

Based on the same inventive concept, the embodiment of the application also provides an account type recognition device for realizing the account type recognition method. The implementation of the solution provided by the device is similar to the implementation described in the above method, so the specific limitation in the embodiments of the account type recognition device or devices provided below may refer to the limitation of the account type recognition method hereinabove, and will not be repeated herein.

In one embodiment, as shown in fig. 4, there is provided an account category identification apparatus, including: an account data acquisition module 402, a hypothesis value interval partitioning module 404, a key value pair acquisition module 406, a category determination module 408, and a category relationship determination module 410, wherein:

an account data acquisition module 402, configured to acquire a plurality of account data; the account data includes: and a plurality of influence factors influencing the category to which the account belongs.

The hypothesis value interval dividing module 404 is configured to obtain, for each influence factor, a plurality of initial hypothesis values corresponding to the influence factor; the plurality of initial hypothesis values are divided into a plurality of hypothesis value intervals based on a plurality of thresholds corresponding to the influence factors.

A key value pair obtaining module 406, configured to obtain a plurality of key value pairs based on a plurality of initial hypothesis values in the hypothesis value interval; wherein, the key value pair is composed of a target hypothesis value and an influence factor in a hypothesis value interval; the number of key-value pairs is the same as the number of initial hypothesized values within the hypothesized value interval.

The category determination module 408 is configured to assign a target hypothesis value to the plurality of account data with the influence factor based on the key value pairs, calculate a category value corresponding to each account data under each key value pair, and determine a category of each account data based on the category value.

The category relation determining module 410 is configured to obtain a relation between the influence factors of different values and the categories of the account data based on the categories of the plurality of account data under each key value.

In one embodiment, the key-value-pair acquisition module 406 may include:

the key value pair constructing sub-module is used for selecting an initial hypothesized value which is not selected in the hypothesized value interval as a target hypothesized value to obtain a key value pair consisting of the target hypothesized value and an influence factor; repeating the steps until the number of key value pairs is the same as the number of initial hypothesized values in the hypothesized value interval.

In one embodiment, before dividing the plurality of initial hypothesis values into a plurality of hypothesis value intervals based on the plurality of thresholds corresponding to the influence factors, the apparatus may further include:

and the initial threshold acquisition module is used for acquiring a plurality of initial thresholds corresponding to the influence factors from the storage object.

The threshold list acquisition module is used for carrying out de-duplication and sequencing on a plurality of initial thresholds according to each influence factor to obtain a threshold list of the corresponding influence factors; the threshold list comprises a plurality of thresholds corresponding to the influence factors.

In one embodiment, the category determination module 408 may include:

the target hypothesis value acquisition sub-module is used for selecting an initial hypothesis value which is not selected in the hypothesis value interval as a target hypothesis value for each hypothesis value interval.

And the hypothesis value interval sequencing sub-module is used for determining the sequencing of the multiple hypothesis value intervals based on the threshold value list.

And the target hypothesis value sequencing sub-module is used for determining the sequencing of the multiple target hypothesis values based on the sequencing of the multiple hypothesis value intervals.

And the assignment sub-module is used for sequentially assigning the target hypothesis values to the account data with the influence factors based on the ordering of the target hypothesis values.

In one embodiment, the apparatus may further include:

and the relation storage module is used for storing the relation between the influence factors and the categories of the account data into the storage object.

In one embodiment, the relationship storage module may include:

and the category value acquisition sub-module is used for acquiring category values of a plurality of account data with target hypothesis values for each influence factor.

And the storage sub-module is used for acquiring the average value of the class values of the plurality of account data and storing the influence factors, the plurality of hypothesis values and the plurality of average values into a storage object.

The various modules in the account category identification device described above may be implemented in whole or in part by software, hardware, and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.

In one embodiment, a computer device is provided, which may be a server, the internal structure of which may be as shown in fig. 5. The computer device includes a processor, a memory, an Input/Output interface (I/O) and a communication interface. The processor, the memory and the input/output interface are connected through a system bus, and the communication interface is connected to the system bus through the input/output interface. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is used for storing data such as an influence factor, an initial assumption value, a target assumption value and the like. The input/output interface of the computer device is used to exchange information between the processor and the external device. The communication interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a method of account category identification.

It will be appreciated by those skilled in the art that the structure shown in FIG. 5 is merely a block diagram of some of the structures associated with the present inventive arrangements and is not limiting of the computer device to which the present inventive arrangements may be applied, and that a particular computer device may include more or fewer components than shown, or may combine some of the components, or have a different arrangement of components.

In one embodiment, a computer device is provided comprising a memory and a processor, the memory having stored therein a computer program, the processor when executing the computer program performing the steps of: acquiring a plurality of account data; the account data includes: a plurality of influence factors influencing the category to which the account belongs; for each influence factor, acquiring a plurality of initial hypothesis values corresponding to the influence factors; dividing a plurality of initial hypothesized values into a plurality of hypothesized value intervals based on a plurality of thresholds corresponding to the influence factors; obtaining a plurality of key value pairs based on a plurality of initial hypothesis values in a hypothesis value interval; wherein, the key value pair is composed of a target hypothesis value and an influence factor in a hypothesis value interval; the number of key-value pairs is the same as the number of initial hypothesized values in the hypothesized value interval; assigning a target hypothesis value to a plurality of account data with influence factors based on key value pairs, calculating a category value corresponding to each account data under each key value pair, and determining the category of each account data based on the category value; based on the category of each key value to the next plurality of account data, the relation between the influence factors of different values and the category of the account data is obtained.

In one embodiment, the processor when executing the computer program further implements deriving a plurality of key-value pairs based on a plurality of initial hypothesis values within a hypothesis value interval, which may include: selecting an initial hypothesis value which is not selected in the hypothesis value interval as a target hypothesis value, and obtaining a key value pair consisting of the target hypothesis value and an influence factor; repeating the steps until the number of key value pairs is the same as the number of initial hypothesized values in the hypothesized value interval.

In one embodiment, before the processor executes the computer program to implement dividing the plurality of initial hypothesis values into a plurality of hypothesis value intervals based on the plurality of thresholds corresponding to the influence factors, the following steps may be implemented: acquiring a plurality of initial thresholds corresponding to the influence factors from a storage object; for each influence factor, performing de-duplication and sequencing on a plurality of initial thresholds to obtain a threshold list of the corresponding influence factors; the threshold list comprises a plurality of thresholds corresponding to the influence factors.

In one embodiment, the processor when executing the computer program further implements assigning the target hypothesis value to the plurality of account data having the impact factor based on the key value pairs, may include: selecting an initial hypothesis value which is not selected in the hypothesis value interval as a target hypothesis value according to each hypothesis value interval; determining an ordering of the plurality of hypothesis value intervals based on the threshold list; determining a ranking of the plurality of target hypothesis values based on the ranking of the plurality of hypothesis value intervals; the target hypothesis values are assigned to the plurality of account data with the impact factors in turn based on the ordering of the plurality of target hypothesis values.

In one embodiment, the processor, when executing the computer program, may further implement the steps of: the relationship between the impact factor and the category of the account data is stored in a storage object.

In one embodiment, the processor, when executing the computer program, further implements storing the relationship between the impact factor and the category of account data in a storage object, which may include: acquiring category values of a plurality of account data with target hypothesis values for each influence factor; and acquiring average values of the class values of the plurality of account data, and storing the influence factors, the plurality of hypothesis values and the plurality of average values into a storage object.

In one embodiment, a computer readable storage medium is provided having a computer program stored thereon, which when executed by a processor, performs the steps of: acquiring a plurality of account data; the account data includes: a plurality of influence factors influencing the category to which the account belongs; for each influence factor, acquiring a plurality of initial hypothesis values corresponding to the influence factors; dividing a plurality of initial hypothesized values into a plurality of hypothesized value intervals based on a plurality of thresholds corresponding to the influence factors; obtaining a plurality of key value pairs based on a plurality of initial hypothesis values in a hypothesis value interval; wherein, the key value pair is composed of a target hypothesis value and an influence factor in a hypothesis value interval; the number of key-value pairs is the same as the number of initial hypothesized values in the hypothesized value interval; assigning a target hypothesis value to a plurality of account data with influence factors based on key value pairs, calculating a category value corresponding to each account data under each key value pair, and determining the category of each account data based on the category value; based on the category of each key value to the next plurality of account data, the relation between the influence factors of different values and the category of the account data is obtained.

In one embodiment, the computer program, when executed by the processor, further implements deriving a plurality of key-value pairs based on a plurality of initial hypothesized values within the hypothesized value interval, may include: selecting an initial hypothesis value which is not selected in the hypothesis value interval as a target hypothesis value, and obtaining a key value pair consisting of the target hypothesis value and an influence factor; repeating the steps until the number of key value pairs is the same as the number of initial hypothesized values in the hypothesized value interval.

In one embodiment, the computer program when executed by the processor implements the following steps before dividing the plurality of initial hypothesized values into a plurality of hypothesized value intervals based on the plurality of thresholds corresponding to the impact factors: acquiring a plurality of initial thresholds corresponding to the influence factors from a storage object; for each influence factor, performing de-duplication and sequencing on a plurality of initial thresholds to obtain a threshold list of the corresponding influence factors; the threshold list comprises a plurality of thresholds corresponding to the influence factors.

In one embodiment, the computer program, when executed by the processor, further implements assigning the target hypothesis value to the plurality of account data having the impact factor based on the key value pairs, may include: selecting an initial hypothesis value which is not selected in the hypothesis value interval as a target hypothesis value according to each hypothesis value interval; determining an ordering of the plurality of hypothesis value intervals based on the threshold list; determining a ranking of the plurality of target hypothesis values based on the ranking of the plurality of hypothesis value intervals; the target hypothesis values are assigned to the plurality of account data with the impact factors in turn based on the ordering of the plurality of target hypothesis values.

In one embodiment, the computer program may further implement the following steps when executed by a processor: the relationship between the impact factor and the category of the account data is stored in a storage object.

In one embodiment, the computer program, when executed by the processor, further enables storing the relationship between the impact factor and the category of account data in a storage object, may include: acquiring category values of a plurality of account data with target hypothesis values for each influence factor; and acquiring average values of the class values of the plurality of account data, and storing the influence factors, the plurality of hypothesis values and the plurality of average values into a storage object.

In one embodiment, a computer program product is provided comprising a computer program which, when executed by a processor, performs the steps of: acquiring a plurality of account data; the account data includes: a plurality of influence factors influencing the category to which the account belongs; for each influence factor, acquiring a plurality of initial hypothesis values corresponding to the influence factors; dividing a plurality of initial hypothesized values into a plurality of hypothesized value intervals based on a plurality of thresholds corresponding to the influence factors; obtaining a plurality of key value pairs based on a plurality of initial hypothesis values in a hypothesis value interval; wherein, the key value pair is composed of a target hypothesis value and an influence factor in a hypothesis value interval; the number of key-value pairs is the same as the number of initial hypothesized values in the hypothesized value interval; assigning a target hypothesis value to a plurality of account data with influence factors based on key value pairs, calculating a category value corresponding to each account data under each key value pair, and determining the category of each account data based on the category value; based on the category of each key value to the next plurality of account data, the relation between the influence factors of different values and the category of the account data is obtained.

It should be noted that, the user information (including but not limited to user equipment information, user personal information, etc.) and the data (including but not limited to data for analysis, stored data, presented data, etc.) related to the present application are information and data authorized by the user or sufficiently authorized by each party, and the collection, use and processing of the related data need to comply with the related laws and regulations and standards of the related country and region.

Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, database, or other medium used in embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high density embedded nonvolatile Memory, resistive random access Memory (ReRAM), magnetic random access Memory (Magnetoresistive Random Access Memory, MRAM), ferroelectric Memory (Ferroelectric Random Access Memory, FRAM), phase change Memory (Phase Change Memory, PCM), graphene Memory, and the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory, and the like. By way of illustration, and not limitation, RAM can be in the form of a variety of forms, such as static random access memory (Static Random Access Memory, SRAM) or dynamic random access memory (Dynamic Random Access Memory, DRAM), and the like. The databases referred to in the embodiments provided herein may include at least one of a relational database and a non-relational database. The non-relational database may include, but is not limited to, a blockchain-based distributed database, and the like. The processor referred to in the embodiments provided in the present application may be a general-purpose processor, a central processing unit, a graphics processor, a digital signal processor, a programmable logic unit, a data processing logic unit based on quantum computing, or the like, but is not limited thereto.

The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

The foregoing examples illustrate only a few embodiments of the application and are described in detail herein without thereby limiting the scope of the application. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the application, which are all within the scope of the application. Accordingly, the scope of the application should be assessed as that of the appended claims.

Claims

1. A method for identifying account categories, the method comprising:

for each influence factor, acquiring a plurality of initial hypothesis values corresponding to the influence factors; dividing the initial hypothesis values into hypothesis value intervals based on a plurality of thresholds corresponding to the influence factors;

Obtaining a plurality of key value pairs based on a plurality of initial hypothesis values in the hypothesis value interval; wherein the key value pair is formed by a target hypothesis value and the influence factor in the hypothesis value interval; the number of key value pairs is the same as the number of initial hypothesized values in the hypothesized value interval;

assigning the target hypothesis value to a plurality of account data with the influence factors based on the key value pairs, calculating a category value corresponding to each account data under each key value pair, and determining the category of each account data based on the category value;

and obtaining the relation between the influence factors with different values and the categories of the account data based on the categories of the plurality of account data under each key value pair.

2. The method of claim 1, wherein the deriving a plurality of key-value pairs based on a plurality of the initial hypothesis values within the hypothesis value interval comprises:

selecting the initial hypothesis value which is not selected in the hypothesis value interval as a target hypothesis value, and obtaining a key value pair consisting of the target hypothesis value and the influence factor; repeating the steps until the number of the key value pairs is the same as the number of the initial hypothesized values in the hypothesized value interval.

3. The method of claim 1, wherein prior to dividing the plurality of initial hypothesis values into a plurality of hypothesis value intervals based on a plurality of thresholds corresponding to the impact factors, the method further comprises:

for each influence factor, performing de-duplication and sequencing on a plurality of initial thresholds to obtain a threshold list corresponding to the influence factor; the threshold list comprises a plurality of thresholds corresponding to the influence factors.

4. The method of claim 3, wherein assigning the target hypothesis value to the plurality of account data having the impact factor based on the key-value pairs comprises:

determining an ordering of a plurality of the hypothesis value intervals based on the threshold list;

determining a ranking of a plurality of target hypothesis values based on the ranking of a plurality of hypothesis value intervals;

assigning the target hypothesis values to the plurality of account data having the impact factors in turn based on the ordering of the plurality of target hypothesis values.

5. The method according to claim 1, wherein the method further comprises:

and storing the relation between the influence factors and the categories of the account data into a storage object.

6. The method of claim 5, wherein storing the relationship between the impact factor and the category of account data into a storage object comprises:

obtaining a category value of a plurality of account data with the target hypothesis value for each influence factor;

and acquiring an average value of the class values of the account data, and storing the influence factors, the assumption values and the average values into the storage object.

7. An account category identification device, the device comprising:

the hypothesis value interval dividing module is used for acquiring a plurality of initial hypothesis values corresponding to each influence factor; dividing the initial hypothesis values into hypothesis value intervals based on a plurality of thresholds corresponding to the influence factors;

A key value pair obtaining module, configured to obtain a plurality of key value pairs based on a plurality of initial hypothesis values in the hypothesis value interval; wherein the key value pair is formed by a target hypothesis value and the influence factor in the hypothesis value interval; the number of key value pairs is the same as the number of initial hypothesized values in the hypothesized value interval;

the category determining module is used for assigning the target hypothesis value to a plurality of account data with the influence factors based on the key value pairs, calculating a category value corresponding to each account data under each key value pair, and determining the category of each account data based on the category value;

and the category relation determining module is used for obtaining the relation between the influence factors with different values and the categories of the account data based on the categories of the plurality of account data under each key value pair.

8. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any of claims 1 to 6 when the computer program is executed.

9. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 6.

10. A computer program product comprising a computer program, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 6.