CN114399319A

CN114399319A - False enterprise identification method, device, equipment and medium based on prediction model

Info

Publication number: CN114399319A
Application number: CN202210075344.3A
Authority: CN
Inventors: 陈芷昕
Original assignee: Ping An International Smart City Technology Co Ltd
Current assignee: Ping An International Smart City Technology Co Ltd
Priority date: 2022-01-22
Filing date: 2022-01-22
Publication date: 2022-04-26

Abstract

The application provides a false enterprise identification method, a false enterprise identification device, false enterprise identification equipment and a false enterprise identification medium based on a prediction model, wherein the method is realized by the following steps: receiving first request information for requesting to acquire a legality result of a target enterprise; acquiring a plurality of preset indexes of a target enterprise, wherein the preset indexes are related to registration information of the target enterprise; performing characterization processing on the plurality of preset indexes to obtain a plurality of characteristic indexes, wherein each characteristic index in the plurality of characteristic indexes adopts a characteristic numerical value to represent the risk degree of a target enterprise as an illegal registered enterprise; inputting a plurality of characteristic indexes into a prediction model to obtain a false index corresponding to a target enterprise; and determining the legality result of the target enterprise according to the false index size. By adopting the method of the embodiment of the application, a plurality of characteristic indexes of the target enterprise are input into the prediction model, and the legality result of the target enterprise is determined according to the obtained false index, so that the accuracy of predicting the legality result of the enterprise is improved.

Description

False enterprise identification method, device, equipment and medium based on prediction model

Technical Field

The present application relates to the field of data analysis technologies, and in particular, to a method, an apparatus, a device, and a medium for identifying false enterprises based on a prediction model.

Background

With the rapid development of economy, economic activities are increasingly active, and accordingly, in order to expand the economic scale, the threshold of enterprise registration is reduced, and the number of registered enterprises is increased.

However, in this process, lawbreakers often use the identity card information revealed by others to register the enterprise, so that some citizens are unknowingly registered under their own names by others, thereby causing certain risks and influences on the related services such as handling personal credit by themselves. Therefore, predicting in advance whether a business is a false registered business has become an urgent problem to be solved.

Disclosure of Invention

The embodiment of the application provides a false enterprise identification method, a false enterprise identification device, false enterprise identification equipment and a false enterprise identification medium based on a prediction model.

In a first aspect, an embodiment of the present application provides a false enterprise identification method based on a prediction model, where the method includes:

receiving first request information, wherein the first request information is used for requesting to acquire a legality result of a target enterprise, and the legality result comprises that the target enterprise is a legal registered enterprise or an illegal registered enterprise;

acquiring a plurality of preset indexes of a target enterprise, wherein the preset indexes are related to registration information of the target enterprise;

performing characterization processing on the plurality of preset indexes to obtain a plurality of characteristic indexes, wherein each characteristic index in the plurality of characteristic indexes adopts a characteristic numerical value to represent the risk degree of a target enterprise as an illegal registered enterprise;

inputting a plurality of characteristic indexes into a prediction model to obtain a false index corresponding to a target enterprise;

and determining the legality result of the target enterprise according to the false index size.

In a second aspect, an embodiment of the present application provides a false enterprise identification apparatus based on a prediction model, where the apparatus includes:

the receiving unit is used for receiving first request information, wherein the first request information is used for requesting to acquire a legality result of a target enterprise, and the legality result comprises that the target enterprise is a legal registered enterprise or an illegal registered enterprise;

the system comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring a plurality of preset indexes of a target enterprise, and the preset indexes are related to registration information of the target enterprise;

the processing unit is used for performing characterization processing on the plurality of preset indexes to obtain a plurality of characteristic indexes, and each characteristic index in the plurality of characteristic indexes adopts a characteristic numerical value to represent the risk degree of the target enterprise as an illegal registered enterprise;

the model unit is used for inputting the characteristic indexes into the prediction model to obtain a false index corresponding to the target enterprise;

and the determining unit is used for determining the legality result of the target enterprise according to the size of the false index.

In a third aspect, embodiments of the present application provide an electronic device, which includes a processor, a memory, and computer executable instructions stored on the memory and executable on the processor, and when the computer executable instructions are executed, the electronic device is caused to perform some or all of the steps described in any one of the methods of the first aspect of the embodiments of the present application.

In a fourth aspect, embodiments of the present application provide a computer-readable storage medium having stored thereon computer instructions, which, when executed on a communication apparatus, cause the communication apparatus to perform some or all of the steps as described in any one of the methods of the first aspect of the embodiments of the present application.

In a fifth aspect, the present application provides a computer program product, where the computer program product includes a computer program operable to cause a computer to perform some or all of the steps as described in any one of the methods of the first aspect of the embodiments of the present application. The computer program product may be a software installation package.

It can be seen that, in the embodiment of the present application, first request information is received, where the first request information is used to request to obtain a validity result of a target enterprise, and the validity result includes that the target enterprise is a legal registered enterprise or an illegal registered enterprise; acquiring a plurality of preset indexes of a target enterprise, wherein the preset indexes are related to registration information of the target enterprise; performing characterization processing on the plurality of preset indexes to obtain a plurality of characteristic indexes, wherein each characteristic index in the plurality of characteristic indexes adopts a characteristic numerical value to represent the risk degree of the target enterprise as an illegal registered enterprise; inputting a plurality of characteristic indexes into a prediction model to obtain a false index corresponding to a target enterprise; and determining the legality result of the target enterprise according to the false index size. By adopting the method of the embodiment of the application, a plurality of characteristic indexes of the target enterprise are input into the prediction model, and the legality result of the target enterprise is determined according to the obtained false index corresponding to the target enterprise, so that the accuracy of predicting the legality result of the enterprise is improved.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a diagram of the structural deployment of a market regulatory system;

FIG. 2 is a flowchart of a false enterprise identification method based on a prediction model according to an embodiment of the present application;

FIG. 3 is a schematic structural diagram of a prediction model provided in an embodiment of the present application;

FIG. 4 is a diagram of the structural deployment of a predictive model-based market regulatory system as applied by an embodiment of the present application;

FIG. 5 is a schematic diagram illustrating an example of a false enterprise identification method based on a prediction model according to an embodiment of the present application;

FIG. 6 is a schematic structural diagram of a false enterprise identification device based on a prediction model according to an embodiment of the present application;

fig. 7 is a schematic structural diagram of a server in a hardware operating environment of an electronic device according to an embodiment of the present application.

Detailed Description

In order to make the technical solutions of the present application better understood, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

The terms "first," "second," and the like in the description and claims of the present application and in the above-described drawings are used for distinguishing between different objects and not for describing a particular order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps is not limited to only those steps recited, but may alternatively include other steps not recited, or may alternatively include other steps inherent to such process, method, article, or apparatus.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.

The following describes an application scenario related to an embodiment of the present application with reference to the drawings.

Fig. 1 is a structural deployment diagram of a market regulatory system. As shown in fig. 1, the system includes a user terminal, a first server, and an enterprise information repository.

The user terminal is a role which wants to acquire a legality result of a target enterprise and sends first request information for requesting to acquire the legality result of the target enterprise to a first server in the market supervision system;

the first server is used for receiving first request information sent by the user terminal, acquiring a plurality of preset indexes of a target enterprise from an enterprise information base according to the first request information, predicting a legality result of the target enterprise according to the acquired preset indexes, and sending the predicted legality result to the user terminal;

the enterprise information base is used for storing a plurality of preset indexes of each enterprise, and the preset indexes are related to the registration information of the target enterprise. In a specific implementation, the enterprise information repository may be a national enterprise credit information bulletin system;

in the process of predicting the legality result of the target enterprise by the system, the first server only simply obtains a plurality of preset indexes of the target enterprise from the enterprise information base, and then the legality result of the target enterprise is directly predicted according to the preset indexes, so that the legality result is predicted by depending on manual experience, and in the whole process, only a worker using the first server is used for manually predicting without a preset prediction basis or a prediction auxiliary tool, so that the accuracy of predicting the legality result of the target enterprise in the process is low.

Based on this, an embodiment of the present application provides a false enterprise identification method based on a prediction model, which is applied to a market monitoring system, where the market monitoring system includes a first server and a user terminal, please refer to fig. 2, fig. 2 is a flowchart of a false enterprise identification method based on a prediction model provided in an embodiment of the present application, and as shown in fig. 2, the method includes the following steps:

101: the first server receives first request information sent by the user terminal, wherein the first request information is used for requesting to acquire a legality result of the target enterprise, and the legality result comprises that the target enterprise is a legal registered enterprise or an illegal registered enterprise.

The market monitoring system can be applied to electronic equipment such as smart phones, desktop computers and tablet computers.

The first request message may include a business name of the target business.

The illegal registration of the enterprise means that the registration purpose of the enterprise is likely to be to use the enterprise for illegal purposes, and the illegal purposes comprise tax evasion, fraud, bidding in bidding activities and the like.

102: the first server obtains a plurality of preset indexes of the target enterprise, and the preset indexes are related to the registration information of the target enterprise.

Wherein the plurality of preset indicators may include at least one of: the number of enterprises with abnormal contact of the registered address of the target enterprise, the false enterprise agent registration number of the target enterprise, whether the target enterprise has the risk of identity misuse or not, the number of times of change of the industrial and commercial information of the target enterprise in historical time, the number of enterprises with the most stockholders in the target enterprise, whether the target enterprise only contains natural stockholders or not, the enterprise operation state of the target enterprise and the like, wherein the enterprise operation state comprises the states of continuity, presence, expense, cancellation, immigration, emigration, outage, liquidation and the like.

The first server obtains a plurality of preset indexes of the target enterprise, and in a specific implementation, the market monitoring system can be connected with a national enterprise credit information public system or other enterprise information systems, so that the first server can obtain the plurality of preset indexes of the target enterprise; the market supervision system can also be used for calling and obtaining a plurality of preset indexes of the target enterprise from a local database.

Illustratively, the number of times of change of the business information of the target enterprise in the historical time and the number of enterprises having the most stakeholders and having the most job-holding enterprises in the target enterprise can be obtained by the market supervision system from the national enterprise credit information disclosure system connected with the market supervision system; the false enterprise agent registration number of the target enterprise's sponsor, and the risk of whether the target enterprise's sponsor has an identity fraud may be obtained by the market regulatory system invoking from a local database.

103: the first server carries out characterization processing on the plurality of preset indexes to obtain a plurality of characteristic indexes, and each characteristic index in the plurality of characteristic indexes adopts a characteristic numerical value to represent the risk degree of the target enterprise as an illegal registered enterprise.

The first server performs characterization processing on the plurality of preset indexes according to the content of the preset indexes.

For example, if the preset index is the risk of whether the administrator of the target enterprise has the impersonation identity, the preset index is characterized, that is, the preset index is marked as "1" when the administrator of the target enterprise has the risk of impersonation identity, whereas the preset index is marked as "0" when the administrator of the target enterprise does not have the risk of impersonation identity; if the preset index is the numerical content, the preset index is marked as a numerical value, for example, if the preset index is the number of the abnormally-contacted enterprises of the registered address of the target enterprise and the number of the abnormally-contacted enterprises is 10, the preset index is characterized, and the preset index is marked as "10";

further, when the preset index is the number of the enterprises with abnormal contact of the registered address of the target enterprise, the registered address of the target enterprise can be further characterized, for example, the registered address is aggregated to the XX number in the XX district XX street in XX city XX district XX of XX province or XX district XX span in XX city XX district XX of XX city XX province, and for example, if the registered address of the target enterprise is "1 span 203 room in guanghai industrial area", the registered address is aggregated to "1 span in guanghai industrial area", and a span is used as a unit of the aggregated address, so that whether the registered area where the enterprise is located has a risk of false registration or not can be found. If it is detected that 5 abnormally-contacted enterprises exist in the address according to the registration address of '1 span in Guangdong sea area', the preset index is marked as '5', and the larger the number of the abnormally-contacted enterprises existing in the registration address is, the higher the risk that the enterprise is an illegally-registered enterprise is, so that the larger the value of the preset index is, the more likely the legality result of the enterprise is to be an illegally-registered enterprise.

For another example, if the preset index is the enterprise operation state of the target enterprise, and the enterprise operation state includes 8 states such as presence, expense, cancellation, immigration, migration, outage, and settlement, the preset index may be characterized, and the preset index may be marked with a numerical value according to the enterprise operation state, and in a specific implementation, the preset index may be marked with "0" by assigning values to the 8 enterprise operation states, for example, when the presence is equal to 0, the settlement is equal to 0.5, and the expense is equal to 1 … …, so that when the enterprise operation state of the target enterprise is in existence, the preset index may be marked with "0" by characterizing the index.

104: the first server inputs the characteristic indexes into the prediction model to obtain a false index corresponding to the target enterprise, and the false index is used for representing the possibility that the target enterprise is an illegal registered enterprise.

Wherein a larger false index indicates a greater likelihood that the target enterprise is an illegitimate registered enterprise. The value range of the false index can be 0-1.

The prediction model may include an input layer, an SVM model layer, and an XGBoost model layer.

For example, referring to fig. 3, fig. 3 is a schematic structural diagram of a prediction model provided in an embodiment of the present application, and as shown in fig. 3, the prediction model includes the following layers:

the system comprises an input layer, a data processing layer and a data processing layer, wherein the input layer is used for receiving a plurality of preset indexes (preset index 1, preset index 2 … … preset index N) of a target enterprise, and performing characterization processing on the plurality of preset indexes to obtain a plurality of characteristic indexes (characteristic index 1, characteristic index 2 … … characteristic index N) of the target enterprise;

the SVM model layer is composed of K SVM models (SVM model 1 and SVM model 2 … … SVM model K), the training data set and the verification data set used by each SVM model in the K SVM models in the training process are different, and the SVM model layer is used for obtaining K initial false indexes after a plurality of characteristic indexes of a target enterprise are respectively input into each SVM model;

and the XGboost model layer is used for obtaining the false indexes obtained by weighting and summing the K initial false indexes as the false indexes corresponding to the target enterprise according to the K initial false indexes.

It should be noted that the prediction model shown in fig. 3 is only an example of a prediction model, and in a specific application, the prediction model may also exist in other hierarchical forms.

105: and the first server determines the legality result of the target enterprise according to the false index size.

The first server determines the validity result of the target enterprise according to the size of the false index, and since the false index is used for representing the possibility that the target enterprise is an illegal registered enterprise, in the specific implementation, the validity result may be: if the false index is larger than a first preset threshold value, the first server determines that the target enterprise is an illegal registered enterprise; and if the false index is smaller than or equal to a first preset threshold value, the first server determines that the target enterprise is a legal registered enterprise. If the value range of the false index is 0-1, the first preset threshold value may be 0.5, 0.6, 0.8 or other values.

106: the first server sends a validity result to the user terminal.

In a specific implementation, the first server sends a validity result to the user terminal, and the content of the notification short message may be "the validity result of the target enterprise: target enterprise is legal registered enterprise/illegal registered enterprise or other content; it is also possible to send the validity result to the user terminal in the form of an instant message.

In a specific implementation, the first server sends the legality result to the user terminal, and the method for obtaining the legality result by the first server may be that a prediction model is arranged in a market supervision system, and the prediction model is used for predicting whether an enterprise is a legal registered enterprise or an illegal registered enterprise, so that after the enterprise corresponding to the request information obtains a plurality of preset indexes of the enterprise when the user initiates the request information, the plurality of preset indexes are input into the prediction model to predict the legality result; the first server may also predict the validity result of each enterprise in advance by using the prediction model and store the result in the local database, so that when the user terminal initiates the first request message, the target enterprise corresponding to the first request message directly calls the validity result in the local database and sends the result to the user terminal.

The apparatus according to the embodiments of the present application will be described with reference to the accompanying drawings.

Referring to fig. 4, fig. 4 is a structural deployment diagram of a market monitoring system based on a prediction model, which is applied in the embodiment of the present application, and as shown in fig. 4, the system includes a user terminal, a first server, an enterprise information base and the prediction model, where the function of each module may be implemented by a separate server, or the functions of multiple modules may be implemented by one server. And a plurality of servers realizing the functions of different modules are mutually communicated and connected. The server may be an independent server, or may be a cloud server that provides basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a Network service, cloud communication, a middleware service, a domain name service, a security service, a Content Delivery Network (CDN), and a big data and artificial intelligence platform.

The user terminal refers to a role that wants to obtain the validity result of the target enterprise, and therefore sends first request information for requesting to obtain the validity result of the target enterprise to the first server in the system.

The first server is used for receiving first request information sent by the user terminal, acquiring a plurality of preset indexes of a target enterprise from an enterprise information base according to the first request information, performing characterization processing on the plurality of preset indexes to obtain a plurality of characteristic indexes, inputting the plurality of characteristic indexes into a prediction model to obtain a false index corresponding to the target enterprise, determining a legality result of the target enterprise according to the size of the false index, and sending the legality result to the user terminal to meet the role of the requirement of the user terminal.

The enterprise information base is used for storing a plurality of preset indexes of each enterprise, and the preset indexes are related to the registration information of the target enterprise. In a particular implementation, the enterprise information repository may be a national enterprise credit information bulletin system.

The prediction model is used for outputting a false index corresponding to the target enterprise to the first server according to a plurality of characteristic indexes of the target enterprise input by the first server.

Illustratively, assuming the false index is greater than 0.8, the result of the target enterprise's legitimacy is determined to be an illegitimate registered enterprise. The user A wants to know whether the target enterprise is an illegal registered enterprise or not, so the user A uses the user terminal to initiate first request information in the market supervision system, the first request information comprises an enterprise name of the target enterprise for requesting to obtain a legal result of the target enterprise, the worker B receives the first request information sent by the user terminal on a first server, the worker B obtains a plurality of preset indexes of the target enterprise from a national enterprise credit information public system or other enterprise information bases on the first server, the plurality of preset indexes can comprise the false enterprise proxy registration number of dealers of the target enterprise, whether the dealers of the target enterprise have the risk of falsifying identities or not and the like, the worker B performs characteristic processing on the plurality of preset indexes on the first server to obtain a plurality of characteristic indexes, and each characteristic index in the plurality of characteristic indexes adopts a characteristic value to represent the risk degree of the target enterprise as the illegal registered enterprise, the worker B inputs the plurality of characteristic indexes into the prediction model on the first server, and obtains a false index of 0.9 > 0.8 corresponding to the target enterprise, and therefore, the worker B determines that the validity result of the target enterprise is an illegal registered enterprise according to the false index of 0.9 on the first server, and sends a short message content containing the validity result to the user terminal on the first server, "the validity result of the target enterprise is: the target enterprise is an illegal registered enterprise, and the illegal registered enterprise is sent to the user terminal so as to inform the user A of the legal result of the target enterprise.

It can be seen that, in the embodiment of the present application, the first server receives first request information sent by the user terminal, where the first request information is used to request to obtain a validity result of the target enterprise, and the validity result includes that the target enterprise is a legal registered enterprise or an illegal registered enterprise; the method comprises the steps that a first server obtains a plurality of preset indexes of a target enterprise, and the preset indexes are related to registration information of the target enterprise; the first server performs characterization processing on the plurality of preset indexes to obtain a plurality of characteristic indexes, wherein each characteristic index in the plurality of characteristic indexes adopts a characteristic numerical value to represent the risk degree of a target enterprise as an illegal registered enterprise; the first server inputs the characteristic indexes into a prediction model to obtain a false index corresponding to the target enterprise, wherein the false index is used for representing the possibility that the target enterprise is an illegal registered enterprise; the first server determines the legality result of the target enterprise according to the size of the false index; the first server sends a validity result to the user terminal. By adopting the method of the embodiment of the application, a plurality of characteristic indexes of the target enterprise are input into the prediction model, and the legality result of the target enterprise is determined according to the obtained false index corresponding to the target enterprise, so that the accuracy of predicting the legality result of the enterprise is improved.

In one possible example, the inputting, by the first server, a plurality of characteristic indicators into the prediction model to obtain the false index corresponding to the target enterprise includes: the method comprises the steps that a first server inputs a plurality of characteristic indexes of a target enterprise into K trained SVM models respectively to obtain K initial false indexes; the first server inputs K initial false indexes into a trained tree model XGboost, and the false indexes obtained after the K initial false indexes are weighted and summed are used as the corresponding false indexes of the target enterprise.

Among them, an svm (support Vector machine) model refers to a support Vector machine model, and is a common discrimination method. In the field of machine learning, an SVM model is a supervised learning model that is commonly used for pattern recognition, classification, and regression analysis.

Among them, the tree model xgboost (extreme Gradient Boosting) is one of Boosting algorithm models, and is usually used to integrate many weak classifiers together to form a strong classifier.

Assuming that K initial false indexes are K1 and K2 … … Kk respectively, and weights of K trained support vector machine SVM models are X1 and X2 … … Xk respectively, the false index after the K initial false indexes are weighted and summed is K1 × X1+ K2 × 2+ … … Kk Xk.

In the specific implementation, the prediction model may be formed by connecting K trained SVM models and a trained XGboost model and then packaging.

Exemplarily, referring to fig. 5, fig. 5 is a schematic diagram illustrating an exemplary false enterprise identification method based on a prediction model according to an embodiment of the present application, and as shown in fig. 5, it is assumed that a plurality of preset indexes obtained for a target enterprise are: the number of abnormally-connected enterprises at the registered address is 3, the number of false enterprise agent registration of the dealers is 11, the dealers have the risk of falsely using identities, no annual newspaper is normally submitted, the current tax payment state is abnormal, the number of times of change of the business information in the historical time is 0, the number of enterprises with the most stakeholders in the staffed enterprises is 3, the enterprises contain not only natural stakeholders, the business operation state is a reimbursement, and after a plurality of preset indexes are subjected to characterization processing, a plurality of characteristic indexes of the target enterprise are respectively: 3. 11, 1, 0, 3, 0, 1; suppose the number K of trained SVM models is 9. The first server inputs a plurality of feature indexes of the target enterprise into 9 trained SVM models (SVM1, SVM2 … … SVM9), and 9 initial false indexes (initial false indexes 1-9) are obtained: 0.781, 0.893, 0.981, 0.342, 0.920, 0.419, 0.827, 0.193 and 0.651, assuming that the trained SVM models 1-8 are all given weight values of 0.1, and the initial false index 9 is given weight value of 0.2, the first server inputs the 9 initial false indexes into the trained tree model XGboost model, obtains the false indexes after weighting and summing the 9 initial false indexes as the false indexes corresponding to the target enterprise (0.781+0.893+ … … +0.193) × 0.1+0.651 × 0.2 ≈ 0.666, and then the first server determines the legality result of the target enterprise according to the size of the false indexes.

It can be seen that, in the embodiment of the application, the first server respectively inputs the feature indexes of the target enterprise into the K trained SVM models to obtain K initial false indexes, and then inputs the K initial false indexes into the trained tree model XGboost model to obtain the false indexes obtained by weighting and summing the K initial false indexes as the corresponding false indexes of the target enterprise. Namely, the prediction model is formed by connecting K trained SVM models and a trained XGboost model, and the prediction model with the composition structure is adopted, so that the calculation of the false indexes corresponding to the target enterprise by the prediction model is more accurate, and the obtained legality result of the target enterprise is more accurate.

In a possible example, the plurality of preset indicators include an enterprise registration address, and the characterizing the plurality of indicators by the first server includes: the first server acquires a registration address of a target enterprise; the first server matches the registration address of the target enterprise with the registration address stored in the local database of the first server, and determines an in-database registration address corresponding to the registration address of the target enterprise; the first server determines the registration address risk degree corresponding to the target enterprise according to the in-store registration address, the registration address risk degree and the in-store registration address have a corresponding relation, and the registration address risk degree is determined according to the number of the enterprises with abnormal contact of the in-store registration address.

The abnormal enterprise contact means that the enterprise registration address is an abnormal enterprise which cannot be contacted according to the contact information in the database registration address. A high probability of contacting an anomalous business is a business that is registered specifically for illicit purposes.

For example, the first server matches the registration address of the target enterprise with the registration address stored in the local database of the first server, which may be determined according to whether the registration address of the target enterprise is included in the repository, for example, the registration address of the target enterprise is "1 m 203 m in the cantonese sea industrial area", and the registration address of the repository is "1 m in the cantonese sea industrial area", and thus, the registration address of the target enterprise is included in the repository, that is, the registration address of the target enterprise is a specific address further from the repository, and thus, the registration address of the target enterprise is matched to "1 m in the cantonese sea industrial area".

In a specific implementation, the function of the registered address in the library is to directly call the number of abnormally contacted enterprises existing in the registered address of the target enterprise according to the address.

The risk index of the registration address is determined according to the number of the enterprises with abnormal contact of the registered address in the database, and in the specific implementation, the risk index of the registration address is larger when the number of the enterprises with abnormal contact of the registered address in the database is larger, namely, the risk index of the registration address and the number of the enterprises with abnormal contact of the registered address in the database are in positive correlation.

For example, the first server determines the risk level of the registered address corresponding to the target enterprise according to the registered address in the repository, which may be numerically marking the registered address of the enterprise in a plurality of preset indexes according to the number of abnormally contacted enterprises in the registered address in the repository, for example, if the number of abnormally contacted enterprises in the registered address in the repository is 10, that is, there are 10 enterprises that cannot be contacted in the registered address in the repository, marking the preset index of the registered address of the enterprise as "10".

In the embodiment of the present application, the enterprise registration address is characterized in that the first server obtains the registration address of the target enterprise and matches the registration address of the target enterprise with the registration address stored in the local database of the first server, so as to determine the in-repository registration address corresponding to the registration address of the target enterprise; the first server determines the registration address risk degree corresponding to the target enterprise according to the in-store registration address, the registration address risk degree and the in-store registration address have a corresponding relation, and the registration address risk degree is determined according to the number of the enterprises with abnormal contact of the in-store registration address. After the enterprise registration address of the target enterprise is characterized, the registration address risk degree corresponding to the target enterprise is finally obtained, so that the enterprise registration address of the target enterprise can represent the risk degree of the target enterprise as an illegal registration enterprise through the characteristic value, that is, the enterprise registration address of the target enterprise more intuitively and quantifiably presents the risk degree of the target enterprise as the illegal registration enterprise, and the prediction accuracy of the enterprise legality result is improved.

In one possible example, the aforementioned plurality of preset indicators include an enterprise business status, and the characterizing the plurality of indicators includes: the method comprises the steps that a first server obtains an enterprise operation state of a target enterprise; the first server determines a business state type corresponding to the business operation state; the first server determines the business state risk degree corresponding to the target enterprise according to the business state type corresponding to the enterprise business state of the target enterprise, the business state risk degree and the business state type have a corresponding relation, and the business state risk degree and the activity degree of the business state type are in negative correlation.

Wherein the enterprise operation state is one of continuous, working, canceling, immigration, emigration, stopping operation and clearing,

the existing state refers to the state that the enterprise exists according to law and continues normal operation;

the business refers to normal operation and production of production type enterprises;

the suspension of the sale refers to the suspension of the business license of the enterprise, and is the administrative punishment made by the market regulatory bureau to the illegal enterprise;

logout means that the enterprise has lost corporate qualifications;

the emigration refers to the change of the enterprise registration administrative organ, and the enterprise has already been emigrated from a certain administrative organ;

immigration refers to the change of an enterprise registration administrative organ, and the enterprise is already immigrated to a certain administrative organ;

the stopping of the operation means that for some reason, an enterprise stops production and operation activities at the end of a term and resumes production after conditions are changed;

clearing means that after the enterprises are released according to regulations of laws and regulations and the operation is declared to be stopped due to bankruptcy, suspension and other reasons, the properties, the debt rights and the debt of the enterprises are comprehensively checked, the debt rights are collected, the debt is cleared and the economic activities of residual property distribution are carried out.

The business state type comprises business change, business renewal or business stop in sequence according to the activity degree.

The business state risk degree is negatively correlated with the activity degree of the business state type, which means that the higher the activity degree of the business state type is, the lower the business state risk degree is, otherwise, the same applies. Thus, in a particular implementation, the changing relationship between the business state risk level and the activity level of the business state type may be in the form of an arbitrary decreasing function.

For example, in a specific implementation, if the preset index is the business operation status, 8 types of business operation statuses and business status types may be respectively associated, for example, the business operation status is associated with the business continuation, the cancellation, the suspension, the settlement and the business stop are associated with the suspension, and the immigration and the emigration are associated with the business change. Different values are then assigned to the different business status types, for example a business renewal value of 0, a business stoppage value of 1 and a business change value of 0.5. Therefore, if the enterprise business state of the target enterprise is persistent, the preset index of the target enterprise is marked as '0' because the business state type corresponding to the persistent state is the enterprise persistent.

It can be seen that, in the embodiment of the present application, the manner of performing the characterization processing on the enterprise operation state is that the first server obtains the enterprise operation state of the target enterprise and determines the business state type corresponding to the enterprise operation state, and then determines the business state risk degree corresponding to the target enterprise according to the business state type corresponding to the enterprise operation state of the target enterprise, where the business state risk degree and the business state type have a corresponding relationship, and the business state risk degree and the activity degree of the business state type are negatively related. After the enterprise operation state of the target enterprise is characterized, the enterprise operation state of the target enterprise can represent the risk degree of the target enterprise as an illegal registered enterprise through the characteristic value, namely, the enterprise operation state of the target enterprise can more intuitively and quantifiably present the risk degree of the target enterprise as the illegal registered enterprise, and the prediction accuracy of the enterprise legality result is improved.

In a possible example, after the first server obtains a plurality of preset indicators of the target enterprise, the method further includes: the first server acquires other preset indexes of the target enterprise, wherein the other preset indexes are related to business information of the target enterprise; the first server performs characterization processing on other preset indexes to obtain other characteristic indexes corresponding to the other preset indexes, and the other characteristic indexes represent the industrial and commercial registration risk degree of the target enterprise through characteristic values; the method for obtaining the false indexes corresponding to the target enterprises by inputting the characteristic indexes into the prediction model by the first server comprises the following steps: and the first server inputs the characteristic indexes and other characteristic indexes into the prediction model to obtain a false index corresponding to the target enterprise.

The number of the other preset indexes may be at least one.

The other preset indexes may include whether the target enterprise normally submits yearly reports, the current tax payment state of the target enterprise, and other business information. The more the business information states presented by other preset indexes are not in accordance with the regulations of the business bureau, the higher the false index output by the prediction model will be caused by the other characteristic indexes obtained after the characterization processing.

For example, when the other preset index is whether the target enterprise normally submits the annual newspaper, the preset index is characterized, and in a specific implementation, the preset index may be marked as "1" when the target enterprise does not normally submit the annual newspaper, or conversely, the preset index is marked as "0" when the target enterprise normally submits the annual newspaper.

For example, when the other preset index is the current tax payment state of the target enterprise, the preset index is characterized, and in a specific implementation, the preset index may be marked as "1" when the current tax payment state of the target enterprise is abnormal, or conversely, the preset index may be marked as "0" when the current tax payment state of the target enterprise is normal.

It can be seen that, in the embodiment of the application, after the first server obtains the multiple preset indexes of the target enterprise, the first server also obtains other preset indexes related to the business information of the target enterprise, the first server performs characterization processing on the other preset indexes to obtain other characteristic indexes corresponding to the other preset indexes, the other characteristic indexes represent the business registration risk degree of the target enterprise through characteristic values, and the first server inputs the multiple characteristic indexes into the prediction model and also inputs the other characteristic indexes obtained after the characterization processing is performed on the other preset indexes into the prediction model, so as to obtain the false index corresponding to the target enterprise. When the legality of an enterprise is predicted, the registration information and the business information of the enterprise are combined to serve as a prediction basis, so that the enterprise related information input into the prediction model is more comprehensive, the prediction model can perform more comprehensive prediction calculation on the false indexes of the enterprise, and the accuracy of predicting the legality result of the enterprise is improved.

In one possible example, the above-mentioned training process of the K trained SVM models is as follows: acquiring K data sets, wherein each data set in the K data sets comprises a plurality of enterprises, a plurality of preset indexes corresponding to each enterprise in the plurality of enterprises and a legality result corresponding to each enterprise; adopting K-1 data sets in the K data sets as training data sets of a first initial SVM model in the K initial SVM models, inputting the first initial SVM model for training, and obtaining a first SVM model in training; adopting the residual data sets except the K-1 data sets in the K data sets as verification data sets of a first SVM model in training, inputting the verification data sets into the first SVM model in training, and obtaining a false index obtained by reasoning the verification data sets through the first SVM model in training; comparing a false index obtained by reasoning the first SVM model in the training of the verification data set with a legality result corresponding to the verification data set, and determining the reasoning accuracy of the first SVM model in the training; if the inference accuracy of the first SVM model in the training is not higher than first preset accuracy, performing iterative training on the first SVM model in the training, and if the inference accuracy of the first SVM model in the training is higher than the first preset accuracy, determining that the training of the first SVM model in the training is finished and the first SVM model is one of the K SVM models; repeatedly adopting K-1 data sets in the K data sets as a training data set of a next first initial SVM model in the K initial SVM models, inputting the next first initial SVM model for training to obtain a first SVM model in the next training, adopting the residual data sets except the K-1 data sets in the K data sets as a verification data set of the first SVM model in the next training, and determining the training completion process of the first SVM model in the next training according to the inference accuracy of the first SVM model in the next training until the training of the K SVM models is completed, wherein the data set adopted by each initial SVM model in the K initial SVM models is different from the training sets of other SVM initial SVM models.

The method comprises the steps that a false index obtained by inference of a first SVM model in a training process of a verification data set is compared with a legality result corresponding to the verification data set, and the inference accuracy of the first SVM model in the training process is determined; and when the false index is larger than a second preset threshold and the legality result corresponding to the verification data set is an illegal registered enterprise, and the false index is smaller than or equal to the second preset threshold and the legality result corresponding to the verification data set is a legal registered enterprise, calculating a difference value between the false index and the second preset threshold, taking a ratio of the difference value to the second preset threshold as a false index error, and determining the inference accuracy of the first SVM model in training through the false index error.

Illustratively, in determining the inference accuracy of the first SVM model in training, the inference accuracy of the first SVM model in training is determined by a false exponential error, where the inference accuracy is (1-false exponential error) × 100%, [1- (| false exponent-second preset threshold |)/second preset threshold ] × 100%. If the second preset threshold is 0.80, the false index obtained by the verification data set through inference of the first SVM model in the training is 0.85, and the first preset accuracy is 85%, the inference accuracy is 1-false index error [1- (0.85-0.80)/0.80 ] × 100%, (1-0.0625) × 100%, (93.75% > 85%, so that the inference accuracy of the first SVM model in the training is higher than the first preset accuracy, and the training of the first SVM model in the training is determined to be completed.

The method comprises the steps of repeatedly adopting K-1 data sets in K data sets as a training data set of a next first initial SVM model in K initial SVM models, inputting the next first initial SVM model for training, obtaining the next first SVM model in training, and finally obtaining K SVM models.

Illustratively, K data sets including the data set 1 and the data set 2 … … are used as K data sets, the training first SVM model 1 is trained by using the K data sets including the data set 1 and the data set 2 … … as the training data sets, and the data set K is used as the verification data set to determine that the inference accuracy of the training first SVM model 1 is higher than a first preset accuracy; similarly, training a first SVM model 2 in training by using K-1 data sets of the data set 2 and the data set 3 … … as training data sets, determining that the inference accuracy of the first SVM model 2 in training is higher than a first preset accuracy … … by analogy by using the data set 1 as a verification data set, and determining the K SVM models on the basis of ensuring that the data set and the training set adopted by each initial SVM model in the K initial SVM models are different from other initial SVM models until the times that each data set is used as the training data set are K-1 times and the inference accuracies of the K SVM models are higher than the first preset accuracy.

It can be seen that in the training process of the K support vector machine SVM models provided in the embodiment of the present application, only when the inference accuracy of the first SVM model in the training is higher than the first preset accuracy, it is determined that the training of the first SVM model in the training is completed until the training of the K SVM models which fully cover the data features of the training data set is completed. Furthermore, K support vector machine SVM models obtained through training in the training process provided by the embodiment of the application are placed in the prediction model to predict the legality result of the enterprise, so that the manpower is liberated, and the prediction accuracy of the legality result of the enterprise is improved.

In one possible example, the training process of the trained tree model XGBoost model is as follows: acquiring a second data set, wherein the second data set comprises at least two enterprises, a plurality of preset indexes corresponding to each enterprise in the at least two enterprises and a legality result corresponding to each enterprise; splitting the second data set into a training data set and a validation data set; inputting a training data set into an initial XGboost model for training to obtain an XGboost model in training, wherein the initial XGboost model is formed by weighting and summing K SVM models after being endowed with different weights; inputting the verification data set into a training XGboost model to obtain a false index obtained by the verification data set through XGboost reasoning in training; comparing a false index obtained by inference of the XGboost model in the training of the verification data set with a legality result corresponding to the verification data set, and determining the inference accuracy of the XGboost model in the training; and if the inference accuracy of the XGboost model in the training is not higher than the second preset accuracy, modifying weights of K SVM models in the XGboost model in the training and carrying out iterative training, and if the inference accuracy of the XGboost model in the training is higher than the second preset accuracy, determining that the XGboost model in the training is trained, so as to obtain the XGboost model.

In the specific implementation, the assignment is performed on the K SVM models, and can be performed according to the data characteristics of a data set used by the K SVM models.

Illustratively, the K SVM models are assigned according to the magnitude of the absolute value of the difference between a plurality of preset indexes corresponding to each enterprise and the average value of a plurality of preset indexes corresponding to at least two enterprises, wherein each data set in the K SVM models comprises, the K SVM models are ranked according to the magnitude of the absolute value of the difference, the ranked K SVM models are obtained, the data set with the larger absolute value of the difference is used as the verification data set, the rank of the SVM models corresponding to the data set is closer, and when the weight is modified and iteratively trained on the K SVM models in the XGboost model during training, the weight is modified and iteratively trained on the basis of determining the relationship between the magnitude of the weight and the rank of the K SVM models. For example, in the process of training K SVM models, in K data sets, the absolute value of the difference between a plurality of preset indexes of an enterprise included in the data set 1 and the average value is the largest, and the data set 1 is a verification data set of the SVM model 1, that is, the data set 1 with the largest absolute value of the difference is not used as the training data set of the SVM model 1, so that the weight of the SVM model 1 is the largest when the K SVM models are assigned. By adopting the method of the embodiment, the weight of the SVM model with a more balanced training data set used in the iterative training process can be higher, so that the XGboost model obtained by training can predict and calculate the false index more accurately.

The method comprises the steps of comparing a false index obtained by inference of an XGboost model in a verification data set through training with a legality result corresponding to the verification data set, and determining the inference accuracy of the XGboost model in the training.

It can be seen that the XGBoost model provided in the embodiment of the present application is formed by weighting and summing K SVM models after being given different weights, and in the training process, it is determined that the XGBoost model training is completed only when the inference accuracy of the XGBoost model in the training is higher than a second preset accuracy. Therefore, the prediction model can not only comprehensively cover the data characteristics of the training data set through the K SVM models, but also indirectly play a role in screening and selecting the output results of the K SVM models through assigning values to the K SVM models, and therefore the accuracy of the prediction model in predicting the enterprise legality results is further improved.

Referring to fig. 6, in accordance with the embodiment shown in fig. 2, fig. 6 is a schematic structural diagram of a prediction model-based false enterprise recognition apparatus according to an embodiment of the present application, as shown in fig. 6:

a predictive model-based false enterprise identification device, said device comprising:

301: the receiving unit is used for receiving first request information, wherein the first request information is used for requesting to acquire a legality result of the target enterprise, and the legality result comprises that the target enterprise is a legal registered enterprise or an illegal registered enterprise.

302: the acquisition unit is used for acquiring a plurality of preset indexes of the target enterprise, and the preset indexes are related to the registration information of the target enterprise.

303: and the processing unit is used for performing characterization processing on the plurality of preset indexes to obtain a plurality of characteristic indexes, and each characteristic index in the plurality of characteristic indexes adopts a characteristic numerical value to represent the risk degree of the target enterprise as an illegal registered enterprise.

304: and the model unit is used for inputting the characteristic indexes into the prediction model to obtain the false indexes corresponding to the target enterprises.

305: and the determining unit is used for determining the legality result of the target enterprise according to the size of the false index.

In a specific implementation, the above apparatus may be applied to a market regulation system including a first server and a user terminal.

It can be seen that, in the apparatus provided in this embodiment of the application, the receiving unit receives the first request information, where the first request information is used to request to obtain a validity result of the target enterprise, and the validity result includes that the target enterprise is a legal registered enterprise or an illegal registered enterprise; acquiring a plurality of preset indexes of a target enterprise through an acquisition unit, wherein the preset indexes are related to registration information of the target enterprise; the method comprises the steps that a plurality of preset indexes are subjected to characteristic processing through a processing unit to obtain a plurality of characteristic indexes, and each characteristic index in the plurality of characteristic indexes adopts a characteristic numerical value to represent the risk degree of a target enterprise as an illegal registered enterprise; inputting a plurality of characteristic indexes into a prediction model through a model unit to obtain a false index corresponding to a target enterprise; and determining the legality result of the target enterprise according to the false index size through a determining unit. By adopting the device provided by the embodiment of the application, a plurality of characteristic indexes of the target enterprise are input into the prediction model, and the legality result of the target enterprise is determined according to the obtained false index corresponding to the target enterprise, so that the accuracy of predicting the legality result of the enterprise is improved.

Specifically, the embodiment of the present application may perform functional unit division on the prediction model-based false enterprise identification apparatus according to the above method example, for example, each functional unit may be divided corresponding to each function, or two or more functions may be integrated into one processing unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit. It should be noted that the division of the unit in the embodiment of the present application is schematic, and is only a logic function division, and there may be another division manner in actual implementation.

Consistent with the embodiment shown in fig. 2, an electronic device is provided in the embodiment of the present application, please refer to fig. 7, fig. 7 is a schematic diagram illustrating a server structure of a hardware operating environment of an electronic device provided in the embodiment of the present application, and as shown in fig. 7, the electronic device includes a processor, a memory, and computer-executable instructions stored in the memory and operable on the processor, and when the computer-executable instructions are executed, the electronic device executes instructions including steps of any false enterprise identification method based on a prediction model.

Wherein the processor is a CPU.

The memory may be a high-speed RAM memory, or may be a stable memory, such as a disk memory.

Those skilled in the art will appreciate that the configuration of the server shown in fig. 7 is not intended to be limiting and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.

As shown in fig. 7, the memory may include computer-executable instructions for an operating system, a network communication module, and a predictive model-based false enterprise identification method. The operating system is used for managing and controlling hardware and software resources of the server and supporting the operation of executing instructions by the computer. The network communication module is used for realizing communication between each component in the memory and communication with other hardware and software in the server, and the communication may use any communication standard or protocol, including but not limited to GSM (Global System of Mobile communication), GPRS (General Packet Radio Service), CDMA2000(Code Division Multiple Access 2000), WCDMA (Wideband Code Division Multiple Access), TD-SCDMA (Time Division-Synchronous Code Division Multiple Access), etc.

In the server shown in fig. 7, the processor is configured to execute computer-executable instructions for personnel management stored in the memory, and to implement the following steps: receiving first request information, wherein the first request information is used for requesting to acquire a legality result of a target enterprise, and the legality result comprises that the target enterprise is a legal registered enterprise or an illegal registered enterprise; acquiring a plurality of preset indexes of a target enterprise, wherein the preset indexes are related to registration information of the target enterprise; performing characterization processing on the plurality of preset indexes to obtain a plurality of characteristic indexes, wherein each characteristic index in the plurality of characteristic indexes adopts a characteristic numerical value to represent the risk degree of the target enterprise as an illegal registered enterprise; inputting a plurality of characteristic indexes into a prediction model to obtain a false index corresponding to a target enterprise; and determining the legality result of the target enterprise according to the false index size.

For specific implementation of the server related to the present application, reference may be made to the above embodiments of the false enterprise identification method based on the prediction model, which are not described herein again.

An embodiment of the present application provides a computer-readable storage medium, in which computer instructions are stored, and when the computer instructions are executed on a communication apparatus, the communication apparatus is caused to perform the following steps: receiving first request information, wherein the first request information is used for requesting to acquire a legality result of a target enterprise, and the legality result comprises that the target enterprise is a legal registered enterprise or an illegal registered enterprise; acquiring a plurality of preset indexes of a target enterprise, wherein the preset indexes are related to registration information of the target enterprise; performing characterization processing on the plurality of preset indexes to obtain a plurality of characteristic indexes, wherein each characteristic index in the plurality of characteristic indexes adopts a characteristic numerical value to represent the risk degree of the target enterprise as an illegal registered enterprise; inputting a plurality of characteristic indexes into a prediction model to obtain a false index corresponding to a target enterprise; and determining the legality result of the target enterprise according to the false index size. The computer includes an electronic device.

The electronic terminal equipment comprises a mobile phone, a tablet computer, a personal digital assistant, wearable equipment and the like.

The computer-readable storage medium may be an internal storage unit of the electronic device described in the above embodiments, for example, a hard disk or a memory of the electronic device. The computer readable storage medium may also be an external storage device of the electronic device, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, provided on the electronic device. Further, the computer-readable storage medium may also include both an internal storage unit and an external storage device of the electronic device. Computer-readable storage media are used to store computer-executable instructions and data as well as other computer-executable instructions and data needed by electronic devices. The computer-readable storage medium may also be used to temporarily store data that has been output or is to be output.

For specific implementation of the computer-readable storage medium related to the present application, reference may be made to the above embodiments of the prediction model-based false enterprise identification method, which are not described herein again.

Embodiments of the present application provide a computer program product, wherein the computer program product comprises a computer program operable to cause a computer to perform some or all of the steps of any of the prediction model-based false enterprise identification methods as described in the above method embodiments, and the computer program product may be a software installation package.

It should be noted that, for the sake of simplicity, any of the above embodiments of the false enterprise identification method based on prediction model are described as a series of action combinations, but those skilled in the art should understand that the present application is not limited by the described action sequence, because some steps may be performed in other sequences or simultaneously according to the present application. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred and that no particular act is required to implement the invention.

The above embodiments of the present application are described in detail, and the principles and implementations of a method, an apparatus, a device and a medium for identifying false enterprises based on a prediction model according to the present application are described herein by applying specific examples, and the description of the above embodiments is only used to help understanding the method and the core ideas of the method; meanwhile, for those skilled in the art, according to the idea of the present application of a prediction model-based false enterprise identification method, apparatus, device and medium, there may be variations in the specific implementation and application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, hardware products and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

While the present application has been described in connection with various embodiments, other variations to the disclosed embodiments can be understood and effected by those skilled in the art in practicing the claimed application, from a review of the drawings, the disclosure, and the appended claims. In the claims, the word "comprising" does not exclude other elements or steps, and the word "a" or "an" does not exclude a plurality. The mere fact that certain measures are recited in mutually different dependent claims does not indicate that a combination of these measures cannot be used to advantage.

It will be apparent to those skilled in the art that various modifications and variations can be made in the predictive model-based false enterprise identification method, apparatus, device, and medium provided herein without departing from the spirit and scope of the present application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.

Claims

1. A method for identifying false businesses based on a predictive model, the method comprising:

receiving first request information, wherein the first request information is used for requesting to acquire a legality result of a target enterprise, and the legality result comprises that the target enterprise is a legally registered enterprise or an illegally registered enterprise;

acquiring a plurality of preset indexes of the target enterprise, wherein the preset indexes are related to registration information of the target enterprise;

performing characterization processing on the preset indexes to obtain a plurality of characteristic indexes, wherein each characteristic index in the characteristic indexes adopts a characteristic numerical value to represent the risk degree of the target enterprise as an illegal registered enterprise;

inputting a plurality of characteristic indexes into a prediction model to obtain a false index corresponding to the target enterprise;

and determining the legality result of the target enterprise according to the size of the false index.

2. The method of claim 1, wherein inputting a plurality of characteristic indicators into a predictive model to obtain a false index corresponding to the target enterprise comprises:

respectively inputting the characteristic indexes into K trained SVM models to obtain K initial false indexes;

inputting the K initial false indexes into a trained tree model XGboost, and obtaining the false indexes obtained after the K initial false indexes are weighted and summed as the corresponding false indexes of the target enterprise.

3. The method according to claim 1 or 2, wherein the preset indexes include enterprise registration addresses, and the characterizing the indexes includes:

acquiring a registration address of the target enterprise;

matching the registration address of the target enterprise with a registration address stored in a local database, and determining an on-library registration address corresponding to the registration address of the target enterprise;

and determining the risk degree of the registered address corresponding to the target enterprise according to the in-repository registered address, wherein the risk degree of the registered address has a corresponding relation with the in-repository registered address, and the risk degree of the registered address is determined according to the number of the enterprises with abnormal contact of the in-repository registered address.

4. The method of claim 1 or 2, wherein the plurality of predetermined metrics comprise business state, and the characterizing the plurality of metrics comprises:

acquiring the enterprise operation state of the target enterprise;

determining a business state type corresponding to the business operation state;

and determining the business state risk degree corresponding to the target enterprise according to the business state type corresponding to the enterprise business state of the target enterprise, wherein the business state risk degree has a corresponding relation with the business state type, and the business state risk degree is negatively correlated with the activity degree of the business state type.

5. The method of claim 1, wherein after obtaining the plurality of predetermined indicators for the target business, the method further comprises:

acquiring other preset indexes of the target enterprise, wherein the other preset indexes are related to the business information of the target enterprise;

performing characterization processing on the other preset indexes to obtain other characteristic indexes corresponding to the other preset indexes, wherein the other characteristic indexes represent the industrial and commercial registration risk degree of the target enterprise through characteristic numerical values;

inputting the characteristic indexes into a prediction model to obtain a false index corresponding to the target enterprise, wherein the false index comprises the following steps:

and inputting the characteristic indexes and the other characteristic indexes into a prediction model to obtain a false index corresponding to the target enterprise.

6. The method of claim 2, wherein the K trained SVM models are trained as follows:

acquiring K data sets, wherein each data set in the K data sets comprises a plurality of enterprises, a plurality of preset indexes corresponding to each enterprise in the plurality of enterprises and a legality result corresponding to each enterprise;

adopting K-1 data sets in the K data sets as training data sets of a first initial SVM model in the K initial SVM models, inputting the first initial SVM model for training, and obtaining a first SVM model in training;

adopting the residual data sets except the K-1 data sets in the K data sets as verification data sets of a first SVM model in the training, inputting the verification data sets into the first SVM model in the training, and obtaining a false index of the verification data sets obtained through reasoning of the first SVM model in the training;

comparing a false index obtained by reasoning the verification data set through the first SVM model in the training with a legality result corresponding to the verification data set, and determining the reasoning accuracy of the first SVM model in the training;

if the inference accuracy of the first SVM model in the training is not higher than a first preset accuracy, performing iterative training on the first SVM model in the training, and if the inference accuracy of the first SVM model in the training is higher than the first preset accuracy, determining that the training of the first SVM model in the training is completed and the first SVM model is one of K SVM models;

repeatedly adopting K-1 data sets in the K data sets as a training data set of a next first initial SVM model in the K initial SVM models, inputting the next first initial SVM model for training to obtain a first SVM model in the next training, adopting the residual data sets except the K-1 data sets in the K data sets as a verification data set of the first SVM model in the next training, and determining the process of finishing training of the first SVM model in the next training according to the inference accuracy of the first SVM model in the next training until the training of the K SVM models is finished, wherein the data set and the training set adopted by each initial SVM model in the K initial SVM models are different from those of other initial SVM models.

7. The method of claim 6, wherein the training process of the trained tree model XGboost model is as follows:

acquiring a second data set, wherein the second data set comprises at least two enterprises, a plurality of preset indexes corresponding to each enterprise of the at least two enterprises, and a legality result corresponding to each enterprise;

splitting the second data set into a training data set and a validation data set;

inputting the training data set into an initial XGboost model for training to obtain an XGboost model in training, wherein the initial XGboost model is formed by weighting and summing K SVM models after being endowed with different weights;

inputting the verification data set into the training XGboost model to obtain a false index obtained by the verification data set through the training XGboost reasoning;

comparing a false index obtained by inference of the verification data set through the XGboost model in training with a validity result corresponding to the verification data set, and determining inference accuracy of the XGboost model in training;

and if the reasoning accuracy of the XGboost model in the training is not higher than a second preset accuracy, modifying weights of K SVM models in the XGboost model in the training and carrying out iterative training, and if the reasoning accuracy of the XGboost model in the training is higher than the second preset accuracy, determining that the XGboost model in the training is trained, so as to obtain the XGboost model.

8. A false enterprise identification device based on a prediction model, the device comprising:

the system comprises a receiving unit and a processing unit, wherein the receiving unit is used for receiving first request information, the first request information is used for requesting to acquire a legality result of a target enterprise, and the legality result comprises that the target enterprise is a legal registered enterprise or an illegal registered enterprise;

the acquisition unit is used for acquiring a plurality of preset indexes of the target enterprise, and the preset indexes are related to the registration information of the target enterprise;

the processing unit is used for performing characterization processing on the preset indexes to obtain a plurality of characteristic indexes, and each characteristic index in the characteristic indexes adopts a characteristic numerical value to represent the risk degree of the target enterprise as an illegal registered enterprise;

the model unit is used for inputting the characteristic indexes into a prediction model to obtain a false index corresponding to the target enterprise;

9. An electronic device comprising a processor, a memory, and computer-executable instructions stored on the memory and executable on the processor, which when executed cause the electronic device to perform the method of any of claims 1-7.

10. A computer readable storage medium having stored thereon computer instructions which, when run on a communication device, cause the communication device to perform the method of any one of claims 1-7.