CN112348092A - Data processing method and device, server and storage medium - Google Patents

Data processing method and device, server and storage medium Download PDF

Info

Publication number
CN112348092A
CN112348092A CN202011246123.5A CN202011246123A CN112348092A CN 112348092 A CN112348092 A CN 112348092A CN 202011246123 A CN202011246123 A CN 202011246123A CN 112348092 A CN112348092 A CN 112348092A
Authority
CN
China
Prior art keywords
data
evaluated
evaluation
processing method
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011246123.5A
Other languages
Chinese (zh)
Inventor
黎豪
陈海雯
张汉林
李立峰
柯学
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Gf Securities Co ltd
Original Assignee
Gf Securities Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Gf Securities Co ltd filed Critical Gf Securities Co ltd
Priority to CN202011246123.5A priority Critical patent/CN112348092A/en
Publication of CN112348092A publication Critical patent/CN112348092A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/06Asset management; Financial planning or analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Development Economics (AREA)
  • Finance (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Accounting & Taxation (AREA)
  • Databases & Information Systems (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Computational Linguistics (AREA)
  • Game Theory and Decision Science (AREA)
  • Human Resources & Organizations (AREA)
  • Operations Research (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the application provides a data processing method and device, a server and a storage medium, and relates to the technical field of data processing. The data processing method comprises the following steps: firstly, acquiring characteristic data of an object to be evaluated; and secondly, processing the characteristic data through a preset evaluation model to obtain evaluation data of the object to be evaluated. By the method, the evaluation can be realized according to the model and the characteristic data, and the problem of low reliability of the analysis of the object to be evaluated caused by directly evaluating the profitability of the object to be evaluated from the operation in the prior art is solved.

Description

Data processing method and device, server and storage medium
Technical Field
The present application relates to the field of data processing technologies, and in particular, to a data processing method and apparatus, a server, and a storage medium.
Background
During fund investments and research, it is often necessary to study and analyze the subject to be evaluated. However, the inventors have found that, in the prior art, generally, the evaluation is directly performed based on the profitability of the subject to be evaluated since the subject to be evaluated is used, and thus, there is a problem that the reliability of the analysis of the subject to be evaluated is low.
Disclosure of Invention
In view of the above, an object of the present application is to provide a data processing method and apparatus, a server, and a storage medium, so as to solve the problems in the prior art.
In order to achieve the above purpose, the embodiment of the present application adopts the following technical solutions:
in a first aspect, the present invention provides a data processing method, including:
acquiring characteristic data of an object to be evaluated;
and processing the characteristic data through a preset evaluation model to obtain evaluation data of the object to be evaluated.
In an optional embodiment, the step of processing the feature data through a preset evaluation model to obtain evaluation data of the object to be evaluated includes:
carrying out category division processing on the feature data to obtain the category of the feature data;
and performing evaluation processing according to the category to obtain evaluation data of the object to be evaluated.
In an optional embodiment, the step of performing class classification processing on the feature data to obtain a class of the feature data includes:
performing dimension reduction processing on the feature data to obtain dimension reduction features;
and carrying out clustering analysis processing on the dimensionality reduction features to obtain the category of the feature data.
In an optional embodiment, the evaluation data includes a characteristic attribute of the object to be evaluated, and the step of performing evaluation processing according to the category to obtain the evaluation data of the object to be evaluated includes:
judging whether the difference value between the dimension reduction characteristic value of the class of objects to be evaluated and the dimension reduction characteristic values of all the classes of objects to be evaluated exceeds a preset difference threshold value or not according to each class;
and if so, labeling the dimension reduction features of the category of objects to be evaluated to obtain feature attributes.
In an optional implementation manner, the evaluation data includes a revenue attribute of the object to be evaluated, and the step of performing evaluation processing according to the category to obtain the evaluation data of the object to be evaluated includes:
calculating the mean value and the standard deviation of the profitability of the object to be evaluated in each category;
and calculating the income attribute of the object to be evaluated according to the mean value and the standard deviation.
In an optional embodiment, the data processing method further includes:
judging whether the change rate of the evaluation data of the object to be evaluated is greater than a preset change rate threshold value or not;
and if so, updating the preset evaluation model.
In an optional embodiment, the step of obtaining feature data of the object to be evaluated includes:
acquiring related data of an object to be evaluated;
and preprocessing the related data of the object to be evaluated to obtain the characteristic data of the object to be evaluated.
In a second aspect, the present invention provides a data processing apparatus comprising:
the data module is used for acquiring the characteristic data of the object to be evaluated;
and the data processing module is used for processing the characteristic data through a preset evaluation model to obtain the evaluation data of the object to be evaluated.
In a third aspect, the present invention provides a server, comprising a memory and a processor, wherein the processor is configured to execute an executable computer program stored in the memory to implement the data processing method of any one of the foregoing embodiments.
In a fourth aspect, the present invention provides a storage medium having stored thereon a computer program which, when executed, implements the steps of the data processing method of any one of the preceding embodiments.
According to the data processing method and device, the server and the storage medium, the evaluation data of the object to be evaluated is obtained by processing the feature data of the object to be evaluated through the preset evaluation model, evaluation according to the model and the feature data is achieved, and the problem that in the prior art, the reliability of analysis on the object to be evaluated is low due to the fact that the yield is directly evaluated based on the operation of the object to be evaluated in the prior art is solved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained from the drawings without inventive effort.
Fig. 1 is a block diagram of a data processing system according to an embodiment of the present disclosure.
Fig. 2 is a schematic flow chart of a data processing method according to an embodiment of the present application.
Fig. 3 is another schematic flow chart of a data processing method according to an embodiment of the present application.
Fig. 4 is another schematic flow chart of the data processing method according to the embodiment of the present application.
Fig. 5 is another schematic flow chart of the data processing method according to the embodiment of the present application.
Fig. 6 is a schematic conversion diagram of a table structure provided in the embodiment of the present application.
Fig. 7 is another schematic flow chart of a data processing method according to an embodiment of the present application.
Fig. 8 is another schematic flow chart of a data processing method according to an embodiment of the present application.
Fig. 9 is another schematic flow chart of a data processing method according to an embodiment of the present application.
Fig. 10 is a block diagram of a data processing apparatus according to an embodiment of the present application.
Icon: 10-a data processing system; 100-a server; 200-a terminal device; 1000-a data processing apparatus; 1010-a data acquisition module; 1020-data processing module.
Detailed Description
Fund managers are a kind of professional category in the financial field and are mainly responsible for fund management, decision making and other works. Each fund is commonly responsible for the investment portfolio, the investment strategy and the decision management of the fund by one fund manager or a plurality of fund managers, and the performance of the fund is in important connection with the investment capacity of the fund manager.
With the development of the fire and heat of the fund market, fund products are more and more, the number of fund managers is increased continuously, fund choices for investors are greatly increased, and the evaluation analysis of the fund managers mostly carries out qualitative evaluation through limited data such as investment plans of the fund managers and managed funds, but the investment styles of the fund managers are often deviated from the plans. For the investment style of a fund manager, the prior art is difficult to quantify and track, and the fund manager has less related research.
In the process of fund investment and research, a fund manager is generally required to be used as an object to be evaluated for research and analysis, but the prior art has less results on quantitative research of the fund manager. The prior art and the existing defects have the following aspects:
(1) at present, most of research methods for fund managers are full-time investigation on the fund managers, qualitative evaluation is conducted by collating text data of the fund managers such as in-process fund management, investment concepts, investment strategies and the like, a large amount of energy is consumed, the efficiency is low, and the research range is difficult to cover all fund managers in the whole market.
(2) Some financial data manufacturers in the industry have quantitative data for calculating investment performance of fund managers, such as profitability of fund managers from practice, but most of the data are objective basic data and cannot meet the requirements of deep research.
In order to improve at least one of the above technical problems proposed by the present application, embodiments of the present application provide a data processing method and apparatus, a server, and a storage medium, and the following describes technical solutions of the present application through possible implementation manners.
The defects of the above solutions are the results of the inventor after practice and careful study, and therefore, the discovery process of the above problems and the solution proposed by the present application to the above problems should be the contribution of the inventor to the present application in the process of the present application.
For purposes of making the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions of the embodiments of the present application will be described in detail below with reference to the drawings in the embodiments of the present application, and it should be understood that the drawings in the present application are for illustrative and descriptive purposes only and are not used to limit the scope of the present application. Additionally, it should be understood that the schematic drawings are not necessarily drawn to scale. The flowcharts used in this application illustrate operations implemented according to some embodiments of the present application. It should be understood that the operations of the flow diagrams may be performed out of order, and steps without logical context may be performed in reverse order or simultaneously. One skilled in the art, under the guidance of this application, may add one or more other operations to, or remove one or more operations from, the flowchart.
In addition, the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present application without making any creative effort, shall fall within the protection scope of the present application.
In order to enable a person skilled in the art to make use of the present disclosure, the following embodiments are given. It will be apparent to those skilled in the art that the general principles defined herein may be applied to other embodiments and applications without departing from the spirit and scope of the application. Applications of the system or method of the present application may include web pages, plug-ins for browsers, client terminals, customization systems, internal analysis systems, or artificial intelligence robots, among others, or any combination thereof.
It should be noted that in the embodiments of the present application, the term "comprising" is used to indicate the presence of the features stated hereinafter, but does not exclude the addition of further features.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.
Fig. 1 is a block diagram of a data processing system 10 provided in an embodiment of the present application, which provides a possible implementation manner of the data processing system 10, and referring to fig. 1, the data processing system 10 may include one or more of a server 100 and a terminal device 200, and the server 100 may include a processor for executing instruction operations.
The server 100 is in communication connection with the terminal device 200 to obtain data sent by the terminal device 200 for processing, send the obtained evaluation data of the object to be evaluated to the terminal device 200, and the terminal device 200 performs visualization conversion on the evaluation data and then displays the evaluation data to a user.
For the server 100, it should be noted that, in some embodiments, the server 100 may be a single server 100 or a server group. The set of servers may be centralized or distributed (e.g., server 100 may be a distributed system). In some embodiments, the server 100 may be local or remote to the terminal device 200. For example, the server 100 may access information and/or data stored in the terminal device 200 via a network. As another example, the server 100 may be directly connected to the terminal device 200 to access stored information and/or data. In some embodiments, the server 100 may be implemented on a cloud platform. By way of example only, the cloud platform may include a private cloud, a public cloud, a hybrid cloud, a resilient cloud, a community cloud (community cloud), a distributed cloud, a cross-cloud (inter-cloud), a multi-cloud (multi-cloud), and the like, or any combination thereof. In some embodiments, the server 100 may be implemented on the terminal device 200.
In some embodiments, the server 100 may include a processor. The processor may process information and/or data transmitted by terminal device 200 to perform one or more of the functions described herein. In some embodiments, a processor may include one or more processing cores (e.g., a single-core processor (S) or a multi-core processor (S)). Merely by way of example, a Processor may include a Central Processing Unit (CPU), an Application Specific Integrated Circuit (ASIC), an Application Specific Instruction Set Processor (ASIP), a Graphics Processing Unit (GPU), a Physical Processing Unit (PPU), a Digital Signal Processor (DSP), a Field Programmable Gate Array (FPGA), a Programmable Logic Device (PLD), a controller, a microcontroller Unit, a Reduced Instruction Set computer (Reduced Instruction Set computer, RISC), a microprocessor, or the like, or any combination thereof.
The network may be used for the exchange of information and/or data. In some embodiments, one or more components in data processing system 10 (e.g., server 100 and terminal device 200) may send information and/or data to other components. For example, the server 100 may acquire data from the terminal device 200 via a network. In some embodiments, the network may be any type of wired or wireless network, or combination thereof. Merely by way of example, the Network may include a wired Network, a Wireless Network, a fiber optic Network, a telecommunications Network, an intranet, the internet, a Local Area Network (LAN), a Wide Area Network (WAN), a Wireless Local Area Network (WLAN), a Metropolitan Area Network (MAN), a Wide Area Network (WAN), a Public Switched Telephone Network (PSTN), a bluetooth Network, a ZigBee Network, a Near Field Communication (NFC) Network, or the like, or any combination thereof.
In some embodiments, the network may include one or more network access points. For example, a network may include wired or wireless network access points, such as base stations and/or network switching nodes, through which one or more components of data processing system 10 may connect to the network to exchange data and/or information.
A database may be included in server 100 and may store data and/or instructions. In some embodiments, the database may store data obtained from the terminal device 200. In some embodiments, a database may store data and/or instructions for the exemplary methods described herein. In some embodiments, the database may include mass storage, removable storage, volatile Read-write Memory, or Read-Only Memory (ROM), among others, or any combination thereof. By way of example, mass storage may include magnetic disks, optical disks, solid state drives, and the like; removable memory may include flash drives, floppy disks, optical disks, memory cards, zip disks, tapes, and the like; volatile read-write Memory may include Random Access Memory (RAM); the RAM may include Dynamic RAM (DRAM), Double data Rate Synchronous Dynamic RAM (DDR SDRAM); static RAM (SRAM), Thyristor-Based Random Access Memory (T-RAM), Zero-capacitor RAM (Zero-RAM), and the like. By way of example, ROMs may include Mask Read-Only memories (MROMs), Programmable ROMs (PROMs), Erasable Programmable ROMs (PERROMs), Electrically Erasable Programmable ROMs (EEPROMs), compact disk ROMs (CD-ROMs), digital versatile disks (ROMs), and the like. In some embodiments, the database may be implemented on a cloud platform. By way of example only, the cloud platform may include a private cloud, a public cloud, a hybrid cloud, a community cloud, a distributed cloud, cross-cloud, multi-cloud, elastic cloud, or the like, or any combination thereof.
In some embodiments, the database may be connected to a network to communicate with one or more components in the data processing system 10 (e.g., the server 100 and the terminal device 200). One or more components in data processing system 10 may access data or instructions stored in a database via a network. In some embodiments, the database may be directly connected to one or more components in data processing system 10 (e.g., server 100 and terminal device 200). Alternatively, in some embodiments, the database may also be part of the server 100. In some embodiments, one or more components in data processing system 10 (e.g., server 100 and terminal device 200) may have access to a database.
Fig. 2 shows one of flowcharts of a data processing method provided in an embodiment of the present application, where the method is applicable to the server 100 shown in fig. 1 and is executed by the server 100 in fig. 1. It should be understood that, in other embodiments, the order of some steps in the data processing method of this embodiment may be interchanged according to actual needs, or some steps may be omitted or deleted. The flow of the data processing method shown in fig. 2 is described in detail below.
Step S210, obtaining the characteristic data of the object to be evaluated.
Step S220, processing the characteristic data through a preset evaluation model to obtain evaluation data of the object to be evaluated.
According to the method, the evaluation data of the object to be evaluated is obtained by processing the feature data of the object to be evaluated through the preset evaluation model, evaluation according to the model and the feature data is realized, and the problem of low reliability of analysis of the object to be evaluated caused by directly evaluating the profitability of the object to be evaluated from a business in the prior art is solved.
For step S210, it should be noted that the specific manner of obtaining the feature data is not limited, and may be set according to the actual application requirement. For example, in an alternative example, step S210 may include a step of obtaining feature data from the data related to the object to be evaluated. Therefore, on the basis of fig. 2, fig. 3 is a schematic flowchart of another data processing method provided in the embodiment of the present application, and referring to fig. 3, step S210 may include:
step S211, acquiring relevant data of the object to be evaluated.
In detail, when the object to be evaluated is a fund manager, a user may input information desired to be queried and analyzed through the terminal device 200, the input information including but not limited to the fund manager, the fund, the investment style, the risk preference, etc., and the terminal device 200 transmits a query request to the server 100.
The server 100 may also actively collect data related to fund managers in the market, where the data relates to multiple data sources, the data sources may be from financial data providers such as find and berg, and the data sources are generally stored in a relational database such as Oracle and postgreSQL. The relevant data can be divided into two categories, one category is fund manager data which mainly comprises fund manager background, fund-in-management, asset management scale, investment concept, working years, fund manager news and the like; one is financial data other than, but related to, the fund manager, including fund quotations, fund positions, industry indices, macroscopic data, and the like.
Step S212, preprocessing the relevant data of the object to be evaluated to obtain the characteristic data of the object to be evaluated.
In detail, the pre-processing may include two steps of data cleaning and feature processing. Firstly, in order to correct or eliminate the data which do not meet the quality requirement according to a certain rule, the data of the fund manager and the financial data of the non-fund manager acquired in the last step can be subjected to data cleaning processing. The data cleaning can comprise the steps of encoding and translating the text data, processing abnormal values and missing values and the like. The text data can comprise data of fund managers, news and the like, and the code translation processing method can comprise One-Hot coding, tf-idf statistical method and the like. The method for processing the missing value can be as follows: deleting missing values, completing missing values (such as mean filling, median filling, etc.). In addition, the abnormal value may be corrected, and the formula of the correction process may be as follows:
Figure BDA0002770087760000101
wherein x isijIndicates an abnormal value, xij *The value after the correction is represented by a value,
Figure BDA0002770087760000102
indicates an abnormal value xijMean value, σ, of the datajIndicates an abnormal value xijStandard deviation of the assigned data.
Secondly, after the data cleaning, the characteristic processing treatment can be carried out on the fund manager data after the data cleaning and the financial data of the non-fund manager. Specifically, the step of feature processing may include: the financial data of the non-fund managers are converted into the characteristics of the fund managers through methods of association, aggregation and the like, and the derived characteristics are calculated according to the characteristics of the fund managers so as to enrich the characteristic library.
In an alternative example, where the financial data of the non-fund manager includes market data of a stock (which may include, but is not limited to, market rate PE, volume of trade, etc.) the fund manager characteristic may be calculated by the following formula:
Figure BDA0002770087760000103
wherein, FiRepresents the characteristics of fund managers obtained by an aggregation method, k represents the kth position stock, hkRepresenting the value of the k-th taken-position stock, h representing the total value of the taken-position stocks of the fund manager, fkShowing market data for the k-th stock.
Optionally, the specific way of calculating the derived features according to the features of the fund manager is not limited, and may be set according to actual application requirements. For example, in an alternative example, derived features may be calculated such as the mean and standard deviation over the fund manager feature statistics period, the ranking of fund manager features in a similar fund manager, and the like.
Further, after the fund manager features and the derived features are obtained through calculation, all the features may be sorted and displayed in a table form, which may be specifically shown in table 1:
TABLE 1 Fund manager feature library Structure Table
Figure BDA0002770087760000111
For step S220, it should be noted that the specific manner of processing the feature data is not limited, and may be set according to the actual application requirement. For example, in an alternative example, step S220 may include a step of performing evaluation processing according to the type of feature data. Therefore, on the basis of fig. 2, fig. 4 is a schematic flowchart of another data processing method provided in the embodiment of the present application, and referring to fig. 4, step S220 may include:
step S221, performs category classification processing on the feature data to obtain a category of the feature data.
In detail, after the fund manager feature library is obtained in step S212, the fund manager features may be analyzed through a data mining technique such as machine learning, and finally the feature categories of the fund manager are obtained, so as to implement category classification on the fund manager.
Step S222, evaluation processing is carried out according to the categories to obtain evaluation data of the object to be evaluated.
For step S221, it should be noted that the specific manner of performing the category division is not limited, and the category division may be set according to the actual application requirement. For example, in an alternative example, step S221 may include a step of obtaining the feature data category according to the dimension reduction feature. Therefore, on the basis of fig. 4, fig. 5 is a schematic flowchart of another data processing method provided in the embodiment of the present application, and referring to fig. 5, step S221 may include:
and step S2211, performing dimension reduction processing on the feature data to obtain dimension reduction features.
It should be noted that, because the fund manager feature library includes a large number of features, which are not beneficial to model training and result analysis, the feature needs to be reduced in dimension, and the main method is a principal component analysis method. In detail, the principal component analysis method can combine original features into a new group of comprehensive features with small correlation through linear transformation, the number of the new features (dimension reduction features) is determined according to model parameter setting, the preset evaluation model provided by the embodiment of the application is generally controlled to be 10-20 features, and the new feature group and the original feature group can be mutually converted through linear transformation.
Further, after obtaining the dimension reduction feature, the embodiment of the present application may further include a step of analyzing the dimension reduction feature to confirm the dimension reduction feature. The dimensionality reduction features are defined mainly by combining transformation analysis and manual configuration definition, the initial definition of the dimensionality reduction features is analyzed through the transformation analysis firstly, and then final confirmation is carried out through manual review and modification.
In detail, taking the principal component analysis method to perform the dimension reduction processing as an example, the step of analyzing the initial definition of the dimension reduction feature is as follows: dimension reduction feature FkCan be represented as F by the original characteristicsk=a1f1+a2f2+…+anfn(ii) a Wherein f is1,…,fnRepresents the original feature, a1,…,anThe coefficients are represented. A is to1,…,anIs ordered from large to small, assuming a maximum value of aiThen a isiCorresponding original feature fiThe feature name of (a) may serve as the initial definition of the new feature.
In the step of manual review and modification, the initial definition of the dimensionality reduction feature can be reviewed manually according to the relationship between the dimensionality reduction feature and the original feature, and if the initial definition is unreasonable, the definition of the dimensionality reduction feature is redefined.
For example, when the original features include the rate of return, the rate of sharp, and the annual rate of return, the dimensionality reduction feature 1 may be represented by the following formula: and if the dimensionality reduction feature 1 is 1.48 × yield +0.53 × sharp ratio +0.72 × annual yield, the initial definition of the dimensionality reduction feature 1 is the 'yield', and the dimensionality reduction feature 1 can be modified into 'profitability' after manual review, so that the definition of the dimensionality reduction feature is more reasonable.
And step S2212, performing clustering analysis processing on the dimensionality reduction features to obtain the categories of the feature data.
In detail, the classification of the fund managers can be realized by adopting algorithms such as multivariate statistics and the like according to the dimension reduction characteristics of the fund managers, and the algorithms can comprise a K-means method, a Dbscan method and the like.
In conjunction with fig. 6, before the step of dimension reduction and cluster analysis, the original feature library structure table may be a table structure with m rows and n columns, where n is the number of features. After the step of dimension reduction processing, the structure table is changed into m rows and k columns, wherein k is the number of dimension reduction features, k is less than or equal to n, and the number of k is generally 10-20. After the step of cluster analysis, adding fund manager category columns to the structure table, wherein the fund manager category columns are divided into p categories, and the number of p is generally 5-10.
For step S222, it should be noted that the specific manner of performing the evaluation processing is not limited, and may be specifically set according to the type of the evaluation data. For example, in an alternative example, when the evaluation data includes the feature attribute of the object to be evaluated, step S222 may include a step of determining whether the difference value of the dimensionality reduction feature values exceeds a threshold value. Therefore, on the basis of fig. 4, fig. 7 is a schematic flowchart of another data processing method provided in the embodiment of the present application, and referring to fig. 7, step S222 may further include:
step S2221, aiming at each category, whether the difference value between the dimension reduction characteristic value of the category object to be evaluated and the dimension reduction characteristic values of all the categories object to be evaluated exceeds a preset difference threshold value is judged.
In the embodiment of the application, when the difference value between the dimension reduction characteristic value of the category of object to be evaluated and the dimension reduction characteristic values of all the categories of object to be evaluated exceeds a preset difference threshold value, it is determined that the dimension reduction characteristics of the category of object to be evaluated have a significant difference, labeling processing can be performed, and step S2222 is executed; and when the difference value between the dimension reduction characteristic value of the category of object to be evaluated and the dimension reduction characteristic values of all the categories of objects to be evaluated does not exceed a preset difference threshold value, judging that the dimension reduction characteristics of the category of object to be evaluated do not have obvious difference, and not carrying out labeling processing.
Step S2222, the dimension reduction characteristics of the category to-be-evaluated object are labeled to obtain characteristic attributes.
In detail, the characteristic attributes of the fund manager class may be defined by a combination of interpretation analysis and manual configuration definition. Firstly, feature attributes of fund managers are intelligently analyzed through interpretation and analysis, the analysis method is to analyze whether the feature attributes (such as mean, variance, distribution form and the like) of each class are obviously different from other classes, if so, the feature attributes are the feature attributes of the fund managers, and then the feature attributes are confirmed through manual examination and modification. The method comprises the following steps of intelligently analyzing the characteristic attributes of fund managers: computing a dimension-reducing feature F for a class pkHas a mean value (or variance, kurtosis, etc.) of cpAnd dimension reduction features F of all fund managerskThe mean (or variance, kurtosis, etc.) of (c) is c. If | cpIf the c | exceeds a preset difference threshold value, the characteristic is obviously different from other categories, the category p is marked with the attribute of the characteristic, and if the c | does not exceed the threshold value, the attribute is not marked. And finally, modifying and eliminating unreasonable characteristic attributes manually to finish auditing.
For example, if the mean value of the dimensionality reduction features "profitability" of the category p is 1.69, the mean value of the dimensionality reduction features "profitability" of all the fund managers is 0.32, and the preset difference threshold is 1.0, the fund managers of the category p have the feature attribute of "good profitability".
For another example, in another alternative example, when the evaluation data includes the profit attribute of the object to be evaluated, step S222 may include the step of calculating a profit margin average and a standard deviation. Therefore, on the basis of fig. 4, fig. 8 is a schematic flowchart of another data processing method provided in the embodiment of the present application, and referring to fig. 8, step S222 may include:
step S2223, aiming at each category, calculating the average value and the standard deviation of the profitability of the object to be evaluated of the category.
And step S2224, the income attribute of the object to be evaluated in the category is obtained through calculation according to the mean value and the standard deviation.
In detail, after the fund manager categories are determined, the risk-benefit characteristics of each category need to be subjected to attribution analysis so as to determine the risk-benefit preference of each category. The specific method is to calculate the average value and the standard deviation of the profitability of each type of fund managers from the work, wherein the average value represents the income performance of the type of fund managers, and the standard deviation represents the risk preference of the type of fund managers. After the mean and the standard deviation of the profitability are calculated, the profitability attribute may be calculated according to the mean and the standard deviation, for example, the mean and the standard deviation may be calculated by allocating different weights.
It should be noted that the preset evaluation model in the embodiment of the present application may be obtained by training according to preset training data, or may be obtained by performing modeling operation daily through all the sub-steps of the step S220. After the evaluation data is obtained by presetting the evaluation model, in order to further improve the reliability of analyzing the object to be evaluated, after step S220, the data processing method provided in the embodiment of the present application may further include a step of updating the model. Therefore, on the basis of fig. 2, fig. 9 is a schematic flowchart of another data processing method provided in the embodiment of the present application, and referring to fig. 9, the data processing method may further include:
step S230, determining whether the change rate of the evaluation data of the object to be evaluated is greater than a preset change rate threshold.
In the embodiment of the present application, when the change rate of the evaluation data of the object to be evaluated is greater than the preset change rate threshold, it is determined that the reliability of the preset evaluation model is low, and step S240 is executed; and when the change rate of the evaluation data of the object to be evaluated is not greater than the preset change rate threshold value, judging that the reliability of the preset evaluation model is normal, and using the model continuously.
And step S240, updating the preset evaluation model.
In detail, the characteristic attributes and risk-benefit preferences of each category of fund managers can be calculated every day, and if the change rate exceeds a preset change rate threshold value, an alarm prompt is sent out and the optimization model is updated.
The data mining technology such as machine learning can mine valuable information such as dynamic development rules and behavior characteristics among objects in mass data, and help researchers provide practical reference values. The fund manager is quantitatively analyzed by adopting data mining technologies such as machine learning, the investment characteristics of the fund manager can be dynamically and deeply mined by an investor, the fund manager is classified according to the characteristics of the fund manager, and the investor is helped to screen out better characteristic interrelations between the fund manager and the fund manager.
In view of the defects in the prior art, and no system or device for deep evaluation and analysis of fund manager feature partitioning exists in the market at present, the embodiment of the application provides a system for mining features of fund managers such as investment behaviors and investment preferences through data mining technologies such as machine learning, and the like, and clustering and partitioning are performed on the fund managers according to the features, so that investors are helped to quantitatively analyze the fund managers, screen out fund managers according with the features, and mine fund managers with similar features.
Compared with the prior art, the embodiment of the application has the following advantages:
(1) compared with the prior art that the fund manager is qualitatively evaluated by a manual analysis method, the method and the system have the advantages that the dimensionality reduction characteristics and characteristic attribute analysis of the fund manager are quantitatively analyzed by the data mining technology such as machine learning, and the like, so that the quantitative evaluation of the fund manager is realized; meanwhile, the embodiment of the application can realize self-learning of the model, dynamically update the model and evaluate the result, improve the dynamic applicability of the system and avoid repeated research and tracking work of a user.
(2) Compared with the prior art that basic quantitative evaluation analysis is carried out according to the profitability and the like of the fund manager from the work, the deep evaluation of the fund manager is realized by quantitatively analyzing the dimensionality reduction features and the feature attribute analysis of the fund manager through the data mining technology such as machine learning, the dimensionality reduction features and the feature attribute description of the fund manager comprise the investment style, the risk preference, the industry preference, the behavior feature and the like, and the richness of the quantitative evaluation of the fund manager is increased; meanwhile, by deriving the analysis result, functions of inquiring similar fund managers, performing class expression tracking of fund managers and the like can be realized, and the analysis capability of the system is improved.
With reference to fig. 10, an embodiment of the present application further provides a data processing apparatus 1000, where the functions implemented by the data processing apparatus 1000 correspond to the steps executed by the foregoing method. The data processing device 1000 may be understood as a processor of the server 100, or may be understood as a component that is independent of the server 100 or a processor and that implements the functions of the present application under the control of the server 100. The data processing apparatus 1000 may include a data acquisition module 1010 and a data processing module 1020.
The data obtaining module 1010 is configured to obtain feature data of an object to be evaluated. In the embodiment of the present application, the data obtaining module 1010 may be configured to perform step S210 shown in fig. 2, and reference may be made to the foregoing description of step S210 for relevant contents of the data obtaining module 1010.
And the data processing module 1020 is configured to process the feature data through a preset evaluation model to obtain evaluation data of the object to be evaluated. In the embodiment of the present application, the data processing module 1020 may be configured to perform step S220 shown in fig. 2, and reference may be made to the foregoing description of step S220 for relevant contents of the data processing module 1020.
In addition, an embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program performs the steps of the data processing method.
The computer program product of the data processing method provided in the embodiment of the present application includes a computer-readable storage medium storing a program code, where instructions included in the program code may be used to execute steps of the data processing method in the above method embodiment, which may be referred to specifically in the above method embodiment, and are not described herein again.
In summary, the data processing method and apparatus, the server, and the storage medium provided in the embodiments of the present application process the feature data of the object to be evaluated through the preset evaluation model to obtain the evaluation data of the object to be evaluated, so that evaluation according to the model and the feature data is realized, and a problem of low reliability of analysis of the object to be evaluated, which is caused by directly evaluating the profitability of the object to be evaluated based on the practice of the object to be evaluated in the prior art, is solved.
The above description is only a preferred embodiment of the present application and is not intended to limit the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims (10)

1. A data processing method, comprising:
acquiring characteristic data of an object to be evaluated;
and processing the characteristic data through a preset evaluation model to obtain evaluation data of the object to be evaluated.
2. The data processing method according to claim 1, wherein the step of processing the feature data through a preset evaluation model to obtain evaluation data of an object to be evaluated comprises:
carrying out category division processing on the feature data to obtain the category of the feature data;
and performing evaluation processing according to the category to obtain evaluation data of the object to be evaluated.
3. The data processing method according to claim 2, wherein the step of performing the class classification processing on the feature data to obtain the class of the feature data comprises:
performing dimension reduction processing on the feature data to obtain dimension reduction features;
and carrying out clustering analysis processing on the dimensionality reduction features to obtain the category of the feature data.
4. The data processing method according to claim 3, wherein the evaluation data includes a characteristic attribute of the object to be evaluated, and the step of performing evaluation processing according to the category to obtain the evaluation data of the object to be evaluated includes:
judging whether the difference value between the dimension reduction characteristic value of the class of objects to be evaluated and the dimension reduction characteristic values of all the classes of objects to be evaluated exceeds a preset difference threshold value or not according to each class;
and if so, labeling the dimension reduction features of the category of objects to be evaluated to obtain feature attributes.
5. The data processing method according to claim 2, wherein the evaluation data includes a profit attribute of the object to be evaluated, and the step of performing evaluation processing according to the category to obtain the evaluation data of the object to be evaluated includes:
calculating the mean value and the standard deviation of the profitability of the object to be evaluated in each category;
and calculating the income attribute of the object to be evaluated according to the mean value and the standard deviation.
6. The data processing method of any one of claims 1 to 5, wherein the data processing method further comprises:
judging whether the change rate of the evaluation data of the object to be evaluated is greater than a preset change rate threshold value or not;
and if so, updating the preset evaluation model.
7. The data processing method of claim 1, wherein the step of obtaining feature data of the object to be evaluated comprises:
acquiring related data of an object to be evaluated;
and preprocessing the related data of the object to be evaluated to obtain the characteristic data of the object to be evaluated.
8. A data processing apparatus, comprising:
the data module is used for acquiring the characteristic data of the object to be evaluated;
and the data processing module is used for processing the characteristic data through a preset evaluation model to obtain the evaluation data of the object to be evaluated.
9. A server, comprising a memory and a processor for executing an executable computer program stored in the memory to implement the data processing method of any one of claims 1 to 7.
10. A storage medium, characterized in that a computer program is stored thereon, which when executed performs the steps of the data processing method of any one of claims 1-7.
CN202011246123.5A 2020-11-10 2020-11-10 Data processing method and device, server and storage medium Pending CN112348092A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011246123.5A CN112348092A (en) 2020-11-10 2020-11-10 Data processing method and device, server and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011246123.5A CN112348092A (en) 2020-11-10 2020-11-10 Data processing method and device, server and storage medium

Publications (1)

Publication Number Publication Date
CN112348092A true CN112348092A (en) 2021-02-09

Family

ID=74363153

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011246123.5A Pending CN112348092A (en) 2020-11-10 2020-11-10 Data processing method and device, server and storage medium

Country Status (1)

Country Link
CN (1) CN112348092A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112927081A (en) * 2021-03-16 2021-06-08 北京同邦卓益科技有限公司 Data processing method, device, system and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112927081A (en) * 2021-03-16 2021-06-08 北京同邦卓益科技有限公司 Data processing method, device, system and storage medium

Similar Documents

Publication Publication Date Title
CN111160992A (en) Marketing system based on user portrait system
CN114647741B (en) Automatic process decision and reasoning method and device, computer equipment and storage medium
Doumpos et al. Developing sorting models using preference disaggregation analysis: An experimental investigation
CN112634056A (en) Method, equipment and storage medium for rapidly calculating and updating enterprise share right structure
CN112330404A (en) Data processing method and device, server and storage medium
US20100076799A1 (en) System and method for using classification trees to predict rare events
CN117271767A (en) Operation and maintenance knowledge base establishing method based on multiple intelligent agents
US8046278B2 (en) Process of selecting portfolio managers based on automated artificial intelligence techniques
CN110489556A (en) Quality evaluating method, device, server and storage medium about follow-up record
Van Buuren et al. Imputation of missing categorical data by maximizing internal consistency
CN112950347A (en) Resource data processing optimization method and device, storage medium and terminal
Espinoza et al. Extending PESTEL technique to neutrosophic environment for decisions making in business management
CN112348092A (en) Data processing method and device, server and storage medium
CN114282875A (en) Flow approval certainty rule and semantic self-learning combined judgment method and device
DE112021001743T5 (en) VECTOR EMBEDDING MODELS FOR RELATIONAL TABLES WITH NULL OR EQUIVALENT VALUES
CN117522132A (en) Vendor risk assessment system and application method
CN112348093A (en) Data processing method and device, server and storage medium
Sun et al. An application of decision tree and genetic algorithms for financial ratios' dynamic selection and financial distress prediction
CN112837151A (en) Five-factor evaluation and multi-strategy combined optimization method for stock selection and trading strategy
CN113570455A (en) Stock recommendation method and device, computer equipment and storage medium
CN118036756B (en) Method, device, computer equipment and storage medium for large model multi-round dialogue
US11830081B2 (en) Automated return evaluation with anomoly detection
CN117422314B (en) Enterprise data evaluation method and equipment based on big data analysis
CN117539520B (en) Firmware self-adaptive upgrading method, system and equipment
CN117764536B (en) Innovative entrepreneur project auxiliary management system based on artificial intelligence

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination