CN112148760A - Big data screening method and device - Google Patents

Big data screening method and device Download PDF

Info

Publication number
CN112148760A
CN112148760A CN202011080351.XA CN202011080351A CN112148760A CN 112148760 A CN112148760 A CN 112148760A CN 202011080351 A CN202011080351 A CN 202011080351A CN 112148760 A CN112148760 A CN 112148760A
Authority
CN
China
Prior art keywords
enterprise
data
evaluation
search keyword
unit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011080351.XA
Other languages
Chinese (zh)
Inventor
周传刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Fireeye Data Technology Co ltd
Original Assignee
Beijing Fireeye Data Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Fireeye Data Technology Co ltd filed Critical Beijing Fireeye Data Technology Co ltd
Priority to CN202011080351.XA priority Critical patent/CN112148760A/en
Publication of CN112148760A publication Critical patent/CN112148760A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/248Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/018Certifying business or products

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • Development Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Computational Linguistics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Educational Administration (AREA)
  • Marketing (AREA)
  • General Engineering & Computer Science (AREA)
  • General Business, Economics & Management (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Game Theory and Decision Science (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a big data screening method and device, relates to the technical field of data processing, and mainly aims to solve the problems that the time consumption is too long, and the user tag data extraction is not accurate, so that the accurate matching cannot be realized, when the needed content is searched from paper documents piled up into mountains. The main technical scheme comprises: training according to multi-dimensional combined data to obtain an enterprise evaluation model, wherein the multi-dimensional combined data comprises: the method comprises the following steps of (1) enterprise operation efficiency, enterprise development capacity, enterprise contribution, compliance with laws and regulations and risk control, wherein data in a single dimension also comprises one or more data; receiving an input first search keyword, and sequentially grading according to multi-dimensional combination in the enterprise evaluation model according to the first search keyword to obtain an evaluation value; and displaying the data screening result according to the size of the evaluation value, and determining the enterprise corresponding to the evaluation value exceeding a preset evaluation threshold value as a target enterprise object.

Description

Big data screening method and device
Technical Field
The invention relates to the technical field of data processing, in particular to a big data screening method and device.
Background
The big data is a data set which is large in scale and greatly exceeds the capability range of a traditional database software tool in the aspects of acquisition, storage, management and analysis, and has the four characteristics of large data scale, rapid data circulation, various data types and low value density.
Particularly, for a big data application scenario of an enterprise selecting a supplier, the requirement on the processing accuracy of big data is strict, and at present, a common method is that required content is searched from paper documents piled up into mountains, the consumed time is too long, and the extraction of user tag data is not accurate, so that accurate matching cannot be realized, and further, a user still has certain difficulty in selecting upstream and downstream suppliers based on the matched data.
Disclosure of Invention
In view of this, embodiments of the present invention provide a method and an apparatus for screening big data, and mainly aim to solve the problems that searching for a desired content from paper documents stacked into mountains consumes too long time and cannot realize accurate matching because of inaccurate extraction of user tag data.
In order to solve the above problems, embodiments of the present invention mainly provide the following technical solutions:
in a first aspect, an embodiment of the present invention provides a method for screening big data, where the method includes:
training according to multi-dimensional combined data to obtain an enterprise evaluation model, wherein the multi-dimensional combined data comprises: the method comprises the following steps of (1) enterprise operation efficiency, enterprise development capacity, enterprise contribution, compliance with laws and regulations and risk control, wherein data in a single dimension also comprises one or more data;
receiving an input first search keyword, and sequentially grading according to multi-dimensional combination in the enterprise evaluation model according to the first search keyword to obtain an evaluation value;
and displaying the data screening result according to the size of the evaluation value, and determining the enterprise corresponding to the evaluation value exceeding a preset evaluation threshold value as a target enterprise object.
Optionally, the method further includes:
searching a second search keyword similar to the first search keyword in the enterprise evaluation model;
and pushing the triggered search result aiming at the second search keyword as the search result of the first search keyword.
Optionally, the obtaining of the enterprise evaluation model according to the multi-dimensional combined data training includes:
classifying the basic information of the enterprise and carrying out data formatting treatment;
and training the multi-dimensional combined data based on the data after the formatting treatment and a preset algorithm.
Optionally, displaying the data screening result according to the size of the evaluation value includes:
respectively acquiring multi-dimensional data of enterprise objects corresponding to the evaluation values;
and comparing and displaying the multi-dimensional data of the enterprise objects in a list form.
Optionally, the method further includes:
respectively monitoring single data under the multi-dimensional combined data in real time;
and if the enterprise evaluation model is determined to have the update, updating the enterprise evaluation model based on the updated data.
In a second aspect, an embodiment of the present invention further provides a big data screening apparatus, including:
the training unit is used for training according to multi-dimensional combined data to obtain an enterprise evaluation model, and the multi-dimensional combined data comprises: the method comprises the following steps of (1) enterprise operation efficiency, enterprise development capacity, enterprise contribution, compliance with laws and regulations and risk control, wherein data in a single dimension also comprises one or more data;
a receiving unit for receiving an input first search keyword;
the evaluation unit is used for sequentially carrying out evaluation according to multi-dimensional combination in the enterprise evaluation model according to the first search keyword to obtain an evaluation value;
the display unit is used for displaying the data screening result according to the evaluation value determined by the evaluation unit;
and the determining unit is used for determining the enterprise corresponding to the evaluation value exceeding the preset evaluation threshold value as the target enterprise object.
Optionally, the apparatus further comprises:
the searching unit is used for searching a second search keyword similar to the first search keyword in the enterprise evaluation model;
and the pushing unit is used for pushing the triggered search result aiming at the second search keyword searched by the searching unit as the search result of the first search keyword.
Optionally, the training unit includes:
the processing module is used for classifying the basic information of the enterprise and carrying out data formatting processing;
and the training module is used for training the multidimensional combined data based on the data formatted and processed by the processing unit and a preset algorithm.
Optionally, the display unit includes:
the acquisition module is used for respectively acquiring the multidimensional data of the enterprise objects corresponding to the evaluation values;
and the display module is used for comparing and displaying the multi-dimensional data of the enterprise objects in a list form.
Optionally, the apparatus further comprises:
the monitoring unit is used for respectively monitoring the single data under the multi-dimensional combined data in real time;
and the updating unit is used for updating the enterprise evaluation model based on the updated data when the monitoring unit determines that the update exists.
By the technical scheme, the technical scheme provided by the embodiment of the invention at least has the following advantages:
according to the screening method and device for big data provided by the embodiment of the invention, an enterprise evaluation model is obtained according to multi-dimensional combined data training, wherein the multi-dimensional combined data comprises the following components: the method comprises the following steps of (1) enterprise operation efficiency, enterprise development capacity, enterprise contribution, compliance with laws and regulations and risk control, wherein data in a single dimension also comprises one or more data; receiving an input first search keyword, and sequentially grading according to multi-dimensional combination in the enterprise evaluation model according to the first search keyword to obtain an evaluation value; displaying a data screening result according to the size of the evaluation value, and determining an enterprise corresponding to the evaluation value exceeding a preset evaluation threshold value as a target enterprise object; compared with the prior art, the embodiment of the invention can accurately and conveniently search the target enterprise object by utilizing the enterprise evaluation model without manual participation, thereby greatly improving the efficiency of data screening.
The foregoing description is only an overview of the technical solutions of the embodiments of the present invention, and the embodiments of the present invention can be implemented according to the content of the description in order to make the technical means of the embodiments of the present invention more clearly understood, and the detailed description of the embodiments of the present invention is provided below in order to make the foregoing and other objects, features, and advantages of the embodiments of the present invention more clearly understandable.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the embodiments of the invention. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:
FIG. 1 is a flow chart illustrating a method for screening big data according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of an enterprise valuation model system provided by an embodiment of the invention;
FIG. 3 is a block diagram illustrating a big data filtering apparatus according to an embodiment of the present invention;
fig. 4 is a block diagram illustrating another big data filtering apparatus according to an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
The embodiment of the invention provides a big data screening method, as shown in fig. 1, the method comprises the following steps:
101. training according to multi-dimensional combined data to obtain an enterprise evaluation model, wherein the multi-dimensional combined data comprises: the method comprises the following steps of (1) enterprise operation efficiency, enterprise development capacity, enterprise contribution, compliance with laws and regulations and risk control, wherein data in a single dimension also comprises one or more data;
the embodiment of the invention relates to an enterprise evaluation model system, and as shown in fig. 2, fig. 2 is a schematic diagram of an enterprise evaluation model system, which comprises a data source layer, a data processing layer, a data platform layer, a data response layer and a data display layer; the data source layer is used for storing various data sources; the data processing layer is used for summarizing the data in the data source and then formatting different data of the source; the data platform layer is used for processing the data formatted by the data processing layer to form preliminary label data of the user; the data response layer is used for storing the label data operated by the data platform layer; and the data display layer is used for displaying the target enterprise object accurately.
To facilitate understanding of multidimensional combination data, the following embodiments are illustrated in a table form, as shown in table 1, and are illustrated by taking the operation efficiency of a dimension enterprise as an example, and the dimension enterprise operation efficiency further includes: total asset turnover rate, inventory (cost) turnover rate, and labor investment to sales revenue ratio.
TABLE 1
Figure BDA0002718369330000051
Figure BDA0002718369330000061
In the actual application process, in addition to the dimensions in table 1, data of other dimensions may also be included, and specific dimension settings may be set according to different application scenarios, for example, the dimension data further includes, but is not limited to, the following:
A. external information of enterprise
(1) Basic Properties of an Enterprise
(ii) an industry
② whether or not high and new technology enterprises
(iii) whether to go to market
(2) Existing scale of enterprise
Enterprise capital assets
Total assets of an enterprise refer to all of the assets that the enterprise currently owns or controls. Including the aggregate items of mobile assets, long term investments, fixed assets, intangible and deferred assets, other long term assets, etc.
② enterprise tax amount
New year tax amount statistics (b1)
Number of total employees of enterprise
Counting the current total employee number of the annual enterprise (b2)
Enterprise total asset five-year change rate
The five-year change rate statistics of the total assets is that the total assets of the subsequent years are calculated by reversing 5 natural years in the current latest statistical year and taking the total assets of the reversed first natural year as a base (n), and the calculation mode is as follows:
counting years Statistical cardinality Rate of change per year
2015 n 0
2016 m T2016=(m-n)/n*100%
2017 s T2017=(s-m)/m*100%
2018 p T2018=(p-s)/s*100%
2019 q T2019=(q-p)/p*100%
Five-year rate of change of tax intake of enterprise
The five-year variation rate statistics of the enterprise taxes amount is to reverse 5 natural years in the current latest statistical year, and to take the total taxes amount of the first natural year after reverse deduction as the base (n1), calculate the variation rate (W) of the total taxes amount of the following years, and the calculation mode is as follows:
counting years Statistical cardinality Rate of change per year
2015 n1 0
2016 m1 W2016=(m1-n1)/n1*100%
2017 s1 W2017=(s1-m1)/m1*100%
2018 p1 W2018=(p1-s1)/s1*100%
2019 q1 W2019=(q1-p1)/p1*100%
Sixthly, the five-year change rate of the total employee of the enterprise
The five-year change rate statistics of the total employee number of the enterprise is to count down 5 natural years according to the current latest statistical year, to count up the total employee number of the first natural year after the backward calculation as a base number (n), and to calculate the change rate (X) of the total employee number of the subsequent years, wherein the calculation mode is as follows:
counting years Statistical cardinality Rate of change per year
2015 n2 0
2016 m2 X2016=(m2-n2)/n2*100%
2017 s2 X2017=(s2-m2)/m2*100%
2018 p2 X2018=(p2-s2)/s2*100%
2019 q2 X2019=(q2-p2)/p2*100%
(3) Intellectual property
Patented number (a1)
Number of registrant (a2)
(4) Total number of products
Number of products put into production (a3)
Number of products in development (a4)
(5) Compliance data
High loss of confidence data (a5)
Blacklist data (a6)
B. And (4) screening information in the enterprise, namely finishing secondary screening work. The working steps are used for subdividing evaluation dimension items on the basis of basic data of the code center to form a specific evaluation result more accurate service park enrollment process. The secondary screening conditions are summarized as follows:
(1) personnel information
Information of senior staff
Counting the number of senior citizens at present
The ratio calculation formula of the senior officers is as follows: the ratio of senior officer to senior officer is counted as the number of senior officer/total number of enterprise officers in the year
Five-year rate of change for senior staff
The five-year change rate statistics of the number of senior officers is that 5 natural years are inverted from the latest statistical year at present, the first natural year after the inversion is taken as the first statistical year, the total number of enterprise employees in the statistical year is n, the number of senior officers is m, the current percentage of senior officers in the statistical year is X, the change rate of the year is D, and the calculation mode is as follows:
counting years Total number of workers Number of staff in senior staff Ratio of high job title Rate of change per year
2015 n m X=m/n*100% 0
2016 n1 m1 X1=m1/n1*100% D2016=X1-X
2017 n2 m2 X2=m2/n2*100% D2017=X2-X1
2018 n3 m3 X3=m3/n3*100% D2018=X3-X2
2019 n4 m4 X4=m4/n4*100% D2019=X4-X3
Structure of teaching calender for personnel
The personal calendar is mainly divided into 5 levels: the statistical method comprises the following steps of:
Figure BDA0002718369330000091
people age structure
The age of a person is mainly divided into 5 levels: the statistical method comprises the following steps of (1) under 25 years old, 25-30 years old, 30-35 years old, 35-40 years old, over 40 years old, the number of people n in the current age group, the total number of people m, the belonged stage proportion Z:
Figure BDA0002718369330000092
(2) enterprise self investment
Product research and development investment
Figure BDA0002718369330000093
Calculating formula of research and development fund ratio in statistical year: the ratio of research and development capital to the total investment cost of the enterprise
Figure BDA0002718369330000094
The change rate of research and development capital in the year (statistics for 5 years)
The five-year change rate statistics of research and development funds is that 5 natural years are reversed from the latest statistics year at present, the first natural year after the reversal is taken as the first statistical year, the total investment amount of enterprises in the statistics year is n, the amount of research and development investment m is counted, the research and development funds in the statistics year account for P, the annual change rate is S, and the calculation formula is as follows:
counting years Total amount of investment of enterprise Amount of investment money Ratio of research and development funds Rate of change per year
2015 n m P=m/n*100% 0
2016 n1 m1 P1=m1/n1*100% S2016=P1-P
2017 n2 m2 P2=m2/n2*100% S2017=P2-P1
2018 n3 m3 P3=m3/n3*100% S2018=P3-P2
2019 n4 m4 P4=m4/n4*100% S2019=P4-P3
Second technical equipment update period
The technical equipment update period (T) is divided into 4 time segments: less than 3 years, 3-7 years, 7-10 years and more than 10 years.
(3) Financial data
Total annual asset contribution rate
Total asset contribution rate: the index reflects the profit capacity of all the assets of the enterprise, is the centralized embodiment of the enterprise operation performance and the management level, and is a core index for evaluating and checking the profit capacity of the enterprise.
The calculation formula is as follows: total asset contribution rate ═ total profit sum total tax sum interest expenditure)/average total asset sum
Wherein: the total amount of the tax is the sum of the product sales tax and the added value added tax which is added and should be paid; the average total asset is the arithmetic mean of the assets at the beginning and end of the period.
Total asset value-keeping and value-adding rate
Total asset value retention and increment rate: the index reflects the change condition of the net assets of the enterprise and is the centralized embodiment of the development capability of the enterprise.
The calculation formula is as follows: equity appreciation rate of end-of-term property/equity of end-of-term property
Wherein: the owner equity is equal to the total amount of assets minus the total amount of liabilities.
Third rate of assets and liabilities
Rate of assets liability: the index reflects the magnitude of the enterprise operational risk and also reflects the ability of the enterprise to engage in operational activities with funds provided by creditors.
The calculation formula is as follows: rate of assets and liabilities being total amount of liabilities/total amount of assets
Turnover rate of mobile assets
Turnover rate of flowing assets: the turnover frequency of the mobile assets in a certain period is indicated, and the turnover speed of the mobile funds of the invested industrial enterprises is reflected.
The calculation formula is as follows: liquidity turnover-sales revenue/liquidity average balance
Cost profit margin
Cost and expense profit margin: the economic benefits of the industrial input production cost and the cost are reflected, and the economic benefits obtained by reducing the cost of enterprises are also reflected.
The calculation formula is as follows: profit margin of cost-expense (total profit/total cost-expense)
The total cost sum is the sum of product sale cost, management cost and financial cost.
(4) Enterprise product sales
The annual rate of change of income of the product
The five-year change rate statistics of the product sales income of the enterprise is to count down 5 natural years in the current latest statistical year, and to count up the product sales change rate (H) in the following years by taking the sales of the first natural year after the count-down as the base (n), and the calculation mode is as follows:
Figure BDA0002718369330000101
Figure BDA0002718369330000111
second, the product market accounts for the annual rate of change
Market share, i.e., the ratio of the sales of a product in an enterprise to the total sales of like products in the market.
Calculating the formula: the market share (V) is the product sales (m) of the enterprise/the total sales (n) x 100% of similar products in the market.
The product market share rate of change (K) annual change is calculated as follows:
Figure BDA0002718369330000112
third product sales rate
Product sales rate: the index reflects the degree of the realized sales of the enterprise products, is an index for analyzing the production and sales linking conditions of the enterprise and researching that the enterprise products meet the market demands. The calculation formula is as follows:
the product sale rate Y is enterprise product sale output value (m)/enterprise total output value (n) × 100%
Counting years Total production value of enterprise Product sales value of an enterprise Rate of sale of product
2015 n m Y2015=n/m*100%
2016 n1 m1 Y2016=n1/m1*100%
2017 n2 m2 Y2017=n2/m2*100%
2018 n3 m3 Y2018=n3/m3*100%
2019 n4 m4 Y2019=n4/m4*100%
The above examples illustrate single index evaluation calculations, and the following examples will illustrate the composite (multi-dimensional) index evaluation calculation process in detail:
(1) assessment of enterprise sustainable development capability
The related indexes of the enterprise sustainable development ability assessment comprise:
annual rate of growth of total assets
Annual tax rate increase of enterprise
Annual growth rate of staff
Annual growth rate of research and development capital
Secondly, an enterprise continuous development ability (L) evaluation calculation formula (calculated according to 5-year statistical data):
continued development ability of enterprise
Figure BDA0002718369330000121
(2) Assessment of enterprise innovation capability
The related indexes of the enterprise sustainable development ability assessment comprise:
annual growth rate of research and development capital
Number of products in development
Annual rate of change for senior officers
Enterprise innovation ability (C) evaluation calculation formula (calculated according to 5-year statistical data):
ability of enterprise innovation
Figure BDA0002718369330000122
(3) Enterprise compliance capability assessment
The related indexes of the enterprise compliance capability assessment comprise:
high law loss of information
Blacklist information
The enterprise compliance capability (H) assessment calculation formula:
Figure BDA0002718369330000123
(4) enterprise product market development assessment
The enterprise product market development assessment related indexes comprise:
annual growth rate of product sales
Product market proportion annual growth rate
Rate of sale of product
Update period of technical equipment
Evaluating a calculation formula (calculated according to 5-year statistical data) for the product market development ability (P):
product market growth ability
Figure BDA0002718369330000131
102. Receiving an input first search keyword, and sequentially grading according to multi-dimensional combination in the enterprise evaluation model according to the first search keyword to obtain an evaluation value;
after the enterprise evaluation model is obtained through training in step 101, a first search keyword may be input in a search box in the display interface, or content displayed in the current interface may be browsed.
And according to the received first search keyword, scoring according to the scoring standard in the table 1 in the enterprise evaluation model.
103. And displaying the data screening result according to the size of the evaluation value, and determining the enterprise corresponding to the evaluation value exceeding a preset evaluation threshold value as a target enterprise object.
In practical application, both registered users/unregistered users of the system can support basic functions of searching, viewing and the like in the system, and can view comprehensive evaluation values of enterprise objects, but the registered users can only view specific evaluation values of the target objects by knowing the target objects in a fine and deep manner, and the registered users need to have registered identities, so that objective and deep self evaluation is formed.
The embodiment of the invention also provides a correlation analysis method based on data mining, which adopts two important formulas:
support (support): reliability of the correspondence rule: (ii) S (X-Y) the number of people who are paying attention to { X, Y } together/total number of people;
confidence (confidence): association degree between corresponding enterprises: c (X-Y) ═ the number of people who pay attention to { X, Y } at the same time/the number of people who pay attention to X.
The embodiment of the invention also provides a method based on user analysis and filtering, which comprises the following steps: searching a second search keyword similar to the first search keyword in the enterprise evaluation model; and pushing the triggered search result aiming at the second search keyword as the search result of the first search keyword.
After the user A enters the system, the platform searches for a user B similar to the interest or hobby of the user A according to the first search keyword, and then pushes the favorite products or browsing contents of the user B to the user A.
Illustratively, in the system according to the embodiment of the present invention, 6 users are selected, and the users browse, collect, purchase, etc. several products, in order to obtain the degree of interest of the users in a certain type of products, a simple model is first designed, different behaviors are assigned with different scores, as shown in table 2,
TABLE 2
Figure BDA0002718369330000141
Then, according to the behaviors of the user, the product scores are accumulated and calculated, the product scores are 10 points full, and the product scores are not accumulated after the product scores exceed 10 points; and obtaining a preference score table. Table 3:
TABLE 3
User' s Product 1 Product 2
A 3 10
B 2 8
And finally, finding the user which is the closest to the user A, and performing an approximate solution on the multidimensional vector of the product:
Figure BDA0002718369330000151
Figure BDA0002718369330000152
wherein xi and yi represent the component amounts of a and b.
The similarity intervals given in the examples of the present invention are [ -1,1 ]: when the result is-1 it means that the two vectors point in exactly the opposite directions, 1 means that their points are exactly the same, 0 usually means that they are independent, and the value between them means an intermediate similarity or dissimilarity. In simple terms, the most similar is 1, the least similar is-1.
In conclusion, based on the two dimensions of data association analysis and user analysis, the most suitable product is comprehensively considered and recommended to the platform user.
When training the enterprise evaluation model according to the multidimensional combined data is performed in step 101, the following methods can be adopted, but are not limited to: classifying the basic information of the enterprise and carrying out data formatting treatment; and training the multi-dimensional combined data based on the data after the formatting treatment and a preset algorithm. During classification, classification can be performed according to industries, regions, creation years and the like, and setting can be specifically performed according to different application scenarios, which is not limited in the embodiment of the invention.
The displaying the data screening result according to the evaluation value specifically comprises the following steps:
respectively acquiring multi-dimensional data of enterprise objects corresponding to the evaluation values; and comparing and displaying the multi-dimensional data of the enterprise objects in a list form. The system platform provides a comparing and selecting function of a plurality of target objects, the quality of the target enterprise objects is clear at a glance, and the problems that required contents are searched from paper documents piled up into mountains and pain points that key information cannot be grasped in time are solved. By using the system, the basic portrait information of the target enterprise object and the value map calculation conclusion information provided by the platform can be accurately and conveniently obtained.
In order to obtain timeliness of enterprise information, the embodiment of the invention further provides a method, namely, single data under the multi-dimensional combined data are respectively monitored in real time; and if the enterprise evaluation model is determined to have the update, updating the enterprise evaluation model based on the updated data. The method includes the steps of describing the recorded enterprise objects in detail (user tags), providing an informatization means for establishing an enterprise supply chain, preferably selecting a supply enterprise, and monitoring the trend of the enterprise to ensure the supply chain.
Since the big data screening apparatus described in this embodiment is an apparatus capable of executing the big data screening method in the embodiment of the present invention, based on the big data screening method described in the embodiment of the present invention, a person skilled in the art can understand a specific implementation manner of the big data screening apparatus of this embodiment and various variations thereof, and therefore, how the big data screening apparatus implements the big data screening method in the embodiment of the present invention is not described in detail herein. The device used by those skilled in the art to implement the method for screening big data in the embodiments of the present invention is within the scope of the present application.
An embodiment of the present invention further provides a big data screening apparatus, as shown in fig. 3, the apparatus includes:
a training unit 21, configured to train to obtain an enterprise evaluation model according to multidimensional combined data, where the multidimensional combined data includes: the method comprises the following steps of (1) enterprise operation efficiency, enterprise development capacity, enterprise contribution, compliance with laws and regulations and risk control, wherein data in a single dimension also comprises one or more data;
a receiving unit 22 for receiving an input first search keyword;
the scoring unit 23 is configured to sequentially score in the enterprise evaluation model according to the first search keyword and a multidimensional combination to obtain an evaluation value;
the display unit 24 is used for displaying the data screening result according to the evaluation value determined by the evaluation unit;
and the determining unit 25 is configured to determine, as the target enterprise object, the enterprise corresponding to the evaluation value exceeding the preset evaluation threshold value.
Further, as shown in fig. 4, the apparatus further includes:
a search unit 26, configured to search the enterprise evaluation model for a second search keyword that is similar to the first search keyword;
the pushing unit 27 is configured to push the triggered search result for the second search keyword searched by the searching unit as the search result for the first search keyword.
Further, as shown in fig. 4, the training unit 21 includes:
the processing module 211 is configured to classify the enterprise basic information and perform data formatting processing;
and a training module 212, configured to train the multidimensional combined data based on the data formatted and processed by the processing unit and a preset algorithm.
Further, as shown in fig. 4, the display unit 24 includes:
an obtaining module 241, configured to obtain multidimensional data of the enterprise objects corresponding to the evaluation values respectively;
and a display module 242, configured to compare and display the multidimensional data of the plurality of enterprise objects in a list form.
Further, as shown in fig. 4, the apparatus further includes:
the monitoring unit 28 is configured to perform real-time monitoring on the single data under the multi-dimensional combined data;
an updating unit 29, configured to update the enterprise evaluation model based on the updated data when the monitoring unit determines that there is an update.
The present embodiments provide a non-transitory computer-readable storage medium storing computer instructions that cause the computer to perform the methods provided by the method embodiments described above.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). The memory is an example of a computer-readable medium.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in the process, method, article, or apparatus that comprises the element.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The above are merely examples of the present application and are not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims (10)

1. A big data screening method is characterized by comprising the following steps:
training according to multi-dimensional combined data to obtain an enterprise evaluation model, wherein the multi-dimensional combined data comprises: the method comprises the following steps of (1) enterprise operation efficiency, enterprise development capacity, enterprise contribution, compliance with laws and regulations and risk control, wherein data in a single dimension also comprises one or more data;
receiving an input first search keyword, and sequentially grading according to multi-dimensional combination in the enterprise evaluation model according to the first search keyword to obtain an evaluation value;
and displaying the data screening result according to the size of the evaluation value, and determining the enterprise corresponding to the evaluation value exceeding a preset evaluation threshold value as a target enterprise object.
2. The method of claim 1, further comprising:
searching a second search keyword similar to the first search keyword in the enterprise evaluation model;
and pushing the triggered search result aiming at the second search keyword as the search result of the first search keyword.
3. The method of claim 1, wherein training the enterprise valuation model based on the multidimensional combined data comprises:
classifying the basic information of the enterprise and carrying out data formatting treatment;
and training the multi-dimensional combined data based on the data after the formatting treatment and a preset algorithm.
4. The method of any one of claims 1 to 3, wherein presenting data screening results according to the magnitude of the assessment value comprises:
respectively acquiring multi-dimensional data of enterprise objects corresponding to the evaluation values;
and comparing and displaying the multi-dimensional data of the enterprise objects in a list form.
5. The method of claim 4, further comprising:
respectively monitoring single data under the multi-dimensional combined data in real time;
and if the enterprise evaluation model is determined to have the update, updating the enterprise evaluation model based on the updated data.
6. A big data screening device, comprising:
the training unit is used for training according to multi-dimensional combined data to obtain an enterprise evaluation model, and the multi-dimensional combined data comprises: the method comprises the following steps of (1) enterprise operation efficiency, enterprise development capacity, enterprise contribution, compliance with laws and regulations and risk control, wherein data in a single dimension also comprises one or more data;
a receiving unit for receiving an input first search keyword;
the evaluation unit is used for sequentially carrying out evaluation according to multi-dimensional combination in the enterprise evaluation model according to the first search keyword to obtain an evaluation value;
the display unit is used for displaying the data screening result according to the evaluation value determined by the evaluation unit;
and the determining unit is used for determining the enterprise corresponding to the evaluation value exceeding the preset evaluation threshold value as the target enterprise object.
7. The apparatus of claim 6, further comprising:
the searching unit is used for searching a second search keyword similar to the first search keyword in the enterprise evaluation model;
and the pushing unit is used for pushing the triggered search result aiming at the second search keyword searched by the searching unit as the search result of the first search keyword.
8. The apparatus of claim 6, wherein the training unit comprises:
the processing module is used for classifying the basic information of the enterprise and carrying out data formatting processing;
and the training module is used for training the multidimensional combined data based on the data formatted and processed by the processing unit and a preset algorithm.
9. The device according to any one of claims 6 to 7, wherein the display unit comprises:
the acquisition module is used for respectively acquiring the multidimensional data of the enterprise objects corresponding to the evaluation values;
and the display module is used for comparing and displaying the multi-dimensional data of the enterprise objects in a list form.
10. The apparatus of claim 9, further comprising:
the monitoring unit is used for respectively monitoring the single data under the multi-dimensional combined data in real time;
and the updating unit is used for updating the enterprise evaluation model based on the updated data when the monitoring unit determines that the update exists.
CN202011080351.XA 2020-10-10 2020-10-10 Big data screening method and device Pending CN112148760A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011080351.XA CN112148760A (en) 2020-10-10 2020-10-10 Big data screening method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011080351.XA CN112148760A (en) 2020-10-10 2020-10-10 Big data screening method and device

Publications (1)

Publication Number Publication Date
CN112148760A true CN112148760A (en) 2020-12-29

Family

ID=73952919

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011080351.XA Pending CN112148760A (en) 2020-10-10 2020-10-10 Big data screening method and device

Country Status (1)

Country Link
CN (1) CN112148760A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115345585A (en) * 2022-08-16 2022-11-15 清华大学苏州汽车研究院(吴江) Supply chain intelligent management system for enterprise operation
CN117078054A (en) * 2023-06-07 2023-11-17 科学技术部火炬高技术产业开发中心 Scientific and technological enterprise innovation ability quantitative assessment method and system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107563630A (en) * 2017-08-25 2018-01-09 前海梧桐(深圳)数据有限公司 Enterprise's methods of marking and its system based on various dimensions
CN109726905A (en) * 2018-12-20 2019-05-07 北交金科金融信息服务有限公司 A kind of method and system of enterprise value portrait evaluation
CN109857938A (en) * 2019-01-30 2019-06-07 杭州太火鸟科技有限公司 Searching method, searcher and computer storage medium based on company information
CN110969332A (en) * 2018-09-30 2020-04-07 北京国双科技有限公司 Enterprise screening method and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107563630A (en) * 2017-08-25 2018-01-09 前海梧桐(深圳)数据有限公司 Enterprise's methods of marking and its system based on various dimensions
CN110969332A (en) * 2018-09-30 2020-04-07 北京国双科技有限公司 Enterprise screening method and device
CN109726905A (en) * 2018-12-20 2019-05-07 北交金科金融信息服务有限公司 A kind of method and system of enterprise value portrait evaluation
CN109857938A (en) * 2019-01-30 2019-06-07 杭州太火鸟科技有限公司 Searching method, searcher and computer storage medium based on company information

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115345585A (en) * 2022-08-16 2022-11-15 清华大学苏州汽车研究院(吴江) Supply chain intelligent management system for enterprise operation
CN117078054A (en) * 2023-06-07 2023-11-17 科学技术部火炬高技术产业开发中心 Scientific and technological enterprise innovation ability quantitative assessment method and system
CN117078054B (en) * 2023-06-07 2024-04-05 科学技术部火炬高技术产业开发中心 Scientific and technological enterprise innovation ability quantitative assessment method and system

Similar Documents

Publication Publication Date Title
Liu Big data and predictive business analytics.
US8930247B1 (en) System and methods for content-based financial decision making support
CN107993143A (en) A kind of Credit Risk Assessment method and system
Katayose et al. Sentiment extraction in music
CN113553540A (en) Commodity sales prediction method
CN110929969A (en) Supplier evaluation method and device
CN112148760A (en) Big data screening method and device
CN115293537A (en) Enterprise policy matching degree evaluation method and system based on big data and artificial intelligence
Moradi et al. A model for performance evaluation of digital game industry using integrated AHP and BSC
Wang et al. Applying TOPSIS method to evaluate the business operation performance of Vietnam listing securities companies
Tackett Association rules for fraud detection
Yu Data mining in library reader management
Mitsuzuka et al. Analysis of CSR activities affecting corporate value using machine learning
He The investor sentiment endurance index and its forecasting ability
CN110941952A (en) Method and device for perfecting audit analysis model
Murakami et al. Time Series Analysis of Global Automakers Stock Price Clustering
Niknya et al. Financial distress prediction of Tehran Stock Exchange companies using support vector machine
CN112560433A (en) Information processing method and device
Pletnev et al. Integral Evaluation of Business Success: Methodology and Case of Russian SME
CN111311331A (en) RFM analysis method
Kunasekaran Research on E-commerce Customer Loyalty under Big Data
Zhang Application of data mining technology in financial risk management
TW201503032A (en) Stock selection assisting method and system for perspective research and development information analysis
Dongsheng et al. Design and implementation of university educational decision support system on the students satisfaction survey
Peng et al. Integrated Algorithm-based Credit Risk Assessment and Credit Decision Guidanc

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination