CN111221973B - Occupational attribute identification method and system based on machine learning and edge calculation - Google Patents

Occupational attribute identification method and system based on machine learning and edge calculation Download PDF

Info

Publication number
CN111221973B
CN111221973B CN202010096332.XA CN202010096332A CN111221973B CN 111221973 B CN111221973 B CN 111221973B CN 202010096332 A CN202010096332 A CN 202010096332A CN 111221973 B CN111221973 B CN 111221973B
Authority
CN
China
Prior art keywords
attribute
module
professional
occupational
crawler
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010096332.XA
Other languages
Chinese (zh)
Other versions
CN111221973A (en
Inventor
吴晓军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hebei Jilian Human Resources Service Group Co ltd
Original Assignee
Hebei Jilian Human Resources Service Group Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hebei Jilian Human Resources Service Group Co ltd filed Critical Hebei Jilian Human Resources Service Group Co ltd
Priority to CN202010096332.XA priority Critical patent/CN111221973B/en
Publication of CN111221973A publication Critical patent/CN111221973A/en
Application granted granted Critical
Publication of CN111221973B publication Critical patent/CN111221973B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques

Abstract

The invention provides a professional attribute recognition method, a system and a computer readable storage medium based on machine learning, cloud computing and edge computing, wherein the professional attributes to be classified are obtained through an edge crawler network computing method and input of a user, semantic, grammar and context analysis and primary recognition are carried out, and the classification of the professional attributes is trained, learned and recognized on a primary recognition result.

Description

Occupational attribute identification method and system based on machine learning and edge calculation
Technical Field
The invention relates to the technical field of cloud computing, in particular to a professional attribute identification method and system based on machine learning and edge computing.
Background
In recent years, with the development of the era and the advancement of science and technology, an information society has come, and the development of a new generation of information technology represented by cloud computing, machine learning, edge computing and the like brings about a great change to the way of work, learning and life of people. The cloud computing is a computing mode of deploying programs in a distributed mode through a network cloud, and data processing efficiency and system reliability can be improved. Machine learning is a computer technique that allows a computer program to continuously accumulate experiences as if it were a human being, and performance is improved based on these experiences, improving the ability to learn and summarize knowledge. The edge computing is that an open platform integrating network, computing, storage and application core capabilities is adopted on one side close to an object or a data source, nearest-end service is provided nearby, partial algorithms are realized through local equipment without being handed to a cloud, the processing process is completed on a local edge computing layer, the processing efficiency is greatly improved, and the load of the cloud is reduced.
The occupation classification means that different occupations are systematically divided and classified according to certain scientific modes and standards and the properties and characteristics of the occupations to enable the different occupations to become a reasonable and ordered occupation classification system, social division is continuously developed along with social progress, the occupation classifications are gradually increased, the existing occupation classification methods basically depend on manual processing, real-time automatic classification cannot be carried out, and the problems of low efficiency and low accuracy of occupation attribute classification exist.
Disclosure of Invention
Based on the problems, the invention provides a professional attribute identification method and a system based on machine learning, cloud computing and edge computing, an information identification, analysis and learning training platform is constructed, edge distributed computing is adopted to filter irrelevant information, the computing pressure is reduced, and the system professional attribute identification accuracy and the working efficiency are improved.
The invention provides the following technical scheme:
a method of professional attribute identification based on machine learning and edge calculation, the method comprising:
step 101, a webpage searching range strategy and a webpage searching method strategy are formulated, a crawler algorithm is executed according to the webpage searching range strategy and the webpage searching method strategy to identify the occupation type, a database is compared, and a first occupation attribute is identified. Executing commands of algorithm updating and database updating of a crawler monitoring and scheduling module, transmitting the first occupational attribute and submitting an analysis request;
102, receiving the analysis request, analyzing the first professional attribute, analyzing the paragraph characters through semantics, grammar and context, extracting keywords of the first professional attribute, converting the keywords into hexadecimal ASCII coded values, and splicing the ASCII coded values by adopting an algorithm to be used as a mark characteristic value of the first professional attribute;
step 103, receiving the marking characteristic value of the first occupational attribute, traversing and comparing the marking characteristic value of the first occupational attribute with the occupational attribute characteristic value of the database module, identifying whether the first occupational attribute is the existing occupational attribute, and if not, classifying the first occupational attribute into a preliminary identification result.
And 104, performing machine learning on the primary recognition result by adopting a minimum risk regression random prediction method to obtain the first career attribute classification.
The occupational attribute identification method based on machine learning, cloud computing and edge computing further comprises the steps of monitoring the execution state of the edge crawler network computing body module, and restarting the abnormal and dead edge crawler network computing body module.
Wherein, the step 103 of identifying whether the first career attribute is an existing career attribute specifically includes: and comparing the confidence coefficient with a preset threshold, wherein if the confidence coefficient is greater than or equal to the preset threshold, the first occupational attribute is the existing occupational attribute, and otherwise, the first occupational attribute is the new occupational attribute.
And after the obtained first professional attribute classification, sending the obtained first professional attribute classification to a client module and displaying the obtained first professional attribute classification to a user.
A system for professional attribute identification based on machine learning and edge calculation, the system comprising: the system comprises at least one edge crawler network calculator module and a cloud computing platform. The cloud computing platform comprises a crawler supervision and scheduling module, a character analysis module, a preliminary identification module, a database module, an intelligent learning module and an information interaction module.
The edge crawler network calculator module is used for making a webpage searching range strategy and a webpage searching method strategy, executing a crawler algorithm to identify occupation types according to the webpage searching range strategy and the webpage searching method strategy, comparing a database and identifying a first occupation attribute. Executing commands of algorithm updating and database updating of a crawler monitoring and scheduling module, transmitting the first occupational attribute and submitting an analysis request;
the character analysis module is used for receiving the analysis request, analyzing the first professional attribute, analyzing the paragraph characters through semantics, grammar and context, extracting keywords of the first professional attribute, converting the keywords into hexadecimal ASCII coded values, and splicing the ASCII coded values by adopting an algorithm to serve as a mark characteristic value of the first professional attribute;
and the preliminary identification module is used for receiving the marking characteristic value of the first occupational attribute, traversing and comparing the marking characteristic value of the first occupational attribute with the occupational attribute characteristic value of the database module, identifying whether the first occupational attribute is the existing occupational attribute, and if not, classifying the first occupational attribute into a preliminary identification result.
The database module is used for storing professional attribute classification information, professional attribute ASCII coded values and machine learning process information.
And the intelligent learning module is used for performing machine learning on the preliminary identification result by adopting a minimum risk regression random prediction method to obtain the first career attribute classification.
The crawler monitoring and scheduling module is used for monitoring the execution state of the edge crawler network computing body module, restarting the abnormal and dead edge crawler network computing body module, receiving the analysis data input by the edge crawler network computing body module, and broadcasting a crawler algorithm and an occupational attribute database to the edge crawler network computing body module.
Wherein, the identifying whether the first career attribute is an existing career attribute specifically includes: and comparing the confidence coefficient with a preset threshold, wherein if the confidence coefficient is greater than or equal to the preset threshold, the first occupational attribute is the existing occupational attribute, and otherwise, the first occupational attribute is the new occupational attribute.
And after the obtained first professional attribute classification, sending the obtained first professional attribute classification to a client module and displaying the obtained first professional attribute classification to a user.
In addition, the invention also provides a computer readable storage medium, on which a computer program is stored, wherein the computer program executes the occupational attribute identification method based on machine learning, cloud computing and edge computing.
The invention provides a professional attribute recognition method, a system and a computer readable storage medium based on machine learning, cloud computing and edge computing, wherein the professional attributes to be classified are obtained through an edge crawler network computing method and input of a user, semantic, grammar and context analysis and primary recognition are carried out, and the classification of the professional attributes is trained, learned and recognized on a primary recognition result.
Drawings
FIG. 1 is a flow chart of the method of the present invention;
fig. 2 is a block diagram of the system architecture of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples.
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.
The invention provides a professional attribute identification method and system based on machine learning, cloud computing and edge computing and a computer readable storage medium, wherein an information identification, analysis and learning training platform is constructed, edge distributed computing is adopted to filter irrelevant information, the computing pressure is reduced, and the system professional attribute identification accuracy and the working efficiency are improved.
In a first embodiment, the present invention provides a professional attribute identification method based on machine learning and edge computing, using a cloud computing means, as shown in fig. 1, the method includes at least one edge crawler network computing body module and a cloud computing platform, and the method includes:
step 101, a webpage searching range strategy and a webpage searching method strategy are formulated, a crawler algorithm is executed according to the webpage searching range strategy and the webpage searching method strategy to identify the occupation type, a database is compared, and a first occupation attribute is identified. Executing commands of algorithm updating and database updating of a crawler monitoring and scheduling module, transmitting the first occupational attribute and submitting an analysis request;
the analysis request is transmitted to a character analysis module for analysis. The edge crawler network calculator module is used for making a webpage searching range strategy and a webpage searching method strategy. The first professional attribute is a newly identified, potentially new professional attribute.
The crawler monitoring and scheduling module monitors the execution state of the edge crawler network computing body module, restarts the abnormal and dead edge crawler network computing body module, receives analysis data input by the edge crawler network computing body module, and broadcasts a crawler algorithm and an occupation attribute database to the edge crawler network computing body module.
102, receiving the analysis request, analyzing the first professional attribute, analyzing the paragraph characters through semantics, syntax and context, extracting a keyword of the first professional attribute, converting the keyword into a hexadecimal ASCII coded value, splicing the ASCII coded value by using an algorithm to serve as a labeled characteristic value of the first professional attribute, assuming that x is an integer and y is the number of the keyword, converting the keyword into the hexadecimal ASCII coded value of Nvalue, and using the labeled characteristic value of the first professional attribute as ASC, wherein the specific implementation manner is as follows:
for(x=1;x<=y;x++)
{
ASC = ASC + Nvalue
};
in this step, the text analysis module is used to analyze the new professional attribute (i.e. the first professional attribute) identified by the edge crawler network calculator module.
Step 103, receiving the marking characteristic value of the first occupational attribute, traversing and comparing the marking characteristic value of the first occupational attribute with the occupational attribute characteristic value of the database module, identifying whether the first occupational attribute is the existing occupational attribute, and if not, classifying the first occupational attribute into a preliminary identification result.
And the primary identification module receives the new professional attribute marking characteristic value transmitted by the character analysis module, preliminarily identifies whether the professional attribute is the existing professional or not by traversing and comparing the new professional attribute marking characteristic value with the existing professional attribute characteristic value of the professional attribute database, and if not, the new professional attribute marking characteristic value is classified into a subset of a primary identification result. If W is a new professional attribute mark characteristic value, M is an existing professional attribute characteristic value, the step code length is x, len (W) is the length of W, len (M) is the length of M, the step number of W i = len (W)/x, the step number of M j = len (M)/x, u and v are natural numbers, mot (Wu, Mv) is the distance value between the u-th segment of W and the v-th segment of M, Dmin is the nearest value in all segments, Dmax is the smallest distance between all segments, Si is a similarity confidence coefficient, and the formula is as follows:
Figure 211086DEST_PATH_IMAGE002
if Si is larger than or equal to 51%, the new professional attribute is the original professional attribute, the ASCII code value of the new professional attribute is updated to the database, and the crawler monitoring and scheduling module is informed to broadcast the new professional attribute database. If Si is less than 51%, the career attribute is a new career attribute, the ASCII code value of the new career attribute is stored, and the intelligent learning module is informed of learning and training.
Wherein the database module C4 stores professional attribute classification information and professional attribute ASCII encoded values, as well as machine learning process information.
And 104, performing machine learning on the primary recognition result by adopting a minimum risk regression random prediction method to obtain the first career attribute classification.
The intelligent learning module is used for learning and training the new professional attributes identified by the preliminary identification module, classifying and naming the new professional attributes, and classifying the original professional attributes. The training method adopts a minimum risk regression random prediction method to generate random numbers p and k, random samples (Mp, Qp) and (Mk, Qk) with numbers p and k are extracted, a random number s between 0 and 1 is generated, U = s × Mp + (1-s) × Mk, V = s × Qp + (1-s) × Qk is generated, the function is F (M, Q) as a prediction classification value function, R (M, Q) is a random risk normal distribution, JER = MIN (Z) is taken as a minimum risk regression random prediction value, and JER minimum is taken as a standard classification name and an attribute name. The minimum risk regression random prediction value calculation formula is as follows:
Figure 534752DEST_PATH_IMAGE004
in addition, the information interaction module is further used for receiving professional attribute introduction input by the client module, the professional attribute introduction is transmitted to the character analysis module to be used for analyzing and extracting professional attributes to be classified, the cloud computing platform transmits the intelligent learning module analysis result to the client module, and in addition, if only primary scoring is needed, the cloud computing platform can send the result of the primary recognition module to the client module to be displayed. And the client module is used for inputting the professional attribute description by the user, receiving the professional attribute judgment request and displaying the analyzed professional attribute classification result to the user.
In a second embodiment, the invention provides a system for recognizing occupational attributes based on machine learning, cloud computing and edge computing, as shown in fig. 2, the system includes at least one edge crawler network computing body module P and a cloud computing platform. The cloud computing platform comprises a crawler supervision and scheduling module C1, a character analysis module C2, a primary recognition module C3, a database module C4, an intelligent learning module C5 and an information interaction module C6.
The edge crawler network calculator module P formulates a webpage searching range strategy and a webpage searching method strategy, executes a crawler algorithm to identify occupation types according to the webpage searching range strategy and the webpage searching method strategy, compares a database and identifies a first occupation attribute. Executing commands of algorithm updating and database updating of a crawler monitoring and scheduling module, transmitting the first occupational attribute and submitting an analysis request;
the analysis request is transmitted to a character analysis module for analysis. The edge crawler network calculator module is used for making a webpage searching range strategy and a webpage searching method strategy. The first professional attribute is a newly identified, potentially new professional attribute.
The crawler monitoring and scheduling module monitors the execution state of the edge crawler network computing body module, restarts the abnormal and dead edge crawler network computing body module, receives analysis data input by the edge crawler network computing body module, and broadcasts a crawler algorithm and an occupation attribute database to the edge crawler network computing body module.
The character analysis module C2 is configured to receive the analysis request, analyze the first professional attribute, analyze a paragraph character through semantics, syntax, and context, extract a keyword of the first professional attribute, convert the keyword into a hexadecimal ASCII encoded value, concatenate the ASCII encoded values by using an algorithm, and use the concatenated value as a labeled feature value of the first professional attribute, where x is an integer, y is the number of the keyword, the keyword is converted into the hexadecimal ASCII encoded value, and is Nvalue, and the labeled feature value of the first professional attribute is ASC, which is specifically implemented as follows:
for(x=1;x<=y;x++)
{
ASC = ASC + Nvalue
};
and the preliminary identification module C3 is used for receiving the marking characteristic value of the first professional attribute, comparing the marking characteristic value of the first professional attribute with the professional attribute characteristic value of the database module in a traversing manner, identifying whether the first professional attribute is the existing professional attribute, and if not, classifying the first professional attribute into a preliminary identification result.
The preliminary identification module C3 receives the new career attribute flag feature value transmitted by the text analysis module C2, preliminarily identifies whether the career attribute is an existing career by comparing the new career attribute flag feature value with the existing career attribute feature value of the career attribute database through traversal, and if not, classifies the career attribute into a subset of the preliminary identification result. If W is a new professional attribute mark characteristic value, M is an existing professional attribute characteristic value, the step code length is x, len (W) is the length of W, len (M) is the length of M, the step number of W i = len (W)/x, the step number of M j = len (M)/x, u and v are natural numbers, mot (Wu, Mv) is the distance value between the u-th segment of W and the v-th segment of M, Dmin is the nearest value in all segments, Dmax is the smallest distance between all segments, Si is a similarity confidence coefficient, and the formula is as follows:
Figure 845647DEST_PATH_IMAGE002
if Si is larger than or equal to 51%, the new professional attribute is the original professional attribute, the ASCII code value of the new professional attribute is updated to the database, and the crawler monitoring and scheduling module is informed to broadcast the new professional attribute database. If Si is less than 51%, the career attribute is a new career attribute, an ASCII code value of the new career attribute is stored, and an intelligent learning module is informed of learning and training.
The database module C4 is used to store the professional attribute classification information and the professional attribute ASCII code values, and the machine learning process information.
And the intelligent learning module C5 is used for performing machine learning on the preliminary identification result by adopting a minimum risk regression random prediction method to obtain the first professional attribute classification.
The intelligent learning module C5 performs learning training on the new career attributes identified by the preliminary identification module C4, classifies and names the new career attributes, and classifies the original career attributes. The training method adopts a minimum risk regression random prediction method to generate random numbers p and k, random samples (Mp, Qp) and (Mk, Qk) with numbers p and k are extracted, a random number s between 0 and 1 is generated, U = s × Mp + (1-s) × Mk, V = s × Qp + (1-s) × Qk is generated, the function is F (M, Q) as a prediction classification value function, R (M, Q) is a random risk normal distribution, JER = MIN (Z) is taken as a minimum risk regression random prediction value, and JER minimum is taken as a standard classification name and an attribute name. The minimum risk regression random prediction value calculation formula is as follows:
Figure DEST_PATH_IMAGE005
in addition, the information interaction module C6 is used for receiving professional attribute introduction input by the client module, transmitting the professional attribute introduction to the character analysis module for analyzing and extracting professional attributes to be classified, and the cloud computing platform transmits the intelligent learning module analysis result to the client module. And the client module is used for inputting the professional attribute description by the user, receiving the professional attribute judgment request and displaying the analyzed professional attribute classification result to the user.
In addition, the invention also provides a computer readable storage medium, on which a computer program is stored, wherein the computer program executes the occupational attribute identification method based on machine learning, cloud computing and edge computing.
The invention provides a professional attribute recognition method, a system and a computer readable storage medium based on machine learning, cloud computing and edge computing, wherein the professional attributes to be classified are obtained through an edge crawler network computing method and input of a user, semantic, grammar and context analysis and primary recognition are carried out, and the classification of the professional attributes is trained, learned and recognized on a primary recognition result.
The embodiments of the present invention described above are combinations of elements and features of the present invention. Unless otherwise mentioned, the elements or features may be considered optional. Each element or feature may be practiced without being combined with other elements or features. In addition, the embodiments of the present invention may be configured by combining some elements and/or features. The order of operations described in the embodiments of the present invention may be rearranged. Some configurations of any embodiment may be included in another embodiment, and may be replaced with corresponding configurations of the other embodiment. It will be apparent to those skilled in the art that claims that are not explicitly cited in each other in the appended claims may be combined into an embodiment of the present invention or may be included as new claims in a modification after the present invention is filed.
In a firmware or software configuration, embodiments of the present invention may be implemented in the form of modules, procedures, functions, and the like. The software codes may be stored in memory units and executed by processors. The memory unit is located inside or outside the processor, and may transmit and receive data to and from the processor via various known means.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (9)

1. A professional attribute identification method based on machine learning and edge calculation is characterized in that:
step 101, making a webpage searching range strategy and a webpage searching method strategy, executing a crawler algorithm to identify occupation types according to the webpage searching range strategy and the webpage searching method strategy, comparing a database, and identifying a first occupation attribute; executing commands of algorithm updating and database updating of a crawler monitoring and scheduling module, transmitting the first occupational attribute and submitting an analysis request;
102, receiving the analysis request, analyzing the first professional attribute, analyzing the paragraph characters through semantics, grammar and context, extracting keywords of the first professional attribute, converting the keywords into hexadecimal ASCII coded values, and splicing the ASCII coded values by adopting an algorithm to be used as a mark characteristic value of the first professional attribute;
103, receiving the marking characteristic value of the first occupational attribute, traversing and comparing the marking characteristic value of the first occupational attribute with the occupational attribute characteristic value of the database module, identifying whether the first occupational attribute is the existing occupational attribute, and if not, classifying the first occupational attribute into a preliminary identification result;
the specific calculation method is as follows: w is a new professional attribute mark characteristic value, M is an existing professional attribute characteristic value, the step code length is x, len (W) is the length of W, len (M) is the length of M, the step number i = len (W)/x of W, the step number j = len (M)/x of M, u and v are natural numbers, mot (Wu, Mv) is the distance value between the u-th segment of W and the v-th segment of M, Dmin is the nearest distance value in all segments, Dmax is the maximum distance value of all segments, Si is a similar confidence coefficient, and the formula is as follows:
Figure 296749DEST_PATH_IMAGE002
if Si is more than or equal to 51%, the new professional attribute is the original professional attribute, the ASCII code value of the new professional attribute is updated to the database, and the crawler monitoring and scheduling module is informed to broadcast the new professional attribute database; if Si is less than 51%, the career attribute is a new career attribute, an ASCII (American standard code for information interchange) coded value of the new career attribute is stored, and an intelligent learning module is informed of learning and training;
and 104, performing machine learning on the primary recognition result by adopting a minimum risk regression random prediction method to obtain the first career attribute classification.
2. The method of claim 1, wherein: the occupational attribute identification method based on machine learning, cloud computing and edge computing further comprises the steps of monitoring the execution state of the edge crawler network computing body module, and restarting the abnormal and dead edge crawler network computing body module.
3. The method of claim 1, wherein: the step 103 identifies whether the first career attribute is an existing career attribute, specifically: and comparing the confidence coefficient with a preset threshold, wherein if the confidence coefficient is greater than or equal to the preset threshold, the first occupational attribute is the existing occupational attribute, and otherwise, the first occupational attribute is the new occupational attribute.
4. The method of claim 1, wherein: and after the obtained first professional attribute classification, sending the obtained first professional attribute classification to a client module to be displayed to a user.
5. A professional attribute recognition system based on machine learning and edge calculation, characterized by: the system comprises at least one edge crawler network calculator module and a cloud computing platform; the cloud computing platform comprises a crawler supervision and scheduling module, a character analysis module, a primary identification module, a database module, an intelligent learning module and an information interaction module;
the edge crawler network calculator module is used for making a webpage searching range strategy and a webpage searching method strategy, executing a crawler algorithm to identify occupation types according to the webpage searching range strategy and the webpage searching method strategy, comparing a database and identifying a first occupation attribute; executing commands of algorithm updating and database updating of a crawler monitoring and scheduling module, transmitting the first occupational attribute and submitting an analysis request;
the character analysis module is used for receiving the analysis request, analyzing the first professional attribute, analyzing the paragraph characters through semantics, grammar and context, extracting keywords of the first professional attribute, converting the keywords into hexadecimal ASCII coded values, and splicing the ASCII coded values by adopting an algorithm to serve as a mark characteristic value of the first professional attribute;
the preliminary identification module is used for receiving the marking characteristic value of the first occupational attribute, traversing and comparing the marking characteristic value of the first occupational attribute with the occupational attribute characteristic value of the database module, identifying whether the first occupational attribute is the existing occupational attribute, and if not, classifying the first occupational attribute into a preliminary identification result;
the preliminary identification module specifically executes: w is a new professional attribute mark characteristic value, M is an existing professional attribute characteristic value, the step code length is x, len (W) is the length of W, len (M) is the length of M, the step number i = len (W)/x of W, the step number j = len (M)/x of M, u and v are natural numbers, mot (Wu, Mv) is the distance value between the u-th segment of W and the v-th segment of M, Dmin is the nearest distance value in all segments, Dmax is the maximum distance value of all segments, Si is a similar confidence coefficient, and the formula is as follows:
Figure 151573DEST_PATH_IMAGE002
if Si is more than or equal to 51%, the new professional attribute is the original professional attribute, the ASCII code value of the new professional attribute is updated to the database, and the crawler monitoring and scheduling module is informed to broadcast the new professional attribute database; if Si is less than 51%, the career attribute is a new career attribute, an ASCII (American standard code for information interchange) coded value of the new career attribute is stored, and an intelligent learning module is informed of learning and training;
the database module is used for storing professional attribute classification information, professional attribute ASCII (American standard code for information interchange) coded values and machine learning process information;
and the intelligent learning module is used for performing machine learning on the preliminary identification result by adopting a minimum risk regression random prediction method to obtain the first career attribute classification.
6. The system of claim 5, wherein: the crawler monitoring and scheduling module is used for monitoring the execution state of the edge crawler network computing body module, restarting the abnormal and dead edge crawler network computing body module, receiving the analysis data input by the edge crawler network computing body module, and broadcasting a crawler algorithm and an occupation attribute database to the edge crawler network computing body module.
7. The system of claim 5, wherein: the identifying whether the first occupational attribute is an existing occupational attribute specifically includes: and comparing the confidence coefficient with a preset threshold, wherein if the confidence coefficient is greater than or equal to the preset threshold, the first occupational attribute is the existing occupational attribute, and otherwise, the first occupational attribute is the new occupational attribute.
8. The system of claim 5, wherein: and after the obtained first professional attribute classification, sending the obtained first professional attribute classification to a client module to be displayed to a user.
9. A computer-readable storage medium having stored thereon a computer program for executing the machine learning and edge calculation based occupation attribute identification method according to claims 1-4.
CN202010096332.XA 2020-02-17 2020-02-17 Occupational attribute identification method and system based on machine learning and edge calculation Active CN111221973B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010096332.XA CN111221973B (en) 2020-02-17 2020-02-17 Occupational attribute identification method and system based on machine learning and edge calculation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010096332.XA CN111221973B (en) 2020-02-17 2020-02-17 Occupational attribute identification method and system based on machine learning and edge calculation

Publications (2)

Publication Number Publication Date
CN111221973A CN111221973A (en) 2020-06-02
CN111221973B true CN111221973B (en) 2021-07-20

Family

ID=70828287

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010096332.XA Active CN111221973B (en) 2020-02-17 2020-02-17 Occupational attribute identification method and system based on machine learning and edge calculation

Country Status (1)

Country Link
CN (1) CN111221973B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107707545A (en) * 2017-09-29 2018-02-16 深信服科技股份有限公司 A kind of abnormal web page access fragment detection method, device, equipment and storage medium
CN109361399A (en) * 2018-10-19 2019-02-19 上海达梦数据库有限公司 A kind of method, apparatus, equipment and storage medium obtaining byte sequence

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3576626A4 (en) * 2017-02-01 2020-12-09 Cerebian Inc. System and method for measuring perceptual experiences
US11347970B2 (en) * 2018-04-30 2022-05-31 International Business Machines Corporation Optimizing machine learning-based, edge computing networks
CN108846547A (en) * 2018-05-06 2018-11-20 成都信息工程大学 A kind of Enterprise Credit Risk Evaluation method of dynamic adjustment
CN110458200A (en) * 2019-07-17 2019-11-15 浙江工业大学 A kind of flower category identification method based on machine learning
CN110300191A (en) * 2019-07-29 2019-10-01 崔翛龙 Service system and data processing method
CN110507315A (en) * 2019-09-26 2019-11-29 杭州电子科技大学 A kind of efficient electrocardiographic diagnosis system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107707545A (en) * 2017-09-29 2018-02-16 深信服科技股份有限公司 A kind of abnormal web page access fragment detection method, device, equipment and storage medium
CN109361399A (en) * 2018-10-19 2019-02-19 上海达梦数据库有限公司 A kind of method, apparatus, equipment and storage medium obtaining byte sequence

Also Published As

Publication number Publication date
CN111221973A (en) 2020-06-02

Similar Documents

Publication Publication Date Title
CN109635273B (en) Text keyword extraction method, device, equipment and storage medium
CN106777296A (en) Method and system are recommended in a kind of talent&#39;s search based on semantic matches
CN110175334B (en) Text knowledge extraction system and method based on custom knowledge slot structure
CN112231447A (en) Method and system for extracting Chinese document events
WO2021243903A1 (en) Method and system for transforming natural language into structured query language
CN107832287A (en) A kind of label identification method and device, storage medium, terminal
CN111274267A (en) Database query method and device and computer readable storage medium
CN115796181A (en) Text relation extraction method for chemical field
CN116089873A (en) Model training method, data classification and classification method, device, equipment and medium
CN115357904A (en) Multi-class vulnerability detection method based on program slice and graph neural network
CN112446209A (en) Method, equipment and device for setting intention label and storage medium
CN114997169B (en) Entity word recognition method and device, electronic equipment and readable storage medium
CN110347827B (en) Event Extraction Method for Heterogeneous Text Operation and Maintenance Data
CN111694961A (en) Keyword semantic classification method and system for sensitive data leakage detection
CN112948573B (en) Text label extraction method, device, equipment and computer storage medium
CN112417996A (en) Information processing method and device for industrial drawing, electronic equipment and storage medium
CN111221973B (en) Occupational attribute identification method and system based on machine learning and edge calculation
CN116610810A (en) Intelligent searching method and system based on regulation and control of cloud knowledge graph blood relationship
CN113177164B (en) Multi-platform collaborative new media content monitoring and management system based on big data
CN116303951A (en) Dialogue processing method, device, electronic equipment and storage medium
CN115858814A (en) Text structured information extraction method based on global pointer decoding method
CN115098687A (en) Alarm checking method and device for scheduling operation of electric power SDH optical transmission system
CN114429140A (en) Case cause identification method and system for causal inference based on related graph information
CN114265931A (en) Big data text mining-based consumer policy perception analysis method and system
CN112883703B (en) Method, device, electronic equipment and storage medium for identifying associated text

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant