CN110457696A - A kind of talent towards file data and policy intelligent Matching system and method - Google Patents
A kind of talent towards file data and policy intelligent Matching system and method Download PDFInfo
- Publication number
- CN110457696A CN110457696A CN201910701445.5A CN201910701445A CN110457696A CN 110457696 A CN110457696 A CN 110457696A CN 201910701445 A CN201910701445 A CN 201910701445A CN 110457696 A CN110457696 A CN 110457696A
- Authority
- CN
- China
- Prior art keywords
- policy
- talent
- talents
- information
- matching
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 49
- 238000000605 extraction Methods 0.000 claims abstract description 29
- 239000000284 extract Substances 0.000 claims abstract description 17
- 238000012216 screening Methods 0.000 claims abstract description 6
- 238000012549 training Methods 0.000 claims description 9
- 238000006243 chemical reaction Methods 0.000 claims description 6
- 238000005516 engineering process Methods 0.000 claims description 6
- 238000013528 artificial neural network Methods 0.000 claims description 5
- 230000015572 biosynthetic process Effects 0.000 claims description 5
- 230000000694 effects Effects 0.000 claims description 4
- 238000012545 processing Methods 0.000 claims description 4
- 241000239290 Araneae Species 0.000 claims description 3
- 230000005540 biological transmission Effects 0.000 claims description 3
- 230000011218 segmentation Effects 0.000 claims description 3
- 238000012360 testing method Methods 0.000 claims description 3
- 210000000056 organ Anatomy 0.000 claims description 2
- 210000004218 nerve net Anatomy 0.000 claims 1
- 238000011161 development Methods 0.000 description 4
- 230000018109 developmental process Effects 0.000 description 4
- 238000010276 construction Methods 0.000 description 3
- 238000009412 basement excavation Methods 0.000 description 2
- 238000005266 casting Methods 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 210000005036 nerve Anatomy 0.000 description 2
- 238000004064 recycling Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000007418 data mining Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 235000013399 edible fruits Nutrition 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000007667 floating Methods 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000002269 spontaneous effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/335—Filtering based on additional data, e.g. user or group profiles
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/26—Government or public services
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/22—Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Databases & Information Systems (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Tourism & Hospitality (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Marketing (AREA)
- General Business, Economics & Management (AREA)
- Strategic Management (AREA)
- Development Economics (AREA)
- Primary Health Care (AREA)
- Human Resources & Organizations (AREA)
- Educational Administration (AREA)
- Economics (AREA)
- Multimedia (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The present invention discloses a kind of talent towards file data and policy intelligent Matching system and method, is editable electronic document by the personal file scanning recognition of papery and forms personnel record information library;Conditional random field models based on event extraction extract the valuable information in personnel record information library and structured storage forms talents information library;Acquisition obtains current talents selection and constructs talents selection database;Policy semantic primitive abstracting method based on artificial rule is extracted to obtain quantifiable indicator information to the talents selection key message of talents selection database;By the matching of the quantifiable indicator information in talents selection database and the field in talents information library and export the talent in all talent's information banks and policy matching result;The policy categories high from matching result screening comprehensive matching degree score value reject the talent to have declared and push policy to the talent for registration and show project information.The present invention is convenient for precisely pushing to not declaring the talent, and attract talent the project of declaring, and provides power-assisted for government's talent introduction.
Description
Technical field
The present invention relates to data processing technique more particularly to a kind of talent towards file data and policy intelligent Matching systems
System and method.
Background technique
The talent is the most crucial element of a regional development, and the outstanding talent is the basis for realizing region development objective, is
The power of regional development.One region first has to pay attention to gathering innovation talent, to try to explore Talent Construction to develop
The new measure, new route and new method, continually strengthen Talent Construction, provide strong human resources for regional development and guarantee.
Since the type of various policies is different, issuing time is different different with affiliated administrative department, these policies is caused to need
It expends considerable time and effort to search for policy source, verifying policy timeliness, assess and declare feasibility etc., this is unfavorable for
The talent fully understands policy information in time, is also not easy to it and filters out from magnanimity policy meet its own policy declared.Institute
With how the policy data of magnanimity is precisely matched to the talent becomes important research direction.
Talents market, office, people society, archives form natural talent bank there are mass talent archive information, but wherein deposit
Archives of paper quality how digitization, talents selection how Auto-matching the problem of.The present invention is based on above-mentioned backgrounds, propose one kind
The talent and policy intelligent Matching system and method towards file data.
The talent and policy matching system are based primarily upon the information progress policy matching that user oneself fills in upload at present, such as specially
Benefit number 201811287900.3 discloses " a kind of policy intelligent Matching system and method ", and the patent system is according to class of subscriber
The basic information for obtaining talent user is filled in selection, user information, carries out successive policy matching.The main problem of the program is base
Policy matching is carried out in the spontaneous upload information of user, cannot achieve extensive automatically talent's matching feature.
The patent No. 201710934706.9 discloses a kind of " matching recommendation side based on city specific crowd with the policy that is associated with
Method and system ", the patent leave family, urban floating population for city, and the old,weak,sick and disabled three classes crowd collects essential information, lead to
It crosses data mining and obtains crowd demand's label, then policy administrative plan of going forward side by side is decomposed based on crowd's label and is recommended.It is with crowd demand
It instructs to push policy to crowd not directed to this special documentation in archives, and guard station office is needed manually to acquire phase
Pass crowd's information, is inevitably omitted, and promotion are not comprehensive enough.
Summary of the invention
The purpose of the present invention is to provide a kind of talent towards file data and policy intelligent Matching system and method, faces
To local archives data, it is automatic to propose that condition random field (CRF) model based on event extraction carries out information to personal file
Extract, while automatic collection talents selection, using the policy semantic primitive abstracting method based on artificial rule parse policy and
Intelligent Matching carries out policy push to talents information, to the talent, realizes the talent and the policy intelligent Matching system of scale automation
System.
The technical solution adopted by the present invention is that:
A kind of talent towards file data and policy intelligent Matching method comprising following steps;
Step 1, obtaining local talent's papery personal file and scanning is image, and will be scanned using image recognition technology
Image be converted into editable electronic document, form personnel record information library;
Step 2, valuable letter is extracted using the conditional random field models based on event extraction based on personnel record information library
It ceases and structured storage forms talents information library;
Step 3, acquisition obtains current talents selection and constructs talents selection database;
Step 4, the policy semantic primitive abstracting method based on artificial rule closes the talents selection of talents selection database
Key information is extracted to obtain quantifiable indicator information;
Step 5, the quantifiable indicator information in talents selection database is matched with the field in talents information library
The talent in all talent's information banks and policy matching result are obtained,
Step 6, from the high policy categories of matching result screening comprehensive matching degree score value reject the talent to have declared to
The talent for registration pushes policy and shows project information.To the policy categories that do not register to the talent in the form of mail, short message into
Administrative plan push shows the declarable project name of the talent, project application economic welfare and the notice link of newest policy.
Further, the specific steps of step 1 are as follows:
Step 1.1, mode typing computer papery personal file scanned, saves as image;
Step 1.2, several subgraphs are divided the image into using OCR technique, includes an individual word in each subgraph
It is female;
Step 1.3, by subgraph from image format conversion at binary format, and by binary data transmission to BP nerve
Network;
Step 1.4, BP neural network finds out the association between character image data and numerical value by training process, will be swept
The image retouched is converted into editable electronic document, and recognition result enters system database.
Further, step 2 carries out the specific steps of event extraction with conditional random field models to personnel record information
Are as follows:
Step 2.1, personnel record information archives text carry out word segmentation processing by document representation at tf/idf weight to
Amount,
Step 2.2, using based on document frequency method carry out feature extraction, filter out for correctly classify contribute it is low
Word;
Step 2.3, archives text training set is set, completes the conversion of archives text training set to characteristic set, and pass through
Artificial mark label learns tagsort;
Step 2.4, archives text test set is obtained into same type characteristic set, is then differentiated by sorter model
To corresponding tag along sort, carries out structured storage and establish talents information library.
The archives text of step 2.1 generally includes personal essential information, education background, work experience and paper Patent Publication
Situation.
Further, the talents selection acquisition modes of step 3 include lead-in mode and extracting mode, and lead-in mode refers to government
Policy information editor is actively stored in talents selection database by mechanism or the third-party institution;Extracting mode refer to using robot,
Web crawlers, Web Spider cyber stalker associated with talent policies all in selected target government website are believed
It ceases progress automatic collection and is downloaded to local server and arrange deposit talents selection database again.
Further, step 4 specifically includes the building of policy dictionary, policy Text Pretreatment and policy information and extracts three steps
Suddenly, specifically:
Policy dictionary building: by the class condition of policy type, policy title, applicable elements and keyword to talents selection
The vocabulary of database carries out including to form talents selection dictionary, and the text feature and description to local talents selection corpus are accustomed to
It is analyzed, extracts the trigger word to the word for playing identification, mark effect of semantic primitive as triggering extraction task.Such as
Declaring condition description generally will appear the vocabulary such as " satisfaction ", " meeting ", " necessary ".
Policy Text Pretreatment: it is segmented using Chinese Academy of Sciences ICTCLAS participle tool and marks part of speech;
Policy information extracts: formulating decimation rule based on triggering vocabulary, and describes decimation rule using regular expression and build
Vertical rule base finally carries out the extraction of the semantic primitive of local talents selection and is stored in talents selection database.
Further, being carried out in step 5 using the matching process based on semantic matches rule will be in talents selection database
Quantifiable indicator information matched with the field in talents information library, specifically includes the following steps:
Step 5.1, matching rule is from the easier to the more advanced sorted, is matched since the index being easiest to, returning to every can quantify
Targets match rules results;
Step 5.2, each matching rule assigns different weights and already present index weights are greater than the index not set up,
And then the matching degree for obtaining the talent and policy categories is calculated, matching degree includes 5 grades: very matching (1 point), matching (0.8
Point), comparison match (0.6 point), general matching (0.4 point), mismatch (0.2 point).If talents information substantially conforms to the project institute
There is matching rule, comprehensive matching similarity then shows that the talent can declare the project close to 1 point.
Step 5.3, all policy categories are retrieved and return to all results and by formation policy after the sequence of comprehensive matching degree
With list.
Further, the policy matching result in step 5 includes project name, type of subject, project application condition, declares
Economic welfare and the notice link of newest policy.
Further, the invention also discloses a kind of talent towards file data and policy intelligent Matching system, packets
It includes with lower module:
Personal file identification module: for being editable electronic document by the personal file scanning recognition of papery and being formed
Personnel record information library;
Personal file property extracting module: the conditional random field models based on event extraction extract personnel record information library
Valuable information and structured storage formation talents information library;
Policy automatic collection module: current talents selection building talents selection database is obtained for acquiring;
Policy segments parsing module: the policy semantic primitive abstracting method based on artificial rule is to talents selection database
Talents selection key message is extracted to obtain quantifiable indicator information;
The talent and policy matching module: in the quantifiable indicator information and talents information library in talents selection database
Field matching and export the talent in all talent's information banks and policy matching result;
Policy pushing module: for rejecting from the high policy categories of matching result screening comprehensive matching degree score value with Shen
The talent of report pushes policy to the talent for registration and shows project information.
The invention adopts the above technical scheme, establishes policy automatic patching system based on local talent's resources bank.Pass through base
Papery talent archive information is identified in the OCR technique of BP neural network, it is automatic to match relevant policies for the local talent and go forward side by side administration
Plan casting push solves the problems, such as that talents selection matching takes time and effort.The present invention passes through the condition random field based on event extraction
(CRF) technology automatically can efficiently extract valuable information from a variety of non-structured personal files, and carry out
Structured storage establishes talents information library, with easy-to-look-up and recycling.The present invention uses big data integrating means, will
The policy data of each " information island " is integrated, and proposes the policy semantic primitive abstracting method based on artificial rule to the talent
Policy realizes that key message extracts, and talents selection database is constructed, convenient for matching with talent's archive information.The present invention is based on
Talents information has been declared in the matching result in talents selection library and talents information library, filtering, realizes precisely push to the talent is not declared.
Detailed description of the invention
The present invention is described in further details below in conjunction with the drawings and specific embodiments;
Fig. 1 is the configuration diagram of a kind of talent towards file data and policy intelligent Matching system of the invention;
Fig. 2 is that the policy semantic primitive based on artificial rule extracts flow chart.
Specific embodiment
In order to solve the problems, such as the effective use of local archives papery talent's data, the present invention offers random field
(CRF) model realizes automatic extraction, the policy semantic primitive abstracting method based on artificial rule to extensive personnel record information
Talents selection key message is extracted, the talent for establishing automation and policy intelligent Matching system provide automatically for the talent
Policy Push Service.As shown in Figure 1, the invention also discloses a kind of talents towards file data and policy intelligent Matching system
System comprising with lower module:
Personal file identification module: for being editable electronic document by the personal file scanning recognition of papery and being formed
Personnel record information library;
Personal file property extracting module: the conditional random field models based on event extraction extract personnel record information library
Valuable information and structured storage formation talents information library;
Policy automatic collection module: current talents selection building talents selection database is obtained for acquiring;
Policy segments parsing module: the policy semantic primitive abstracting method based on artificial rule is to talents selection database
Talents selection key message is extracted to obtain quantifiable indicator information;
The talent and policy matching module: in the quantifiable indicator information and talents information library in talents selection database
Field matching and export the talent in all talent's information banks and policy matching result;
Policy pushing module: for rejecting from the high policy categories of matching result screening comprehensive matching degree score value with Shen
The talent of report pushes policy to the talent for registration and shows project information.
As shown in the figures 1 and 2, the invention discloses a kind of talent towards file data and policy intelligent Matching method,
Include the following steps;
Step 1, obtaining local talent's papery personal file and scanning is image, and will be scanned using image recognition technology
Image be converted into editable electronic document, form personnel record information library;
Further, the specific steps of step 1 are as follows:
Step 1.1, from talents market, office, people society, the acquisitions local such as archives talent's papery personal file, by papery occurrences in human life
The mode typing computer of archives scanning, saves as image;
Step 1.2, several subgraphs are divided the image into using OCR technique, includes an individual word in each subgraph
It is female;
Step 1.3, by subgraph from image format conversion at binary format, and by binary data transmission to BP nerve
Network;
Step 1.4, BP neural network finds out the association between character image data and numerical value by training process, will be swept
The image retouched is converted into editable electronic document, and recognition result enters system database, facilitates subsequent progress full text inspection
Rope.
Step 2, the big and non-structured personnel record information for quantity, using the condition random field based on event extraction
Model extraction valuable information and structured storage formation talents information library, facilitate the matching of the subsequent talent and policy.
Further, condition random field (CRF) model is a kind of sorter model that can be used for naming Entity recognition, this
Model regards the variety classes of information extraction as a kind of label for being directed to feature, thus by information extraction task be converted into
Determine the classification problem of text and its feature, feature used in condition random field (CRF) model is as shown in table 1.
1 condition random field of table (CRF) model occurrences in human life archive feature
The present invention carries out the specific steps of event extraction with conditional random field models to personnel record information are as follows:
Step 2.1, personnel record information archives text carry out word segmentation processing by document representation at tf/idf weight to
Amount,
Step 2.2, using based on document frequency method carry out feature extraction, filter out for correctly classify contribute it is low
Word;
Step 2.3, archives text training set is set, completes the conversion of archives text training set to characteristic set, and pass through
Artificial mark label learns tagsort;
Step 2.4, archives text test set is obtained into same type characteristic set, is then differentiated by sorter model
To corresponding tag along sort, carries out structured storage and establish talents information library.
Specifically, archives text generally includes personal essential information, education background, work experience and paper Patent Publication feelings
Condition.In different content blocks, item of information is listed with different modes, as education background content blocks include may be the time, learn
School, learned profession etc.;Paper publishing situation content blocks include may be author, thesis topic, the periodical delivered or meeting
Discuss collected works, deliver the time etc..On the other hand, the content blocks of different archives texts may be listed in a different order, most of
Archives text is personal essential information first, is next successively education background, work experience, is finally the patent feelings that publish thesis
Condition.The item of information listed in each content blocks is sequentially arranged, therefore these items of information can be counted as a system
The event of column, therefore event can be expressed as to a five-tuple:
E=<who, when, where, what, how>
Representation method building text feature and mark label based on above-mentioned event, mark label of the invention such as 2 institute of table
Show.Essential characteristic includes the features such as entity word, entity word part of speech, upper and lower cliction, upper and lower cliction part of speech, and mark label includes surname
Name, event bodies, time, place, the knots such as date of birth, address, phone, position, education experience/work experience/publication paper
Fruit etc..The personal file event information of condition random field (CRF) model treatment is subjected to structured storage, establishes talents information
Library.
2 condition random field of table (CRF) model personal file label
Step 3, the acquisition of policy automatic collection module obtains current talents selection and constructs talents selection database;Talent's political affairs
Plan acquisition modes include lead-in mode and extracting mode, and lead-in mode refers to government organs or the third-party institution actively by policy information
Editor's deposit talents selection database;Extracting mode refer to using robot, web crawlers, Web Spider cyber stalker
Automatic collection is carried out to policy informations associated with the talent all in selected target government website and is downloaded to local service
Device arranges deposit talents selection database again.
Step 4, policy segments policy semantic primitive abstracting method of the parsing module based on artificial rule to talents selection number
It is extracted to obtain quantifiable indicator information according to the talents selection key message in library;
Further, as shown in Fig. 2, step 4 specifically includes the building of policy dictionary, policy Text Pretreatment and policy information
Three steps are extracted, specifically:
The building of policy dictionary: in order to improve policy participle effect, need to construct talents selection dictionary and information extraction triggering
Vocabulary.Information of the building of talents selection dictionary based on local talents selection corpus, by policy type, policy title is applicable in
The vocabulary in corpus is included in condition, the classification such as keyword.Trigger word refers to that the extraction to a certain semantic primitive plays
Identification, mark effect, can trigger the word of extraction task.The building that information extraction triggers vocabulary is based on to local talents selection
The text feature and description habit of corpus are analyzed.Such as declare condition description generally will appear " satisfaction ", " meeting ",
Vocabulary such as " necessary ".
Policy Text Pretreatment: Text Pretreatment module mainly includes subordinate sentence, participle and part-of-speech tagging.Use the local talent
The title and text of policy corpus are segmented using Chinese Academy of Sciences ICTCLAS participle tool as experiment corpus and mark word
Property.
Policy information extracts: by the analysis to corpus to be extracted, formulating decimation rule based on triggering vocabulary, and using just
Then expression formula describes decimation rule and establishes rule base, finally carries out the extraction of the semantic primitive of local talents selection.By all political affairs
Plan information according to project name, type of subject, declare condition, declare economic welfare, policy link carry out data normalization arrangement
Talents selection database is imported afterwards.Feature Words extracting method in applicating text excavation declares item in talents selection database
Part carries out further semantic excavation, resolves into the index that can quantify to measure, is stored in talents selection database.
Step 5, the talent and policy matching module are by the quantifiable indicator information and talents information in talents selection database
Field in library carries out the talent and policy matching result in all talent's information banks of matching acquisition, wherein policy matching result packet
It includes project name, type of subject, project application condition, declare economic welfare and the notice link of newest policy.
Further, being carried out in step 5 using the matching process based on semantic matches rule will be in talents selection database
Quantifiable indicator information matched with the field in talents information library, such as " publishing thesis at least 5 " be equivalent to " paper number
Amount >=5 ", specifically includes the following steps:
Step 5.1, matching rule is from the easier to the more advanced sorted, is matched since the index being easiest to, returning to every can quantify
Targets match rules results;Matching rule is ranked up by matching complexity, first matches the index being easiest to, such as educational background, year
Age, the already present index such as the quantity that publishes thesis match the index not set up by keyword, and returning to every can measure
Change targets match rules results.
Step 5.2, each matching rule assigns different weights and already present index weights are greater than the index not set up,
And then comprehensive matching degree formula is constructed according to quantifiable indicator matching rule result, for measuring the matching of the talent and policy categories
Degree, matching degree include 5 grades: very matching (1 point), matching (0.8 point), comparison match (0.6 point), general matching (0.4
Point), mismatch (0.2 point).If talents information substantially conforms to all matching rules of the project, comprehensive matching similarity is close to 1
Point, then show that the talent can declare the project.
Step 5.3, auto-returned policy list of matches information after all policy categories is retrieved, comprehensive matching degree is pressed in list
Sequence, tabulating result include relevant project name, type of subject, project application condition, declare economic welfare and newest policy
Notice link.
Step 6, policy pushing module screens comprehensive after the talent and policy matching result returned in all talent's information banks
The high policy categories of matching degree score value are closed, are compared with project talents information library has been declared, to the policy categories that do not register
Carry out policy push in the form of mail, short message to the talent, show the declarable project name of the talent, project application economic welfare and
Newest policy notice link.
The invention adopts the above technical scheme, establishes policy automatic patching system based on local talent's resources bank.Pass through base
Papery talent archive information is identified in the OCR technique of BP neural network, it is automatic to match relevant policies for the local talent and go forward side by side administration
Plan casting push solves the problems, such as that talents selection matching takes time and effort.The present invention passes through the condition random field based on event extraction
(CRF) technology automatically can efficiently extract valuable information from a variety of non-structured personal files, and carry out
Structured storage establishes talents information library, with easy-to-look-up and recycling.The present invention uses big data integrating means, will
The policy data of each " information island " is integrated, and proposes the policy semantic primitive abstracting method based on artificial rule to the talent
Policy realizes that key message extracts, and talents selection database is constructed, convenient for matching with talent's archive information.The present invention is based on
Talents information has been declared in the matching result in talents selection library and talents information library, filtering, realizes precisely push to the talent is not declared.
Bibliography
[1] research [D] the Xi'an Polytechnic University of occurrences in human life resume intelligence extraction system of all wise man based on condition random field,
2016.
[2] personnel and post matching degree Calculating model and its application [J] Hubei University Of Technology journal in Zhu Pingli enterprise, 2009,
24(6):58-59.
[3] Zhang Xian, Liu Shengyan, Wang Wenguang personnel and post matching model construction and application: by taking the practice of company A personnel and post matching as an example
The exploitation of [J] Chinese human resources, 2014 (22): 54-60.
Claims (10)
1. a kind of talent towards file data and policy intelligent Matching method, it is characterised in that: it includes the following steps;
Step 1, obtaining local talent's papery personal file and scanning is image, and the figure that will be scanned using image recognition technology
As being converted into editable electronic document, personnel record information library is formed;
Step 2, valuable information is extracted simultaneously using the conditional random field models based on event extraction based on personnel record information library
Structured storage forms talents information library;
Step 3, acquisition obtains current talents selection and constructs talents selection database;
Step 4, the policy semantic primitive abstracting method based on artificial rule believes the talents selection key of talents selection database
Breath is extracted to obtain quantifiable indicator information;
Step 5, the quantifiable indicator information in talents selection database and the field in talents information library are subjected to matching acquisition
The talent and policy matching result in all talent's information banks,
Step 6, the talent Xiang Weibao to have declared is rejected from the high policy categories of matching result screening comprehensive matching degree score value
The talent of name pushes policy and shows project information.
2. a kind of talent towards file data according to claim 1 and policy intelligent Matching method, it is characterised in that:
The specific steps of step 1 are as follows:
Step 1.1, mode typing computer papery personal file scanned, saves as image;
Step 1.2, several subgraphs are divided the image into using OCR technique, includes an individual letter in each subgraph;
Step 1.3, by subgraph from image format conversion at binary format, and by binary data transmission to BP nerve net
Network;
Step 1.4, BP neural network finds out the association between character image data and numerical value by training process, by what is scanned
Image is converted into editable electronic document, and recognition result enters system database.
3. a kind of talent towards file data according to claim 1 and policy intelligent Matching method, it is characterised in that:
Step 2 carries out the specific steps of event extraction with conditional random field models to personnel record information are as follows:
Step 2.1, the archives text of personnel record information carries out the vector that word segmentation processing weights document representation at tf/idf,
Step 2.2, feature extraction is carried out using the method based on document frequency, filters out and contributes low list for correctly classifying
Word;
Step 2.3, archives text training set is set, completes the conversion of archives text training set to characteristic set, and by artificial
Mark label learns tagsort;
Step 2.4, archives text test set is obtained into same type characteristic set, then differentiates to obtain pair by sorter model
The tag along sort answered carries out structured storage and establishes talents information library.
4. a kind of talent towards file data according to claim 3 and policy intelligent Matching method, it is characterised in that:
The archives text of step 2.1 generally includes personal essential information, education background, work experience and paper Patent Publication situation.
5. a kind of talent towards file data according to claim 1 and policy intelligent Matching method, it is characterised in that:
The talents selection acquisition modes of step 3 include lead-in mode and extracting mode, and lead-in mode refers to government organs or the third-party institution
Policy information editor is actively stored in talents selection database;Extracting mode, which refers to, utilizes robot, web crawlers, Web Spider
Cyber stalker automatic collections are carried out simultaneously to associated with talent policy informations all in selected target government website
It is downloaded to local server and arranges deposit talents selection database again.
6. a kind of talent towards file data according to claim 1 and policy intelligent Matching method, it is characterised in that:
Step 4 specifically includes the building of policy dictionary, policy Text Pretreatment and policy information and extracts three steps, specifically:
The building of policy dictionary: including talents selection dictionary and trigger word, talents selection dictionary by by policy type, policy title,
The class condition of applicable elements and keyword includes the vocabulary of talents selection database;Trigger word is appointed for triggering to extract
Business, trigger word are the word for playing identification, mark effect to semantic primitive, and trigger word passes through to local talents selection corpus
Text feature and description habit carry out analysis extract obtain;
Policy Text Pretreatment: it is segmented using Chinese Academy of Sciences ICTCLAS participle tool and marks part of speech;
Policy information extracts: formulating decimation rule based on triggering vocabulary, and describes decimation rule using regular expression and establish rule
Then library finally carries out the extraction of the semantic primitive of local talents selection and is stored in talents selection database.
7. a kind of talent towards file data according to claim 1 and policy intelligent Matching method, it is characterised in that:
In step 5 using the matching process based on semantic matches rule carry out by talents selection database quantifiable indicator information and
Field in talents information library is matched, specifically includes the following steps:
Step 5.1, matching rule is from the easier to the more advanced sorted, is matched since the index being easiest to, return to every quantifiable indicator
Matching rule result;
Step 5.2, each matching rule assigns different weights and already present index weights are greater than the index not set up, in turn
The matching degree for obtaining the talent and policy categories is calculated,
Step 5.3, all policy categories are retrieved to return to all results and arrange by policy matching is formed after the sequence of comprehensive matching degree
Table.
8. a kind of talent towards file data according to claim 1 and policy intelligent Matching method, it is characterised in that:
The policy matching result of step 5 includes project name, type of subject, project application condition, declares economic welfare and newest policy
Notice link.
9. a kind of talent towards file data according to claim 1 and policy intelligent Matching method, it is characterised in that:
In step 6 policy push is carried out to the talent to the policy categories that do not register in the form of mail, short message and shows declarable project
Title, project application economic welfare and the notice link of newest policy.
10. a kind of talent towards file data and policy intelligent Matching system comprising with lower module:
Personal file identification module: for being editable electronic document by the personal file scanning recognition of papery and forming occurrences in human life
Archive information library;
Personal file property extracting module: the conditional random field models based on event extraction extract the valuable of personnel record information library
Value information and structured storage formation talents information library;
Policy automatic collection module: current talents selection building talents selection database is obtained for acquiring;
Policy segments parsing module: the talent of the policy semantic primitive abstracting method based on artificial rule to talents selection database
Policy key message is extracted to obtain quantifiable indicator information;
The talent and policy matching module: for the quantifiable indicator information in talents selection database and the word in talents information library
The matching of section simultaneously exports the talent in all talent's information banks and policy matching result;
Policy pushing module: for what is declared from the high policy categories rejecting of matching result screening comprehensive matching degree score value
The talent pushes policy to the talent for registration and shows project information.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910701445.5A CN110457696A (en) | 2019-07-31 | 2019-07-31 | A kind of talent towards file data and policy intelligent Matching system and method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910701445.5A CN110457696A (en) | 2019-07-31 | 2019-07-31 | A kind of talent towards file data and policy intelligent Matching system and method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110457696A true CN110457696A (en) | 2019-11-15 |
Family
ID=68484267
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910701445.5A Pending CN110457696A (en) | 2019-07-31 | 2019-07-31 | A kind of talent towards file data and policy intelligent Matching system and method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110457696A (en) |
Cited By (25)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110852722A (en) * | 2019-11-18 | 2020-02-28 | 江苏苏伦大数据科技研究院有限公司 | Information matching system for introducing high-level talents |
CN111428037A (en) * | 2020-03-24 | 2020-07-17 | 合肥科捷通科技信息服务有限公司 | Method for analyzing matching performance of behavior policy |
CN111652524A (en) * | 2020-06-11 | 2020-09-11 | 中力数创(重庆)科技有限公司 | Method and device for intelligently matching policy and guiding improvement path |
CN111931031A (en) * | 2020-08-19 | 2020-11-13 | 太仓中科信息技术研究院 | Method for calculating policy information matching degree |
CN112036841A (en) * | 2020-09-18 | 2020-12-04 | 重庆强大知识产权服务有限公司 | Policy analysis system and method based on intelligent semantic recognition |
CN112035653A (en) * | 2020-11-05 | 2020-12-04 | 北京智源人工智能研究院 | Policy key information extraction method and device, storage medium and electronic equipment |
CN112036842A (en) * | 2020-09-18 | 2020-12-04 | 重庆强大知识产权服务有限公司 | Intelligent matching platform for scientific and technological services |
CN112184525A (en) * | 2020-09-28 | 2021-01-05 | 上海市浦东新区行政服务中心(上海市浦东新区市民中心) | System and method for realizing intelligent matching recommendation through natural semantic analysis |
CN112258144A (en) * | 2020-09-27 | 2021-01-22 | 重庆生产力促进中心 | Policy file information matching and pushing method based on automatic construction of target entity set |
CN112380264A (en) * | 2020-11-23 | 2021-02-19 | 政和科技股份有限公司 | Policy analysis and matching method and device based on personal full life cycle |
CN112765338A (en) * | 2020-12-30 | 2021-05-07 | 江苏风云科技服务有限公司 | Policy data pushing method, policy calculator and computer equipment |
CN112765441A (en) * | 2021-04-07 | 2021-05-07 | 北京零号窗网络信息技术有限公司 | Enterprise policy information multiple dynamic intelligent matching recommendation method for digital government affairs |
CN112989195A (en) * | 2021-03-20 | 2021-06-18 | 重庆图强工程技术咨询有限公司 | Big data based whole process consultation method and device, electronic equipment and storage medium |
CN113191436A (en) * | 2021-05-07 | 2021-07-30 | 广州博士信息技术研究院有限公司 | Talent image tag identification method and system and cloud platform |
CN113268573A (en) * | 2021-05-19 | 2021-08-17 | 上海博亦信息科技有限公司 | Extraction method of academic talent information |
CN113537927A (en) * | 2021-06-28 | 2021-10-22 | 北京航空航天大学 | Scientific and technological resource service platform transaction coordination system and method |
CN113590584A (en) * | 2021-07-23 | 2021-11-02 | 无锡海创智慧谷科技有限公司 | Talent base construction method based on big data |
CN113609836A (en) * | 2021-09-29 | 2021-11-05 | 深圳市指南针医疗科技有限公司 | Medical policy full definition analysis system and method |
CN114495145A (en) * | 2022-02-16 | 2022-05-13 | 平安国际智慧城市科技股份有限公司 | Policy document number extraction method, device, equipment and storage medium |
CN115587786A (en) * | 2022-08-31 | 2023-01-10 | 广州市弋迦信息科技有限公司 | Talent information management system and method and talent information management platform |
CN115630080A (en) * | 2022-10-26 | 2023-01-20 | 深圳市纵横云数信息科技有限公司 | Guided talent policy welfare calculation method and device |
CN116483940A (en) * | 2023-04-26 | 2023-07-25 | 深圳市国房云数据技术服务有限公司 | Method for extracting and structuring data of whole-flow type document |
CN116681261A (en) * | 2023-07-27 | 2023-09-01 | 山东创亿智慧信息科技发展有限责任公司 | Intelligent archive management control system |
CN116956130A (en) * | 2023-07-25 | 2023-10-27 | 北京安联通科技有限公司 | Intelligent data processing method and system based on associated feature carding model |
CN116992035A (en) * | 2023-09-27 | 2023-11-03 | 湖南正宇软件技术开发有限公司 | Intelligent classification method, device, computer equipment and medium |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080162513A1 (en) * | 2006-12-28 | 2008-07-03 | Pitney Bowes Incorporated | Universal address parsing system and method |
WO2013013283A1 (en) * | 2011-07-28 | 2013-01-31 | Wairever Inc. | Method and system for validation of claims against policy with contextualized semantic interoperability |
CN106021553A (en) * | 2016-05-30 | 2016-10-12 | 深圳市华傲数据技术有限公司 | Structuralized data matching method and system |
CN106447298A (en) * | 2016-09-30 | 2017-02-22 | 深圳市华傲数据技术有限公司 | Information processing system and method based on talent service system |
CN108764835A (en) * | 2018-05-24 | 2018-11-06 | 广州合摩计算机科技有限公司 | Reverse talent's pushed information method and apparatus |
CN109408683A (en) * | 2018-10-31 | 2019-03-01 | 广州高企云信息科技有限公司 | A kind of policy intelligent Matching system and method |
-
2019
- 2019-07-31 CN CN201910701445.5A patent/CN110457696A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080162513A1 (en) * | 2006-12-28 | 2008-07-03 | Pitney Bowes Incorporated | Universal address parsing system and method |
WO2013013283A1 (en) * | 2011-07-28 | 2013-01-31 | Wairever Inc. | Method and system for validation of claims against policy with contextualized semantic interoperability |
CN106021553A (en) * | 2016-05-30 | 2016-10-12 | 深圳市华傲数据技术有限公司 | Structuralized data matching method and system |
CN106447298A (en) * | 2016-09-30 | 2017-02-22 | 深圳市华傲数据技术有限公司 | Information processing system and method based on talent service system |
CN108764835A (en) * | 2018-05-24 | 2018-11-06 | 广州合摩计算机科技有限公司 | Reverse talent's pushed information method and apparatus |
CN109408683A (en) * | 2018-10-31 | 2019-03-01 | 广州高企云信息科技有限公司 | A kind of policy intelligent Matching system and method |
Cited By (36)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110852722A (en) * | 2019-11-18 | 2020-02-28 | 江苏苏伦大数据科技研究院有限公司 | Information matching system for introducing high-level talents |
CN111428037A (en) * | 2020-03-24 | 2020-07-17 | 合肥科捷通科技信息服务有限公司 | Method for analyzing matching performance of behavior policy |
CN111428037B (en) * | 2020-03-24 | 2022-09-20 | 合肥科捷通科技信息服务有限公司 | Method for analyzing matching performance of behavior policy |
CN111652524A (en) * | 2020-06-11 | 2020-09-11 | 中力数创(重庆)科技有限公司 | Method and device for intelligently matching policy and guiding improvement path |
CN111931031A (en) * | 2020-08-19 | 2020-11-13 | 太仓中科信息技术研究院 | Method for calculating policy information matching degree |
CN112036841A (en) * | 2020-09-18 | 2020-12-04 | 重庆强大知识产权服务有限公司 | Policy analysis system and method based on intelligent semantic recognition |
CN112036842A (en) * | 2020-09-18 | 2020-12-04 | 重庆强大知识产权服务有限公司 | Intelligent matching platform for scientific and technological services |
CN112036842B (en) * | 2020-09-18 | 2023-08-08 | 重庆强大知识产权服务有限公司 | Intelligent matching device for scientific and technological service |
CN112258144B (en) * | 2020-09-27 | 2022-04-26 | 重庆生产力促进中心 | Policy file information matching and pushing method based on automatic construction of target entity set |
CN112258144A (en) * | 2020-09-27 | 2021-01-22 | 重庆生产力促进中心 | Policy file information matching and pushing method based on automatic construction of target entity set |
CN112184525A (en) * | 2020-09-28 | 2021-01-05 | 上海市浦东新区行政服务中心(上海市浦东新区市民中心) | System and method for realizing intelligent matching recommendation through natural semantic analysis |
CN112035653A (en) * | 2020-11-05 | 2020-12-04 | 北京智源人工智能研究院 | Policy key information extraction method and device, storage medium and electronic equipment |
CN112380264A (en) * | 2020-11-23 | 2021-02-19 | 政和科技股份有限公司 | Policy analysis and matching method and device based on personal full life cycle |
CN112765338A (en) * | 2020-12-30 | 2021-05-07 | 江苏风云科技服务有限公司 | Policy data pushing method, policy calculator and computer equipment |
CN112989195A (en) * | 2021-03-20 | 2021-06-18 | 重庆图强工程技术咨询有限公司 | Big data based whole process consultation method and device, electronic equipment and storage medium |
CN112989195B (en) * | 2021-03-20 | 2023-09-05 | 重庆图强工程技术咨询有限公司 | Whole-process consultation method and device based on big data, electronic equipment and storage medium |
CN112765441A (en) * | 2021-04-07 | 2021-05-07 | 北京零号窗网络信息技术有限公司 | Enterprise policy information multiple dynamic intelligent matching recommendation method for digital government affairs |
CN112765441B (en) * | 2021-04-07 | 2021-11-02 | 北京零号窗网络信息技术有限公司 | Enterprise policy information multiple dynamic intelligent matching recommendation method for digital government affairs |
CN113191436A (en) * | 2021-05-07 | 2021-07-30 | 广州博士信息技术研究院有限公司 | Talent image tag identification method and system and cloud platform |
CN113268573A (en) * | 2021-05-19 | 2021-08-17 | 上海博亦信息科技有限公司 | Extraction method of academic talent information |
CN113537927A (en) * | 2021-06-28 | 2021-10-22 | 北京航空航天大学 | Scientific and technological resource service platform transaction coordination system and method |
CN113537927B (en) * | 2021-06-28 | 2024-06-07 | 北京航空航天大学 | Transaction collaboration system and method for scientific and technological resource service platform |
CN113590584A (en) * | 2021-07-23 | 2021-11-02 | 无锡海创智慧谷科技有限公司 | Talent base construction method based on big data |
CN113609836A (en) * | 2021-09-29 | 2021-11-05 | 深圳市指南针医疗科技有限公司 | Medical policy full definition analysis system and method |
CN113609836B (en) * | 2021-09-29 | 2022-01-28 | 深圳市指南针医疗科技有限公司 | Medical policy full definition analysis system and method |
CN114495145A (en) * | 2022-02-16 | 2022-05-13 | 平安国际智慧城市科技股份有限公司 | Policy document number extraction method, device, equipment and storage medium |
CN114495145B (en) * | 2022-02-16 | 2024-05-28 | 平安国际智慧城市科技股份有限公司 | Policy and document extraction method, device, equipment and storage medium |
CN115587786A (en) * | 2022-08-31 | 2023-01-10 | 广州市弋迦信息科技有限公司 | Talent information management system and method and talent information management platform |
CN115630080B (en) * | 2022-10-26 | 2023-08-04 | 深圳市纵横云数信息科技有限公司 | Guided talent policy welfare calculation method and device |
CN115630080A (en) * | 2022-10-26 | 2023-01-20 | 深圳市纵横云数信息科技有限公司 | Guided talent policy welfare calculation method and device |
CN116483940A (en) * | 2023-04-26 | 2023-07-25 | 深圳市国房云数据技术服务有限公司 | Method for extracting and structuring data of whole-flow type document |
CN116956130A (en) * | 2023-07-25 | 2023-10-27 | 北京安联通科技有限公司 | Intelligent data processing method and system based on associated feature carding model |
CN116681261A (en) * | 2023-07-27 | 2023-09-01 | 山东创亿智慧信息科技发展有限责任公司 | Intelligent archive management control system |
CN116681261B (en) * | 2023-07-27 | 2023-10-17 | 山东创亿智慧信息科技发展有限责任公司 | Intelligent archive management control system |
CN116992035A (en) * | 2023-09-27 | 2023-11-03 | 湖南正宇软件技术开发有限公司 | Intelligent classification method, device, computer equipment and medium |
CN116992035B (en) * | 2023-09-27 | 2023-12-08 | 湖南正宇软件技术开发有限公司 | Intelligent classification method, device, computer equipment and medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110457696A (en) | A kind of talent towards file data and policy intelligent Matching system and method | |
CN110427623B (en) | Semi-structured document knowledge extraction method and device, electronic equipment and storage medium | |
CN104820629B (en) | A kind of intelligent public sentiment accident emergent treatment system and method | |
CN110968684B (en) | Information processing method, device, equipment and storage medium | |
TWI424325B (en) | Systems and methods for organizing collective social intelligence information using an organic object data model | |
CN111309936A (en) | Method for constructing portrait of movie user | |
CN115526590B (en) | Efficient person post matching and re-pushing method combining expert knowledge and algorithm | |
Oppong et al. | Business decision support system based on sentiment analysis | |
Nawaz et al. | Mining public opinion: a sentiment based forecasting for democratic elections of Pakistan | |
CN116304035A (en) | Multi-notice multi-crime name relation extraction method and device in complex case | |
CN116186422A (en) | Disease-related public opinion analysis system based on social media and artificial intelligence | |
Sandhu et al. | Enhanced Text Mining Approach for Better Ranking System of Customer Reviews | |
Tallapragada et al. | Improved Resume Parsing based on Contextual Meaning Extraction using BERT | |
CN116562785B (en) | Auditing and welcome system | |
Gajanayake et al. | Candidate selection for the interview using github profile and user analysis for the position of software engineer | |
Barale et al. | Automated refugee case analysis: An nlp pipeline for supporting legal practitioners | |
Luo et al. | Towards combining web classification and web information extraction: a case study | |
Jiang et al. | ChouBERT: Pre-training french language model for crowdsensing with tweets in phytosanitary context | |
Phoan et al. | Sentiment Analysis Of Comments On Sexual Harassment In Colleges On Four Popular Social Media | |
CN114358015B (en) | Topic identification method and topic evolution path construction method based on semantic information | |
Tijare et al. | A smart resume screening tool for customized shortlisting | |
CN118093881B (en) | Audit object portrait modeling method and system based on knowledge graph | |
CN113220850B (en) | Case image mining method for court trial and reading | |
Al-augby et al. | USING RULE TEXT MINING BASED ALGORITHM TO SUPPORT THE STOCK MARKET INVESTMENT DECISION. | |
KR102671618B1 (en) | Method and system for providing user-customized interview feedback for educational purposes based on deep learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20191115 |