CN106776575A - A kind of system and method for real-time semantic search working opportunity - Google Patents

A kind of system and method for real-time semantic search working opportunity Download PDF

Info

Publication number
CN106776575A
CN106776575A CN201611239045.XA CN201611239045A CN106776575A CN 106776575 A CN106776575 A CN 106776575A CN 201611239045 A CN201611239045 A CN 201611239045A CN 106776575 A CN106776575 A CN 106776575A
Authority
CN
China
Prior art keywords
real
information
semantic
time
opportunity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201611239045.XA
Other languages
Chinese (zh)
Inventor
周宝舟
赵泛舟
钟永生
卢奕
张有聪
周赖靖竞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Ipin Information Technology Co Ltd
Original Assignee
Shenzhen Ipin Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Ipin Information Technology Co Ltd filed Critical Shenzhen Ipin Information Technology Co Ltd
Priority to CN201611239045.XA priority Critical patent/CN106776575A/en
Publication of CN106776575A publication Critical patent/CN106776575A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention relates to a kind of real-time semantic working opportunity search system and method, by the collection of dynamic realtime, the recruitment information of each recruitment door that is polymerized, analyzed and processed in real time, and set up index;For retrieval and inquisition request, semantic analysis can be carried out, with all working chance according to one matching degree of semantic computation, then sorted according to matching degree, and the working opportunity for most matching user's request is returned, and screening cost is greatly reduced, improve the efficiency that user looks for a job.

Description

A kind of system and method for real-time semantic search working opportunity
Technical field
The present invention relates to a kind of working opportunity search engine system, more particularly to a kind of real-time semantic search work The system and method for chance.
Background technology
The major technique of real-time semantic search working opportunity is to be set up semantic indexing, Ran Houyong in real time to working opportunity Family can be by way of uploading resume or filling in work experience, and then system carries out semantic analysis, then from the trick of magnanimity Engage in information, carry out semantic dependency calculating, return to the working opportunity for more matching user.
Existing recruitment website such as future is carefree, intelligence connection is recruited, 58 is all using traditional search engine with city recruitment etc. Technology, come search work chance by way of Keywords matching.It is achieved in that and divided by the text to recruitment information Word, then sets up inverted index;When user inquires about, phrase is obtained by carrying out participle to user's query text, then The corresponding row's of falling slide fastener is pulled by phrase, is then merged, be met the working opportunity of search request.It is existing to be based on The conventional operation chance search engine implementation of Keywords matching, with it is following the drawbacks of:
1)Recall rate is low(It is incomplete):Because the expression way of natural language is varied, for same position, various phases are had Near expression way;Traditional working opportunity search engine, due to the semanteme of text cannot be understood, so can only be according to keyword Strictly matched, cause much to represent the working opportunity of same position, can not effectively be recalled because describing mode is different, with Many working opportunitys are missed as user.
2)Accuracy rate is low(It is inaccurate):Conventional operation chance search engine, because the information being input into is very limited, for example " JAVA Developmental Engineer " can hit thousands of working opportunity, it is necessary to user carries out substantial amounts of artificial screening, takes time and effort, It is very easy to miss some correlations chance very high simultaneously.
3)Operational capability is not enough(It is unhappy):The realization of traditional search engines is to do data processing based on CPU, due to The limitation of CPU computing capabilitys, during match query, can block to some the long row's of falling slide fasteners, cause to look into The result of part matching is ask, recall rate and accuracy rate is influenceed.
4)Data are disperseed:Existing major recruitment platforms are all each to realize, safeguard a set of working opportunity search engine, for Needed for user respectively from multiple platform search work chances, causing to look for a job, efficiency is very low, and time cost is high.
The content of the invention
It is contemplated that at least solving one of technical problem present in prior art.
Therefore, it is an object of the present invention to provide a kind of based on semantic real-time working chance search engine, by real-time Gather, be polymerized each recruitment door recruitment information, then analyzed and processed in real time, by semantic analysis matching treatment, obtain To the working opportunity for most matching user's request.By way of semantic matches, using GPU parallel computations, reach " faster, more accurate, It is more complete " purpose, substantially increase the efficiency that user looks for a job.
To achieve the above object, the invention provides a kind of system of real-time semantic search working opportunity, the system includes: Data analysis layer and service layer,
Data analysis layer is used to dynamically detect and obtain the working opportunity of each recruitment channel, and working opportunity is carried out in real time Data Analysis Services, specifically include the real-time stream process cluster module of Spider cluster modules, data, structural data DB clusters Module;The Spider cluster modules are used for the Real-time Collection recruitment information from the recruitment page of the whole network, and are put in storage preservation;It is described The real-time stream process cluster module of data carries out real-time processing, including information for the information that the Spider cluster modules are preserved Extraction, quantization, standardization, vectorization;The structural data DB cluster modules are by the real-time stream process cluster module of the data Data after treatment are stored in a DB cluster, as wired upper module initial data;
Service layer is used to, to the Query Information of user, be analyzed treatment, then with DB clusters described in the data analysis layer in The recruitment information of preservation carries out real-time matching, and returns to the working opportunity most matched with user, specifically includes inquiry gateway, semanteme Retrieval service module, KVDB modules;The inquiry gateway provides external query interface, and Query Information to user input is carried out Pretreatment, including information extraction, quantization, vectorization, the working opportunity id lists for calling the semantic retrieval service module to return, Finally inquiry KVDB modules obtain complete working opportunity information;The semantic retrieval service module is for looking into that user submits to Ask, carry out screening matching, calculate matching degree, and return to the working opportunity id lists of matching;The KVDB modules, for store with The corresponding detailed operation opportunity information of working opportunity, and for front end page displaying.
More specifically, the Spider cluster modules are implemented as by analyzing each recruitment website each channel, each In the renewal of the page cycle of individual recruitment list page, the update cycle of each recruitment list page is estimated out, then entered according to the update cycle Row captures node in real time, and is saved in web page library.
More specifically, described information extract refer to by non-structured text generating structure data, the quantization refer to by The field of denumerable value quantifies, and the standardization is to standardize entity information, and the vectorization refers to by the information of text class Vector turns to corresponding semantic vector.
More specifically, the semantic retrieval service module includes GPU acceleration layers, for accelerating semantic computation, for user The inquiry of submission, is screened according to inquiry screening conditions, calculates semantic vector similarity, then calculates matching degree, is sorted, And return to the id lists for most matching.
Present invention also offers a kind of method of real-time semantic search working opportunity, the method comprises the following steps:
Step 1, dynamically detects and obtains the working opportunity of each recruitment channel, and the working opportunity is grabbed in real time Take, Data Analysis Services, and be put in storage preservation;Real-time processing flow includes:Information extraction, quantization, standardization, vectorization;By institute The data after real-time processing are stated to be stored in DB clusters, as it is wired on initial data;
Step 2, user is pre-processed by inquiring about gateway input inquiry information, the Query Information to user input, including letter Breath is extracted, quantified, vectorization, then calls semantic retrieval service;
Step 3, the described pretreated Query Information submitted to by semantic retrieval service for user, carries out querying condition screening Matching, calculates semantic matching degree, and return to the working opportunity id lists of matching;
Step 4, the inquiry working opportunity id lists that are returned according to semantic retrieval service of gateway, in inquiry KVDB storage with The corresponding detailed operation opportunity information of working opportunity, finally obtains complete working opportunity information, and be shown to preceding end page.
More specifically, described in the step 1 dynamic detection simultaneously obtains the working opportunity of each recruitment channel, to described Working opportunity carries out real-time Data Analysis Services, and is put in storage preservation and is implemented as:
By analyze each recruitment website each channel, each recruit renewal of the page cycle of list page, estimate out each recruitment The update cycle of list page, nodal information is then captured according to the update cycle in real time, and be saved in web page library.
More specifically, in the step 1 information extraction, quantization, standardization, vectorization are implemented as by the letter Breath is extracted non-structured text generating structure data, and the field of denumerable value is quantified, and entity information is standardized, and will The information vector of text class turns to corresponding semantic vector.
More specifically, in the step 2 information extraction, quantization, vectorization are implemented as being extracted by described information By non-structured text generating structure data, the field of denumerable value is quantified, and the information vector of text class is turned to it is right The semantic vector answered.
More specifically, the semantic retrieval service includes, by GPU acceleration layers, accelerating semantic computation, is submitted to for user Inquiry, screened according to inquiry screening conditions, calculate semantic vector similarity, then calculate matching degree, sort, and return Return the id lists for most matching.
Real-time semantic search working opportunity system proposed by the present invention, compared to traditional job search engine, with such as Under beneficial technique effect:
1)Recall rate and accuracy rate higher, with " front end engineer " for example, by way of semantic retrieval, can recall The work of the position such as " WEB engineer ", " WEB Developmental Engineer ", " WEB exploitations ", " front end Developmental Engineer ", " front end exploitation " Chance;Relative to traditional keyword retrieval mode, same inquiry can return to up to several times even tens times of working opportunity. Meanwhile, the resume uploaded by user or work experience are described, the way of search of semantic matches, with all working chance(Thousand Ten thousand ranks)A matching degree is calculated, the working opportunity for most matching then is returned according to matching degree sequence, screening efficiency improves several Again to tens times.
2)High-timeliness, by introducing real-time Spider clusters, dynamically detects the working opportunity of each recruitment channel Update, in the newly-increased working opportunity of very first time crawl, working opportunity is analyzed and processed in real time in second rank then And be put in storage, each service module is then distributed in real time, set up index, there is provided inquiry.
3)More fully working opportunity analysis
Because semantic computation is related to the floating-point operation of magnanimity, traditional CPU disposal abilities are very limited, and meeting is right in causing inquiry Query process is blocked, and only returns to Query Result after query portion data;The system introduce GPU calculate by way of, For accelerating semantic computation, tens times of acceleration effect has been reached;Each inquiry simultaneously can be carried out the full database data of full dose Analytical calculation, analyzes more comprehensively accurate to working opportunity.
Brief description of the drawings
Of the invention above-mentioned and/or additional aspect and advantage will become from description of the accompanying drawings below to embodiment is combined Substantially and be readily appreciated that, wherein:
Fig. 1 is shown according to a kind of general frame figure of the system of real-time semantic search working opportunity of the present invention;
Fig. 2 shows the system framework figure of a kind of real-time semantic search working opportunity of one embodiment of the invention;
Fig. 3 shows a kind of method flow diagram of real-time semantic search working opportunity of the present invention.
Specific embodiment
It is below in conjunction with the accompanying drawings and specific real in order to be more clearly understood that the above objects, features and advantages of the present invention Mode is applied to be further described in detail the present invention.It should be noted that in the case where not conflicting, the implementation of the application Feature in example and embodiment can be mutually combined.
Many details are elaborated in the following description in order to fully understand the present invention, but, the present invention may be used also Implemented with being different from mode described here using other, therefore, protection scope of the present invention does not receive following public tool The limitation of body embodiment.
Fig. 1 is shown according to a kind of general frame figure of the system of real-time semantic search working opportunity of the present invention.
As shown in figure 1, a specific embodiment of the invention provides a kind of real-time semantic search working opportunity System, the system includes:Data analysis layer and service layer two large divisions,
Data analysis layer is used to dynamically detect and obtain the working opportunity of each recruitment channel, and working opportunity is carried out in real time Data Analysis Services, specifically include the real-time stream process cluster module of Spider cluster modules, data, structural data DB clusters Module;The Spider cluster modules are used for the Real-time Collection recruitment information from the recruitment page of the whole network, and are put in storage preservation;It is described The real-time stream process cluster module of data carries out real-time processing, including information for the information that the Spider cluster modules are preserved Extraction, quantization, standardization, vectorization;The structural data DB cluster modules are by the real-time stream process cluster module of the data Data after treatment are stored in a DB cluster, as wired upper module initial data.
Specifically, Spider cluster modules by analyze each recruitment website each channel, each recruit list page In the renewal of the page cycle, the update cycle of each recruitment list page is estimated out, crawl section in real time is then carried out according to the update cycle Point information, and be saved in web page library.
The real-time stream process cluster module of data for the original recruitment information page that Spider is captured, by real-time streams cluster Real-time processing is carried out, including:Information extraction, quantization, standardization, vectorization.
More specifically, the resume of recruitment information and user is typically a kind of destructuring or semi-structured text, Follow-up query processing can just be carried out after the data for needing to change into structuring, information extraction is exactly by from destructuring or half Recruitment Business Name, position vacant, the number of recruits, position educational requirement, working experience requirement, firewood are extracted in the page of structuring Reward treatment, the basic element such as job requirement are so as to by destructuring or semi-structured text generating structure data.Then, pair can With the field for quantizing, for example work annual pay, specialty etc. are quantified;Such as entity information such as company, position is carried out into standard Change;And for the information of text class, such as job description, then according to semantic model, by vectorization generate it is corresponding it is semantic to Amount, such as by machine learning, using neutral net, trains the model for the business, and one section of text is then converted to one Individual high dimension vector, for stating the semanteme of text.
Service layer is used to, to the Query Information of user, be analyzed treatment, then with DB clusters described in data analysis layer in The recruitment information of preservation carries out real-time matching, and returns to the working opportunity most matched with user, specifically includes inquiry gateway, semanteme Retrieval service module, KVDB modules;The inquiry gateway provides external query interface, and Query Information to user input is carried out Pretreatment, including information extraction, quantization, vectorization, the working opportunity id lists for calling the semantic retrieval service module to return, Finally inquiry KVDB modules obtain complete working opportunity information;The semantic retrieval service module is for looking into that user submits to Ask, carry out screening matching, calculate matching degree, and return to the working opportunity id lists of matching;The KVDB modules, for store with The corresponding detailed operation opportunity information of working opportunity, and for front end page displaying.
More specifically, semantic retrieval service module is related to the floating-point operation of magnanimity due to needing to carry out semantic computation, Traditional CPU disposal abilities are very limited, and the system introduces GPU acceleration layers, for accelerating semantic computation, submitted to for user Inquiry, is screened according to inquiry screening conditions, calculates semantic vector similarity, then calculates matching degree, is sorted, and return The id lists for most matching.Therefore, the system is calculated by introducing GPU, and for each inquiry of user, we can be at 20 milliseconds The resume of left and right analysis user, then carries out real-time matching with the other recruitment information of millions, calculates matching degree and is then back to most The working opportunity of user is matched, working opportunity effectiveness of retrieval is substantially increased.
Fig. 2 shows the system framework figure of a kind of real-time semantic search working opportunity of one embodiment of the invention.
As shown in Fig. 2 Spider cluster modules by analyze each recruitment website each channel, each recruit list page In the renewal of the page cycle, the update cycle of each recruitment list page is estimated out, crawl section in real time is then carried out according to the update cycle Point information, then by Spider colony dispatching device scheduler tasks, and information is saved in web page library.
The real-time stream process cluster module of data enters for the primary data information (pdi) information that the Spider cluster modules are preserved , then be stored into for structural data by row real-time processing, including feature extraction, quantization, standardization, vectorization generative semantics vector In the DB clusters of structural data DB cluster modules, then the data in structural data DB cluster modules are loaded into semantic retrieval Service module.Semantic retrieval service module accelerates semantic computation by GPU acceleration layers, the inquiry submitted to user, according to inquiry Screening conditions are screened, and calculate semantic vector similarity, then calculate matching degree, sequence, and return to the work for most matching Chance id lists.Inquiry gateway provides external query interface, and Query Information to user input is pre-processed, including information KVDB moulds are finally inquired about in extraction, quantization, vectorization, the working opportunity id lists for calling the semantic retrieval service module to return Block obtains complete working opportunity information.Meanwhile, user can fill working opportunity information by inquiring about webmaster.KVDB modules, For load store detailed operation opportunity information corresponding with working opportunity, and for front end page displaying.
A kind of method of real-time semantic search working opportunity is provided according to another aspect of the present invention, and Fig. 3 shows this Invent a kind of method flow diagram of real-time semantic search working opportunity.The method comprises the following steps:
Step 1, dynamically detects and obtains the working opportunity of each recruitment channel, and real-time data are carried out to the working opportunity Analyzing and processing, and be put in storage preservation;To the information of the preservation, carry out real-time processing, including information extraction, quantization, standardization, to Quantify;Data after the real-time processing are stored in DB clusters, as it is wired on initial data.
Step 2, user is pre-processed by inquiring about gateway input inquiry information, the Query Information to user input, bag Information extraction, quantization, vectorization are included, semantic retrieval service is then called.
Step 3, the described pretreated Query Information submitted to by semantic retrieval service for user carries out screening Match somebody with somebody, calculate matching degree, and return to the working opportunity id lists of matching.
Step 4, the working opportunity id lists that the inquiry gateway is returned according to semantic retrieval service, stores in inquiry KVDB Detailed operation opportunity information corresponding with working opportunity, finally obtain complete working opportunity information, and be shown to preceding end page.
More specifically, described in the step 1 dynamic detection simultaneously obtains the working opportunity of each recruitment channel, to institute Stating working opportunity carries out real-time Data Analysis Services, and is put in storage preservation and is implemented as:It is each by analyzing each recruitment website In the renewal of the page cycle of individual channel, each recruitment list page, the update cycle of each recruitment list page is estimated out, then according to more The new cycle is captured nodal information in real time, and is saved in web page library.
More specifically, the resume of recruitment information and user is typically a kind of destructuring or semi-structured text, Follow-up query processing can just be carried out after the data for needing to change into structuring, information extraction is exactly by from destructuring or half Recruitment Business Name, position vacant, the number of recruits, position educational requirement, working experience requirement, firewood are extracted in the page of structuring Reward treatment, the basic element such as job requirement are so as to by destructuring or semi-structured text generating structure data.Then, pair can With the field for quantizing, for example work annual pay, specialty etc. are quantified;Such as entity information such as company, position is carried out into standard Change;And for the information of text class, such as job description, then according to semantic model, by vectorization generate it is corresponding it is semantic to Amount, such as by machine learning, using neutral net, trains the model for the business, and one section of text is then converted to one Individual high dimension vector, for stating the semanteme of text.
More specifically, semantic retrieval service is related to the floating-point operation of magnanimity due to needing to carry out semantic computation, tradition CPU disposal abilities it is very limited, the system introduce GPU acceleration layers, for accelerating semantic computation, for user submit to look into Ask, screened according to inquiry screening conditions, calculate semantic vector similarity, then calculate matching degree, sort, and return most The id lists of matching.Therefore, the system is calculated by introducing GPU, and for each inquiry of user, we can be on 20 milliseconds of left sides The resume of right analysis user, then carries out real-time matching with the other recruitment information of millions, calculates matching degree and is then back to most Working opportunity with user, substantially increases accuracy rate, recall rate and the search efficiency of working opportunity retrieval.
Real-time working chance search engine based on semanteme of the invention, by the collection of dynamic realtime, be polymerized each trick The recruitment information of door is engaged, is analyzed and processed in real time, then by semantic analysis matching treatment, with all working chance (Ten million rank)A matching degree is calculated, the working opportunity for most matching user's request is then returned according to matching degree sequence, reduced Screening cost, substantially increases the efficiency that user looks for a job.
The present invention can apply various searching engine fields, any need using semantic search and carry out data processing answer Can be used with scene.
In the description of this specification, the description of term " one embodiment ", " specific embodiment " etc. means to combine the reality Specific features, structure, material or the feature for applying example or example description are contained at least one embodiment of the invention or example. In this manual, the schematic representation to above-mentioned term is not necessarily referring to identical embodiment or example.And, description Specific features, structure, material or feature can in an appropriate manner be combined in one or more any embodiments or example.
The preferred embodiments of the present invention are the foregoing is only, is not intended to limit the invention, for the skill of this area For art personnel, the present invention can have various modifications and variations.It is all within the spirit and principles in the present invention, made any repair Change, equivalent, improvement etc., should be included within the scope of the present invention.

Claims (9)

1. a kind of system of real-time semantic search working opportunity, it is characterised in that the system includes:Data analysis layer and service Layer,
Data analysis layer is used to dynamically detect and obtain the working opportunity of each recruitment channel, and working opportunity is carried out in real time Data Analysis Services, specifically include the real-time stream process cluster module of Spider cluster modules, data, structural data DB clusters Module;The Spider cluster modules are used for the Real-time Collection recruitment information from the recruitment page of the whole network, and are put in storage preservation;It is described The real-time stream process cluster module of data carries out real-time processing, including information for the information that the Spider cluster modules are preserved Extraction, quantization, standardization, vectorization;The structural data DB cluster modules are by the real-time stream process cluster module of the data Data after treatment, are then distributed to each service module on line in real time;
Structured message after service layer's meeting real-time loading data analysis layer treatment, and set up index;To the Query Information of user, Semantic analysis treatment is first carried out, Similarity Measure is then carried out with index, return to the working opportunity most matched with user's inquiry;Tool Body includes inquiry gateway, semantic retrieval service module, KVDB modules;The inquiry gateway provides external query interface, and to The Query Information of family input is pre-processed, including information extraction, quantization, vectorization, calls the semantic retrieval service module The working opportunity id lists of return, finally inquire about KVDB modules and obtain complete working opportunity information;The semantic retrieval service Module is traveled through for the inquiry that user submits to index, calculates the semantic matching degree of inquiry and working opportunity, is returned most The working opportunity id lists matched somebody with somebody;The KVDB modules, for storing detailed operation opportunity information corresponding with working opportunity, are used in combination In front end page displaying.
2. a kind of system of real-time semantic search working opportunity according to claim 1, it is characterised in that the Spider Cluster module be implemented as by analyze each recruitment website each channel, each recruit renewal of the page cycle of list page, The update cycle of each recruitment list page is estimated out, is then captured in real time according to the update cycle, and in real time The page is analyzed, processes, is stored, be then distributed to downstream service module.
3. the system of a kind of real-time semantic search working opportunity according to claim 1, it is characterised in that described information is taken out Take refers to that, by non-structured text generating structure data, the quantization refers to quantify the field of denumerable value, the standard Change is to standardize entity information, and the vectorization refers to that the information vector of text class is turned into corresponding semantic vector.
4. a kind of system of real-time semantic search working opportunity according to claim 1, it is characterised in that the semantic inspection Rope service module includes GPU acceleration layers, for accelerating semantic computation, for the inquiry that user submits to, according to inquiry screening conditions Screened, calculated semantic vector similarity, then calculated matching degree, sorted, and returned to the id lists for most matching.
5. a kind of method of real-time semantic search working opportunity, it is characterised in that the method comprises the following steps:
Step 1, dynamically detects and obtains the working opportunity of each recruitment channel, and real-time data are carried out to the working opportunity Analyzing and processing, and be put in storage preservation;To the information of the preservation, carry out real-time processing, including information extraction, quantization, standardization, to Quantify;Data after the real-time processing are stored in DB clusters, as it is wired on initial data;
Step 2, by inquiring about gateway input inquiry information, the Query Information to user input is pre-processed user, bag Information extraction, quantization, vectorization are included, semantic retrieval service is then called;
Step 3, the described pretreated Query Information submitted to by semantic retrieval service for user carries out screening matching, meter Matching degree is calculated, and returns to the working opportunity id lists of matching;
Step 4, the working opportunity id lists that the inquiry gateway is returned according to the semantic retrieval service, stores in inquiry KVDB Detailed operation opportunity information corresponding with working opportunity, finally obtain complete working opportunity information, and be shown to preceding end page.
6. a kind of method of real-time semantic search working opportunity according to claim 5, it is characterised in that the step 1 Described in dynamic detection and obtain the working opportunity of each recruitment channel, real-time data point are carried out to the working opportunity Analysis is processed, and is put in storage preservation and is implemented as:
By analyze each recruitment website each channel, each recruit renewal of the page cycle of list page, estimate out each recruitment The update cycle of list page, nodal information is then captured according to the update cycle in real time, and be saved in web page library.
7. a kind of method of real-time semantic search working opportunity according to claim 5, it is characterised in that the step 1 In information extraction, quantization, standardization, vectorization be implemented as extracting to generate non-structured text by described information tying Structure data, the field of denumerable value is quantified, and entity information is standardized, and the information vector of text class is turned to corresponding Semantic vector.
8. a kind of method of real-time semantic search working opportunity according to claim 5, it is characterised in that the step 2 In information extraction, quantization, vectorization be implemented as being extracted non-structured text generating structure number by described information According to the field of denumerable value being quantified, and the information vector of text class is turned into corresponding semantic vector.
9. a kind of method of real-time semantic search working opportunity according to claim 5, it is characterised in that the semantic inspection Rope service is included by GPU acceleration layers, and for accelerating semantic computation, the inquiry submitted to user is entered according to inquiry screening conditions Row screening, calculates semantic vector similarity, then calculates matching degree, sorts, and returns to the id lists for most matching.
CN201611239045.XA 2016-12-29 2016-12-29 A kind of system and method for real-time semantic search working opportunity Pending CN106776575A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611239045.XA CN106776575A (en) 2016-12-29 2016-12-29 A kind of system and method for real-time semantic search working opportunity

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611239045.XA CN106776575A (en) 2016-12-29 2016-12-29 A kind of system and method for real-time semantic search working opportunity

Publications (1)

Publication Number Publication Date
CN106776575A true CN106776575A (en) 2017-05-31

Family

ID=58923217

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611239045.XA Pending CN106776575A (en) 2016-12-29 2016-12-29 A kind of system and method for real-time semantic search working opportunity

Country Status (1)

Country Link
CN (1) CN106776575A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107220380A (en) * 2017-06-27 2017-09-29 北京百度网讯科技有限公司 Question and answer based on artificial intelligence recommend method, device and computer equipment
CN107590133A (en) * 2017-10-24 2018-01-16 武汉理工大学 The method and system that position vacant based on semanteme matches with job seeker resume
CN107818134A (en) * 2017-09-26 2018-03-20 北京纳人网络科技有限公司 A kind of position similarity calculating method, client and server
CN109101600A (en) * 2018-08-01 2018-12-28 沈文策 The crawling method and device of dynamic data in a kind of webpage
CN111309870A (en) * 2020-03-04 2020-06-19 平安养老保险股份有限公司 Data rapid searching method and device and computer equipment
CN112445813A (en) * 2020-12-01 2021-03-05 深圳市中博科创信息技术有限公司 Search semantic analysis method for enterprise service portal platform

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104182389A (en) * 2014-07-21 2014-12-03 安徽华贞信息科技有限公司 Semantic-based big data analysis business intelligence service system

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104182389A (en) * 2014-07-21 2014-12-03 安徽华贞信息科技有限公司 Semantic-based big data analysis business intelligence service system

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107220380A (en) * 2017-06-27 2017-09-29 北京百度网讯科技有限公司 Question and answer based on artificial intelligence recommend method, device and computer equipment
CN107818134A (en) * 2017-09-26 2018-03-20 北京纳人网络科技有限公司 A kind of position similarity calculating method, client and server
CN107590133A (en) * 2017-10-24 2018-01-16 武汉理工大学 The method and system that position vacant based on semanteme matches with job seeker resume
CN109101600A (en) * 2018-08-01 2018-12-28 沈文策 The crawling method and device of dynamic data in a kind of webpage
CN111309870A (en) * 2020-03-04 2020-06-19 平安养老保险股份有限公司 Data rapid searching method and device and computer equipment
CN111309870B (en) * 2020-03-04 2022-11-18 平安养老保险股份有限公司 Data rapid searching method and device and computer equipment
CN112445813A (en) * 2020-12-01 2021-03-05 深圳市中博科创信息技术有限公司 Search semantic analysis method for enterprise service portal platform

Similar Documents

Publication Publication Date Title
CN106776575A (en) A kind of system and method for real-time semantic search working opportunity
US11256487B2 (en) Vectorized representation method of software source code
KR102288249B1 (en) Information processing method, terminal, and computer storage medium
CN108073568A (en) keyword extracting method and device
CN102646095B (en) Object classifying method and system based on webpage classification information
CN110232447B (en) Deep reasoning method for legal case
CN105528422A (en) Focused crawler processing method and apparatus
CN111680147A (en) Data processing method, device, equipment and readable storage medium
CN110347724A (en) Abnormal behaviour recognition methods, device, electronic equipment and medium
CN108830100B (en) User privacy leakage detection method, server and system based on multitask learning
CN110362663A (en) Adaptive more perception similarity detections and parsing
CN109285024B (en) Online feature determination method and device, electronic equipment and storage medium
CN112925904A (en) Lightweight text classification method based on Tucker decomposition
CN102289408A (en) regression test case sequencing method based on error propagation network
CN115794798B (en) Market supervision informatization standard management and dynamic maintenance system and method
CN109460506B (en) User demand driven resource matching pushing method
CN116561288A (en) Event query method, device, computer equipment, storage medium and program product
Jabeen et al. Divided we stand out! Forging Cohorts fOr Numeric Outlier Detection in large scale knowledge graphs (CONOD)
CN116366312A (en) Web attack detection method, device and storage medium
CN109284360A (en) A kind of automatic denoising method of patent retrieval and device
CN109189893A (en) A kind of method and apparatus of automatically retrieval
CN112749554B (en) Method, device, equipment and storage medium for determining text matching degree
CN113742495A (en) Rating characteristic weight determination method and device based on prediction model and electronic equipment
Ratnasari Performance of Random Oversampling, Random Undersampling, and SMOTE-NC Methods in Handling Imbalanced Class in Classification Models
KR102599008B1 (en) Method for processing multi-queries based on multi-query scheduler and data processing system providing the method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20170531