CN111400444A - Document selection method and device - Google Patents

Document selection method and device Download PDF

Info

Publication number
CN111400444A
CN111400444A CN202010149328.5A CN202010149328A CN111400444A CN 111400444 A CN111400444 A CN 111400444A CN 202010149328 A CN202010149328 A CN 202010149328A CN 111400444 A CN111400444 A CN 111400444A
Authority
CN
China
Prior art keywords
standard
document
field
predefined
fields
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010149328.5A
Other languages
Chinese (zh)
Inventor
李阳
魏聪惠
黄星
邱晓辉
杨志滔
陈建文
陈阳
王俐
曾佳妍
邱炜亨
苏鹏浩
王怡冰
刘洋
朱佳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Construction Bank Corp
Original Assignee
China Construction Bank Corp
CCB Finetech Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Construction Bank Corp, CCB Finetech Co Ltd filed Critical China Construction Bank Corp
Priority to CN202010149328.5A priority Critical patent/CN111400444A/en
Publication of CN111400444A publication Critical patent/CN111400444A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/3332Query translation
    • G06F16/3334Selection or weighting of terms from queries, including natural language queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • G06Q10/105Human resources
    • G06Q10/1053Employment or hiring

Abstract

The invention discloses a method and a device for selecting a document, and relates to the technical field of computers. One embodiment of the method comprises: acquiring a document to be selected, wherein the document to be selected contains predefined fields; calculating a screening value of the document to be selected according to the predefined field and a preset standard field library; the standard field library comprises a plurality of standard fields and scores corresponding to the standard fields respectively; and selecting a target document meeting a preset screening condition from the documents to be selected according to the screening value. The embodiment improves the accuracy and efficiency of document selection.

Description

Document selection method and device
Technical Field
The invention relates to the technical field of computers, in particular to a method and a device for selecting a document.
Background
At the time of enterprise recruitment, the proper talents are selected, usually by screening resumes delivered by job seekers.
Currently, enterprises still adopt a manual mode to screen resumes, that is, a recruitment specialist is used to browse a large number of resumes so as to select resumes meeting the requirements of the enterprises from the large number of resumes.
In the process of implementing the invention, the inventor finds that at least the following problems exist in the prior art:
the resume is screened manually depending on the personal experience of the recruiter, the selection result is not objective and accurate enough, and the process of the recruiter browsing the resumes one by one and selecting the resumes one by one is time-consuming and labor-consuming, so that the efficiency of screening the resumes is low.
Disclosure of Invention
In view of this, embodiments of the present invention provide a method and an apparatus for selecting a document, which can improve the efficiency of selecting a document and improve the accuracy of selecting a document.
To achieve the above object, according to an aspect of an embodiment of the present invention, a method of document selection is provided.
The method for selecting the document comprises the following steps: acquiring a document to be selected, wherein the document to be selected contains predefined fields;
calculating a screening value of the document to be selected according to the predefined field and a preset standard field library; the standard field library comprises a plurality of standard fields and scores corresponding to the standard fields respectively;
and selecting a target document meeting a preset screening condition from the documents to be selected according to the screening value.
Alternatively,
the calculating the screening value of the document to be selected according to the predefined field and a preset standard field library comprises:
determining one or more standard fields in the standard field library that match the predefined field and a degree of match between the predefined field and the matched standard field;
and calculating the screening value according to the standard field matched with the predefined field, the score of the standard field and the matching degree.
Alternatively,
the standard field library also comprises weighted values corresponding to the plurality of standard fields respectively;
and calculating the screening value according to one or more standard fields matched with the predefined fields in the standard field library, the scores and the weight values of the standard fields and the matching degree.
Alternatively,
and characterizing the matching degree by adopting a binary value.
Alternatively,
the determining the standard fields in the standard field library that match the predefined fields comprises:
one or more keywords are determined from the predefined fields, and the standard fields containing the keywords are used as the standard fields matched with the predefined fields.
Alternatively,
and converting the predefined field according to the format of the standard field matched with the predefined field, and calculating the matching degree according to the converted predefined field.
Alternatively,
when the standard fields in the standard field library correspond to a plurality of weight values, the weight values respectively correspond to a plurality of obtaining modes;
the calculating the screening value of the document to be selected according to the predefined field and a preset standard field library comprises:
and determining the weight value of the standard field matched with the predefined field according to the acquisition mode of the document to be selected, and calculating the screening value according to the determined weight value.
Optionally, the method further comprises:
and when the number of the documents to be selected is multiple, normalizing the screening values corresponding to the multiple documents to be selected respectively, and selecting the target document according to the normalized screening values.
Alternatively,
the screening conditions include: the screening value is larger than a preset threshold value, and/or the screening values are sorted from large to small and then are in a preset number.
To achieve the above object, according to still another aspect of an embodiment of the present invention, there is provided an apparatus for document selection.
The device for selecting the document comprises the following components: the device comprises an acquisition module, a calculation module and a selection module; wherein the content of the first and second substances,
the acquisition module is used for acquiring a document to be selected, and the document to be selected contains predefined fields;
the calculation module is used for calculating the screening value of the document to be selected according to the predefined field and a preset standard field library; the standard field library comprises a plurality of standard fields and scores corresponding to the standard fields respectively;
and the selection module is used for selecting a target document which meets a preset screening condition from the documents to be selected according to the screening value.
To achieve the above object, according to still another aspect of the embodiments of the present invention, there is provided an electronic device for document selection.
An electronic device for selecting a document according to an embodiment of the present invention includes: one or more processors; a storage device for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement a method of document selection in accordance with an embodiment of the present invention.
To achieve the above object, according to still another aspect of embodiments of the present invention, there is provided a computer-readable storage medium.
A computer-readable storage medium of an embodiment of the present invention has stored thereon a computer program that, when executed by a processor, implements a method of document selection of an embodiment of the present invention.
One embodiment of the above invention has the following advantages or benefits: after the documents to be selected containing the predefined fields are obtained, the screening values of the documents to be selected can be calculated according to the predefined fields and a standard field library comprising scores corresponding to a plurality of standard fields and a plurality of standard fields respectively, and then the target documents meeting the preset screening conditions are selected from the documents to be selected according to the screening values. Therefore, the screening value of the document to be selected is automatically and accurately calculated according to the predefined field in the document to be selected, and the accuracy of document selection is improved. In addition, the process does not need to adopt a manual mode to browse the documents one by one, so that the efficiency of document selection is improved.
Further effects of the above-mentioned non-conventional alternatives will be described below in connection with the embodiments.
Drawings
The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:
FIG. 1 is a schematic diagram of the main steps of a method of document selection according to an embodiment of the invention;
FIG. 2 is a schematic diagram of the main steps of another method of document selection according to an embodiment of the invention;
FIG. 3 is a schematic diagram of the main steps of yet another method of document selection according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of the major modules of an apparatus for document selection according to an embodiment of the present invention;
FIG. 5 is an exemplary system architecture diagram in which embodiments of the present invention may be employed;
fig. 6 is a schematic block diagram of a computer system suitable for use in implementing a terminal device or server of an embodiment of the invention.
Detailed Description
Exemplary embodiments of the present invention are described below with reference to the accompanying drawings, in which various details of embodiments of the invention are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
It should be noted that the embodiments of the present invention and the technical features of the embodiments may be combined with each other without conflict.
FIG. 1 is a schematic diagram of the main steps of a method of document selection according to an embodiment of the present invention.
As shown in fig. 1, the method for selecting a document according to the embodiment of the present invention mainly includes the following steps:
step S101: and acquiring a document to be selected, wherein the document to be selected contains a predefined field.
The document to be selected may be a resume, and the predefined field in the document to be selected is a fixed field included in the resume, such as name, gender, age, graduation institution, acquired awards, working age, and the like.
Step S102: calculating a screening value of the document to be selected according to the predefined field and a preset standard field library; the standard field library comprises a plurality of standard fields and scores corresponding to the standard fields respectively.
The standard fields in the standard field library may be indexes of the evaluation resume, such as an arbitrary time length, an actual time length, an educational experience (academic or graduate schools, and the like), acquired awards, and an award level (national or school level, and the like). In the standard field library, a plurality of standard fields may be classified and stored, for example, classified into a work experience class, a school experience, a learning experience, a work language, and the like, and then, as shown in table 1 below, the various standard fields are stored in correspondence with the post requirements and the matching conditions.
TABLE 1
Figure BDA0002399009790000061
In order to facilitate the matching of the predefined fields in the document to be selected and the standard fields, each standard field is described in a binary value manner in the standard field library, such as by four levels in english (the matching result is only yes or no) and academic mastery (the matching result is only yes or no). Some standard fields which are inconvenient to match in a binary value mode can be converted into a format which can be matched in the binary value mode in advance, if the free time is generally N years, in order to match the predefined fields and the standard fields in the binary value mode, the standard fields corresponding to the free time can be described as different time periods, such as the free time exceeding 3 years, the free time exceeding 5 years and the like.
Certainly, in the standard field library, scores corresponding to each standard field are further recorded, and the scores can be predetermined according to the position requirements, for example, if the work experience in some positions is not seen, the score corresponding to the work experience is higher, and the score corresponding to the academic calendar is lower. The scores corresponding to the various criteria fields may be stored using a table such as that shown in table 2.
TABLE 2
Figure BDA0002399009790000071
Figure BDA0002399009790000081
To further improve the accuracy of the resume screening, in one embodiment of the present invention, one or more standard fields in the standard field library that match the predefined fields and the matching degree between the predefined fields and the matched standard fields are determined; and calculating the screening value according to the standard field matched with the predefined field, the score of the standard field and the matching degree.
Here, each standard field in the standard field library is described in a binary value manner, that is, the standard field in the standard field library has a fixed format, and the obtained document to be selected may be in various personalized forms, but each predefined field in the document to be selected may be matched with one or more standard fields. For example, among the predefined fields characterizing the working experience in the document to be selected, there are predefined fields relating to the time of employment, the format of which, although it may be various, such as XX and XX days in XXXX years, or XX and XX days in XXXX months to YYYY years in YY months YY days, or K years in the time of employment, the time of employment may be matched with the standard fields characterizing the time of employment in the library of standard fields, and when matched, one or more keywords may be determined from the predefined fields, and then the standard fields containing the keywords are used as the standard fields matched with the predefined fields.
In one embodiment of the invention, matching of the predefined fields and the standard fields may be achieved using natural language processing techniques. In this example, the standard field matching the predefined field may be determined according to the keyword such as "incumbent" and/or "time" in the predefined field, for example, the predefined field representing the incumbent time may match the standard field such as "incumbent time exceeds three years" and "incumbent time exceeds five years".
In order to improve the matching efficiency between the predefined field and the standard field, in an embodiment of the present invention, the predefined field may be converted according to a format of the standard field matched with the predefined field, and the matching degree may be calculated according to the converted predefined field.
For example, when the standard fields characterizing the incumbent time include "the incumbent time exceeds three years" and "the incumbent time exceeds five years", that is, the format of the standard fields characterizing the incumbent time is divided by years, the predefined fields of different formats characterizing the incumbent time in the document to be selected may be converted into a format characterized by years, such as converting the incumbent time characterized by "XXXX month XX day by date" or "XXXX month XX day by XX day to YYYY year YY month YY day" into a format characterized by years, such as the converted predefined field characterizing the incumbent time is the incumbent time duration K years. It can be understood that a standard resume template can be formed according to the format of each standard field, and after receiving the document to be selected in the personalized format, the standard resume template is established according to the standard to realize the standardization of the document to be selected, so as to convert the predefined fields according to the format of the standard fields. Therefore, matching between the predefined field and the standard field is facilitated, the matching degree between the predefined field and the matched standard field is further facilitated to be calculated, and the selection efficiency of the document to be selected is further facilitated to be improved.
In addition, since the standard fields in the standard field library are described in a binary value form, one predefined field may be matched to multiple standard fields, and as in the above example, the predefined field representing the free time may be matched to two standard fields, namely "the free time exceeds three years" and "the free time exceeds five years", so that in order to improve the accuracy of the resume selection, the screening value of the document to be selected may be further calculated according to the matching degree between the predefined field and the standard field. Wherein the degree of match characterizes a similarity between the predefined field and one or more standard fields to which it is matched.
In order to improve the efficiency of document selection, in one embodiment of the present invention, the degree of matching is characterized by a binary value, that is, the degree of matching between the predefined field and the standard field is characterized in the form of 1 or 0. For example, when the predefined field characterizing the incumbent time is "4 years of incumbent time", the degree of matching with the standard field of "the incumbent time exceeds three years" is 1, and the degree of matching with the standard field of "the incumbent time exceeds five years" is 0.
Therefore, the screening value of the document to be selected can be calculated by adopting the following formula (1) according to the standard field matched with each predefined field in the document to be selected, the matching degree between the standard field and the score of the standard field. Because the binary value is adopted to represent the matching degree between the predefined field and the standard field, the calculation process is simplified, and the operation efficiency of the screening value of the document to be selected is improved, thereby being beneficial to improving the efficiency of document selection.
Figure BDA0002399009790000101
Wherein S represents the screening value of the document to be selected, mijCharacterizing the degree of match between the jth standard field and the ith predefined field in the standard field library that matches the ith predefined field in the document to be selected, LijThe score characterizing a jth standard field of the standard fields in the library of standard fields that matches the ith predefined field.
In order to further improve the accuracy of document selection, in an embodiment of the present invention, the standard field library further includes weight values corresponding to the plurality of standard fields, respectively; and calculating the screening value according to one or more standard fields matched with the predefined fields in the standard field library, the scores and the weight values of the standard fields and the matching degree.
The standard field library recorded with the weight values corresponding to the plurality of standard fields may be as shown in table 3 below:
TABLE 3
Figure BDA0002399009790000102
Figure BDA0002399009790000111
The weight value corresponding to each standard field can be determined according to the post requirement and/or the recruitment scene. For example, in the school enrollment scenario, the experience at school is mainly considered, and therefore the weight value of the standard field associated with the experience at school may be set higher, while the weight value of the standard field associated with the job experience may be set lower. For another example, the social note mainly examines the work experience, and thus the weight value of the standard field related to the work experience may be set higher, and the weight value of the standard field related to the school experience may be set lower. For another example, off-shore work stations and township work stations require higher weight values for the languages of the small or local languages.
Therefore, the screening value of the document to be selected can be calculated by adopting the following formula (2) according to the standard field matched with each predefined field in the document to be selected, the matching degree between the standard field and the standard field, and the score and the weight value of the standard field:
Figure BDA0002399009790000112
wherein S represents the screening value of the document to be selected, mijCharacterizing the degree of match between the jth standard field and the ith predefined field in the standard field library that matches the ith predefined field in the document to be selected, LijCharacterization ofScore for the jth standard field in the standard fields in the library of standard fields that match the ith predefined field, αijAnd characterizing the weight value of the jth standard field in the standard fields matched with the ith predefined field in the standard field library.
Because the weight values of the standard fields may be different according to different recruitment scenes, when the standard fields in the standard field library correspond to a plurality of weight values, and the plurality of weight values respectively correspond to a plurality of obtaining modes, the weight values of the standard fields matched with the predefined fields can be determined according to the obtaining modes of the document to be selected, and the screening value is calculated according to the determined weight values.
The multiple acquiring manners corresponding to the multiple weight values of the same standard field respectively may correspond to different recruitment scenes, for example, if the standard field "the time of job exceeding 3 years" corresponds to a weight value of 2% for a recruitment, and the weight value corresponding to a social recruitment is 5%, when the document to be selected is received, the weight value of the standard field matched with the predefined field in the document to be selected may be determined according to the acquiring manner of the document to be selected (for example, acquired by a school recruitment manner or acquired by a social recruitment manner), and if the document to be selected is acquired by the school recruitment, the weight value of the standard field "the time of job exceeding 3 years" matched with the predefined field in the document to be selected is 2%.
According to the above embodiment, taking the document to be selected as the resume as an example, the method for selecting the document provided by the embodiment of the present invention may include the following steps S201 to S205:
step S201: and acquiring the resume to be selected, which comprises a plurality of predefined fields.
Step S202: determining one or more standard fields matched with each predefined field from a standard field library, and determining the matching degree between the one or more standard fields and the corresponding predefined fields thereof and the scores and the weight values respectively corresponding to the one or more standard fields; when there is a standard field corresponding to a plurality of weight values, executing step S203; when the weight value of each standard field is one, step S204 is performed.
Step S203: and determining the weight value of the standard field according to the acquisition mode of the resume to be selected, and executing the step S204.
Step S204: and calculating a screening value of the resume to be selected according to the matching degree between one or more standard fields and the corresponding predefined fields thereof and the scores and the weight values respectively corresponding to the one or more standard fields.
Step S205: and selecting the target document which meets the screening condition according to the screening value of the resume to be selected.
Therefore, the screening value of the resume to be selected can be automatically and accurately calculated according to the predefined fields in the resume to be selected, so that the accuracy of resume selection is improved, manual mode for browsing the resumes one by one is not needed in the process, and the efficiency of resume screening is improved.
Step S103: and selecting a target document meeting a preset screening condition from the documents to be selected according to the screening value.
Wherein the screening conditions comprise: the screening value is larger than a preset threshold value, and/or the screening values are sorted from large to small and then are in a preset number.
For example, if the filtering condition is that the filtering value is greater than the preset threshold and the preset threshold is 90, the document to be selected with the filtering value greater than 90 is taken as the target document. When the screening condition is that the screening value is a screening value of a preset number of documents after the screening value is sorted from large to small and the preset number is 2, when the screening value of the document A to be selected is 100, the screening value of the document B to be selected is 90, and the screening value of the document C to be selected is 80, the first 2 documents to be selected in the three documents to be selected are used as target documents, namely the document A to be selected and the document B to be selected are used as the target documents.
In addition, when the number of the documents to be selected is multiple, in order to facilitate comparison of the documents to be selected including predefined fields of different units or magnitudes, normalization processing may be performed on the screening values corresponding to the multiple documents to be selected, so as to convert the screening values of the respective documents to be selected into dimensionless pure values, and then the target document is selected according to the screening values after the normalization processing, so as to improve accuracy and efficiency of document selection.
Specifically, the following formula (3) may be adopted to normalize the filtering value of the document to be selected:
Figure BDA0002399009790000131
the document selecting method comprises the steps of S' representing a screening value of a k-th document to be selected in a plurality of documents to be selected after normalization, S representing a screening value of a k-th document to be selected in the plurality of documents to be selected, MIN representing a minimum screening value in the plurality of documents to be selected, and MAX representing a maximum screening value in the plurality of documents to be selected.
According to the above embodiment, the method for selecting a document provided by the present invention may include the steps of:
step S301: obtaining a plurality of documents to be selected, wherein the documents to be selected contain predefined fields.
Step S302: respectively calculating the screening value of each document to be selected according to the predefined field and a preset standard field library; the standard field library comprises a plurality of standard fields, and scores and weight values corresponding to the standard fields respectively.
Step S303: and normalizing the screening value of each document to be selected according to the screening value corresponding to each document to be selected.
Step S304: and selecting a target document which meets a preset screening condition from the plurality of documents to be selected according to the screening value after the normalization processing.
After normalization processing, the screening values of the documents to be selected are converted into dimensionless values, as shown in table 4, so that comparison of a plurality of documents to be selected in different recruitment scenes and aiming at different recruitment departments and the like can be facilitated, and accuracy and efficiency of document selection are improved.
TABLE 4
Figure BDA0002399009790000141
Figure BDA0002399009790000151
According to the document selection method provided by the embodiment of the invention, after the document to be selected containing the predefined field is obtained, the screening value of the document to be selected can be calculated according to the predefined field and the standard field library comprising the scores corresponding to the standard fields and the standard fields respectively, and then the target document meeting the preset screening condition is selected from the document to be selected according to the screening value. Therefore, the screening value of the document to be selected is automatically and accurately calculated according to the predefined field in the document to be selected, and the accuracy of document selection is improved. In addition, the process does not need to adopt a manual mode to browse the documents one by one, so that the efficiency of document selection is improved.
When the document to be selected is the resume containing the predefined fields, automatic scoring of the resumes suitable for different recruitment scenes can be realized, and support is provided for resume screening. And the algorithm is simple and feasible, can adapt to different recruitment requirements, and is favorable for reducing the development and expandability cost of the application program. When a new recruitment scene exists, the score and the weighted value in the standard field library are updated correspondingly, so that the document selection method provided by the embodiment of the invention has flexible configuration and strong expandability. In addition, the document selection process does not need to depend on the qualification of a recruitment specialist, so that the score and the weight of each standard field do not need to be grasped from the whole situation, and the accuracy of resume screening is improved.
FIG. 4 is a schematic diagram of the main modules of an apparatus for document selection according to an embodiment of the present invention.
As shown in fig. 4, the apparatus 400 for selecting a document according to an embodiment of the present invention includes: an acquisition module 401, a calculation module 402 and a selection module 403; wherein the content of the first and second substances,
the obtaining module 401 is configured to obtain a document to be selected, where the document to be selected includes a predefined field;
the calculating module 402 is configured to calculate a screening value of the document to be selected according to the predefined field and a preset standard field library; the standard field library comprises a plurality of standard fields and scores corresponding to the standard fields respectively;
the selecting module 403 is configured to select, according to the screening value, a target document that meets a preset screening condition from the documents to be selected.
In an embodiment of the present invention, the calculating module 402 is configured to determine one or more standard fields in the standard field library that match the predefined field, and a matching degree between the predefined field and the matched standard field; and calculating the screening value according to the standard field matched with the predefined field, the score of the standard field and the matching degree.
In an embodiment of the present invention, the standard field library further includes weight values corresponding to the plurality of standard fields, respectively; the calculating module 402 is configured to calculate the screening value according to one or more standard fields in the standard field library that match the predefined field, the score, the weight value, and the matching degree of each of the standard fields.
In one embodiment of the invention, the degree of match is characterized by a binary value.
In an embodiment of the present invention, the calculating module 402 is configured to determine one or more keywords from the predefined fields, and use a standard field containing the keywords as a standard field matching the predefined fields.
In an embodiment of the present invention, the calculating module 402 is configured to convert the predefined field according to a format of a standard field matched with the predefined field, and calculate the matching degree according to the converted predefined field.
In an embodiment of the present invention, when a standard field in the standard field library corresponds to a plurality of weight values, the plurality of weight values respectively correspond to a plurality of obtaining manners; the calculating module 402 is configured to determine a weight value of a standard field matched with the predefined field according to the obtaining manner of the document to be selected, and calculate the screening value according to the determined weight value.
In an embodiment of the present invention, when there are a plurality of documents to be selected, the selecting module 403 is configured to perform normalization processing on the screening values corresponding to the plurality of documents to be selected, and select the target document according to the screening value after the normalization processing.
In one embodiment of the present invention, the screening conditions include: the screening value is larger than a preset threshold value, and/or the screening values are sorted from large to small and then are in a preset number.
According to the device for selecting the document, provided by the embodiment of the invention, after the document to be selected containing the predefined field is obtained, the screening value of the document to be selected can be calculated according to the predefined field and the standard field library comprising the scores corresponding to the plurality of standard fields and the plurality of standard fields respectively, and then the target document meeting the preset screening condition is selected from the document to be selected according to the screening value. Therefore, the screening value of the document to be selected is automatically and accurately calculated according to the predefined field in the document to be selected, and the accuracy of document selection is improved. In addition, the process does not need to adopt a manual mode to browse the documents one by one, so that the efficiency of document selection is improved.
FIG. 5 illustrates an exemplary system architecture 500 of a document selection method or apparatus to which embodiments of the invention may be applied.
As shown in fig. 5, the system architecture 500 may include terminal devices 501, 502, 503, a network 504, and a server 505. The network 504 serves to provide a medium for communication links between the terminal devices 501, 502, 503 and the server 505. Network 504 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
The user may use the terminal devices 501, 502, 503 to interact with a server 505 over a network 504 to receive or send messages or the like. The terminal devices 501, 502, 503 may have various communication client applications installed thereon, such as a shopping application, a web browser application, a search application, an instant messaging tool, a mailbox client, social platform software, and the like.
The terminal devices 501, 502, 503 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.
The server 505 may be a server that provides various services, such as a background management server that supports shopping websites browsed by users using the terminal devices 501, 502, 503. The background management server may analyze and perform other processing on the received data such as the product information query request, and feed back a processing result (e.g., target push information and product information) to the terminal device.
It should be noted that the method for selecting a document provided by the embodiment of the present invention is generally executed by the server 505, and accordingly, a device for selecting a document is generally disposed in the server 505.
It should be understood that the number of terminal devices, networks, and servers in fig. 5 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
Referring now to FIG. 6, a block diagram of a computer system 600 suitable for use with a terminal device implementing an embodiment of the invention is shown. The terminal device shown in fig. 6 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.
As shown in fig. 6, the computer system 600 includes a Central Processing Unit (CPU)601 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)602 or a program loaded from a storage section 608 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data necessary for the operation of the system 600 are also stored. The CPU 601, ROM 602, and RAM 603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.
To the I/O interface 605, AN input section 606 including a keyboard, a mouse, and the like, AN output section 607 including a network interface card such as a Cathode Ray Tube (CRT), a liquid crystal display (L CD), and the like, a speaker, and the like, a storage section 608 including a hard disk, and the like, and a communication section 609 including a network interface card such as a L AN card, a modem, and the like, the communication section 609 performs communication processing via a network such as the internet, a drive 610 is also connected to the I/O interface 605 as necessary, a removable medium 611 such as a magnetic disk, AN optical disk, a magneto-optical disk, a semiconductor memory, and the like is mounted on the drive 610 as necessary, so that a computer program read out therefrom is mounted into the storage section 608 as necessary.
In particular, according to the embodiments of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 609, and/or installed from the removable medium 611. The computer program performs the above-described functions defined in the system of the present invention when executed by the Central Processing Unit (CPU) 601.
It should be noted that the computer readable medium shown in the present invention can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The modules described in the embodiments of the present invention may be implemented by software or hardware. The described modules may also be provided in a processor, which may be described as: a processor includes an acquisition module, a calculation module, and a selection module. The names of these modules do not in some cases constitute a limitation to the module itself, and for example, the acquiring module may also be described as a "module that acquires a document to be selected".
As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs which, when executed by a device, cause the device to comprise: acquiring a document to be selected, wherein the document to be selected contains predefined fields; calculating a screening value of the document to be selected according to the predefined field and a preset standard field library; the standard field library comprises a plurality of standard fields and scores corresponding to the standard fields respectively; and selecting a target document meeting a preset screening condition from the documents to be selected according to the screening value.
According to the technical scheme of the embodiment of the invention, after the document to be selected containing the predefined field is obtained, the screening value of the document to be selected can be calculated according to the predefined field and the standard field library comprising the scores corresponding to the standard fields and the standard fields respectively, and then the target document meeting the preset screening condition is selected from the document to be selected according to the screening value. Therefore, the screening value of the document to be selected is automatically and accurately calculated according to the predefined field in the document to be selected, and the accuracy of document selection is improved. In addition, the process does not need to adopt a manual mode to browse the documents one by one, so that the efficiency of document selection is improved.
The above-described embodiments should not be construed as limiting the scope of the invention. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions can occur, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (11)

1. A method of document selection, comprising:
acquiring a document to be selected, wherein the document to be selected contains predefined fields;
calculating a screening value of the document to be selected according to the predefined field and a preset standard field library; the standard field library comprises a plurality of standard fields and scores corresponding to the standard fields respectively;
and selecting a target document meeting a preset screening condition from the documents to be selected according to the screening value.
2. The method according to claim 1, wherein the calculating the filtering value of the document to be selected according to the predefined fields and a preset standard field library comprises:
determining one or more standard fields in the standard field library that match the predefined field and a degree of match between the predefined field and the matched standard field;
and calculating the screening value according to the standard field matched with the predefined field, the score of the standard field and the matching degree.
3. The method of claim 2, wherein the standard field library further comprises weight values corresponding to the plurality of standard fields respectively;
and calculating the screening value according to one or more standard fields matched with the predefined fields in the standard field library, the scores and the weight values of the standard fields and the matching degree.
4. The method of claim 2, wherein determining the standard field in the standard field library that matches the predefined field comprises:
one or more keywords are determined from the predefined fields, and the standard fields containing the keywords are used as the standard fields matched with the predefined fields.
5. The method of claim 2,
converting the predefined field according to the format of the standard field matched with the predefined field, and calculating the matching degree according to the converted predefined field;
and/or the presence of a gas in the gas,
and characterizing the matching degree by adopting a binary value.
6. The method of claim 3, wherein when the standard field in the standard field library corresponds to a plurality of weight values, the plurality of weight values respectively correspond to a plurality of obtaining manners;
the calculating the screening value of the document to be selected according to the predefined field and a preset standard field library comprises:
and determining the weight value of the standard field matched with the predefined field according to the acquisition mode of the document to be selected, and calculating the screening value according to the determined weight value.
7. The method of claim 1, further comprising: when the number of the documents to be selected is plural,
and normalizing the screening values corresponding to the plurality of documents to be selected respectively, and selecting the target document according to the normalized screening values.
8. The method according to any one of claims 1 to 7,
the screening conditions include: the screening value is larger than a preset threshold value, and/or the screening values are sorted from large to small and then are in a preset number.
9. An apparatus for document selection, comprising: the device comprises an acquisition module, a calculation module and a selection module; wherein the content of the first and second substances,
the acquisition module is used for acquiring a document to be selected, and the document to be selected contains predefined fields;
the calculation module is used for calculating the screening value of the document to be selected according to the predefined field and a preset standard field library; the standard field library comprises a plurality of standard fields and scores corresponding to the standard fields respectively;
and the selection module is used for selecting a target document which meets a preset screening condition from the documents to be selected according to the screening value.
10. An electronic device, comprising:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-8.
11. A computer-readable medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1-8.
CN202010149328.5A 2020-03-03 2020-03-03 Document selection method and device Pending CN111400444A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010149328.5A CN111400444A (en) 2020-03-03 2020-03-03 Document selection method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010149328.5A CN111400444A (en) 2020-03-03 2020-03-03 Document selection method and device

Publications (1)

Publication Number Publication Date
CN111400444A true CN111400444A (en) 2020-07-10

Family

ID=71436296

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010149328.5A Pending CN111400444A (en) 2020-03-03 2020-03-03 Document selection method and device

Country Status (1)

Country Link
CN (1) CN111400444A (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080005165A1 (en) * 2006-06-28 2008-01-03 Martin James A Configurable field definition document
US20080126335A1 (en) * 2006-11-29 2008-05-29 Oracle International Corporation Efficient computation of document similarity
CN107909340A (en) * 2017-11-08 2018-04-13 平安科技(深圳)有限公司 Resume selection method, electronic device and readable storage medium storing program for executing
CN108629046A (en) * 2018-05-14 2018-10-09 平安科技(深圳)有限公司 A kind of fields match method and terminal device
CN109766438A (en) * 2018-12-12 2019-05-17 平安科技(深圳)有限公司 Biographic information extracting method, device, computer equipment and storage medium
CN110084571A (en) * 2019-05-08 2019-08-02 软通智慧科技有限公司 A kind of resume selection method, apparatus, server and medium
CN110263311A (en) * 2019-05-22 2019-09-20 中国平安财产保险股份有限公司 A kind of generation method and equipment of Webpage
CN110502514A (en) * 2019-08-15 2019-11-26 中国平安财产保险股份有限公司 Collecting method, device, equipment and computer readable storage medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080005165A1 (en) * 2006-06-28 2008-01-03 Martin James A Configurable field definition document
US20080126335A1 (en) * 2006-11-29 2008-05-29 Oracle International Corporation Efficient computation of document similarity
CN107909340A (en) * 2017-11-08 2018-04-13 平安科技(深圳)有限公司 Resume selection method, electronic device and readable storage medium storing program for executing
CN108629046A (en) * 2018-05-14 2018-10-09 平安科技(深圳)有限公司 A kind of fields match method and terminal device
CN109766438A (en) * 2018-12-12 2019-05-17 平安科技(深圳)有限公司 Biographic information extracting method, device, computer equipment and storage medium
CN110084571A (en) * 2019-05-08 2019-08-02 软通智慧科技有限公司 A kind of resume selection method, apparatus, server and medium
CN110263311A (en) * 2019-05-22 2019-09-20 中国平安财产保险股份有限公司 A kind of generation method and equipment of Webpage
CN110502514A (en) * 2019-08-15 2019-11-26 中国平安财产保险股份有限公司 Collecting method, device, equipment and computer readable storage medium

Similar Documents

Publication Publication Date Title
CN111898643B (en) Semantic matching method and device
CN113204621B (en) Document warehouse-in and document retrieval method, device, equipment and storage medium
US20220121668A1 (en) Method for recommending document, electronic device and storage medium
CN113836314B (en) Knowledge graph construction method, device, equipment and storage medium
US10719529B2 (en) Presenting a trusted tag cloud
CN113722438A (en) Sentence vector generation method and device based on sentence vector model and computer equipment
CN111104479A (en) Data labeling method and device
CN111143556A (en) Software function point automatic counting method, device, medium and electronic equipment
CN113393306A (en) Product recommendation method and device, electronic equipment and computer readable medium
CN113268560A (en) Method and device for text matching
CN111930891B (en) Knowledge graph-based search text expansion method and related device
CN113590756A (en) Information sequence generation method and device, terminal equipment and computer readable medium
US20150363803A1 (en) Business introduction interface
CN109670183B (en) Text importance calculation method, device, equipment and storage medium
CN116186198A (en) Information retrieval method, information retrieval device, computer equipment and storage medium
CN109492089A (en) Method and apparatus for output information
CN111400444A (en) Document selection method and device
CN110276001B (en) Checking page identification method and device, computing equipment and medium
CN114218431A (en) Video searching method and device, electronic equipment and storage medium
CN113609833A (en) Dynamic generation method and device of file, computer equipment and storage medium
CN107621892B (en) Method and device for acquiring information
CN112784600A (en) Information sorting method and device, electronic equipment and storage medium
CN112016017A (en) Method and device for determining characteristic data
CN116610782B (en) Text retrieval method, device, electronic equipment and medium
CN113342646B (en) Use case generation method, device, electronic equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20221008

Address after: 25 Financial Street, Xicheng District, Beijing 100033

Applicant after: CHINA CONSTRUCTION BANK Corp.

Address before: 25 Financial Street, Xicheng District, Beijing 100033

Applicant before: CHINA CONSTRUCTION BANK Corp.

Applicant before: Jianxin Financial Science and Technology Co.,Ltd.