CN109740157A - The label of working individual determines method, apparatus and computer storage medium - Google Patents

The label of working individual determines method, apparatus and computer storage medium Download PDF

Info

Publication number
CN109740157A
CN109740157A CN201811637972.6A CN201811637972A CN109740157A CN 109740157 A CN109740157 A CN 109740157A CN 201811637972 A CN201811637972 A CN 201811637972A CN 109740157 A CN109740157 A CN 109740157A
Authority
CN
China
Prior art keywords
verb
label
text data
working individual
working
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811637972.6A
Other languages
Chinese (zh)
Other versions
CN109740157B (en
Inventor
陈凤杰
任真
黄扬
单若诚
周星
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guizhou Little Love Robot Technology Co Ltd
Original Assignee
Guizhou Little Love Robot Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guizhou Little Love Robot Technology Co Ltd filed Critical Guizhou Little Love Robot Technology Co Ltd
Priority to CN201811637972.6A priority Critical patent/CN109740157B/en
Publication of CN109740157A publication Critical patent/CN109740157A/en
Application granted granted Critical
Publication of CN109740157B publication Critical patent/CN109740157B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Machine Translation (AREA)

Abstract

A kind of label of working individual determines method, apparatus and computer storage medium, which comprises obtains the content of text data;Extract working individual title, proper noun and the verb in the text data;The working individual includes: staff and department;The determining verb to match with the proper noun is as label;According to the working individual title in the position of position and the label in the text data in the text data, by the tag match to corresponding working individual title.Using the above scheme, the ability determination of working individual can be made more objective, accuracy is higher.

Description

The label of working individual determines method, apparatus and computer storage medium
Technical field
The present invention relates to data processing fields more particularly to a kind of label of working individual to determine method, apparatus and meter Calculation machine storage medium.
Background technique
Nowadays, since the development of each enterprise is more and more rapider, the working individual quantity of enterprises is more and more, therefore In order to preferably carry out business administration, the ability for understanding each employee in enterprise becomes particularly significant.
In the prior art, since the manager of enterprise can not get along with employee all in enterprise in person, link up, because This, determines that the mode for the ability that each working individual has is usually to summarize after being summarized by the immediate boss of each employee.
It artificially summarizes to the ability of working individual however, only relying only on, the ability of working individual determines usual mistake In subjectivity, objectivity and accuracy are poor, it is difficult to realize the ability by understanding each employee in enterprise, carry out to enterprise Preferably manage.
Summary of the invention
Present invention solves the technical problem that be working individual ability determine excessively subjective, objectivity and accuracy compared with Difference.
In order to solve the above technical problems, the label that the embodiment of the present invention provides a kind of working individual determines method, comprising: obtain Take the content of text data;Extract working individual title, proper noun and the verb in the text data;The working individual It include: staff and department;The determining verb to match with the proper noun is as label;According to the working individual Title is in the position of position and the label in the text data in the text data, extremely by the tag match Corresponding working individual title.
Optionally, the content for obtaining text data, comprising: the text data is converted into html file format; By crawler mode, the content in the text data of the html file format is extracted.
Optionally, it after the working individual title extracted in the text data, proper noun and verb, also wraps It includes: according to the working individual title and proper noun in the text data, construction work individual title dictionary and proper noun Dictionary;The text data is segmented according to the working individual title dictionary and the proper noun dictionary, after participle Filter out the stop words in the text data.
Optionally, it after the working individual title extracted in the text data, proper noun and verb, also wraps It includes: according to the verb in the text data, constructing verb dictionary;Filter out effective verb in the verb dictionary;Statistics Word frequency of effective verb in the text data in the verb dictionary.
Optionally, the effective verb filtered out in the verb dictionary, comprising: by semantic analysis algorithm or preset Verb screen dictionary, filter out effective verb in the verb dictionary.
Optionally, in the effective verb counted in the verb dictionary after the word frequency in the text data, Further include: according to the word frequency, effective verb in the text data is split or merged.
Optionally, the verb that the determination and proper noun match is as label, comprising: in the text data In a certain sentence, if it exists with the proper noun in the sentence have incidence relation verb, determine the verb for institute State the label that proper noun matches.
Optionally, the position and label according to working individual title in the text data is in the text Position in data, by tag match to corresponding working individual title, comprising: a certain sentence in the text data In, it is if between the working individual title being coordination, the tag match in the sentence is all into the sentence Working individual title;If mutually indepedent between the working individual title, by the tag match to number of words in the sentence away from The working individual title nearest from the label.
The present invention also provides a kind of label determining devices of working individual, comprising: acquiring unit, extraction unit, label are true Order member and tag match unit, in which: the acquiring unit, for obtaining the content of text data;The extraction unit is used Working individual title, proper noun and verb in the extraction text data;The working individual include: staff with And department;The tag determination unit, for the determining verb to match with proper noun as label;The tag match list Member, for according to working individual title in the position of position and the label in the text data in the text data It sets, by the tag match to corresponding working individual title.
The present invention also provides a kind of computer readable storage mediums, are stored thereon with computer instruction, the computer can Reading storage medium is non-volatile memory medium or non-transitory storage media, and the computer instruction executes any of the above-described when running The step of label of the working individual of kind determines method.
The present invention also provides a kind of electronic equipment, including memory and processor, computer is stored on the memory The label of instruction, the computer instruction working individual that the processor executes any of the above-described kind when running determines the step of method Suddenly.
Compared with prior art, the technical solution of the embodiment of the present invention has the advantages that
By the content for obtaining text data;Extract working individual title, proper noun and the verb in text data;Really The verb that fixed and proper noun matches is as label;According to position of the working individual title in the document data, and Position of the label in the text data, by tag match to corresponding working individual title.It, will be in above scheme Proper noun carries out matched verb as label, has filtered off a part of invalid verb to a certain extent, has embodied selection Label accuracy, using the position of label and working individual in text data as the limiting factor in matching process, protect The correct matching of label and working individual is demonstrate,proved, to sum up, using the above scheme, so that the ability of working individual determines more visitor It sees, accuracy is higher.
Detailed description of the invention
Fig. 1 is that the label of working individual provided in an embodiment of the present invention determines the flow diagram of method;
Fig. 2 is the structural schematic diagram of the label determining device of working individual provided in an embodiment of the present invention.
Specific embodiment
In the prior art, when the number of enterprise is more, the manager of enterprise possibly can not in person with it is all in enterprise Employee gets along, links up, accordingly, it is determined that the mode for the ability that each working individual has be usually by each employee directly under Superior summarizes after summarizing.
It artificially summarizes to the ability of working individual however, only relying only on, the ability of working individual determines usual mistake In subjectivity, objectivity and accuracy are poor, it is difficult to realize the ability by understanding each employee in enterprise, carry out to enterprise Preferably manage.
In the embodiment of the present invention, by obtaining text data;Extract working individual title, the proper noun in text data And verb;The determining verb to match with proper noun is as label;According to working individual title in the document data The position of position and label in the text data, by tag match to corresponding working individual title.Using above-mentioned side Case, so that the ability determination of working individual is more objective, accuracy is higher.
It is understandable to enable above-mentioned purpose of the invention, feature and beneficial effect to become apparent, with reference to the accompanying drawing to this The specific embodiment of invention is described in detail.
Refering to fig. 1, the flow diagram of method is determined for the label of working individual provided in an embodiment of the present invention, specifically Step includes:
Step S101 obtains the content of text data.
In specific implementation, the text data of acquisition can be the work with staff and affiliated function of enterprises Make the relevant text data of content.
In the embodiment of the present invention, the text data can be converted into html file format;By crawler mode, extract Content in the text data of the html file format.
In specific implementation, the format of text data is converted into html file format, is easy to use the electronics such as computer Equipment extracts the content of the text data, in the case where the data volume of text data is very huge, is effectively promoted The processing speed of text data.
Step S102 extracts working individual title, proper noun and verb in the text data;The working individual It include: staff and department.
In specific implementation, the proper noun can be application system title, device name or the work of enterprises Make the keyword of task.Such as information system, analog circuit or product appearance etc..It specifically can be by user according to application scenarios It is set.
In specific implementation, the text data can be extracted according to meaning of a word parser and corresponding term database In working individual title, proper noun and verb.
In specific implementation, the meaning of a word of the word object extracted by setting meaning of a word parser, can be in text data It is middle to extract corresponding working individual title, proper noun and verb.
In specific implementation, meaning of a word parser can be stammerer (jieba) algorithm, can also be by user according to applied field The corresponding meaning of a word parser of the different set of scape.
In specific implementation, by preset term database, inquiry and phase in term database in text data Working individual title, proper noun and the verb matched.
In specific implementation, the term database can be set by user according to application scenarios.
In the embodiment of the present invention, in working individual title, proper noun and the verb extracted in the text data Later, according to the working individual title and proper noun in the text data, construction work individual title dictionary and proprietary name Word dictionary;The text data is segmented according to the working individual title dictionary and the proper noun dictionary, is segmented The stop words in the text data is filtered out afterwards.
In specific implementation, stop words refers to the word for not having practical significance in text data, for example, " ", " ", Words such as " ".Data processing is carried out after stop words is filtered out, and can effectively improve the efficiency of data processing.
In the embodiment of the present invention, in working individual title, proper noun and the verb extracted in the text data Later, according to the verb in the text data, verb dictionary is constructed;Filter out effective verb in the verb dictionary;System Count word frequency of the effective verb in the verb dictionary in the text data.
In specific implementation, due to be using verb as the label of working individual, after building verb dictionary facilitates Continue the verb that label is determined as in verb.
In specific implementation, since label is the ability that can be showed at work for illustrating working individual, text Verb in notebook data is not that all can serve as label.For example, the verb " writing " in sentence " writing words " cannot be made For label.The process that effective verb in the verb dictionary is filtered out in verb dictionary is to filter out to can be used for table The verb of the ability of bright working individual at work.
In the embodiment of the present invention, dictionary is screened by meaning of a word parser or preset verb, filters out the verb word Effective verb in library.
It in specific implementation, can be as the word of the verb of label by setting when using meaning of a word parser Justice chooses corresponding effective verb;Or it can not be left out as the meaning of a word of the verb of label by setting corresponding dynamic Word has reached the purpose for filtering out effective verb.
In specific implementation, when using verb screening dictionary, it can input and can make by being screened in dictionary in verb Corresponding effective verb is chosen for the verb of label;Or mark can not be used as by screening input in dictionary in verb The verb of label to leave out corresponding verb, have reached the purpose for filtering out effective verb.
In the embodiment of the present invention, according to the word frequency, effective verb in the text data is split or merged.
In specific implementation, some verbs are made of two or more verb, for example, verb " braiding " is by moving Word " volume " and verb " knitting ".Therefore, it when choosing verb, can judge to be split by the word frequency of verb, respectively will " volume " and " knitting " respectively separately as a verb, still merges, and " braiding " is seen as a verb.
In specific implementation, word frequency threshold values can be set according to practical application scene by user, in the word frequency of a certain verb When higher than the word frequency threshold values, then the verb is individually seen into a verb.For example, the word frequency when verb " volume " is higher than institute's predicate When frequency threshold values, " volume " can individually be regarded into as a verb;It, can be with when the word frequency of verb " braiding " is higher than the word frequency threshold values " braiding " is individually regarded into as a verb;When the word frequency of verb " braiding " and the word frequency of verb " volume " are all larger than the word frequency threshold values When;When the word frequency of verb " braiding " is greater than the word frequency of verb " volume ", individually regard verb " braiding " as a verb;Work as verb When the word frequency of " braiding " is less than the word frequency of verb " volume ", individually regard verb " volume " as a verb.
Step S103, the determining verb to match with the proper noun is as label.
In the embodiment of the present invention, in a certain sentence in the text data, if it exists with it is proprietary in the sentence Noun has the verb of incidence relation, determines that the verb is the label to match with the proper noun.
In specific implementation, the matching of proper noun and verb is limited by proper noun meaning of a word itself, with proprietary name The verb that word matches should be associated with the proper noun.For example, proper noun is " information system ", verb is " maintenance ", " maintenance " can be identified as associated with " information system ", and verb " maintenance " can be looked at as label;Proper noun is " to produce Product appearance ", verb are " maintenance ", and " maintenance " can be identified as not being associated with " product appearance ", " product appearance " and verb " design " can be identified as being associated, and verb " design " can be looked at as label.
Step S104, according to position of the working individual title in the text data and the label described Position in text data, by the tag match to corresponding working individual title.
In specific implementation, position of the word in text data can be determined by regular expression.
In specific implementation, it can be limited with paragraph, label and the working individual title in paragraph where it are matched.
In specific implementation, it can be limited with sentence, label and the working individual title in sentence where it are matched.
In the embodiment of the present invention, in a certain sentence in the text data, if being between the working individual title Coordination, by all working individual title of the tag match in the sentence into the sentence;If the working individual It is mutually indepedent between title, by the nearest working individual name of label described in distance in the tag match to number of words in the sentence Claim.
For example, in sentence " first and second department carry out product design jointly ", working individual title " first " and work Individual title " second department " shows as coordination, then label " design " is matched to working individual title " first " and working individual Title " second department ";In sentence " under the guidance of first, second department completes product design ", working individual title " first " Independently of each other with working individual title " second department ", the number of words between label " design " and working individual title " second department " is few Number of words between label " design " and working individual title " first ", then be matched to working individual title " second for label " design " Department ".
Referring to Fig.2, its structural schematic diagram for the label determining device 20 of working individual provided in an embodiment of the present invention, tool Body includes: acquiring unit 201, extraction unit 202, tag determination unit 203 and tag match unit 204, in which:
The acquiring unit 201, can be used for obtaining text data;
The extraction unit 202 can be used for extracting the working individual title in the text data, proper noun and move Word;The working individual includes: staff and department;
The tag determination unit 203, the verb for being determined for matching with proper noun is as label;
The tag match unit 204, can be used for the position according to working individual title in the text data with And position of the label in the text data, by the tag match to corresponding working individual title.
In specific implementation, the proper noun can be application system title, device name or the work of enterprises Make the keyword of task.Such as information system, analog circuit or product appearance etc..It specifically can be by user according to application scenarios It is set.
In specific implementation, the text data can be extracted according to meaning of a word parser and corresponding term database In working individual title, proper noun and verb.
In specific implementation, the meaning of a word of the word object extracted by setting meaning of a word parser, can be in text data It is middle to extract corresponding working individual title, proper noun and verb.
In specific implementation, meaning of a word parser can be stammerer (jieba) algorithm, can also be by user according to applied field The corresponding meaning of a word parser of the different set of scape.
In specific implementation, by preset term database, inquiry and phase in term database in text data Working individual title, proper noun and the verb matched.
In specific implementation, the term database can be set by user according to application scenarios.
In specific implementation, matched verb will be carried out with proper noun as label, filtered off to a certain extent A part of invalid verb, embodies the accuracy of the label of selection, with the position of label and working individual in text data It sets as the limiting factor in matching process, ensure that the correct matching of label and working individual, to sum up, using the above scheme, So that the ability determination of working individual is more objective, accuracy is higher.
In the embodiment of the present invention, the acquiring unit 201 further include: content capture unit, the content capture list Member can be also used for the text data being converted to html file format;By crawler mode, the html tray is extracted Content in the text data of formula.
In specific implementation, the format of text data is converted into html file format, is easy to use the electronics such as computer Equipment extracts the content of the text data, in the case where the data volume of text data is very huge, is effectively promoted The processing speed of text data
In the embodiment of the present invention, the extraction unit 202 further include: building subelement and participle subelement, the building Subelement can be also used for according to the working individual title and proper noun in the text data, construction work individual title word Library and proper noun dictionary;The participle subelement can be also used for according to the working individual title dictionary and the proprietary name Word dictionary segments the text data, and the stop words in the text data is filtered out after participle.
In specific implementation, stop words is removed, can effectively improve the efficiency of data processing.
In the embodiment of the present invention, the extraction unit 202 further includes statistics subelement, and the building subelement can also be used According to the verb in the text data, verb dictionary is constructed;Filter out effective verb in the verb dictionary;The system Meter subelement can be also used for counting word frequency of effective verb in the text data in the verb dictionary.
In specific implementation, due to be using verb as the label of working individual, after building verb dictionary facilitates Continue the verb that label is determined as in verb.
In specific implementation, since label is the ability that can be showed at work for illustrating working individual, text Verb in notebook data is not that all can serve as label.For example, the verb " writing " in sentence " writing words " cannot be made For label.The process that effective verb in the verb dictionary is filtered out in verb dictionary is to filter out to can be used for table The verb of the ability of bright working individual at work.
In the embodiment of the present invention, the extraction unit 202 further includes screening subelement, and the screening subelement can also be used In screening dictionary by semantic analysis algorithm or preset verb, effective verb in the verb dictionary is filtered out.
It in specific implementation, can be as the word of the verb of label by setting when using meaning of a word parser Justice chooses corresponding effective verb;Or it can not be left out as the meaning of a word of the verb of label by setting corresponding dynamic Word has reached the purpose for filtering out effective verb.
In specific implementation, when using verb screening dictionary, it can input and can make by being screened in dictionary in verb Corresponding effective verb is chosen for the verb of label;Or mark can not be used as by screening input in dictionary in verb The verb of label to leave out corresponding verb, have reached the purpose for filtering out effective verb.
In specific implementation, filter out can as label verb as effective verb, filtering out a part cannot be used for The verb of working individual ability is described, to promote the accuracy that the ability of working individual determines.
In the embodiment of the present invention, the extraction unit 202 further include: word processing subelement, word processing are single Member can be also used for according to the word frequency, and effective verb in the text data is split or merged.
In specific implementation, some verbs are made of two or more verb, for example, verb " braiding " is by moving Word " volume " and verb " knitting ".Therefore, it when choosing verb, can judge to be split by the word frequency of verb, respectively will " volume " and " knitting " respectively separately as a verb, still merges, and " braiding " is seen as a verb.
In specific implementation, word frequency threshold values can be set according to practical application scene by user, in the word frequency of a certain verb When higher than the word frequency threshold values, then the verb is individually seen into a verb.For example, the word frequency when verb " volume " is higher than institute's predicate When frequency threshold values, " volume " can individually be regarded into as a verb;It, can be with when the word frequency of verb " braiding " is higher than the word frequency threshold values " braiding " is individually regarded into as a verb;When the word frequency of verb " braiding " and the word frequency of verb " volume " are all larger than the word frequency threshold values When;When the word frequency of verb " braiding " is greater than the word frequency of verb " volume ", individually regard verb " braiding " as a verb;Work as verb When the word frequency of " braiding " is less than the word frequency of verb " volume ", individually regard verb " volume " as a verb.
In the embodiment of the present invention, the tag determination unit 203 can be also used for a certain language in the text data Sentence in, if it exists with the proper noun in the sentence have incidence relation verb, determine the verb be with it is described proprietary The label that noun matches.
In specific implementation, the matching of proper noun and verb is limited by proper noun meaning of a word itself, with proprietary name The verb that word matches should be associated with the proper noun, thus promotes the accuracy that the ability of working individual determines.
For example, proper noun is " information system ", verb is " maintenance ", and " maintenance " can be identified as with " information system " Associated, verb " maintenance " can be looked at as label;Proper noun be " product appearance ", verb be " maintenance ", " maintenance " and " product appearance " can be identified as not being associated with, and " product appearance " can be identified as associated, verb with verb " design " " design " can be looked at as label.
In the embodiment of the present invention, the tag match unit 204 can be also used for a certain language in the text data In sentence, if being coordination between the working individual title, by institute of the tag match in the sentence into the sentence There is working individual title;It, will be in the tag match to number of words in the sentence if mutually indepedent between the working individual title The working individual title nearest apart from the label.
In specific implementation, position of the word in text data can be determined by regular expression.
In specific implementation, it can be limited with paragraph, label and the working individual title in paragraph where it are matched.
In specific implementation, it can be limited with sentence, label and the working individual title in sentence where it are matched.
For example, in sentence " first and second department carry out product design jointly ", working individual title " first " and work Individual title " second department " shows as coordination, then label " design " is matched to working individual title " first " and working individual Title " second department ";In sentence " under the guidance of first, second department completes product design ", working individual title " first " Independently of each other with working individual title " second department ", the number of words between label " design " and working individual title " second department " is few Number of words between label " design " and working individual title " first ", then be matched to working individual title " second for label " design " Department ".
It should be noted that can be used as a software according to the label determining device of the working individual of the embodiment of the present application Module and/or hardware module and be integrated into electronic equipment, in other words, which may include the label of the working individual Determining device.For example, the label determining device of the working individual can be a software in the operating system of the electronic equipment Module, or can be and be directed to its application program developed;Certainly, the label determining device of the working individual is same It can be one of numerous hardware modules of the electronic equipment.
In another embodiment of the application, label determining device and electronic equipment of the working individual are also possible to discrete Equipment (for example, server), and the label determining device of the working individual can be connected by wired and or wireless network Interactive information is transmitted to the electronic equipment, and according to the data format of agreement.
In the electronic equipment that one embodiment of the application provides, comprising: one or more processors and memory;And it deposits The computer program instructions of storage in memory, it is as above that computer program instructions execute processor The label for stating the working individual of any embodiment determines method.
Processor can be central processing unit (CPU) or with data-handling capacity and/or instruction execution capability The processing unit of other forms, and can control the other assemblies in electronic equipment to execute desired function.
Memory may include one or more computer program products, and the computer program product may include various The computer readable storage medium of form, such as volatile memory and/or nonvolatile memory.The volatile memory It such as may include random access memory (RAM) and/or cache memory (cache) etc..The non-volatile memories Device for example may include read-only memory (ROM), hard disk, flash memory etc..It can store on the computer readable storage medium One or more computer program instructions, processor can run described program instruction, to realize the application's described above The label of the working individual of each embodiment determines step and/or other desired functions in method.In the calculating The information such as the position of light intensity, compensation luminous intensity, optical filter can also be stored in machine readable storage medium storing program for executing.
In one example, electronic equipment can also include: input unit and output device, these components pass through total linear system The interconnection of the bindiny mechanism of system and/or other forms.
The output device can be output to the outside various information, such as may include such as display, loudspeaker, printing Machine and communication network and its remote output devices connected etc..
In addition to this, according to concrete application situation, electronic equipment can also include any other component appropriate.
Other than the above method and equipment, embodiments herein can also be computer program product, including calculate Machine program instruction, computer program instructions make processor execute the work such as above-mentioned any embodiment when being run by processor The label of individual determines the step in method.
Computer program product can be write with any combination of one or more programming languages for executing sheet Apply for the program code of embodiment operation, described program design language includes object oriented program language, such as Java, C++ etc. further includes conventional procedural programming language, such as " C " language or similar programming language.Program code It can fully execute on the user computing device, partly execute, held as an independent software package on a user device Part executes on a remote computing or completely in remote computing device or service on the user computing device for row, part It is executed on device.
In addition, embodiments herein can also be computer readable storage medium, it is stored thereon with computer program and refers to It enables, the computer program instructions make the processor execute the above-mentioned working individual of this specification when being run by processor Label determines described in method part the step determined in method according to the label of the working individual of the various embodiments of the application.
The computer readable storage medium can be using any combination of one or more readable mediums.Readable medium can To be readable signal medium or readable storage medium storing program for executing.Readable storage medium storing program for executing for example can include but is not limited to electricity, magnetic, light, electricity Magnetic, the system of infrared ray or semiconductor, device or device, or any above combination.Readable storage medium storing program for executing it is more specific Example (non exhaustive list) includes: the electrical connection with one or more conducting wires, portable disc, hard disk, random access memory Device (RAM), read-only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc Read-only memory (CD-ROM), light storage device, magnetic memory device or above-mentioned any appropriate combination.
Although present disclosure is as above, present invention is not limited to this.Anyone skilled in the art are not departing from this It in the spirit and scope of invention, can make various changes or modifications, therefore protection scope of the present invention should be with claim institute Subject to the range of restriction.

Claims (11)

1. a kind of label of working individual determines method characterized by comprising
Obtain the content of text data;
Extract working individual title, proper noun and the verb in the text data;The working individual includes: staff And department;
The determining verb to match with the proper noun is as label;
According to the working individual title in the text data position and the label in the text data Position, by the tag match to corresponding working individual title.
2. the label of working individual according to claim 1 determines method, which is characterized in that the acquisition text data Content, comprising:
The text data is converted into html file format;
By crawler mode, the content in the text data of the html file format is extracted.
3. the label of working individual according to claim 1 determines method, which is characterized in that extract the text described After working individual title, proper noun and verb in data, further includes:
According to the working individual title and proper noun in the text data, construction work individual title dictionary and proper noun Dictionary;
The text data is segmented according to the working individual title dictionary and the proper noun dictionary, is filtered after participle Except the stop words in the text data.
4. the label of working individual according to claim 1 determines method, which is characterized in that extract the text described After working individual title, proper noun and verb in data, further includes:
According to the verb in the text data, verb dictionary is constructed;
Filter out effective verb in the verb dictionary;
Count word frequency of the effective verb in the verb dictionary in the text data.
5. the label of working individual according to claim 4 determines method, which is characterized in that described to filter out the verb Effective verb in dictionary, comprising:
Dictionary is screened by semantic analysis algorithm or preset verb, filters out effective verb in the verb dictionary.
6. the label of working individual according to claim 4 determines method, which is characterized in that in the statistics verb Effective verb in dictionary is after the word frequency in the text data, further includes:
According to the word frequency, effective verb in the text data is split or merged.
7. the label of working individual according to claim 1 determines method, which is characterized in that the determination and proper noun The verb to match is as label, comprising:
In a certain sentence in the text data, there is the dynamic of incidence relation with the proper noun in the sentence if it exists Word determines that the verb is the label to match with the proper noun.
8. the label of working individual according to claim 1 determines method, which is characterized in that described according to working individual name Claim in the position of position and label in the text data in the text data, by tag match to corresponding work Make individual title, comprising:
In a certain sentence in the text data, if being coordination between the working individual title, by the sentence In all working individual title of the tag match into the sentence;It, will if mutually indepedent between the working individual title The nearest working individual title of label described in distance in tag match to number of words in the sentence.
9. a kind of label determining device of working individual characterized by comprising acquiring unit, extraction unit, label determine single Member and tag match unit, in which:
The acquiring unit, for obtaining the content of text data;
The extraction unit, for extracting working individual title, proper noun and verb in the text data;The work Individual includes: staff and department;
The tag determination unit, for the determining verb to match with proper noun as label;
The tag match unit, for being existed according to position of the working individual title in the text data and the label Position in the text data, by the tag match to corresponding working individual title.
10. a kind of computer readable storage medium, is stored thereon with computer instruction, the computer readable storage medium is non- Volatile storage medium or non-transitory storage media, which is characterized in that the computer instruction run when perform claim require 1~ The step of label of 8 described in any item working individuals determines method.
11. a kind of electronic equipment, including memory and processor, it is stored with computer instruction on the memory, feature exists In the label of the computer instruction 1~8 described in any item working individuals of processor perform claim requirement when running is true The step of determining method.
CN201811637972.6A 2018-12-29 2018-12-29 Method and device for determining label of working individual and computer storage medium Active CN109740157B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811637972.6A CN109740157B (en) 2018-12-29 2018-12-29 Method and device for determining label of working individual and computer storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811637972.6A CN109740157B (en) 2018-12-29 2018-12-29 Method and device for determining label of working individual and computer storage medium

Publications (2)

Publication Number Publication Date
CN109740157A true CN109740157A (en) 2019-05-10
CN109740157B CN109740157B (en) 2023-08-18

Family

ID=66362296

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811637972.6A Active CN109740157B (en) 2018-12-29 2018-12-29 Method and device for determining label of working individual and computer storage medium

Country Status (1)

Country Link
CN (1) CN109740157B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2005038584A2 (en) * 2003-10-10 2005-04-28 Unicru, Inc. Matching job candidate information
US20090327243A1 (en) * 2008-06-27 2009-12-31 Cbs Interactive, Inc. Personalization engine for classifying unstructured documents
CN101833555A (en) * 2009-03-12 2010-09-15 富士通株式会社 Information extraction method and device
CN106776571A (en) * 2016-12-27 2017-05-31 北京奇虎科技有限公司 The generation method and device of a kind of label
CN107480200A (en) * 2017-07-17 2017-12-15 深圳先进技术研究院 Word mask method, device, server and the storage medium of word-based label
CN108288229A (en) * 2018-03-02 2018-07-17 北京邮电大学 A kind of user's portrait construction method
CN108959575A (en) * 2018-07-06 2018-12-07 北京神州泰岳软件股份有限公司 A kind of enterprise's incidence relation information mining method and device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2005038584A2 (en) * 2003-10-10 2005-04-28 Unicru, Inc. Matching job candidate information
US20090327243A1 (en) * 2008-06-27 2009-12-31 Cbs Interactive, Inc. Personalization engine for classifying unstructured documents
CN101833555A (en) * 2009-03-12 2010-09-15 富士通株式会社 Information extraction method and device
CN106776571A (en) * 2016-12-27 2017-05-31 北京奇虎科技有限公司 The generation method and device of a kind of label
CN107480200A (en) * 2017-07-17 2017-12-15 深圳先进技术研究院 Word mask method, device, server and the storage medium of word-based label
CN108288229A (en) * 2018-03-02 2018-07-17 北京邮电大学 A kind of user's portrait construction method
CN108959575A (en) * 2018-07-06 2018-12-07 北京神州泰岳软件股份有限公司 A kind of enterprise's incidence relation information mining method and device

Also Published As

Publication number Publication date
CN109740157B (en) 2023-08-18

Similar Documents

Publication Publication Date Title
US9990356B2 (en) Device and method for analyzing reputation for objects by data mining
CA2953969C (en) Interactive interfaces for machine learning model evaluations
Hoover et al. Simulation: a problem-solving approach
US9646077B2 (en) Time-series analysis based on world event derived from unstructured content
US20150379430A1 (en) Efficient duplicate detection for machine learning data sets
US10546348B1 (en) Cleaning noise words from transaction descriptions
US20180046956A1 (en) Warning About Steps That Lead to an Unsuccessful Execution of a Business Process
CA2684822A1 (en) Data transformation based on a technical design document
CN109542966B (en) Data fusion method and device, electronic equipment and computer readable medium
US11030384B2 (en) Identification of sequential browsing operations
US20140114941A1 (en) Search activity prediction
CN109284450B (en) Method and device for determining order forming paths, storage medium and electronic equipment
CN115017893A (en) Correcting content generated by deep learning
CN114579104A (en) Data analysis scene generation method, device, equipment and storage medium
CN109754224A (en) Organizational affiliation map construction method, apparatus and computer storage medium
US20160132809A1 (en) Identifying and amalgamating conditional actions in business processes
US11188981B1 (en) Identifying matching transfer transactions
CN107122367B (en) User attribute value calculation method and device based on user browsing behavior
CN114444465A (en) Information extraction method, device, equipment and storage medium
Rahmi Dewi et al. Software Requirement-Related Information Extraction from Online News using Domain Specificity for Requirements Elicitation: How the system analyst can get software requirements without constrained by time and stakeholder availability
CN112241433A (en) Product demonstration method and device, computer equipment and storage medium
US20150046443A1 (en) Document-based search with facet information
CN110555212A (en) Document verification method and device based on natural language processing and electronic equipment
CN103971191A (en) Working thread managing method and equipment
CN109740157A (en) The label of working individual determines method, apparatus and computer storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant