CN114443727A - Human vein data processing method, device, equipment and storage medium - Google Patents

Human vein data processing method, device, equipment and storage medium Download PDF

Info

Publication number
CN114443727A
CN114443727A CN202111683522.2A CN202111683522A CN114443727A CN 114443727 A CN114443727 A CN 114443727A CN 202111683522 A CN202111683522 A CN 202111683522A CN 114443727 A CN114443727 A CN 114443727A
Authority
CN
China
Prior art keywords
data
personal
employee
vein
enterprise
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111683522.2A
Other languages
Chinese (zh)
Inventor
黎展
陈开冉
黄俊强
蔡家成
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Tungee Technology Co ltd
Original Assignee
Guangzhou Tungee Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Tungee Technology Co ltd filed Critical Guangzhou Tungee Technology Co ltd
Priority to CN202111683522.2A priority Critical patent/CN114443727A/en
Publication of CN114443727A publication Critical patent/CN114443727A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • G06F16/24573Query processing with adaptation to user needs using data annotations, e.g. user-defined metadata
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/288Entity relationship models

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Library & Information Science (AREA)
  • Computational Linguistics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a method, a device, equipment and a storage medium for processing personal vein data, wherein the acquired enterprise personal vein data is subjected to data processing according to a preset filtering rule to generate first enterprise personal vein data comprising employee personal vein data; according to a plurality of preset dimensions, respectively extracting dimension data of each employee in the employee personal vein data, and generating an employee dimension data set corresponding to the employee personal vein data; meanwhile, identifying the same personal data in the personal data of the staff according to a preset fusion rule so as to fuse the same personal data; and performing aggregation processing on the fused employee personal data and the employee dimension data set to generate an enterprise personal data map. Compared with the prior art, the method and the device have the advantages that the obtained enterprise personal vein data are filtered, extracted and fused to generate the enterprise personal vein data map, convenience in enterprise personal vein management is improved, data redundancy is reduced, and displayed enterprise personal vein information is richer and more complete.

Description

Human vein data processing method, device, equipment and storage medium
Technical Field
The invention relates to the technical field of big data, in particular to a human vein data processing method, a human vein data processing device, human vein data processing equipment and a storage medium.
Background
In the prior art, when the staff pulse of an enterprise needs to be inquired, two types of pulse of staff under the enterprise can be found, one type is the disclosure of each large public platform to the company high administration, and the other type is the information which is automatically filled by the staff of the social platform, but the inquiry method is relatively vertical no matter in the public platform or the social platform, belongs to a certain field, cannot cover all the staff pulse under the main body, and leads the staff information to be dispersed, independent and incomplete. Currently, because the social platform and the public platform are numerous, whether the staff vein information obtained from the multiple platforms is the same person or not cannot be known, so that the staff veins cannot be in one-to-one correspondence; the interpersonal situation of the staff across the platforms can not be searched intensively and uniformly; the platform has independent operation strategies, the displayed information is uneven, and certain unicity and limitation are achieved; meanwhile, the existing method cannot acquire other pulse information related to the pulse of the enterprise, and is not beneficial to the management of the enterprise on the pulse information.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: the method, the device, the equipment and the storage medium for processing the personal data are provided, and the obtained enterprise personal data is filtered, extracted and fused to generate an enterprise personal data map, so that the enterprise personal relationship is clear, the convenience of enterprise management is improved, and the displayed enterprise personal information is richer and more complete.
In order to solve the technical problem, the invention provides a human pulse data processing method, which comprises the following steps:
according to a preset filtering rule, performing data processing on the obtained enterprise personal vein data to generate first enterprise personal vein data, wherein the first enterprise personal vein data comprises member worker pulse data;
according to a plurality of preset dimensions, respectively extracting dimension data of each employee in the employee personal vein data, and generating an employee dimension data set corresponding to the employee personal vein data;
meanwhile, identifying the same personal data in the personal data of the staff according to a preset fusion rule so as to fuse the same personal data;
and performing aggregation processing on the fused employee personal data and the employee dimension data set to generate an enterprise personal data map.
Further, according to a preset filtering rule, performing data processing on the acquired enterprise personal vein data to generate first enterprise personal vein data, specifically:
the method comprises the steps of dividing the acquired enterprise personal image data of different data sources, carrying out data processing on the divided non-head image data and head image data according to corresponding preset filtering rules, and collecting the non-head image data and the head image data after data processing to generate first enterprise personal image data.
Further, the preset plurality of dimensions include educational experiences, colleagues, alumni, business partners, bidding partners, investment partners, and news trends.
Further, according to a preset fusion rule, the same personal vein data in the staff personal vein data are identified so that the same personal vein data are fused, and the method specifically comprises the following steps:
according to a preset fusion rule, after the employee personal vein data of different data sources are subjected to multiple judgment, the same personal vein data and the different personal vein data in the employee personal vein data are identified, the different personal vein data are eliminated, and the same personal vein data are fused.
Further, the present invention also provides a human vein data processing apparatus, comprising: the system comprises a filtering module, a dimension data extraction module, a fusion module and an aggregation module;
the filtering module is used for performing data processing on the acquired enterprise personal vein data according to a preset filtering rule to generate first enterprise personal vein data, wherein the first enterprise personal vein data comprises employee personal vein data;
the dimension data extraction module is used for respectively extracting dimension data of each employee in the employee personal vein data according to a plurality of preset dimensions to generate an employee dimension data set corresponding to the employee personal vein data;
the fusion module is used for identifying the same personal vein data in the staff personal vein data according to a preset fusion rule so as to fuse the same personal vein data;
and the aggregation module is used for aggregating the staff relationship data and the staff dimension data set after fusion to generate an enterprise relationship data map.
Further, the filtering module is used for performing data processing on the obtained enterprise personal vein data according to a preset filtering rule to generate first enterprise personal vein data, and specifically comprises:
the method comprises the steps of dividing the acquired enterprise personal image data of different data sources, carrying out data processing on the divided non-head image data and head image data according to corresponding preset filtering rules, and collecting the non-head image data and the head image data after data processing to generate first enterprise personal image data.
Further, the plurality of dimensions preset in the dimension data extraction module include education experience, colleagues, alumni, business partners, bidding partners, investment partners and news dynamics.
Further, the fusion module is configured to identify the same personal vein data in the employee personal vein data according to a preset fusion rule, so as to fuse the same personal vein data, specifically:
according to a preset fusion rule, after the employee personal vein data of different data sources are subjected to multiple judgment, the same personal vein data and the different personal vein data in the employee personal vein data are identified, the different personal vein data are eliminated, and the same personal vein data are fused.
Further, the present invention also provides a terminal device, including a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor, wherein the processor, when executing the computer program, implements the personal data processing method according to any one of the above.
Further, the present invention also provides a computer-readable storage medium, which includes a stored computer program, wherein when the computer program runs, a device where the computer-readable storage medium is located is controlled to execute the human vein data processing method according to any one of the above items.
Compared with the prior art, the human vein data processing method, the human vein data processing device, the human vein data processing equipment and the storage medium have the following beneficial effects:
according to the technical scheme, the obtained enterprise personal vein data is classified and filtered to generate the first enterprise personal vein data comprising the employee personal vein data, so that the first enterprise personal vein data is embodied to the employee personal vein data, and the obtained first enterprise personal vein data is more complete; according to a plurality of preset dimensions, dimension data of each employee in the employee personal vein data are respectively extracted, an employee dimension data set corresponding to the employee personal vein data is generated, meanwhile, the same personal vein data in the employee personal vein data are identified according to a preset fusion rule, so that the same personal vein data are fused, duplicate removal processing of the same personal vein data is realized, and data redundancy is reduced; and performing aggregation processing on the fused employee personal data and the employee dimension data set to generate an enterprise personal data map. Compared with the prior art, the technical scheme of the invention generates the enterprise personal data map by filtering, extracting and fusing the acquired enterprise personal data, so that the enterprise personal relationship is clear, the convenience of enterprise personal management is improved, the redundancy of data is reduced, and the displayed enterprise personal information is richer and more complete.
Drawings
FIG. 1 is a schematic flow chart diagram illustrating one embodiment of a method for personal data processing in accordance with the present invention;
fig. 2 is a schematic structural diagram of an embodiment of a human vein data processing device provided by the invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Example 1
Referring to fig. 1, fig. 1 is a schematic flow chart of an embodiment of a method for processing personal data according to the present invention, as shown in fig. 1, the method includes steps 101-104, which are as follows:
step 101: and according to a preset filtering rule, performing data processing on the obtained enterprise personal vein data to generate first enterprise personal vein data, wherein the first enterprise personal vein data comprises employee personal vein data.
In this embodiment, enterprise personal data disclosed by each large public platform and the social platform is obtained based on a crawler technology, where the enterprise personal data includes, but is not limited to, a personal name, an associated enterprise, a head portrait, a resume, an associated school, and other information corresponding to each employee of all employees of an enterprise.
In this embodiment, the obtained enterprise personal image data of different data sources is divided into head portraits, the divided non-head portraits and head portraits are respectively subjected to data processing according to corresponding preset filtering rules, and the non-head portraits and head portraits subjected to data processing are collected to generate first enterprise personal image data.
In this embodiment, the preset filtering rule corresponding to the non-avatar data is as follows: judging whether the non-head portrait data is English name or not, and judging that the non-head portrait data is English name when English exists in the non-head portrait data; judging whether the data format of the non-avatar data is in a str format, reserving when the non-avatar data is in the str format, and filtering the non-str format; judging whether the length of the non-head portrait data is larger than 1 and smaller than 100, reserving when the length of the non-head portrait data is larger than 1 and smaller than 100, and filtering the non-head portrait data with the length smaller than or equal to 1 and larger than or equal to 100; judging whether the English name contains special characters, if no special characters exist, reserving the non-head portrait data, and filtering the special characters, wherein the special characters are non-Chinese, English, interval number, bracketing, Chinese and English blank, colon, punctuation, period number, hyphen, Chinese hyphen (-), single quotation number, semicolon and contents other than & lt- & gt; and judging whether a space exists between the front part and the back part of the non-head portrait data or not, and deleting the existing space.
In this embodiment, the preset filtering rule corresponding to the non-avatar data further includes: judging whether the non-head portrait is a Chinese name or not, and judging that the non-head portrait data is the Chinese name when English does not exist in the non-head portrait data; judging whether the data format of the non-avatar data is in a str format, reserving when the non-avatar data is in the str format, and filtering the non-str format; judging whether the length of the non-head portrait data is larger than 1 and smaller than 100, reserving when the length of the non-head portrait data is larger than 1 and smaller than 50, and filtering the non-head portrait data with the length smaller than or equal to 1 and larger than or equal to 50; judging whether the Chinese name contains special characters, if no special characters exist, reserving the non-head image data, and filtering the special characters, wherein the special characters are contents except non-Chinese, English, interval numbers, brackets, Chinese and English spaces, colons, punctuation marks, periods and hyphens; and judging whether a space exists between the front part and the back part of the non-head portrait data or not, and deleting the existing space.
In the present embodiment, if the non-avatar data is of an enterprise type, it is determined whether the non-avatar data includes foreign names such as ' usa ', ' japan ', ' germany ', ' france ', etc., and it is permissible to include japanese and latin characters (' a ', '),
Figure RE-GDA0003585822580000061
'è'、'é'、'ê'、
Figure RE-GDA0003585822580000062
'ì'、'í'、
Figure RE-GDA0003585822580000063
'ò'、'ó'、
Figure RE-GDA0003585822580000064
Figure RE-GDA0003585822580000065
'ù'、'ú'、
Figure RE-GDA0003585822580000066
Figure RE-GDA0003585822580000067
Figure RE-GDA0003585822580000068
]) For the presence of foreign names including 'usa', 'japan', 'german', 'france', etc., it is permissible for a type of enterprise including japanese and latin characters to be judged as a non-local enterprise, and for the absence of foreign names including 'usa', 'japan', 'german', 'france', etc., it is not permissible for a type of enterprise including japanese and latin characters to be judged as a local enterprise.
In this embodiment, the divided non-avatar data and avatar data are respectively subjected to data processing according to corresponding preset filtering rules, so that invalid data in the acquired enterprise vein data of different data sources can be eliminated, and the data can be uniformly verified and analyzed to ensure the correctness of the acquired data.
Step 102: and according to a plurality of preset dimensions, respectively extracting the dimension data of each employee in the employee personal vein data, and generating an employee dimension data set corresponding to the employee personal vein data.
In this embodiment, the preset multiple dimensions include educational experiences, colleagues, alumni, business partners, bidding partners, investment partners, and news trends.
In this embodiment, through big data processing capability and AI algorithm capability, based on employee pulse data, dimension data of each employee is extracted according to a plurality of preset dimensions, and as an example in this implementation, for an educational experience of each employee pulse, through obtaining resume data of the employee pulse, a corresponding educational experience of the employee pulse, including graduate colleges, department specialties, and the like, is identified through a preset rule and an AI algorithm model; for the colleagues of the personal veins of each employee, extracting the corresponding colleagues by acquiring the related enterprises of the personal veins of the employee; for the alumni of each staff pulse, extracting the corresponding alumni by acquiring the associated school of the staff pulse; for business partners of every staff, identifying the cooperation conditions of the two companies such as joining, signing agreements and the like in news of related enterprises by utilizing an AI algorithm model, considering that high management of the two parties is the business partner if the two parties cooperate, and extracting the information of the business partner; for the bidding partners of each employee, judging whether two companies of which two persons are willing have a bidding cooperative relationship or not, regarding the enterprises with the bidding cooperative relationship and the two persons are high management, considering the two persons as the bidding partners, and extracting the corresponding bidding partners; for the investment partners of the interpersonal relationship of each employee, by researching the investment situation corresponding to the interpersonal relationship, if two persons simultaneously serve as stockholders of one company, the two persons are considered as the investment partners, and the corresponding investment partners are extracted; for news trends of every staff, relevant news reports are crawled, the news trends of the staff are matched by using a certain rule, and the news with corresponding characters is extracted from the news associated with the enterprises to form association.
Step 103: and meanwhile, identifying the same personal data in the staff personal data according to a preset fusion rule so as to fuse the same personal data.
In this embodiment, after the employee personal vein data of different data sources are subjected to multiple judgments according to a preset fusion rule, the same personal vein data and different personal vein data in the employee personal vein data are identified, the different personal vein data are excluded, and the same personal vein data are fused.
In this embodiment, the obtained employee personal data of different data sources is subjected to preliminary data processing, whether the name or the name of the associated enterprise in the employee personal data is empty or not is judged, if yes, the employee personal data is discarded, and if not, the next judgment is performed; judging whether the name of the person is an enterprise name, if so, discarding the name, and if not, entering the next judgment; judging whether the name is a pure number, if so, discarding, and if not, entering the next judgment; judging whether the name does not contain any one of Chinese, English, Japanese and Latin texts, if so, discarding the name, and if not, entering the next judgment; judging whether the name contains a 'mobile phone user', if not, entering the next judgment, if so, judging whether the name contains no other two Chinese characters with the number of 2 or more, if so, abandoning, and if not, entering the next judgment; judging whether the name contains 'appointments', if not, retaining, if yes, judging whether the name comes from a preset platform, if not, retaining, if yes, judging whether the number of owned enterprises is more than 1000, if not, retaining, and if yes, discarding. Meanwhile, judging whether the fields of 'position, establishment, education experience' and the like are 'none', if so, the field is empty; it is also determined whether the resume contains no Chinese characters, if not, the resume is retained, and if so, the resume is discarded.
As an example in this embodiment, when determining whether a name is a business name, if all the words in the name are greater than or equal to 4 and include one or more of the preset nouns, the name is considered as a business or a unit, where the preset nouns include: companies, schools, universities, associations, schools, groups, elementary schools, hotels, supermarkets, internet cafes, movie theaters, bookstores.
In the embodiment, after the initial data processing is performed on the obtained employee personal data of different data sources, the preset platform data, the internal database industrial and commercial data and the peculiar data source which are subjected to the name processing are obtained; by combining preset platform data and internal library business data, matching company plus person names with the preset platform data based on the internal library business data to generate a unique identifier, judging whether the preset platform really has the unique identifier, if not, considering the staff's pulse as an independent worker and generating a corresponding pulse with a pulse ID, and if so, considering the staff's pulse as a preset platform unique identifier and fusing the preset platform and the internal library business, and generating the corresponding pulse with the pulse ID; judging whether the generated arteries of the two kinds of arteries ID need to delete the arteries ID, if so, discarding the arteries, if not, converging and fusing the arteries of the arteries ID and the employee arteries data of other data sources in a form of company plus names, judging whether the arteries ID exists after the fusion in the form of company plus names, if so, fusing the arteries information according to the arteries ID, if not, considering the arteries ID as the arteries of which the source does not comprise a preset platform and an internal custodian, judging whether the arteries ID is sourced as a single microblog, if so, discarding the information, if not, judging whether the source does not comprise the internal custodian, presetting a platform, a stock market and a head enterprise high-management duplicate name, if not, creating a new arteries ID, fusing the arteries information, if not, judging whether the arteries are sourced as the single arteries/neck-English arteries, if not, the information is abandoned, and if so, a new pulse ID is created to fuse the pulse information, so that the uniqueness of the pulse is realized.
In this embodiment, the name processing means that the full-angle character is changed to a half-angle character, all the double quotation marks and single quotation marks of chinese and english are changed to english single quotation marks, and all the brackets are changed to english brackets; changing all symbols into spaces; if the Chinese characters outside the brackets are not Chinese characters with more than two characters and the Chinese characters inside the brackets are Chinese characters with more than two characters, only the Chinese characters inside the brackets are left; the content in the brackets including the brackets is deleted; the name of the person only allows Chinese, English, decimal point, space and characters containing Japanese and Latin to exist, and other characters are deleted; deleting decimal points and spaces before and after the name of the person; if the name contains Chinese and has no English, Japanese or Latin characters, deleting decimal points and spaces; if the name contains Chinese characters and English, Japanese or Latin characters do not exist among the Chinese files, decimal points and spaces are deleted among the Chinese characters; the name contains the Chinese characters of the mobile phone user, and if more than two Chinese characters exist, the Chinese characters except the mobile phone user are reserved; if the name contains more than two Chinese characters, removing peculiar characters and only keeping pure Chinese characters; if the name does not contain Chinese, but English exists, only pure English is reserved; the case and case of English are consistent by default; successive empty characters become an english space.
As an example in this embodiment, the arteries from the preset platform are not merged if the arteries IDs are different and the company names are the same; similarly, if the name is processed, if the character length of the name is 1, the original value is selected for human context fusion.
Step 104: and aggregating the fused employee personal data and the employee dimension data set to generate an enterprise personal data map.
In this embodiment, based on the relationship fusion, the employee relationship data and the employee dimension data set corresponding to the employee relationship are subjected to association and aggregation processing, and specifically, all relationship associations and association enterprises are integrated to form a relationship graph. As a preferable solution in this embodiment, a right-to-stock penetration map may be generated by studying a situation of holding stocks corresponding to a human vein.
In the embodiment, based on the visualization technology, the relationship graph, the associated enterprise, the knowledge graph, the industrial/commercial/news dynamic state and the risk information formed by integrating all the relationship and associated enterprises are intensively displayed, so that the relationship information is richer; meanwhile, based on keyword retrieval provided by a display interface, corresponding personal veins are searched by supporting enterprise names, personal names and post names, and combined inquiry can be carried out by combining post classification, company area classification, company industry classification and company establishment years, and multiple data such as related colleagues, partners, investment partners, alumni and the like are integrated and displayed through the display interface, so that the display page integrates enterprise high-management and employee information of a plurality of platforms, and a user can search and view enterprise personal vein data of related target enterprises in a cross-platform real-time manner.
Referring to fig. 2, fig. 2 is a schematic structural diagram of an embodiment of a human-vein data processing apparatus provided by the present invention, as shown in fig. 2, the apparatus includes a filtering module 201, a dimension data extracting module 202, a fusing module 203, and an aggregating module 204, which are specifically as follows:
and the filtering module 201 is configured to perform data processing on the obtained enterprise personal vein data according to a preset filtering rule, and generate first enterprise personal vein data, where the first enterprise personal vein data includes employee personal vein data.
In this embodiment, enterprise personal data disclosed by each large public platform and the social platform is obtained based on a crawler technology, where the enterprise personal data includes, but is not limited to, a personal name, an associated enterprise, a head portrait, a resume, an associated school, and other information corresponding to each employee of all employees of an enterprise.
In this embodiment, the obtained enterprise personal image data of different data sources is divided into head portraits, the divided non-head portraits and head portraits are respectively subjected to data processing according to corresponding preset filtering rules, and the non-head portraits and head portraits subjected to data processing are collected to generate first enterprise personal image data.
In this embodiment, the preset filtering rule corresponding to the non-avatar data is as follows: judging whether the non-head portrait data is English name or not, and judging that the non-head portrait data is English name when English exists in the non-head portrait data; judging whether the data format of the non-avatar data is in a str format, reserving when the non-avatar data is in the str format, and filtering the non-str format; judging whether the length of the non-head portrait data is larger than 1 and smaller than 100, reserving when the length of the non-head portrait data is larger than 1 and smaller than 100, and filtering the non-head portrait data with the length smaller than or equal to 1 and larger than or equal to 100; judging whether the English name contains special characters, if no special characters exist, reserving the non-head portrait data, and filtering the special characters, wherein the special characters are non-Chinese, English, interval number, bracketing, Chinese and English blank, colon, punctuation, period number, hyphen, Chinese hyphen (-), single quotation number, semicolon and contents other than & lt- & gt; and judging whether a space exists between the front part and the back part of the non-head portrait data or not, and deleting the existing space.
In this embodiment, the preset filtering rule corresponding to the non-avatar data further includes: judging whether the non-head portrait is a Chinese name or not, and judging that the non-head portrait data is the Chinese name when English does not exist in the non-head portrait data; judging whether the data format of the non-avatar data is in a str format, reserving when the non-avatar data is in the str format, and filtering the non-str format; judging whether the length of the non-head portrait data is larger than 1 and smaller than 100, reserving when the length of the non-head portrait data is larger than 1 and smaller than 50, and filtering the non-head portrait data with the length smaller than or equal to 1 and larger than or equal to 50; judging whether the Chinese name contains special characters, if no special characters exist, reserving the non-head image data, and filtering the special characters, wherein the special characters are contents except non-Chinese, English, interval numbers, brackets, Chinese and English spaces, colons, punctuation marks, periods and hyphens; and judging whether a space exists between the front part and the back part of the non-head portrait data or not, and deleting the existing space.
In the present embodiment, if the non-avatar data is of an enterprise type, it is determined whether or not the non-avatar data includes foreign names such as 'usa', 'japan', 'germany', 'france', etc., which allow japanese and latin characters ([ 'a', 'a'),
Figure RE-GDA0003585822580000111
'è'、'é'、'ê'、
Figure RE-GDA0003585822580000112
'ì'、'í'、
Figure RE-GDA0003585822580000113
'ò'、'ó'、
Figure RE-GDA0003585822580000114
Figure RE-GDA0003585822580000115
'ù'、'ú'、
Figure RE-GDA0003585822580000116
Figure RE-GDA0003585822580000117
Figure RE-GDA0003585822580000118
]) The existence of foreign names including 'USA', 'Japan', 'Germany', 'France', etc. is permissible to includeThe type of enterprise in japanese and latin characters is judged to be a non-local enterprise, and for a non-existence of foreign names such as 'usa', 'japan', 'germany', 'france', etc., a judgment that includes japanese and latin characters is not allowed is a local enterprise.
In this embodiment, the divided non-avatar data and avatar data are respectively subjected to data processing according to corresponding preset filtering rules, so that invalid data in the acquired enterprise vein data of different data sources can be eliminated, and the data can be uniformly verified and analyzed to ensure the correctness of the acquired data.
The dimension data extraction module 202 is configured to respectively extract dimension data of each employee in the employee personal vein data according to a plurality of preset dimensions, and generate an employee dimension data set corresponding to the employee personal vein data.
In this embodiment, the predetermined plurality of dimensions include educational experiences, colleagues, alumni, business partners, bidding partners, investment partners, and news trends.
In this embodiment, the dimension data of each employee is extracted according to a plurality of preset dimensions based on the employee vein data through the big data processing capability and the AI algorithm capability, and as an example in this embodiment, for the education experience of each employee vein, the education experience of the corresponding employee vein, including the academic institutions and the institutional specialties, is identified through the preset rules and the AI algorithm model by obtaining the resume data of the employee vein; for the colleagues of the personal veins of each employee, extracting the corresponding colleagues by acquiring the related enterprises of the personal veins of the employees; for the alumni of each staff pulse, extracting the corresponding alumni by acquiring the associated school of the staff pulse; for business partners of every staff, identifying the cooperation conditions of the two companies such as joining, signing agreements and the like in news of related enterprises by utilizing an AI algorithm model, considering that high management of the two parties is the business partner if the two parties cooperate, and extracting the information of the business partner; for the bidding partners of each employee, judging whether two companies of which two persons are willing have a bidding cooperative relationship or not, regarding the enterprises with the bidding cooperative relationship and the two persons are high management, considering the two persons as the bidding partners, and extracting the corresponding bidding partners; for the investment partners of the interpersonal relationship of each employee, by researching the investment situation corresponding to the interpersonal relationship, if two persons simultaneously serve as stockholders of one company, the two persons are considered as the investment partners, and the corresponding investment partners are extracted; for news trends of every staff, relevant news reports are crawled, the news trends of the staff are matched by using a certain rule, and the news with corresponding characters is extracted from the news associated with the enterprises to form association.
And the fusion module 203 is configured to identify the same personal vein data in the employee personal vein data according to a preset fusion rule, so as to fuse the same personal vein data.
In this embodiment, after the employee personal vein data of different data sources are subjected to multiple judgments according to a preset fusion rule, the same personal vein data and different personal vein data in the employee personal vein data are identified, the different personal vein data are excluded, and the same personal vein data are fused.
In this embodiment, the obtained employee personal data of different data sources is subjected to preliminary data processing, whether the name or the name of the associated enterprise in the employee personal data is empty or not is judged, if yes, the employee personal data is discarded, and if not, the next judgment is performed; judging whether the name of the person is an enterprise name, if so, discarding the name, and if not, entering the next judgment; judging whether the name is a pure number, if so, discarding, and if not, entering the next judgment; judging whether the name does not contain any one of Chinese, English, Japanese and Latin texts, if so, discarding the name, and if not, entering the next judgment; judging whether the name contains a 'mobile phone user', if not, entering the next judgment, if so, judging whether the name contains no other two Chinese characters with the number of 2 or more, if so, abandoning, and if not, entering the next judgment; judging whether the name contains 'appointments', if not, retaining, if yes, judging whether the name comes from a preset platform, if not, retaining, if yes, judging whether the number of owned enterprises is more than 1000, if not, retaining, and if yes, discarding. Meanwhile, judging whether the fields of 'position, establishment, education experience' and the like are 'none', if so, the field is empty; it is also determined whether the resume contains no Chinese characters, if not, the resume is retained, and if so, the resume is discarded.
As an example in this embodiment, when determining whether a name is a business name, if all the words in the name are greater than or equal to 4 and include one or more of the preset nouns, the name is considered as a business or a unit, where the preset nouns include: companies, schools, universities, associations, schools, colleges, groups, elementary schools, hotels, supermarkets, internet cafes, movie theaters, and bookstores.
In the embodiment, after the initial data processing is performed on the obtained employee personal data of different data sources, the preset platform data, the internal database industrial and commercial data and the peculiar data source which are subjected to the name processing are obtained; by combining preset platform data and internal library business data, matching company plus person names with the preset platform data based on the internal library business data to generate a unique identifier, judging whether the preset platform really has the unique identifier, if not, considering the staff's pulse as an independent worker and generating a corresponding pulse with a pulse ID, and if so, considering the staff's pulse as a preset platform unique identifier and fusing the preset platform and the internal library business, and generating the corresponding pulse with the pulse ID; judging whether the generated arteries of the two kinds of arteries ID need to delete the arteries ID, if so, discarding the arteries, if not, converging and fusing the arteries of the arteries ID and the employee arteries data of other data sources in a form of company plus names, judging whether the arteries ID exists after the fusion in the form of company plus names, if so, fusing the arteries information according to the arteries ID, if not, considering the arteries ID as the arteries of which the source does not comprise a preset platform and an internal custodian, judging whether the arteries ID is sourced as a single microblog, if so, discarding the information, if not, judging whether the source does not comprise the internal custodian, presetting a platform, a stock market and a head enterprise high-management duplicate name, if not, creating a new arteries ID, fusing the arteries information, if not, judging whether the arteries are sourced as the single arteries/neck-English arteries, if not, the information is abandoned, and if so, a new pulse ID is created to fuse the pulse information, so that the uniqueness of the pulse is realized.
In this embodiment, the name processing means that the full-angle character is changed into a half-angle character, all the double quotation marks and single quotation marks of chinese and english are changed into single quotation marks of english, and all the parentheses are changed into the parentheses of english; changing all symbols into spaces; if the Chinese characters outside the brackets are not Chinese characters with more than two characters and the Chinese characters inside the brackets are Chinese characters with more than two characters, only the Chinese characters inside the brackets are left; the content in the brackets contains that the brackets are deleted; the name of the person only allows Chinese, English, decimal point, space and characters containing Japanese and Latin to exist, and other characters are deleted; deleting decimal points and spaces before and after the name of the person; if the name contains Chinese and no English, Japanese or Latin character, deleting decimal points and spaces; if the name contains Chinese characters and English, Japanese or Latin characters do not exist among the Chinese files, decimal points and spaces are deleted among the Chinese characters; the name contains the Chinese characters of the mobile phone user, and if more than two Chinese characters exist, the Chinese characters except the mobile phone user are reserved; if the name contains more than two Chinese characters, removing peculiar characters and only keeping pure Chinese characters; if the name does not contain Chinese, but English exists, only pure English is reserved; the case and case of English are consistent by default; successive empty characters become an english space.
As an example in this embodiment, the arteries from the preset platform are not merged if the arteries IDs are different and the company names are the same; similarly, if the name is processed, if the character length of the name is 1, the original value is selected for human context fusion.
And the aggregation module 204 is configured to aggregate the merged employee personal data and the employee dimension data set to generate an enterprise personal data map.
In this embodiment, based on the relationship fusion, the employee relationship data and the employee dimension data set corresponding to the employee relationship are subjected to association and aggregation processing, and specifically, all relationship associations and association enterprises are integrated to form a relationship graph. As a preferable solution in this embodiment, a right-to-stock penetration map may be generated by studying a situation of holding stocks corresponding to a human vein.
In the embodiment, based on the visualization technology, the relationship graph, the associated enterprise, the knowledge graph, the industrial/commercial/news dynamic state and the risk information formed by integrating all the relationship and associated enterprises are intensively displayed, so that the relationship information is richer; meanwhile, based on keyword retrieval provided by a display interface, corresponding personal veins are searched by supporting enterprise names, personal names and post names, and combined inquiry can be carried out by combining post classification, company area classification, company industry classification and company establishment years, and multiple data such as related colleagues, partners, investment partners, alumni and the like are integrated and displayed through the display interface, so that the display page integrates enterprise high-management and employee information of a plurality of platforms, and a user can search and view enterprise personal vein data of related target enterprises in a cross-platform real-time manner.
In this embodiment, there is also provided a personal data processing device, which includes a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor, and when the processor executes the computer program, the personal data processing method is implemented.
The embodiment of the invention also provides a computer readable storage medium, which comprises a stored computer program, wherein when the computer program runs, the device where the computer readable storage medium is located is controlled to execute the human vein data processing method.
Illustratively, the computer program may be partitioned into one or more modules that are stored in the memory and executed by the processor to implement the invention. The one or more modules may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the execution of the computer program in a human-vein data processing device.
The human vein data processing equipment can be computing equipment such as a desktop computer, a notebook computer, a palm computer, a cloud server and the like. The personal data processing device may include, but is not limited to, a processor, a memory, and a display. It will be appreciated by those skilled in the art that the above components are merely examples of a personal data processing device and do not constitute a limitation of a personal data processing device and may include more or less components than those described, or some components in combination, or different components, for example the personal data processing device may also include input output devices, network access devices, buses, etc.
The Processor may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. The general purpose processor may be a microprocessor or the processor may be any conventional processor or the like, the processor being the control center of the personal data processing device, with various interfaces and lines connecting the various parts of the entire personal data processing device.
The memory may be used to store the computer programs and/or modules, and the processor may implement various functions of the personal data processing apparatus by executing or executing the computer programs and/or modules stored in the memory and calling data stored in the memory. The memory may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function (such as a sound playing function, a text conversion function, etc.), and the like; the storage data area may store data (such as audio data, text message data, etc.) created according to the use of the mobile phone, etc. In addition, the memory may include high speed random access memory, and may also include non-volatile memory, such as a hard disk, a memory, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), at least one magnetic disk storage device, a Flash memory device, or other volatile solid state memory device.
Wherein, the integrated module of the human-vein data processing device can be stored in a computer readable storage medium if the integrated module is realized in the form of a software functional unit and sold or used as an independent product. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may also be implemented by a computer program, which may be stored in a computer-readable storage medium and used by a processor to implement the steps of the above-described embodiments of the method. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and proprietary practices in the jurisdiction, for example in some jurisdictions, in accordance with legislation and patent practices, the computer readable medium does not include electrical carrier signals and telecommunications signals.
One of ordinary skill in the art can understand and implement the present invention without inventive effort.
To sum up, the invention relates to a method, a device, equipment and a storage medium for processing personal vein data, which are used for generating first enterprise personal vein data by processing the acquired enterprise personal vein data according to a preset filtering rule, wherein the first enterprise personal vein data comprises member worker pulse data; according to a plurality of preset dimensions, respectively extracting dimension data of each employee in the employee personal vein data, and generating an employee dimension data set corresponding to the employee personal vein data; meanwhile, identifying the same personal data in the staff personal data according to a preset fusion rule so as to fuse the same personal data; and aggregating the merged employee personal data and the employee dimensional data set to generate an enterprise personal data map. Compared with the prior art, the technical scheme of the invention generates the enterprise personal vein data map by filtering, extracting and fusing the acquired enterprise personal vein data, so that the enterprise personal vein relation is clear, the convenience of enterprise personal vein management is improved, the redundancy of data is reduced, and the displayed enterprise personal vein information is richer and more complete.
The above description is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and substitutions can be made without departing from the technical principle of the present invention, and these modifications and substitutions should also be regarded as the protection scope of the present invention.

Claims (10)

1. A human pulse data processing method is characterized by comprising the following steps:
according to a preset filtering rule, performing data processing on the obtained enterprise personal vein data to generate first enterprise personal vein data, wherein the first enterprise personal vein data comprises employee personal vein data;
according to a plurality of preset dimensions, respectively extracting dimension data of each employee in the employee personal vein data, and generating an employee dimension data set corresponding to the employee personal vein data;
meanwhile, identifying the same personal data in the staff personal data according to a preset fusion rule so as to fuse the same personal data;
and aggregating the fused employee personal data and the employee dimension data set to generate an enterprise personal data map.
2. The people vein data processing method according to claim 1, wherein the data processing is performed on the obtained enterprise people vein data according to a preset filtering rule to generate first enterprise people vein data, specifically:
the method comprises the steps of dividing the acquired enterprise personal image data of different data sources, carrying out data processing on the divided non-head image data and head image data according to corresponding preset filtering rules, and collecting the non-head image data and the head image data after data processing to generate first enterprise personal image data.
3. The personal data processing method of claim 1, wherein the predetermined plurality of dimensions comprise educational experiences, colleagues, alumni, business partners, bidding partners, investment partners, and news dynamics.
4. The method for processing the personal vein data according to claim 2, wherein the same personal vein data in the staff personal vein data is identified according to a preset fusion rule so as to fuse the same personal vein data, specifically:
according to a preset fusion rule, after the employee personal vein data of different data sources are subjected to multiple judgment, the same personal vein data and the different personal vein data in the employee personal vein data are identified, the different personal vein data are eliminated, and the same personal vein data are fused.
5. A personal data processing device, comprising: the system comprises a filtering module, a dimension data extraction module, a fusion module and an aggregation module;
the filtering module is used for performing data processing on the acquired enterprise personal vein data according to a preset filtering rule to generate first enterprise personal vein data, wherein the first enterprise personal vein data comprises employee personal vein data;
the dimension data extraction module is used for respectively extracting dimension data of each employee in the employee personal vein data according to a plurality of preset dimensions to generate an employee dimension data set corresponding to the employee personal vein data;
the fusion module is used for identifying the same personal vein data in the staff personal vein data according to a preset fusion rule so as to fuse the same personal vein data;
and the aggregation module is used for aggregating the fused employee personal data and the employee dimension data set to generate an enterprise personal data map.
6. The personal data processing device according to claim 5, wherein the filtering module is configured to perform data processing on the obtained enterprise personal data according to a preset filtering rule to generate first enterprise personal data, and specifically:
the method comprises the steps of dividing the acquired enterprise personal image data of different data sources, carrying out data processing on the divided non-head image data and head image data according to corresponding preset filtering rules, and collecting the non-head image data and the head image data after data processing to generate first enterprise personal image data.
7. The personal data processing device as claimed in claim 5, wherein the plurality of dimensions preset in the dimension data extraction module include educational experiences, colleagues, alumni, business partners, bidding partners, investment partners and news dynamics.
8. The personal vein data processing device according to claim 6, wherein the fusion module is configured to identify the same personal vein data in the employee personal vein data according to a preset fusion rule, so as to fuse the same personal vein data, and specifically:
according to a preset fusion rule, after the employee personal vein data of different data sources are subjected to multiple judgment, the same personal vein data and the different personal vein data in the employee personal vein data are identified, the different personal vein data are eliminated, and the same personal vein data are fused.
9. A terminal device comprising a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor, the processor implementing the personal data processing method according to any one of claims 1 to 4 when executing the computer program.
10. A computer-readable storage medium, comprising a stored computer program, wherein the computer program, when executed, controls an apparatus in which the computer-readable storage medium is located to perform the human vein data processing method according to any one of claims 1 to 4.
CN202111683522.2A 2021-12-31 2021-12-31 Human vein data processing method, device, equipment and storage medium Pending CN114443727A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111683522.2A CN114443727A (en) 2021-12-31 2021-12-31 Human vein data processing method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111683522.2A CN114443727A (en) 2021-12-31 2021-12-31 Human vein data processing method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114443727A true CN114443727A (en) 2022-05-06

Family

ID=81366167

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111683522.2A Pending CN114443727A (en) 2021-12-31 2021-12-31 Human vein data processing method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114443727A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115983809A (en) * 2022-11-03 2023-04-18 广州市锌云信息科技有限公司 Enterprise office management method and system based on intelligent portal platform

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115983809A (en) * 2022-11-03 2023-04-18 广州市锌云信息科技有限公司 Enterprise office management method and system based on intelligent portal platform
CN115983809B (en) * 2022-11-03 2023-09-19 广州市锌云信息科技有限公司 Enterprise office management method and system based on intelligent portal platform

Similar Documents

Publication Publication Date Title
Bruns et al. Tools and methods for capturing Twitter data during natural disasters
US20160357718A1 (en) Methods and apparatus for extraction of content from an email or email threads for use in providing implicit profile attributes and content for recommendation engines
CN111191039B (en) Knowledge graph creation method, knowledge graph creation device and computer readable storage medium
Daraghmi et al. We are so close, less than 4 degrees separating you and me!
US10021061B1 (en) Message presentation management in a social networking environment
KR20120126093A (en) Method, system and server for managing dynamic information of friends in network
US11640419B2 (en) Management of event summary types
Al-Hasan et al. A tale of two movements: Egypt during the Arab spring and occupy wall street
WO2013002771A1 (en) Capturing intentions within online text
CN112417274A (en) Message pushing method and device, electronic equipment and storage medium
US20140337421A1 (en) Query-driven virtual social network group
CN106462933A (en) Using content structure to socially connect users
CN113051460A (en) Elasticissearch-based data retrieval method and system, electronic device and storage medium
CN112001159A (en) Document generation method and device, electronic equipment and storage medium
Agarwal et al. SmPFT: Social media based profile fusion technique for data enrichment
CN114443727A (en) Human vein data processing method, device, equipment and storage medium
CN109542891B (en) Data fusion method and computer storage medium
CN113177055A (en) Information updating method and device and computer storage medium
CN110968584B (en) Portrait generation system, method, electronic device and readable storage medium
CN109949090B (en) Client recommendation method and device, electronic equipment and medium
CN115544214A (en) Event processing method and device and computer readable storage medium
Rasyid et al. Public Communication of Local Government Leaders: A Case Study of Three Major Governors in Indonesia
CN115510247A (en) Method, device, equipment and storage medium for constructing electric carbon policy knowledge graph
WO2021051874A1 (en) Information pushing method and related device
CN111552890B (en) Name information processing method and device based on name prediction model and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination