CN112434158B - Enterprise tag acquisition method, enterprise tag acquisition device, storage medium and computer equipment - Google Patents

Enterprise tag acquisition method, enterprise tag acquisition device, storage medium and computer equipment Download PDF

Info

Publication number
CN112434158B
CN112434158B CN202011264990.1A CN202011264990A CN112434158B CN 112434158 B CN112434158 B CN 112434158B CN 202011264990 A CN202011264990 A CN 202011264990A CN 112434158 B CN112434158 B CN 112434158B
Authority
CN
China
Prior art keywords
text
candidate
candidate keyword
enterprise
parameter
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011264990.1A
Other languages
Chinese (zh)
Other versions
CN112434158A (en
Inventor
柴源
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Haichuanghui Technology Entrepreneurship Development Co ltd
Original Assignee
Haichuanghui Technology Entrepreneurship Development Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Haichuanghui Technology Entrepreneurship Development Co ltd filed Critical Haichuanghui Technology Entrepreneurship Development Co ltd
Priority to CN202011264990.1A priority Critical patent/CN112434158B/en
Publication of CN112434158A publication Critical patent/CN112434158A/en
Application granted granted Critical
Publication of CN112434158B publication Critical patent/CN112434158B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/355Class or cluster creation or modification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses an enterprise tag acquisition method, an acquisition device, a storage medium and computer equipment, wherein the enterprise tag acquisition method not only extracts keywords based on enterprise basic information texts, enterprise financing texts and enterprise business model texts for describing enterprises, but also screens candidate keywords according to the positions, parts of speech, repetition times, independent ideographic capacity, heat and the like of the candidate keywords, and can take the candidate keywords which are more focused by investors as enterprise tags, so that the investors can quickly find target enterprises through the enterprise tags.

Description

Enterprise tag acquisition method, enterprise tag acquisition device, storage medium and computer equipment
Technical Field
The invention relates to the technical field of enterprise classification in the financial industry, in particular to an enterprise label acquisition method, an enterprise label acquisition device, a storage medium and computer equipment.
Background
With the progress of science and technology and the rapid development of economy, some enterprises need to introduce investors to invest in the enterprises in order to expand the development range, and for the investors, the investors often need to acquire interested contents from massive data when selecting the enterprises, so that the efficiency of searching the investors is greatly reduced.
Disclosure of Invention
The technical problem solved by the invention is to provide an enterprise tag acquisition method, an enterprise tag acquisition device, a storage medium and computer equipment, so that an investor can search an enterprise by using the enterprise tag, and the enterprise searching efficiency is improved.
The technical scheme adopted by the invention comprises the following specific contents:
An enterprise tag acquisition method comprises the following steps:
acquiring a text to be extracted, wherein the text to be extracted comprises at least one enterprise basic information text, at least one enterprise financing text and at least one enterprise business model text, and determining the text type of the text to be extracted according to the content of the text to be extracted;
Segmenting the text to be extracted of each text type to obtain candidate keywords, and obtaining initial weights of the candidate keywords;
Obtaining similarity values of each candidate keyword and candidate keywords of other text types;
Acquiring a heat value of each candidate keyword;
Obtaining a weight optimization value of each candidate keyword according to the similarity value, the heat value and the initial weight of each candidate keyword;
and determining the candidate keywords with the weight optimization values exceeding a preset threshold as enterprise tags.
As a preferable mode of the above scheme, an initial weight of each candidate keyword is obtained:
Obtaining a position parameter r i1 of the candidate keyword according to the position of the candidate keyword in the text to be extracted, and when the candidate keyword is simultaneously present in the title and the text of the text to be extracted, r i1 =2; when the candidate keywords are simultaneously present in the title or the text of the text to be extracted, r i1 =1;
Obtaining a repetition parameter r i2 of the candidate keyword according to the repetition times of the candidate keyword in the text to be extracted, and Wherein: a i is the repetition number of the ith candidate keyword, and n is the number of the candidate keywords;
Obtaining an expression parameter r i3 of the candidate keyword according to the independent ideographic capability of the candidate keyword in the text to be extracted, and when the candidate keyword can be independently ideographic, r i3 =1; when the candidate keywords cannot be ideogrammed independently, r i3 =0;
obtaining a part-of-speech parameter r i4 of the candidate keyword according to the part of speech of the candidate keyword in the text to be extracted, and when the candidate keyword is a verb, an adjective, a quantity word and a pronoun, r i4 =0; when the candidate keyword is a noun, r i4 =1;
obtaining initial weight omega i0 of the candidate key words according to the position parameter, the repetition parameter, the expression parameter and the part-of-speech parameter Wherein: n is the number of the candidate keywords.
As a preferred aspect of the above solution, the obtaining the similarity value of each candidate keyword and the candidate keywords of other text types includes the following steps:
Constructing a first vector a according to the position parameter, the repetition parameter, the expression parameter and the part-of-speech parameter of the candidate keyword, wherein the first vector is a= (r i1,ri2,ri3,ri4): r i1,ri2,ri3,ri4 is the position parameter, repetition parameter, expression parameter and part-of-speech parameter of the i candidate keyword respectively;
Constructing a second vector B according to the position parameter, the repetition parameter, the expression parameter and the part-of-speech parameter of the candidate keyword of the associated word, and the first vector is b= (r j1,rj2,rj3,rj4), wherein: r j1,rj2,rj3,rj4 is the position parameter, repetition parameter, expression parameter and part-of-speech parameter of the j candidate keyword, and the related word is a candidate keyword of other text types;
calculating the similarity value of the candidate keywords and the associated words by using the first vector and the second vector, wherein the calculation formula of the similarity value is as follows:
as a preferable aspect of the above solution, the obtaining the popularity value of each candidate keyword includes the following steps:
taking the candidate keywords as statistical items to count the vocabulary popularity of the candidate keywords;
Taking the set of each candidate keyword as a statistical item to count the heat of the set of a plurality of candidate keywords which are simultaneously concerned by investors;
And adding the vocabulary heat and the aggregate heat to obtain the retrieval heat of the candidate keywords.
As a preferable mode of the above scheme, the statistical methods of the vocabulary heat and the aggregate heat are the same, and the statistical methods are:
setting a statistical starting time, and dividing the duration between the statistical starting time and the calculation time of the overall heat or the vocabulary heat or the collection heat into a plurality of time periods;
and weighting the whole heat or the vocabulary heat or the aggregate heat in a way that the contribution degree to the heat value is lower as the distance from the current time is longer, namely: wherein: lambda j is the weight value corresponding to the j-th time period, and the closer to the time period calculated by the heat value, the larger the corresponding weight value is; beta ij is the number of times the statistical item of the overall heat or the lexical heat or the aggregate heat is collected in the jth time period.
As a preferable mode of the above scheme, according to the similarity value, the heat value and the initial weight of each candidate keyword, a calculation formula for obtaining the weight optimization value of each candidate keyword is as follows:
The invention also discloses an enterprise tag acquisition device, which comprises a first acquisition module, a second acquisition module, a third acquisition module, a fourth acquisition module, a calculation module and a determination module, wherein: the method comprises the steps that a first acquisition module acquires a text to be extracted, wherein the text to be extracted comprises at least one enterprise basic information text, at least one enterprise financing text and at least one enterprise business model text, and the text type of the text to be extracted is determined according to the content of the text to be extracted; the second acquisition module divides the text to be extracted of each text type to obtain candidate keywords, and acquires initial weights of the candidate keywords; the third acquisition module acquires similarity values of each candidate keyword and candidate keywords of other text types; the fourth acquisition module acquires the heat value of each candidate keyword; the calculation module obtains a weight optimization value of each candidate keyword according to the similarity value, the heat value and the initial weight of each candidate keyword; and the determining module determines the candidate keywords with the weight optimization values exceeding a preset threshold as enterprise tags.
The invention also discloses a computer device, which comprises a memory and a processor connected with the memory, wherein the memory stores a computer program, and the computer program realizes the steps of realizing the enterprise tag acquisition method when being executed by the processor.
The invention also discloses a computer readable storage medium, on which a computer program is stored, which when being executed by a processor, implements the steps of the method for acquiring enterprise labels.
Compared with the prior art, the invention has the beneficial effects that:
The method for acquiring the enterprise tag disclosed by the invention not only extracts the keywords based on the enterprise basic information text, the enterprise financing text and the enterprise business model text waiting text for extracting for describing the enterprise, but also screens the candidate keywords according to the positions, parts of speech, repetition times, independent ideographic capacity, heat and the like of the candidate keywords, and can take the candidate keywords which are more focused by an investor as the enterprise tag, so that the investor can quickly find a target enterprise through the enterprise tag.
The foregoing description is only an overview of the present invention, and is intended to be implemented in accordance with the teachings of the present invention, as well as the preferred embodiments thereof, together with the following detailed description of the invention, given by way of illustration only, together with the accompanying drawings.
Drawings
FIG. 1 is an application environment diagram of a method for acquiring enterprise tags in accordance with a preferred embodiment;
FIG. 2 is a flow chart of a method for acquiring an enterprise tag according to a preferred embodiment;
FIG. 3 is a block diagram of an enterprise tag acquisition apparatus in accordance with a preferred embodiment;
FIG. 4 is a block diagram illustrating a second acquisition module of FIG. 3;
FIG. 5 is a block diagram illustrating a third acquisition module of FIG. 3;
FIG. 6 is a block diagram illustrating a fourth acquisition module of FIG. 3;
FIG. 7 is a block diagram of the computer device of the preferred embodiment;
Wherein, each reference sign is:
1. A terminal; 2. a server; 3. a first acquisition module; 4. a second acquisition module; 5. a third acquisition module; 6. a fourth acquisition module; 7. a computing module; 8. a determining module; 9. a first acquisition unit; 10. a second acquisition unit; 11. a third acquisition unit; 12. a fourth acquisition unit; 13. a first calculation unit; 14. a first building unit; 15. a second construction unit; 16. a second calculation unit; 17. a first statistical unit; 18. a second statistical unit; 19. and a third calculation unit.
Detailed Description
In order to further describe the technical means and effects adopted by the invention to achieve the preset aim, the following detailed description is given below of the specific implementation, structure, characteristics and effects according to the invention with reference to the accompanying drawings and preferred embodiments:
Example 1
As shown in fig. 1, an application environment diagram of an enterprise tag acquiring method of the present invention is shown, where the enterprise tag acquiring method is applied to an enterprise tag acquiring system, the enterprise tag acquiring system includes a terminal 1 and a server 2, where the terminal 1 and the server 2 are connected through a network, the terminal 1 may be specifically a desktop terminal or a mobile terminal, the mobile terminal may be specifically at least one of a mobile phone, a tablet computer, a notebook computer, a portable wearable device, and the like, and the server 2 may be implemented by using an independent server or a server cluster formed by multiple servers.
As shown in fig. 2, in one embodiment, the present invention provides a method for obtaining an enterprise tag, which is described by taking the application of the method to the server 2 in fig. 1 as an example, and includes the following steps:
acquiring a text to be extracted, wherein the text to be extracted comprises at least one enterprise basic information text, at least one enterprise financing text and at least one enterprise business model text, and determining the text type of the text to be extracted according to the content of the text to be extracted;
The text to be extracted of each text type is segmented to obtain candidate keywords, initial weight of each candidate keyword is obtained, and the candidate keywords obtained by segmenting the text to be extracted of each text type comprise basic information keywords used for reflecting basic information of enterprises, financing keywords used for reflecting financing information of the enterprises and business mode keywords used for reflecting business modes of the enterprises due to different text types.
Obtaining similarity values of each candidate keyword and candidate keywords of other text types;
Acquiring a heat value of each candidate keyword;
Obtaining a weight optimization value of each candidate keyword according to the similarity value, the heat value and the initial weight of each candidate keyword;
and determining the candidate keywords with the weight optimization values exceeding a preset threshold as enterprise tags.
The method for acquiring the enterprise tag disclosed by the invention not only extracts the keywords based on the enterprise basic information text, the enterprise financing text and the enterprise business model text waiting text for extracting for describing the enterprise, but also screens the candidate keywords according to the positions, parts of speech, repetition times, independent ideographic capacity, heat and the like of the candidate keywords, and can take the candidate keywords which are more focused by an investor as the enterprise tag, so that the investor can quickly find a target enterprise through the enterprise tag.
As a preferable mode of the above scheme, an initial weight of each candidate keyword is obtained:
Obtaining a position parameter r i1 of the candidate keyword according to the position of the candidate keyword in the text to be extracted, and when the candidate keyword is simultaneously present in the title and the text of the text to be extracted, r i1 =2; when the candidate keywords are simultaneously present in the title or the body of the text to be extracted, r i1 =1.
Obtaining a repetition parameter r i2 of the candidate keyword according to the repetition times of the candidate keyword in the text to be extracted, andWherein: a i is the repetition number of the ith candidate keyword, and n is the number of the candidate keywords.
Obtaining an expression parameter r i3 of the candidate keyword according to the independent ideographic capability of the candidate keyword in the text to be extracted, and when the candidate keyword can be independently ideographic, r i3 =1; when the candidate keywords cannot be ideographic independently, r i3 =0.
Obtaining a part-of-speech parameter r i4 of the candidate keyword according to the part of speech of the candidate keyword in the text to be extracted, and when the candidate keyword is a verb, an adjective, a quantity word and a pronoun, r i4 =0; when the candidate keyword is a noun, r i4 =1.
Obtaining initial weight omega i0 of the candidate key words according to the position parameter, the repetition parameter, the expression parameter and the part-of-speech parameterWherein: n is the number of the candidate keywords.
It should be appreciated that the initial weight of each of the candidate keywords is determined based on the text to be extracted in which the candidate keyword is located.
As a preferred aspect of the above solution, the obtaining the similarity value of each candidate keyword and the candidate keywords of other text types includes the following steps:
constructing a first vector a according to the position parameter, the repetition parameter, the expression parameter and the part-of-speech parameter of the candidate keyword, wherein the first vector is a= (r i1,ri2,ri3,ri4): r i1,ri2,ri3,ri4 is the position parameter, repetition parameter, expression parameter and part-of-speech parameter of the i candidate keyword respectively.
Constructing a second vector B according to the position parameter, the repetition parameter, the expression parameter and the part-of-speech parameter of the candidate keyword of the associated word, and the first vector is b= (r j1,rj2,rj3,rj4), wherein: r j1,rj2,rj3,rj4 is a position parameter, a repetition parameter, an expression parameter and a part-of-speech parameter of the jth candidate keyword respectively, and the related words are candidate keywords of other text types, namely the i candidate keyword and the j candidate keyword are different in text types of the text to be extracted.
Calculating the similarity value of the candidate keywords and the associated words by using the first vector and the second vector, wherein the calculation formula of the similarity value is as follows:
as a preferable aspect of the above solution, the obtaining the popularity value of each candidate keyword includes the following steps:
the candidate keywords are used as statistical items to count the vocabulary heat of the candidate keywords, which can reflect the attention heat of investors to each candidate keyword, so that the candidate keywords with higher attention heat of investors can be accumulated.
And taking the set of the candidate keywords as a statistical item to count the heat of the set of the candidate keywords which are simultaneously paid attention to by the investor, wherein the heat of the investor when paying attention to the candidate keywords simultaneously can be reflected.
And adding the vocabulary heat and the aggregate heat to obtain the search heat of the investor on the candidate keywords.
And adding the vocabulary heat and the collection heat to obtain the search heat of the investor on the candidate keywords, and carrying out statistics of the two dimensions on the search information input by the investor when searching the enterprise by a search engine, so that the integrity of the statistics is enhanced, and the investor can input the candidate keywords and the collection of the candidate keywords to obtain statistics.
It should be appreciated that the hotness value of the candidate keyword should be the vocabulary entered by the investor when searching for businesses or financing items using a search engine.
As a preferable mode of the above scheme, the statistical methods of the vocabulary heat and the aggregate heat are the same, and the concept of "time cooling" is introduced when the vocabulary heat and the aggregate heat are counted, that is, the farther from the current time, the lower the contribution to the heat value is. Because there are many existing hot spot enterprises, the hot spot enterprises may be replaced by other hot spot enterprises quickly over time after the hot spot period, so that the hot spot enterprises closer to the current time are more attractive to investors, and therefore, based on the consideration of the factor, the statistical method is as follows:
setting a statistical starting time, and dividing the duration between the statistical starting time and the calculation time of the overall heat or the vocabulary heat or the collection heat into a plurality of time periods;
and weighting the whole heat or the vocabulary heat or the aggregate heat in a way that the contribution degree to the heat value is lower as the distance from the current time is longer, namely: wherein: lambda j is the weight value corresponding to the j-th time period, and the closer to the time period calculated by the heat value, the larger the corresponding weight value is; beta ij is the number of times the statistical item of the overall heat or the lexical heat or the aggregate heat is collected in the jth time period.
When the vocabulary heat and the aggregate heat are counted based on the consideration of time cooling, the candidate keywords of the hot spot can be ensured to have higher heat values.
In addition, different methods of valuing λ j can be used, for example: the weight values corresponding to the time periods are valued in an arithmetic distribution mode, and the weight value corresponding to the ith time period is: Or the weight values corresponding to the time periods are taken as values in a mode of proportionally distributing, and the weight value corresponding to the j-th time period is as follows: or the value of lambda j can also be determined according to the update rate of the hotspot enterprise.
As a preferable mode of the above scheme, according to the similarity value, the heat value and the initial weight of each candidate keyword, a calculation formula for obtaining the weight optimization value of each candidate keyword is as follows:
it should be understood that, although the steps in the flowchart of fig. 2 are shown in sequence as indicated by the arrows, the steps are not necessarily performed in sequence as indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in fig. 2 may include multiple sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, nor do the order in which the sub-steps or stages are performed necessarily performed in sequence, but may be performed alternately or alternately with at least a portion of the sub-steps or stages of other steps or other steps.
Example two
The invention also discloses an enterprise tag acquisition device, which comprises a first acquisition module 3, a second acquisition module 4, a third acquisition module 5, a fourth acquisition module 6, a calculation module 7 and a determination module 8, wherein: the first obtaining module 3 obtains a text to be extracted, wherein the text to be extracted comprises at least one enterprise basic information text, at least one enterprise financing text and at least one enterprise business model text, and determines the text type of the text to be extracted according to the content of the text to be extracted; the second obtaining module 4 performs word segmentation on the text to be extracted of each text type to obtain candidate keywords, and obtains initial weights of each candidate keyword; the third obtaining module 5 obtains a similarity value of each candidate keyword and candidate keywords of other text types; the fourth obtaining module 6 obtains a heat value of each candidate keyword; the calculation module 7 obtains a weight optimization value of each candidate keyword according to the similarity value, the heat value and the initial weight of each candidate keyword; the determining module 8 determines candidate keywords with weight optimization values exceeding a preset threshold value as enterprise tags.
And due to different text types, segmenting the text to be extracted of each text type to obtain candidate keywords, wherein the candidate keywords comprise basic information keywords used for reflecting basic information of enterprises, financing keywords used for reflecting financing information of the enterprises and business mode keywords used for reflecting business modes of the enterprises.
As a further preferable aspect, as shown in fig. 4, the second acquisition module 4 includes a first acquisition unit 9, a second acquisition unit 10, a third acquisition unit 11, a fourth acquisition unit 12, and a first calculation unit 13, wherein:
The first obtaining unit 9 obtains a position parameter r i1 of the candidate keyword according to the position of the candidate keyword in the text to be extracted, and when the candidate keyword appears in the title and the text of the text to be extracted at the same time, r i1 =2; when the candidate keywords are simultaneously present in the title or the text of the text to be extracted, r i1 =1;
the second obtaining unit 10 obtains a repetition parameter r i2 of the candidate keyword according to the repetition number of the candidate keyword in the text to be extracted, and Wherein: a i is the repetition number of the ith candidate keyword, and n is the number of the candidate keywords;
The third obtaining unit 11 obtains an expression parameter r i3 of the candidate keyword according to the independent ideographic capability of the candidate keyword in the text to be extracted, and when the candidate keyword can be independently ideographic, r i3 =1; when the candidate keywords cannot be ideogrammed independently, r i3 =0;
The fourth obtaining unit 12 obtains a part-of-speech parameter r i4 of the candidate keyword according to the part of speech of the candidate keyword in the text to be extracted, and when the candidate keyword is a verb, an adjective, a number word and a pronoun, r i4 =0; when the candidate keyword is a noun, r i4 =1;
The first calculation unit 13 obtains the initial weight ω i0 of the candidate keyword according to the location parameter, the repetition parameter, the expression parameter and the part-of-speech parameter Wherein: n is the number of the candidate keywords.
It should be appreciated that the initial weight of each of the candidate keywords is determined based on the text to be extracted in which the candidate keyword is located.
As a further preferred solution, as shown in fig. 5, the third obtaining module 5 includes a first building unit 14, a second building unit 15, and a second calculating unit 16, where:
The first construction unit 14 constructs a first vector a from the position parameter, the repetition parameter, the expression parameter, and the part-of-speech parameter of the candidate keyword, and the first vector is a= (r i1,ri2,ri3,ri4), wherein: r i1,ri2,ri3,ri4 is the position parameter, repetition parameter, expression parameter and part-of-speech parameter of the i candidate keyword respectively;
The second construction unit 15 constructs a second vector B from the position parameter, the repetition parameter, the expression parameter, and the part-of-speech parameter of the candidate keyword of the associated word, and the first vector is b= (r j1,rj2,rj3,rj4), wherein: r j1,rj2,rj3,rj4 is a position parameter, a repetition parameter, an expression parameter and a part-of-speech parameter of the jth candidate keyword respectively, and the related words are candidate keywords of other text types, namely the i candidate keyword and the j candidate keyword are different in text types of texts to be extracted;
the second calculation unit 16 calculates a similarity value of the candidate keyword and the related word using the first vector and the second vector, and a calculation formula of the similarity value is:
As a further preferable solution, as shown in fig. 6, the fourth obtaining module 6 includes a first statistics unit 17, a second statistics unit 18, and a third calculation unit 19, where:
the first statistics unit 17 uses the candidate keywords as statistics items to count the vocabulary popularity of the candidate keywords;
The second statistics unit 18 counts the heat of the collection of the candidate keywords as a statistics item for the investor to pay attention to the collection of the candidate keywords at the same time;
The third computing unit 19 adds the vocabulary heat and the aggregate heat to obtain a search heat for the enterprise by the investor.
In this embodiment, the statistical methods of the vocabulary heat and the aggregate heat are the same as those of the first embodiment.
As a further preferable solution, the calculation module 7 obtains a weight optimization value of each candidate keyword according to the similarity value, the heat value and the initial weight of each candidate keyword, and the calculation formula is as follows:
It should be noted that, each module in the enterprise tag acquisition apparatus may be implemented in whole or in part by software, hardware, and a combination thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.
Example III
The invention also discloses a computer device, which can be a server, as shown in fig. 7, and comprises a processor, a memory, a network interface and a database which are connected through a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is used for storing operation behavior data, commodity information data, and the like. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program when executed by a processor implements the steps of the method of obtaining an enterprise tag.
It will be appreciated by those skilled in the art that the structure shown in FIG. 7 is merely a block diagram of some of the structures associated with the present inventive arrangements and is not limiting of the computer device to which the present inventive arrangements may be applied, and that a particular computer device may include more or fewer components than shown, or may combine some of the components, or have a different arrangement of components.
In other embodiments, a computer device is provided, including a memory and a processor connected to the memory, where the memory stores a computer program, and the computer program when executed by the processor implements the steps of implementing the method for obtaining an enterprise tag, and specifically includes the following steps: acquiring a text to be extracted, wherein the text to be extracted comprises at least one enterprise basic information text, at least one enterprise financing text and at least one enterprise business model text, and determining the text type of the text to be extracted according to the content of the text to be extracted; segmenting the text to be extracted of each text type to obtain candidate keywords, and obtaining initial weights of the candidate keywords; obtaining similarity values of each candidate keyword and candidate keywords of other text types; acquiring a heat value of each candidate keyword; obtaining a weight optimization value of each candidate keyword according to the similarity value, the heat value and the initial weight of each candidate keyword; and determining the candidate keywords with the weight optimization values exceeding a preset threshold as enterprise tags.
And due to different text types, segmenting the text to be extracted of each text type to obtain candidate keywords, wherein the candidate keywords comprise basic information keywords used for reflecting basic information of enterprises, financing keywords used for reflecting financing information of the enterprises and business mode keywords used for reflecting business modes of the enterprises.
In other embodiments, the step of obtaining the initial weight of each candidate keyword is implemented when the processor executes the computer program, and specifically includes the following steps: (1) Obtaining a position parameter r i1 of the candidate keyword according to the position of the candidate keyword in the text to be extracted, and when the candidate keyword is simultaneously present in the title and the text of the text to be extracted, r i1 =2; when the candidate keywords are simultaneously present in the title or the text of the text to be extracted, r i1 =1; (2) Obtaining a repetition parameter r i2 of the candidate keyword according to the repetition times of the candidate keyword in the text to be extracted, and Wherein: a i is the repetition number of the ith candidate keyword, and n is the number of the candidate keywords; (3) Obtaining an expression parameter r i3 of the candidate keyword according to the independent ideographic capability of the candidate keyword in the text to be extracted, and when the candidate keyword can be independently ideographic, r i3 =1; when the candidate keywords cannot be ideogrammed independently, r i3 =0; (4) Obtaining a part-of-speech parameter r i4 of the candidate keyword according to the part of speech of the candidate keyword in the text to be extracted, and when the candidate keyword is a verb, an adjective, a quantity word and a pronoun, r i4 =0; when the candidate keyword is a noun, r i4 =1; (5) Obtaining initial weight omega i0 of the candidate key words according to the position parameter, the repetition parameter, the expression parameter and the part-of-speech parameter, and then/>Wherein: n is the number of the candidate keywords.
It should be appreciated that the initial weight of each of the candidate keywords is determined based on the text to be extracted in which the candidate keyword is located.
In some other embodiments, the step of obtaining the similarity value of each candidate keyword and the candidate keywords of other text types is implemented when the processor executes the computer program, and specifically includes the following steps: (1) Constructing a first vector a according to the position parameter, the repetition parameter, the expression parameter and the part-of-speech parameter of the candidate keyword, wherein the first vector is a= (r i1,ri2,ri3,ri4): r i1,ri2,ri3,ri4 is the position parameter, repetition parameter, expression parameter and part-of-speech parameter of the i candidate keyword respectively; (2) Constructing a second vector B according to the position parameter, the repetition parameter, the expression parameter and the part-of-speech parameter of the candidate keyword of the associated word, and the first vector is b= (r j1,rj2,rj3,rj4), wherein: r j1,rj2,rj3,rj4 is a position parameter, a repetition parameter, an expression parameter and a part-of-speech parameter of the jth candidate keyword respectively, and the related words are candidate keywords of other text types, namely the i candidate keyword and the j candidate keyword are different in text types of texts to be extracted; (3) Calculating the similarity value of the candidate keywords and the associated words by using the first vector and the second vector, wherein the calculation formula of the similarity value is as follows:
in some other embodiments, the step of obtaining the popularity value of each candidate keyword is implemented when the processor executes the computer program, and specifically includes the following steps: acquiring retrieval information input by investors when searching enterprises; performing word segmentation processing on the search information by using a word segmentation technology to obtain candidate keywords, and taking the candidate keywords as statistical items to count the vocabulary popularity of the candidate keywords; taking the set of each candidate keyword as a statistical item to count the heat of the set of a plurality of candidate keywords which are simultaneously concerned by investors; and adding the overall heat, the vocabulary heat and the aggregate heat to obtain the retrieval heat of investors to enterprises.
In this embodiment, the statistical methods of the overall heat, the vocabulary heat, and the aggregate heat are the same as those of the first embodiment. The overall heat is mainly used for reflecting the attention degree of investors to the complete retrieval information.
Example IV
The invention also discloses a computer readable storage medium having stored thereon a computer program which when executed by a processor realizes the steps of: acquiring a text to be extracted, wherein the text to be extracted comprises at least one enterprise basic information text, at least one enterprise financing text and at least one enterprise business model text, and determining the text type of the text to be extracted according to the content of the text to be extracted; segmenting the text to be extracted of each text type to obtain candidate keywords, and obtaining initial weights of the candidate keywords; obtaining similarity values of each candidate keyword and candidate keywords of other text types; acquiring a heat value of each candidate keyword; obtaining a weight optimization value of each candidate keyword according to the similarity value, the heat value and the initial weight of each candidate keyword; and determining the candidate keywords with the weight optimization values exceeding a preset threshold as enterprise tags.
And due to different text types, segmenting the text to be extracted of each text type to obtain candidate keywords, wherein the candidate keywords comprise basic information keywords used for reflecting basic information of enterprises, financing keywords used for reflecting financing information of the enterprises and business mode keywords used for reflecting business modes of the enterprises.
In other embodiments, the step of obtaining the initial weight of each candidate keyword is implemented when the computer program is executed by a processor, and specifically includes the steps of: (1) Obtaining a position parameter r i1 of the candidate keyword according to the position of the candidate keyword in the text to be extracted, and when the candidate keyword is simultaneously present in the title and the text of the text to be extracted, r i1 =2; when the candidate keywords are simultaneously present in the title or the text of the text to be extracted, r i1 =1; (2) Obtaining a repetition parameter r i2 of the candidate keyword according to the repetition times of the candidate keyword in the text to be extracted, andWherein: a i is the repetition number of the ith candidate keyword, and n is the number of the candidate keywords; (3) Obtaining an expression parameter r i3 of the candidate keyword according to the independent ideographic capability of the candidate keyword in the text to be extracted, and when the candidate keyword can be independently ideographic, r i3 =1; when the candidate keywords cannot be ideogrammed independently, r i3 =0; (4) Obtaining a part-of-speech parameter r i4 of the candidate keyword according to the part of speech of the candidate keyword in the text to be extracted, and when the candidate keyword is a verb, an adjective, a quantity word and a pronoun, r i4 =0; when the candidate keyword is a noun, r i4 =1; (5) Obtaining initial weight omega i0 of the candidate key words according to the position parameter, the repetition parameter, the expression parameter and the part-of-speech parameter, and then/>Wherein: n is the number of the candidate keywords.
It should be appreciated that the initial weight of each of the candidate keywords is determined based on the text to be extracted in which the candidate keyword is located.
In some other embodiments, the step of obtaining the similarity value of each candidate keyword and the candidate keywords of other text types is implemented when the computer program is executed by a processor, and specifically includes the steps of: (1) Constructing a first vector a according to the position parameter, the repetition parameter, the expression parameter and the part-of-speech parameter of the candidate keyword, wherein the first vector is a= (r i1,ri2,ri3,ri4): r i1,ri2,ri3,ri4 is the position parameter, repetition parameter, expression parameter and part-of-speech parameter of the i candidate keyword respectively; (2) Constructing a second vector B according to the position parameter, the repetition parameter, the expression parameter and the part-of-speech parameter of the candidate keyword of the associated word, and the first vector is b= (r j1,rj2,rj3,rj4), wherein: r j1,rj2,rj3,rj4 is a position parameter, a repetition parameter, an expression parameter and a part-of-speech parameter of the jth candidate keyword respectively, and the related words are candidate keywords of other text types, namely the i candidate keyword and the j candidate keyword are different in text types of texts to be extracted; (3) Calculating the similarity value of the candidate keywords and the associated words by using the first vector and the second vector, wherein the calculation formula of the similarity value is as follows:
In other embodiments, the step of obtaining the popularity value of each candidate keyword is implemented when the computer program is executed by a processor, and specifically includes the steps of: acquiring retrieval information input by investors when searching enterprises; performing word segmentation processing on the search information by using a word segmentation technology to obtain candidate keywords, and taking the candidate keywords as statistical items to count the vocabulary popularity of the candidate keywords; taking the set of each candidate keyword as a statistical item to count the heat of the set of a plurality of candidate keywords which are simultaneously concerned by investors; and adding the overall heat, the vocabulary heat and the aggregate heat to obtain the retrieval heat of investors to enterprises.
In this embodiment, the statistical methods of the overall heat, the vocabulary heat, and the aggregate heat are the same as those of the first embodiment. The overall heat is mainly used for reflecting the attention degree of investors to the complete retrieval information.
Those skilled in the art will appreciate that all or part of the processes in the methods of the above embodiments may be implemented by a computer program for instructing relevant hardware, where the program may be stored in a non-volatile computer readable storage medium, and where the program, when executed, may include processes in the embodiments of the methods described above. Any reference to memory, database, or other medium used in embodiments provided herein may include non-volatile memory and/or volatile memory, where: (1) The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory; (2) Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous link (SYNCHLINK) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.
It should be noted that in the description of the present invention, the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.
The above embodiments are only preferred embodiments of the present invention, and the scope of the present invention is not limited thereto, but any insubstantial changes and substitutions made by those skilled in the art on the basis of the present invention are intended to be within the scope of the present invention as claimed.

Claims (8)

1. The method for acquiring the enterprise tag is characterized by comprising the following steps of:
acquiring a text to be extracted, wherein the text to be extracted comprises at least one enterprise basic information text, at least one enterprise financing text and at least one enterprise business model text, and determining the text type of the text to be extracted according to the content of the text to be extracted;
Segmenting the text to be extracted of each text type to obtain candidate keywords, obtaining initial weight of each candidate keyword, obtaining position parameters r i1 of the candidate keywords according to positions of the candidate keywords in the text to be extracted, and when the candidate keywords are simultaneously present in the title and the text of the text to be extracted, wherein r i1 =2; when the candidate keywords are simultaneously present in the title or the text of the text to be extracted, r i1 =1; obtaining a repetition parameter r i2 of the candidate keyword according to the repetition times of the candidate keyword in the text to be extracted, and Wherein: a i is the repetition number of the ith candidate keyword, and n is the number of the candidate keywords; obtaining an expression parameter r i3 of the candidate keyword according to the independent ideographic capability of the candidate keyword in the text to be extracted, and when the candidate keyword can be independently ideographic, r i3 =1; when the candidate keywords cannot be ideogrammed independently, r i3 =0; obtaining a part-of-speech parameter r i4 of the candidate keyword according to the part of speech of the candidate keyword in the text to be extracted, and when the candidate keyword is a verb, an adjective, a quantity word and a pronoun, r i4 =0; when the candidate keyword is a noun, r i4 =1; obtaining initial weight omega i0 of the candidate key words according to the position parameter, the repetition parameter, the expression parameter and the part-of-speech parameter, and then/>
Obtaining similarity values of each candidate keyword and candidate keywords of other text types;
Acquiring a heat value of each candidate keyword;
Obtaining a weight optimization value of each candidate keyword according to the similarity value, the heat value and the initial weight of each candidate keyword;
and determining the candidate keywords with the weight optimization values exceeding a preset threshold as enterprise tags.
2. The method for obtaining an enterprise tag as claimed in claim 1, wherein obtaining a similarity value between each candidate keyword and candidate keywords of other text types comprises the steps of:
Constructing a first vector a according to the position parameter, the repetition parameter, the expression parameter and the part-of-speech parameter of the candidate keyword, wherein the first vector is a= (r i1,ri2,ri3,ri4): r i1,ri2,ri3,ri4 is the position parameter, repetition parameter, expression parameter and part-of-speech parameter of the i candidate keyword respectively;
Constructing a second vector B according to the position parameter, the repetition parameter, the expression parameter and the part-of-speech parameter of the candidate keyword of the associated word, and the first vector is b= (r j1,rj2,rj3,rj4), wherein: r j1,rj2,rj3,rj4 is the position parameter, repetition parameter, expression parameter and part-of-speech parameter of the j candidate keyword, and the related word is a candidate keyword of other text types;
calculating the similarity value of the candidate keywords and the associated words by using the first vector and the second vector, wherein the calculation formula of the similarity value is as follows:
3. the method for obtaining the enterprise tag according to claim 2, wherein obtaining the popularity value of each candidate keyword comprises the steps of:
taking the candidate keywords as statistical items to count the vocabulary popularity of the candidate keywords;
Taking the set of each candidate keyword as a statistical item to count the heat of the set of a plurality of candidate keywords which are simultaneously concerned by investors;
And adding the vocabulary heat and the aggregate heat to obtain the retrieval heat of the candidate keywords.
4. The method for obtaining an enterprise tag according to claim 3, wherein the statistical methods of the vocabulary heat and the aggregate heat are the same, and the statistical methods are:
setting a statistical starting time, and dividing the duration between the statistical starting time and the calculation time of the vocabulary heat or the collection heat into a plurality of time periods;
And weighting the vocabulary heat or the aggregate heat in a way that the contribution degree to the heat value is lower as the distance from the current time is longer, namely: wherein: lambda j is the weight value corresponding to the j-th time period, and the closer to the time period calculated by the heat value, the larger the corresponding weight value is; beta ij is the number of times the statistical item of the vocabulary heat or the aggregate heat is collected in the jth time period.
5. The method for obtaining an enterprise tag according to claim 4, wherein the calculation formula for obtaining the weight optimization value of each candidate keyword according to the similarity value, the heat value and the initial weight of each candidate keyword is as follows:
6. An enterprise tag acquisition apparatus for implementing the enterprise tag acquisition method according to any one of claims 1 to 5, characterized by comprising a first acquisition module, a second acquisition module, a third acquisition module, a fourth acquisition module, a calculation module, and a determination module, wherein: the method comprises the steps that a first acquisition module acquires a text to be extracted, wherein the text to be extracted comprises at least one enterprise basic information text, at least one enterprise financing text and at least one enterprise business model text, and the text type of the text to be extracted is determined according to the content of the text to be extracted; the second acquisition module divides the text to be extracted of each text type to obtain candidate keywords, and acquires initial weights of the candidate keywords; the third acquisition module acquires similarity values of each candidate keyword and candidate keywords of other text types; the fourth acquisition module acquires the heat value of each candidate keyword; the calculation module obtains a weight optimization value of each candidate keyword according to the similarity value, the heat value and the initial weight of each candidate keyword; and the determining module determines the candidate keywords with the weight optimization values exceeding a preset threshold as enterprise tags.
7. A computer device, characterized by: comprising a memory and a processor connected to the memory, the memory storing a computer program which, when executed by the processor, implements the steps of the method of obtaining an enterprise tag according to any one of claims 1-5.
8. A computer-readable storage medium, characterized by: a computer program stored thereon, which when executed by a processor, implements the steps of the method of obtaining an enterprise tag as claimed in any one of claims 1 to 5.
CN202011264990.1A 2020-11-13 2020-11-13 Enterprise tag acquisition method, enterprise tag acquisition device, storage medium and computer equipment Active CN112434158B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011264990.1A CN112434158B (en) 2020-11-13 2020-11-13 Enterprise tag acquisition method, enterprise tag acquisition device, storage medium and computer equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011264990.1A CN112434158B (en) 2020-11-13 2020-11-13 Enterprise tag acquisition method, enterprise tag acquisition device, storage medium and computer equipment

Publications (2)

Publication Number Publication Date
CN112434158A CN112434158A (en) 2021-03-02
CN112434158B true CN112434158B (en) 2024-05-28

Family

ID=74699951

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011264990.1A Active CN112434158B (en) 2020-11-13 2020-11-13 Enterprise tag acquisition method, enterprise tag acquisition device, storage medium and computer equipment

Country Status (1)

Country Link
CN (1) CN112434158B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113282763B (en) * 2021-06-28 2023-03-10 深圳平安智汇企业信息管理有限公司 Text key information extraction device, equipment and storage medium
CN116226213B (en) * 2023-02-22 2023-11-10 广州集联信息技术有限公司 Information recommendation system and method based on big data
CN116069938B (en) * 2023-04-06 2023-06-20 中电科大数据研究院有限公司 Text relevance analysis method

Citations (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105608166A (en) * 2015-12-18 2016-05-25 Tcl集团股份有限公司 Label extracting method and device
KR101662450B1 (en) * 2015-05-29 2016-10-05 포항공과대학교 산학협력단 Multi-source hybrid question answering method and system thereof
CN106156204A (en) * 2015-04-23 2016-11-23 深圳市腾讯计算机系统有限公司 The extracting method of text label and device
CN107066599A (en) * 2017-04-20 2017-08-18 北京文因互联科技有限公司 A kind of similar enterprise of the listed company searching classification method and system of knowledge based storehouse reasoning
CN107122413A (en) * 2017-03-31 2017-09-01 北京奇艺世纪科技有限公司 A kind of keyword extracting method and device based on graph model
CN107861948A (en) * 2017-11-16 2018-03-30 百度在线网络技术(北京)有限公司 A kind of tag extraction method, apparatus, equipment and medium
CN108509569A (en) * 2018-03-26 2018-09-07 河北省科学院应用数学研究所 Generation method, device, electronic equipment and the storage medium of enterprise's portrait
CN108874992A (en) * 2018-06-12 2018-11-23 深圳华讯网络科技有限公司 The analysis of public opinion method, system, computer equipment and storage medium
CN109101477A (en) * 2018-06-04 2018-12-28 东南大学 A kind of enterprise's domain classification and enterprise's keyword screening technique
JP2019008476A (en) * 2017-06-22 2019-01-17 富士通株式会社 Generating program, generation device and generation method
CN109255118A (en) * 2017-07-11 2019-01-22 普天信息技术有限公司 A kind of keyword extracting method and device
CN109726905A (en) * 2018-12-20 2019-05-07 北交金科金融信息服务有限公司 A kind of method and system of enterprise value portrait evaluation
CN109961091A (en) * 2019-03-01 2019-07-02 杭州叙简科技股份有限公司 A kind of accident word tag of self study and summarization generation system and method
CN109992646A (en) * 2019-03-29 2019-07-09 腾讯科技(深圳)有限公司 The extracting method and device of text label
CN110147482A (en) * 2017-09-11 2019-08-20 百度在线网络技术(北京)有限公司 Method and apparatus for obtaining burst hot spot theme
CN110442704A (en) * 2019-08-13 2019-11-12 重庆誉存大数据科技有限公司 A kind of Company News screening technique and system
CN110674319A (en) * 2019-08-15 2020-01-10 中国平安财产保险股份有限公司 Label determination method and device, computer equipment and storage medium
CN111353014A (en) * 2018-12-20 2020-06-30 阿里巴巴集团控股有限公司 Method and device for extracting job keywords and updating post requirements
CN111611340A (en) * 2019-02-26 2020-09-01 广州慧睿思通信息科技有限公司 Information extraction method and device, computer equipment and storage medium

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108334533B (en) * 2017-10-20 2021-12-24 腾讯科技(深圳)有限公司 Keyword extraction method and device, storage medium and electronic device

Patent Citations (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106156204A (en) * 2015-04-23 2016-11-23 深圳市腾讯计算机系统有限公司 The extracting method of text label and device
KR101662450B1 (en) * 2015-05-29 2016-10-05 포항공과대학교 산학협력단 Multi-source hybrid question answering method and system thereof
CN105608166A (en) * 2015-12-18 2016-05-25 Tcl集团股份有限公司 Label extracting method and device
CN107122413A (en) * 2017-03-31 2017-09-01 北京奇艺世纪科技有限公司 A kind of keyword extracting method and device based on graph model
CN107066599A (en) * 2017-04-20 2017-08-18 北京文因互联科技有限公司 A kind of similar enterprise of the listed company searching classification method and system of knowledge based storehouse reasoning
JP2019008476A (en) * 2017-06-22 2019-01-17 富士通株式会社 Generating program, generation device and generation method
CN109255118A (en) * 2017-07-11 2019-01-22 普天信息技术有限公司 A kind of keyword extracting method and device
CN110147482A (en) * 2017-09-11 2019-08-20 百度在线网络技术(北京)有限公司 Method and apparatus for obtaining burst hot spot theme
CN107861948A (en) * 2017-11-16 2018-03-30 百度在线网络技术(北京)有限公司 A kind of tag extraction method, apparatus, equipment and medium
CN108509569A (en) * 2018-03-26 2018-09-07 河北省科学院应用数学研究所 Generation method, device, electronic equipment and the storage medium of enterprise's portrait
CN109101477A (en) * 2018-06-04 2018-12-28 东南大学 A kind of enterprise's domain classification and enterprise's keyword screening technique
CN108874992A (en) * 2018-06-12 2018-11-23 深圳华讯网络科技有限公司 The analysis of public opinion method, system, computer equipment and storage medium
CN109726905A (en) * 2018-12-20 2019-05-07 北交金科金融信息服务有限公司 A kind of method and system of enterprise value portrait evaluation
CN111353014A (en) * 2018-12-20 2020-06-30 阿里巴巴集团控股有限公司 Method and device for extracting job keywords and updating post requirements
CN111611340A (en) * 2019-02-26 2020-09-01 广州慧睿思通信息科技有限公司 Information extraction method and device, computer equipment and storage medium
CN109961091A (en) * 2019-03-01 2019-07-02 杭州叙简科技股份有限公司 A kind of accident word tag of self study and summarization generation system and method
CN109992646A (en) * 2019-03-29 2019-07-09 腾讯科技(深圳)有限公司 The extracting method and device of text label
CN110442704A (en) * 2019-08-13 2019-11-12 重庆誉存大数据科技有限公司 A kind of Company News screening technique and system
CN110674319A (en) * 2019-08-15 2020-01-10 中国平安财产保险股份有限公司 Label determination method and device, computer equipment and storage medium

Also Published As

Publication number Publication date
CN112434158A (en) 2021-03-02

Similar Documents

Publication Publication Date Title
US11663254B2 (en) System and engine for seeded clustering of news events
CN112434158B (en) Enterprise tag acquisition method, enterprise tag acquisition device, storage medium and computer equipment
CN109583620B (en) Enterprise potential risk early warning method, enterprise potential risk early warning device, computer equipment and storage medium
US9626440B2 (en) Tenantization of search result ranking
US9208219B2 (en) Similar document detection and electronic discovery
CN108520041B (en) Industry classification method and system of text, computer equipment and storage medium
CN112434216B (en) Intelligent recommendation method and device for investment projects, storage medium and computer equipment
JP7451747B2 (en) Methods, devices, equipment and computer readable storage media for searching content
CN109829629A (en) Generation method, device, computer equipment and the storage medium of risk analysis reports
US20100106719A1 (en) Context-sensitive search
CN108509424A (en) Institutional information processing method, device, computer equipment and storage medium
CA2956627A1 (en) System and engine for seeded clustering of news events
US20160378847A1 (en) Distributional alignment of sets
CN112256863B (en) Method and device for determining corpus intention and electronic equipment
CN110909120A (en) Resume searching/delivering method, device and system and electronic equipment
CN110442713A (en) Abstract generation method, apparatus, computer equipment and storage medium
CN112288279A (en) Business risk assessment method and device based on natural language processing and linear regression
US20090327877A1 (en) System and method for disambiguating text labeling content objects
CN112685639A (en) Activity recommendation method and device, computer equipment and storage medium
CN115392235A (en) Character matching method and device, electronic equipment and readable storage medium
CN114547257B (en) Class matching method and device, computer equipment and storage medium
CN112182390B (en) Mail pushing method, device, computer equipment and storage medium
CN111985217B (en) Keyword extraction method, computing device and readable storage medium
CN115269765A (en) Account identification method and device, electronic equipment and storage medium
CN114996215A (en) File searching method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20231024

Address after: 5th Floor, Block B, Building 1, No. 151 Huizhiqiao Road, High tech Zone, Qingdao, Shandong Province

Applicant after: Haichuanghui Technology Entrepreneurship Development Co.,Ltd.

Address before: 100022 unit 02, 10 / F, building 108, building a 108, building B 108, building 110, building 112, building 116, building 118, building a 118, building B 118

Applicant before: Beijing Chuangye Guangrong Information Technology Co.,Ltd.

GR01 Patent grant
GR01 Patent grant