CN110489739A - A kind of the name extracting method and its device of public security case and confession text based on CRF algorithm - Google Patents
A kind of the name extracting method and its device of public security case and confession text based on CRF algorithm Download PDFInfo
- Publication number
- CN110489739A CN110489739A CN201910593309.9A CN201910593309A CN110489739A CN 110489739 A CN110489739 A CN 110489739A CN 201910593309 A CN201910593309 A CN 201910593309A CN 110489739 A CN110489739 A CN 110489739A
- Authority
- CN
- China
- Prior art keywords
- case
- text
- confession
- public security
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000004422 calculation algorithm Methods 0.000 title claims abstract description 37
- 238000000034 method Methods 0.000 title claims abstract description 37
- 238000000605 extraction Methods 0.000 claims abstract description 18
- 239000000284 extract Substances 0.000 claims abstract description 11
- 238000013507 mapping Methods 0.000 claims abstract description 5
- 238000012544 monitoring process Methods 0.000 claims abstract description 4
- 238000012512 characterization method Methods 0.000 claims description 6
- 238000004590 computer program Methods 0.000 claims description 5
- 239000000463 material Substances 0.000 claims description 3
- 238000003058 natural language processing Methods 0.000 abstract description 3
- 238000012549 training Methods 0.000 description 6
- 238000010586 diagram Methods 0.000 description 4
- 238000002360 preparation method Methods 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000000586 desensitisation Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 239000000178 monomer Substances 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/26—Government or public services
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A30/00—Adapting or protecting infrastructure or their operation
- Y02A30/60—Planning or developing urban green infrastructure
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Tourism & Hospitality (AREA)
- General Physics & Mathematics (AREA)
- Economics (AREA)
- Health & Medical Sciences (AREA)
- Marketing (AREA)
- Primary Health Care (AREA)
- Strategic Management (AREA)
- General Health & Medical Sciences (AREA)
- General Business, Economics & Management (AREA)
- Human Resources & Organizations (AREA)
- Educational Administration (AREA)
- Development Economics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- Machine Translation (AREA)
Abstract
The present invention relates to natural language processing technique fields, specifically disclose the name extracting method and its device of a kind of public security case based on CRF algorithm and confession text, method includes the data information for obtaining public security case text and case confession, case text and case confession are corresponded to and are integrally formed a text data, and is stored in a tables of data to be labeled;Case text and the corresponding text data being integrally formed of case confession are subjected to entity word mark;Part-of-speech tagging is carried out, feature is extracted to establish essential characteristic template according to mark;The corpus of essential characteristic template and public security case text and case confession input CRF algorithm model is trained, name is obtained and extracts model;Establish the information data table of avenue situation in public security monitoring range;It extracts model by name to identify newly-increased case text and confession information, and the information data table of correspondence mappings to avenue situation carries out information extraction, improves office efficiency.
Description
Technical field
The present invention relates to natural language processing technique field, specifically disclose a kind of public security case based on CRF algorithm and
The name extracting method and its device of confession text.
Background technique
With the fast development of natural language processing technique, which has been widely used for the relevant industries such as search engine
In, and public security organization has accumulated a large amount of case text data information in long-term information process, public security department needs to put into
More and more manpowers go analysis and classification to case text and confession text.
Currently, there are subjective on term since numerous cases and confession via different police's descriptions and record
Difference in order to accurately find relevant information, need public security officer to spend more and without specification description term
Time and efforts, aggravated the operating pressure and employment cost of public security officer significantly during access, office efficiency is significantly
It reduces;And when public security officer needs to extract some case information, after access case need to be passed through and browse case entire contents
It obtains, the important information of case can not be intuitively understood, to cause the inefficiency of public security officer's analysis case.
Therefore, a kind of method and apparatus that can solve the above problem are needed in industry.
Summary of the invention
In order to overcome shortcoming and defect existing in the prior art, the purpose of the present invention is to provide one kind to be calculated based on CRF
The public security case of method and the name extracting method and its device of confession text can make public security officer can be fast in office process with this
Speed accurately recognizes the relevant information of case.
To achieve the above object, the present invention uses following scheme.
A kind of name extracting method of public security case and confession text based on CRF algorithm, comprising:
Case text and case confession are corresponded to and are integrally formed a text by the data information for obtaining public security case text and case confession
Notebook data, and store in a tables of data to be labeled;
Case text and the corresponding text data being integrally formed of case confession are subjected to entity word mark;
Part-of-speech tagging is carried out, feature is extracted to establish essential characteristic template according to mark;
The corpus of essential characteristic template and public security case text and case confession input CRF algorithm model is trained, is obtained
Model is extracted in name;
Establish the information data table of avenue situation in public security monitoring range;
It extracts model by name to identify newly-increased case text and confession information, and correspondence mappings are to the letter of avenue situation
It ceases tables of data and carries out information extraction.
Further, the carry out part-of-speech tagging, extracting feature according to mark to establish essential characteristic template includes:
Corpus is segmented using jieba participle method, part-of-speech tagging is carried out using jieba.posseg;
According to participle and part-of-speech tagging, each participle is labeled to obtain its corresponding label using BIEOS marking model,
Wherein B indicates word position stem in label, and I indicates that word part interior, E indicate word position tail portion, and O indicates that unrelated word, S indicate single
Pronouns, general term for nouns, numerals and measure words;
Feature extraction is carried out to establish essential characteristic template to corpus, wherein feature includes part of speech feature, entity word feature and mark
Label.
Further, the essential characteristic template of establishing is that the user-defined feature template based on u-gram includes:
Establish feature templates:
U00:%x[-2,0]
U01:%x[-1,0]
U02:%x[0,0]
U03:%x[1,0]
U04:%x[2,0]
U05:%x[-2,1]
U06:%x[-1,1]
U07:%x[0,1]
U08:%x[1,1]
U09:%x[2,1]
U10:%x[-2,0]/%x[-1,0]/%x[0,0]
U11:%x[-1,0]/%x[0,0]/%x[1,0]
U12:%x[0,0]/%x[1,0]/%x[2,0]
U13:%x[-2,0]/%x[-1,1]
U14:%x[0,0]/%x[1,0]
U15:%x[-1,0]/%x[0,0]
U16:%x[1,1]/%x[2,1]
U17:%x[-1,1]/%x[0,1]
U18:%x[0,1]/%x[1,1]
Wherein, U00 to U09 respectively indicates the feature participle of respective position;U10 to U18 then indicates to segment the language formed by feature
Material;
Part of speech feature, entity word feature and label are substituted into position and the corpus group of user-defined feature template assigned characteristics participle
At.
Further, the entity word includes crime place place, loss article, case-involving tool, case-involving means;Institute's predicate
Property includes noun, verb, adjective, pronoun, preposition.
It further, further include being trained to be pre-processed in input CRF algorithm model, specifically:
Using public security system data, crime place locality data table, loss type of goods tables of data and case-involving tool are constructed respectively
Tables of data;
The corpus of public security case text and case confession is converted to the input format of CRF algorithm model, each of them corpus
Format is expressed as<word, part of speech feature, loses article characteristics, case-involving tool characteristics, Site characterization, and label>;
Each of corpus word is traversed, if loss article characteristics, case-involving tool characteristics, Site characterization appear in its corresponding number
According to being then labeled as 1 in table, 0 is labeled as if not occurring.
Further, the information of the avenue situation includes avenue address information and its corresponding house, list
Position, place, personal information.
A kind of mobile device, comprising:
Case text and confession text data module are integrated, for obtaining the data information of public security case text and case confession,
Case text and case confession correspondence are integrally formed a text data;
Database module, for recording the information of avenue situation;
Processor is adapted for carrying out program instruction;
Storage device, is suitable for storage program instruction, and described program instruction is suitable for having processor to load and executing above-mentioned to realize
The name extracting method of public security case and confession text based on CRF algorithm.
A kind of computer readable storage devices, are stored with computer program, the computer program be executed by processor with
Realize the name extracting method of the above-mentioned public security case based on CRF algorithm and confession text.
A kind of name extraction system of public security case and confession text based on CRF algorithm, server;
Server includes processor and storage equipment;
Processor is adapted for carrying out program instruction;
Equipment is stored, storage program instruction is suitable for, described program instruction is suitable for being loaded by processor and being executed above-mentioned to realize
The name extracting method of public security case and confession text based on CRF algorithm.
Beneficial effects of the present invention: the classification extraction side of a kind of public security case based on CRF algorithm and confession text is provided
Method and its device, it is by obtaining the data information of public security case text and case confession, case text and case confession is corresponding
It is integrally formed a text data, and stores in a tables of data and marks several part-of-speech taggings to carry out entity word, can be led to after completing mark
Essential characteristic template is established in the extraction for crossing mark progress feature, then again by essential characteristic template and public security case text and case
Confession information input so that obtaining a general name extracts model, while establishing public security prison to the model training of CRF algorithm
The information data table for controlling avenue situation in range, when the data information for having newly-increased public security case text and case confession
When, it is passed into name and extracts the key message for identifying newly-increased public security case text and case confession in model, facilitate public security
The inquiry to case information of personnel, while mapping to the information data table of avenue situation and feeding back to public security officer, make
It is more fully accurate to obtain case extraction information.And this programme establishes a general name by sample training and extracts model,
The difference being adapted in different police's description and record term, can accurately find relevant information, mention significantly
High case handling efficiency.
Detailed description of the invention
Fig. 1 is the flow diagram of the embodiment of the present invention.
Fig. 2 is the schematic device of the embodiment of the present invention.
Fig. 3 is the schematic diagram of the corpus training format of the embodiment of the present invention.
Fig. 4 is the schematic diagram of BIEOS of embodiment of the present invention model mark.
Fig. 5 is the schematic diagram that the embodiment of the present invention extracts address information.
Specific embodiment
For the ease of the understanding of those skilled in the art, the present invention is made further below with reference to examples and drawings
Bright, the content that embodiment refers to not is limitation of the invention.
The present invention provides a kind of public security case based on CRF algorithm and the name extracting methods of confession text, such as Fig. 1 institute
Show, in order to establish one be suitable for public security case text and case confession information pass through model, it is necessary first to existing public affairs
Public security case text and case confession information in peace system carry out a certain amount of sample training, so that the model adapts to
Difference in different police's descriptions and record term, and inquire corresponding information accurately to improve office efficiency.
Therefore the data information that public security case text and case confession are first obtained from public security system, by case text and case confession pair
It should be integrally formed a text data, can be corresponded to unified case text and case confession with this, while in order to facilitate after
Continuous mark is stored in a tables of data.
Case text and the corresponding text data being integrally formed of case confession are subjected to entity word mark, wherein entity word master
It to include crime place place, loss article, case-involving tool, case-involving means;Above-mentioned several entity words are common in office process
Key message, this also for extract key message, rather than case full text or a big segment information, with this public security office worker
The artificial extraction again from a big segment information is no longer needed to, efficiency is increased.But above- mentioned information are intended only as one embodiment, can basis
The requirement of different public security offices, can suitably increase other entity word informations.Entity word mark can be marked using artificial, or
System mark or system mark and artificial nucleus couple, herein with no restrictions.
Part-of-speech tagging is carried out, primarily to difference including but not limited to noun, verb, adjective, pronoun, preposition, than
Such as in order to both can be used as the case where noun can also be used as verb for the same word, avoid mentioning subsequently through the name of foundation
The problem of obscuring when taking model extraction.
As shown in Figures 3 and 4, when carrying out part-of-speech tagging, first corpus is segmented using jieba participle method, that is,
One long sentence is divided into multiple participles, for example, " crossing Dong Keng intersection preparation in the town Dongguan City LiaoPo is gone home by bus " participle at
" Dongguan City/small house/control/town/Dong Keng/crossing/boundary/place/preparation/by bus/is gone home ", then uses
Jieba.posseg carries out part-of-speech tagging;Further according to participle and part-of-speech tagging, each participle is carried out using BIEOS marking model
Mark is to obtain its corresponding label, and wherein B indicates word position stem in label, and I indicates that word part interior, E indicate word position
Tail portion, O indicate that unrelated word, S indicate monomer word;For example town-brand label in Dongguan are B-PLACE in Fig. 3, mark in this way is advantageous
In subsequent feature extraction, the speed of Speed-up Establishment essential characteristic template.
Then, feature extraction is carried out to establish essential characteristic template to corpus, this feature template is equivalent to an empty content
Template, only include record need training feature, these features include it is above-mentioned have the part of speech feature mentioned, entity word feature and
Label.
In the present embodiment, establishing essential characteristic template is the user-defined feature template based on U-gram, including, it makes by oneself
Adopted feature templates format is %U [row, col], and due to using U-gram template types, beginning letter is U;Row expression is worked as
The row of front position, col corresponding is column.Every a line represents a template below:
U00:%x[-2,0]
U01:%x[-1,0]
U02:%x[0,0]
U03:%x[1,0]
U04:%x[2,0]
U05:%x[-2,1]
U06:%x[-1,1]
U07:%x[0,1]
U08:%x[1,1]
U09:%x[2,1]
U10:%x[-2,0]/%x[-1,0]/%x[0,0]
U11:%x[-1,0]/%x[0,0]/%x[1,0]
U12:%x[0,0]/%x[1,0]/%x[2,0]
U13:%x[-2,0]/%x[-1,1]
U14:%x[0,0]/%x[1,0]
U15:%x[-1,0]/%x[0,0]
U16:%x[1,1]/%x[2,1]
U17:%x[-1,1]/%x[0,1]
U18:%x[0,1]/%x[1,1]
Wherein, U00 to U09 respectively indicates the feature participle of respective position;U10 to U18 then indicates to segment the language formed by feature
Material, and above-mentioned number is to refer in a generation, is not actual position coordinates, should not be a limitation of the present invention;By part of speech
Feature, entity word feature and label substitute into position and the corpus composition of self-defined template assigned characteristics participle.
In order to illustrate more clearly of, in conjunction with BIEOS mark and segment illustrate but and as limitation of the invention, such as
The crossing Dong Keng intersection preparation in the town Dongguan City LiaoPo goes home to be robbed with knife by bus
B I I I I I I I E O O O O B E
It is mentioned before the meaning of B I E O S therein, does not make tired chat herein.
Assuming that current word is " Dong Keng ", corresponding " Dong Keng " word of U02:%x [0,0], then U00:%x [- 2,0] indicates " control " word,
U01:%x [1,0] indicates that " crossing " word, U05:%x [- 1,0]/%x [0,0]/%x [1,0] indicate " town/Dong Keng/crossing ", such as such
It pushes away.
Then the corpus of essential characteristic template and public security case text and case confession is inputted into CRF algorithm model, the mistake
Journey is similar to carry out template to the content of public security case text and case confession according to the regulation of essential characteristic template for case information
Filling to carry out sample training, obtain name and extract model.
In the present embodiment, in order to better adapt to CRF algorithm model, it is pre- that progress is trained in input CRF algorithm model
Processing, specifically:
Using public security system data, crime place locality data table place_data, loss type of goods tables of data are constructed respectively
Hings_data and case-involving tool data table tools_data;
As shown in figure 3, the corpus of public security case text and case confession is converted to the input format of CRF algorithm model, wherein
Each corpus format is expressed as<word, part of speech feature, loses article characteristics, case-involving tool characteristics, Site characterization, and label>;Time
Each of corpus word is gone through, if loss article characteristics, case-involving tool characteristics, Site characterization appear in its corresponding tables of data
It is then labeled as 1,0 is labeled as if not occurring, information can be reflected intuitively more with this.
After completing to name and extracting model foundation, information extraction directly to newly-increased case and it can queried, but
Application is more accurate for the information for ensureing acquisition, meets public security and handles official business rigorous requirement, also sets up public security monitoring range inner city
The information data table of city street situation, the data information indicate data based on public security system to establish, the avenue
The information of situation includes avenue address information and its corresponding house, unit, place, personal information.The information table is main
It is that country advocates " two marks four are real " information.Two marks include normal address library, standard operation figure;Four is real including real population, reality
There is house, has unit in fact, has facility in fact.That is, after extracting the key message that model identifies by name, it can be by it
It maps in the information table of avenue situation, system can make a verification, for example extract model extraction by name and come out
Send out place place, loss article, case-involving tool, case-involving means information, it is assumed that the event is the event of robbing the bank, and city street
Recording in road situation information table is resident room, then system can recognize wrong, re-start extraction to the case information, significantly
Improve accuracy.More specifically, as shown in figure 5, by the address information extracted " Tangxia Town, Dongguan City, Guangdong Province ring city east
Road * * * " (since data are sensitive data, has made desensitization process), in mapping value public security department " two marks four are real " table, according to table
Middle information can feed back the property reality rental house for locating the address.
In addition, as shown in Fig. 2, the present invention also provides a kind of mobile devices, comprising:
Case text and confession text data module are integrated, for obtaining the data information of public security case text and case confession,
Case text and case confession correspondence are integrally formed a text data;
Database module, for recording the information of avenue situation;
Processor is adapted for carrying out program instruction;
Storage device, is suitable for storage program instruction, and described program instruction is suitable for having processor to load and executing above-mentioned to realize
The name extracting method of public security case and confession text based on CRF algorithm.
The present invention provides a kind of computer readable storage devices again, is stored with computer program, which is characterized in that the meter
Calculation machine program is executed by processor the name extracting method of the above-mentioned public security case based on CRF algorithm and confession text.
The present invention also provides a kind of public security case based on CRF algorithm and the name extraction system of confession text, features
It is, server;
Server includes processor and storage equipment;
Processor is adapted for carrying out program instruction;
Equipment is stored, storage program instruction is suitable for, described program instruction is suitable for being loaded by processor and being executed above-mentioned to realize
The name extracting method of public security case and confession text based on CRF algorithm.
The above is only a preferred embodiment of the present invention, for those of ordinary skill in the art, according to the present invention
Thought, there will be changes in the specific implementation manner and application range, and the content of the present specification should not be construed as to the present invention
Limitation.
Claims (9)
1. a kind of name extracting method of public security case and confession text based on CRF algorithm characterized by comprising
Case text and case confession are corresponded to and are integrally formed a text by the data information for obtaining public security case text and case confession
Notebook data, and store in a tables of data to be labeled;
Case text and the corresponding text data being integrally formed of case confession are subjected to entity word mark;
Part-of-speech tagging is carried out, feature is extracted to establish essential characteristic template according to mark;
The corpus of essential characteristic template and public security case text and case confession input CRF algorithm model is trained, is obtained
Model is extracted in name;
Establish the information data table of avenue situation in public security monitoring range;
It extracts model by name to identify newly-increased case text and confession information, and correspondence mappings are to the letter of avenue situation
It ceases tables of data and carries out information extraction.
2. a kind of name extracting method of public security case and confession text based on CRF algorithm according to claim 1,
It is characterized in that, the carry out part-of-speech tagging, extracting feature according to mark to establish essential characteristic template includes:
Corpus is segmented using jieba participle method, part-of-speech tagging is carried out using jieba.posseg;
According to participle and part-of-speech tagging, each participle is labeled to obtain its corresponding label using BIEOS marking model,
Wherein B indicates word position stem in label, and I indicates that word part interior, E indicate word position tail portion, and O indicates that unrelated word, S indicate single
Pronouns, general term for nouns, numerals and measure words;
Feature extraction is carried out to establish essential characteristic template to corpus, wherein feature includes part of speech feature, entity word feature and mark
Label.
3. a kind of name extracting method of public security case and confession text based on CRF algorithm according to claim 2,
It is characterized in that, the essential characteristic template of establishing is that the user-defined feature template based on U-gram includes:
Establish user-defined feature template:
U00:%x[-2,0]
U01:%x[-1,0]
U02:%x[0,0]
U03:%x[1,0]
U04:%x[2,0]
U05:%x[-2,1]
U06:%x[-1,1]
U07:%x[0,1]
U08:%x[1,1]
U09:%x[2,1]
U10:%x[-2,0]/%x[-1,0]/%x[0,0]
U11:%x[-1,0]/%x[0,0]/%x[1,0]
U12:%x[0,0]/%x[1,0]/%x[2,0]
U13:%x[-2,0]/%x[-1,1]
U14:%x[0,0]/%x[1,0]
U15:%x[-1,0]/%x[0,0]
U16:%x[1,1]/%x[2,1]
U17:%x[-1,1]/%x[0,1]
U18:%x[0,1]/%x[1,1]
Wherein, U00 to U09 respectively indicates the feature participle of respective position;U10 to U18 then indicates to segment the language formed by feature
Material;
Part of speech feature, entity word feature and label are substituted into position and the corpus group of user-defined feature template assigned characteristics participle
At.
4. a kind of name of described in any item public security cases and confession text based on CRF algorithm is extracted according to claim 1
Method, which is characterized in that the entity word includes crime place place, loss article, case-involving tool, case-involving means;Institute's predicate
Property includes noun, verb, adjective, pronoun, preposition.
5. a kind of name extracting method of public security case and confession text based on CRF algorithm according to claim 4,
It is characterized in that, further includes being trained to be pre-processed in input CRF algorithm model, specifically:
Using public security system data, crime place locality data table, loss type of goods tables of data and case-involving tool are constructed respectively
Tables of data;
The corpus of public security case text and case confession is converted to the input format of CRF algorithm model, each of them corpus
Format is expressed as<word, part of speech feature, loses article characteristics, case-involving tool characteristics, Site characterization, and label>;
Each of corpus word is traversed, if loss article characteristics, case-involving tool characteristics, Site characterization appear in its corresponding number
According to being then labeled as 1 in table, 0 is labeled as if not occurring.
6. a kind of name of public security case and confession text based on CRF algorithm according to claim 1-5 mentions
Take method, which is characterized in that the information of the avenue situation includes avenue address information and its corresponding house, list
Position, place, personal information.
7. a kind of mobile device characterized by comprising
Case text and confession text data module are integrated, for obtaining the data information of public security case text and case confession,
Case text and case confession correspondence are integrally formed a text data;
Database module, for recording the information of avenue situation;
Processor is adapted for carrying out program instruction;
Storage device, is suitable for storage program instruction, and described program instruction is suitable for having processor to load and executing to realize that right is wanted
Seek the name extracting method of the public security case and confession text described in 1-6 any one based on CRF algorithm.
8. a kind of computer readable storage devices, are stored with computer program, which is characterized in that the computer program is processed
Device execution is mentioned with the name for realizing the public security case as claimed in any one of claims 1 to 6 based on CRF algorithm and confession text
Take method.
9. a kind of name extraction system of public security case and confession text based on CRF algorithm, which is characterized in that server;
Server includes processor and storage equipment;
Processor is adapted for carrying out program instruction;
Equipment is stored, storage program instruction is suitable for, described program instruction is suitable for being loaded by processor and being executed to realize that right is wanted
Seek the name extracting method of the public security case and confession text described in 1 to 6 any one based on CRF algorithm.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910593309.9A CN110489739B (en) | 2019-07-03 | 2019-07-03 | Naming extraction method and device for public security cases and oral text based on CRF algorithm |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910593309.9A CN110489739B (en) | 2019-07-03 | 2019-07-03 | Naming extraction method and device for public security cases and oral text based on CRF algorithm |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110489739A true CN110489739A (en) | 2019-11-22 |
CN110489739B CN110489739B (en) | 2023-06-20 |
Family
ID=68546041
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910593309.9A Active CN110489739B (en) | 2019-07-03 | 2019-07-03 | Naming extraction method and device for public security cases and oral text based on CRF algorithm |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110489739B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112925919A (en) * | 2021-03-03 | 2021-06-08 | 曲阜师范大学 | Knowledge graph driven personalized job layout method |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070046982A1 (en) * | 2005-08-23 | 2007-03-01 | Hull Jonathan J | Triggering actions with captured input in a mixed media environment |
US20120330971A1 (en) * | 2011-06-26 | 2012-12-27 | Itemize Llc | Itemized receipt extraction using machine learning |
US20150186361A1 (en) * | 2013-12-25 | 2015-07-02 | Kabushiki Kaisha Toshiba | Method and apparatus for improving a bilingual corpus, machine translation method and apparatus |
CN109190110A (en) * | 2018-08-02 | 2019-01-11 | 厦门快商通信息技术有限公司 | A kind of training method of Named Entity Extraction Model, system and electronic equipment |
CN109710925A (en) * | 2018-12-12 | 2019-05-03 | 新华三大数据技术有限公司 | Name entity recognition method and device |
-
2019
- 2019-07-03 CN CN201910593309.9A patent/CN110489739B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070046982A1 (en) * | 2005-08-23 | 2007-03-01 | Hull Jonathan J | Triggering actions with captured input in a mixed media environment |
US20120330971A1 (en) * | 2011-06-26 | 2012-12-27 | Itemize Llc | Itemized receipt extraction using machine learning |
US20150186361A1 (en) * | 2013-12-25 | 2015-07-02 | Kabushiki Kaisha Toshiba | Method and apparatus for improving a bilingual corpus, machine translation method and apparatus |
CN109190110A (en) * | 2018-08-02 | 2019-01-11 | 厦门快商通信息技术有限公司 | A kind of training method of Named Entity Extraction Model, system and electronic equipment |
CN109710925A (en) * | 2018-12-12 | 2019-05-03 | 新华三大数据技术有限公司 | Name entity recognition method and device |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112925919A (en) * | 2021-03-03 | 2021-06-08 | 曲阜师范大学 | Knowledge graph driven personalized job layout method |
Also Published As
Publication number | Publication date |
---|---|
CN110489739B (en) | 2023-06-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111708773B (en) | Multi-source scientific and creative resource data fusion method | |
CN111680490B (en) | Cross-modal document processing method and device and electronic equipment | |
WO2021208696A1 (en) | User intention analysis method, apparatus, electronic device, and computer storage medium | |
CN107193796B (en) | Public opinion event detection method and device | |
CN108959566B (en) | A kind of medical text based on Stacking integrated study goes privacy methods and system | |
CN107357765B (en) | Word document flaking method and device | |
CN112035675A (en) | Medical text labeling method, device, equipment and storage medium | |
CN111241230A (en) | Method and system for identifying string mark risk based on text mining | |
CN111222330B (en) | Chinese event detection method and system | |
CN109299469A (en) | A method of identifying complicated address in long text | |
CN115130613B (en) | False news identification model construction method, false news identification method and device | |
CN110489739A (en) | A kind of the name extracting method and its device of public security case and confession text based on CRF algorithm | |
CN114416939A (en) | Intelligent question and answer method, device, equipment and storage medium | |
CN112416992B (en) | Industry type identification method, system and equipment based on big data and keywords | |
CN111898528B (en) | Data processing method, device, computer readable medium and electronic equipment | |
Panenghat et al. | Towards the necessity for debiasing natural language inference datasets | |
CN106649875B (en) | Public opinion big data visualization system | |
CN109271479A (en) | A kind of resume structuring processing method | |
CN112330501A (en) | Document processing method and device, electronic equipment and storage medium | |
CN111427977B (en) | Electronic eye data processing method and device | |
CN116976321A (en) | Text processing method, apparatus, computer device, storage medium, and program product | |
CN110866394A (en) | Company name identification method and device, computer equipment and readable storage medium | |
CN106598983A (en) | Information display method and device | |
US20220075950A1 (en) | Data labeling method and device, and storage medium | |
CN112989811A (en) | BilSTM-CRF-based historical book reading auxiliary system and control method thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |