CN108197087A - Character code recognition methods and device - Google Patents
Character code recognition methods and device Download PDFInfo
- Publication number
- CN108197087A CN108197087A CN201810050150.1A CN201810050150A CN108197087A CN 108197087 A CN108197087 A CN 108197087A CN 201810050150 A CN201810050150 A CN 201810050150A CN 108197087 A CN108197087 A CN 108197087A
- Authority
- CN
- China
- Prior art keywords
- text
- coding mode
- identified
- probability value
- identification model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/12—Use of codes for handling textual entities
- G06F40/126—Character encoding
Abstract
The present invention provides a kind of character code recognition methods and device, this method include:Obtain text to be identified;Meet the coding mode of the text to be identified according to the text to be identified and the acquisition of preset coding mode identification model;The file to be identified is decoded according to the coding mode of acquisition, obtains decoding result.The embodiment of the present invention provides a kind of character code recognition methods and device, pass through the text to be identified to getting, text to be identified is obtained according to text to be identified and coding mode identification model and meets probability value corresponding to preset each coding mode, from meeting coding mode that text to be identified is determined for compliance in probability value, then it is decoded acquisition decoding result, so as to reach need not be manually set coding mode and match coding mode needed for characteristic sequence, reduce workload, flexibility is strong.
Description
Technical field
The present embodiments relate to technical field of information processing more particularly to a kind of character code recognition methods and devices.
Background technology
In computer information technology field, character code is a basic fundamental.Character code is also referred to as encoding, is word
The character code that symbol is concentrated is certain an object in specified set, the biography for storing in a computer so as to text and passing through communication network
It passs.The information stored in computer is all to use binary number representation, and to make user readable, it is necessary to according to a certain character
Collection is converted by way of character code.Common coding mode mainly has UTF-8, GB2312, GBK, BIG5 etc..It is logical
Often, different language has its corresponding applicable coding, and if ISO-8859-1 is mainly used for representing Latin character, GBK, GB2312 are normal
For simplified form of Chinese Character, and BIG5 is usually used in Chinese-traditional.
When computer stores and shows information, correct coding staff can not be obtained when due to loss of learning or being modified with
Formula leads to not normal use.Therefore, identify that the method and system of character code is extremely important.Common recognition methods has three
Kind:(1) determined according to coding range, each coding has a use scope of oneself, but when there is a large amount of coding coincidence point this
Kind method will fail.(2) it using characteristic matching, goes to match current letter with the keyword in dictionary or the feature of Manual definition
Breath, once successful match can determine.But if matching is unsuccessful, can not determine.(3) character distribution establishes character in advance
Probabilistic model, the probability that current character is distributed is calculated according to model and judges ownership situation.This method is for there is specific word
The too short coding information effect of language use habit, length is limited.
Invention content
The embodiment of the present invention provides a kind of character code recognition methods and device, for solving coding mode in the prior art
The problem of artificial setting of dependence, flexibility is poor.
In a first aspect, the embodiment of the present invention provides a kind of character code recognition methods, including:
Obtain text to be identified;
Meet the volume of the text to be identified according to the text to be identified and the acquisition of preset coding mode identification model
Code mode;
The file to be identified is decoded according to the coding mode of acquisition, obtains decoding result.
Optionally, it is described to wait to know according to meeting the text to be identified and the acquisition of preset coding mode identification model
The coding mode of other text, including:
The text to be identified is sent in the coding mode identification model calculate and obtains the text to be identified
This meets probability value corresponding to preset each coding mode;
According to the coding mode for meeting probability value and being determined for compliance with the text to be identified.
Optionally, it is described to wait to know according to meeting the text to be identified and the acquisition of preset coding mode identification model
The coding mode of other text, including:
Multiple text chunks are chosen from the text to be identified;
Each text chunk is sent in the coding mode identification model calculate and obtains each text chunk and corresponds to
Preset each coding mode meets probability value, according to the coding staff for meeting probability value and being determined for compliance with each text chunk
Formula;
The coding mode of the text to be identified is determined according to the coding mode of each text chunk.
Optionally, according to the coding mode for meeting probability value and being determined for compliance with the text to be identified, including:According to institute
It states to meet and most probable value is chosen in probability value;The corresponding coding mode of the most probable value is described to be identified as meeting
The coding mode of text.
Second aspect, the embodiment of the present invention provide a kind of character code identification device, including:
Acquisition module, for obtaining text to be identified;
Processing module, for according to the text to be identified and preset coding mode identification model acquisition meet described in treat
Identify the coding mode of text;
Decoder module is decoded the file to be identified for the coding mode according to acquisition, is decoded
As a result.
Optionally, the processing module is specifically used for:
The text to be identified is sent in the coding mode identification model calculate and obtains the text to be identified
This meets probability value corresponding to preset each coding mode;
According to the coding mode for meeting probability value and being determined for compliance with the text to be identified.
Optionally, the processing module is specifically used for:
Multiple text chunks are chosen from the text to be identified;
Each text chunk is sent in the coding mode identification model calculate and obtains each text chunk and corresponds to
Preset each coding mode meets probability value, according to the coding staff for meeting probability value and being determined for compliance with each text chunk
Formula;
The coding mode of the text to be identified is determined according to the coding mode of each text chunk.
Optionally, the processing module includes computing unit and determination unit, wherein:
Computing unit carries out calculating acquisition for the text to be identified to be sent in the coding mode identification model
The text to be identified meets probability value corresponding to preset each coding mode;
Determination unit chooses most probable value for meeting according to, the most probable value is corresponded in probability value
Coding mode as the coding mode for meeting the text to be identified.
The third aspect, the embodiment of the present invention provide a kind of electronic equipment, which is characterized in that including:Processor, memory,
Bus and storage are on a memory and the computer program that can run on a processor;
Wherein, the processor, memory complete mutual communication by the bus;
The processor realizes method as described above when performing the computer program.
Fourth aspect, the embodiment of the present invention provide a kind of non-transient computer readable storage medium storing program for executing, the non-transient calculating
Computer program is stored on machine readable storage medium storing program for executing, which realizes method as described above when being executed by processor.
As shown from the above technical solution, the embodiment of the present invention provides a kind of character code recognition methods and device, by right
The text to be identified got obtains text to be identified corresponding to preset according to text to be identified and coding mode identification model
Each coding mode meets probability value, from coding mode that text to be identified is determined for compliance in probability value is met, then carries out
Decoding obtain decoding result, so as to reach need not be manually set coding mode and match coding mode needed for characteristic sequence, subtract
Workload is lacked, flexibility is strong.
Description of the drawings
Fig. 1 is the flow diagram of character code recognition methods that one embodiment of the invention provides;
Fig. 2 is a kind of learning structure frame diagram that one embodiment of the invention provides;
Fig. 3 is the structure diagram of character code identification device that one embodiment of the invention provides;
Fig. 4 is the structure diagram of electronic equipment that one embodiment of the invention provides.
Specific embodiment
With reference to the accompanying drawings and examples, the specific embodiment of the present invention is described in further detail.Implement below
Example is used to illustrate the present invention, but be not limited to the scope of the present invention.
Fig. 1 shows that one embodiment of the invention provides a kind of character code recognition methods, including:
S11, text to be identified is obtained;
S12, the text to be identified is met according to the text to be identified and the acquisition of preset coding mode identification model
Coding mode;
S13, the file to be identified is decoded according to the coding mode of acquisition, obtains decoding result.
For above-mentioned steps S11- steps S13, it should be noted that in embodiments of the present invention, data are compiled using certain
Code mode can generate certain sequence text after being encoded.
For example, " computer technology is fast-developing " is encoded by UTF-8, it is expressed as with 16 systems:
e8aea1e7ae97e69cbae68a80e69cafe5bfabe9809fe58f91e5b195;It encodes by GBK, is represented with 16 systems
For:bcc6cbe3bbfabcbccaf5bfeccbd9b7a2d5b9.Here sequence length is limited to be no more than L character that (L can be clever
Setting living, such as 128).
In embodiments of the present invention, it is also necessary to which explanation can train to obtain coding mode identification mould by deep learning
Type, concretely:
100,000 even hundreds thousand of sequence datas are carried out deep learning to iterate, until training error and true rate reach
To acceptable degree.LSTM (time recurrent neural network), Text-CNN (Convolutional Neural can be used in model
Networks for Sentence Classification) even depth learning structure.
It is illustrated in figure 2 a kind of learning structure frame diagram provided in an embodiment of the present invention.
(1) by the input layer of input_1, the embeding layer (being also expression layer) of embedding_1, the ginseng of embeding layer are connect
Numerical value is automatically learned by model.
After sequence is read in, for ease of calculating, each 16 ary codes are converted to the call number of a positive integer first.It builds
Vertical mapping table, it is as shown in the table:
It is reserved | a | b | c | d | e | f | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 |
As abc123 is converted to:1,2,3,8,9,10.The sequence of these call numbers, can be by model as input layer data
In embeding layer received.Mend 0 in the part of curtailment L.
After embeding layer receives the sequence of call number, the matrix form that can carry out the operations such as convolution is converted into, is exactly
Each call number of sequence is initialized as vector.Common conversion regime has randomized, one-hot methods (only hot method), is based on
Word embedding inlay technique of word2vec etc., here by taking one-hot methods as an example.Its basic ideas is only one in the corresponding vector of certain character
Position is 1, other are 0.Such as abc123 is converted to:
(2) one-dimensional convolutional layer conv1d_1, conv1d_2, conv1d_ for being followed by 3 different size convolution kernels of embeding layer
3, three convolutional layers are concurrency relation.Convolution layer parameter is automatically learned with model.
(3) above-mentioned 3 results condense together, i.e. polymer layer concatenate_1.
(4) after tiling layer flatten_1 processing, restraint layer dropout_1 and full articulamentum dense_1 is met, is connected to
Represent multiple nodes of various coding modes.Through excessively taking turns iteration, loss function value (the i.e. difference of predicted value and actual value of output
Other metric) it is gradually reduced, until reaching acceptable minimum.Meanwhile mould can be examined with the accuracy rate of verification collection
Type effect.
After model reaches promising result, preservation model structure and weighted value are used for system.
It is more ripe technology for obtaining coding mode identification model using deep learning.
In embodiments of the present invention, system is accorded with according to the text to be identified and preset coding mode identification model
The coding mode of the text to be identified is closed, specifically may include:
11) text to be identified is sent in the coding mode identification model carry out calculate obtain it is described to be identified
Text meets probability value corresponding to preset each coding mode;
12) meet the coding mode that probability value is determined for compliance with the text to be identified according to.
For step 11) and step 12), it should be noted that text to be identified is sent to the coding mode and is identified
In model, processing mode is identical with training deep learning process, is also intended to first be processed into the sequence of call number, before conversion
The sequence of text to be identified is c4a7cadecad7d0 ..., then can be exchanged into according to mapping table:3,11,1,14,3, Isosorbide-5-Nitrae, 5,3,
Isosorbide-5-Nitrae, 14,4,7 ....Then it according to parameter of the weighted value of preservation as embeding layer, convolutional layer, and then is obtained by operation
Text to be identified is in the probability value of each coding mode.According to it is described meet most probable value is chosen in probability value, by maximum probability
It is worth corresponding coding mode as the coding mode for meeting the text to be identified.
Such as UTF-8:0.01, GBK:0.98, Latin1:0.01, because of 0.98 maximum, therefore take the coding mode that GBK is prediction.
The file to be identified is decoded according to the coding mode of acquisition, obtains decoding result.
The embodiment of the present invention provides a kind of character code recognition methods, by the text to be identified to getting, according to treating
Identification text and coding mode identification model obtain text to be identified and meet probability value corresponding to preset each coding mode,
From coding mode that text to be identified is determined for compliance in probability value is met, acquisition decoding result is then decoded, so as to reach
Need not be manually set coding mode and match coding mode needed for characteristic sequence, reduce workload, flexibility is strong.
One embodiment of the invention provides a kind of character code recognition methods, including:
S21, text to be identified is obtained;
S22, the text to be identified is met according to the text to be identified and the acquisition of preset coding mode identification model
Coding mode;
S23, the file to be identified is decoded according to the coding mode of acquisition, obtains decoding result.
For above-mentioned steps S21- steps S23, it should be noted that in embodiments of the present invention, data are compiled using certain
Code mode can generate certain sequence text after being encoded.
System meets the text to be identified according to the text to be identified and the acquisition of preset coding mode identification model
Coding mode, specifically may include:
21) multiple text chunks are chosen from the text to be identified;
22) each text chunk is sent in the coding mode identification model calculate and obtains each text chunk correspondence
Meet probability value in preset each coding mode, according to the coding staff for meeting probability value and being determined for compliance with each text chunk
Formula;
23) coding mode of the text to be identified is determined according to the coding mode of each text chunk.
For step 11) and step 12), it should be noted that text to be identified is sent to the coding mode and is identified
In model, multiple text chunks are chosen from the text to be identified.The processing mode of each text chunk and training deep learning mistake
Cheng Xiangtong is also intended to first be processed into the sequence of call number, as the sequence of the text to be identified before converting is
C4a7cadecad7d0 ... then can be exchanged into according to mapping table:3,11,1,14,3, Isosorbide-5-Nitrae, 5,3, Isosorbide-5-Nitrae, 14,4,7 ....
Then according to parameter of the weighted value of preservation as embeding layer, convolutional layer, and then each text is obtained by operation
This section meets probability value in each coding mode.According to it is described meet most probable value is chosen in probability value, by most probable value
Corresponding coding mode is as the coding mode for meeting each text chunk.Then using most coding modes of appearance as
The coding mode of the text to be identified.
The file to be identified is decoded according to the coding mode of acquisition, obtains decoding result.
The embodiment of the present invention provides a kind of character code recognition methods, by multiple to the text selection to be identified got
Text chunk obtains each text chunk according to each text chunk and coding mode identification model and corresponds to preset each coding mode
Meet probability value, from coding mode that each text chunk is determined for compliance in probability value is met, then determine text to be identified
Coding mode is decoded acquisition decoding result, need not be manually set needed for coding mode and matching coding mode so as to reach
Characteristic sequence, reduce workload, flexibility is strong.
Fig. 3 shows a kind of character code identification device that one embodiment of the invention provides, including acquisition module 31, processing
Module 32 and decoder module 33, wherein:
Acquisition module 31, for obtaining text to be identified;
Processing module 32, described in being met according to the text to be identified and the acquisition of preset coding mode identification model
The coding mode of text to be identified;
Decoder module 33 is decoded the file to be identified for the coding mode according to acquisition, is solved
Code result.
The processing module is specifically used for:
The text to be identified is sent in the coding mode identification model calculate and obtains the text to be identified
This meets probability value corresponding to preset each coding mode;
According to the coding mode for meeting probability value and being determined for compliance with the text to be identified.
The processing module includes computing unit and determination unit, wherein:
Computing unit carries out calculating acquisition for the text to be identified to be sent in the coding mode identification model
The text to be identified meets probability value corresponding to preset each coding mode;
Determination unit chooses most probable value for meeting according to, the most probable value is corresponded in probability value
Coding mode as the coding mode for meeting the text to be identified.
Since described device of the embodiment of the present invention is identical with the principle of above-described embodiment the method, for more detailed
Explain that details are not described herein for content.
It it should be noted that can be by hardware processor (hardware processor) come real in the embodiment of the present invention
Existing related function module.
The embodiment of the present invention provides a kind of character code identification device, by the text to be identified to getting, according to treating
Identification text and coding mode identification model obtain text to be identified and meet probability value corresponding to preset each coding mode,
From coding mode that text to be identified is determined for compliance in probability value is met, acquisition decoding result is then decoded, so as to reach
Need not be manually set coding mode and match coding mode needed for characteristic sequence, reduce workload, flexibility is strong.
A kind of character code identification device that one embodiment of the invention provides, including acquisition module, processing module and decoding
Module, wherein:
Acquisition module, for obtaining text to be identified;
Processing module, for according to the text to be identified and preset coding mode identification model acquisition meet described in treat
Identify the coding mode of text;
Decoder module is decoded the file to be identified for the coding mode according to acquisition, is decoded
As a result.
The processing module is specifically used for:
Multiple text chunks are chosen from the text to be identified;
Each text chunk is sent in the coding mode identification model calculate and obtains each text chunk and corresponds to
Preset each coding mode meets probability value, according to the coding staff for meeting probability value and being determined for compliance with each text chunk
Formula;
The coding mode of the text to be identified is determined according to the coding mode of each text chunk.
Since described device of the embodiment of the present invention is identical with the principle of above-described embodiment the method, for more detailed
Explain that details are not described herein for content.
It it should be noted that can be by hardware processor (hardware processor) come real in the embodiment of the present invention
Existing related function module.
The embodiment of the present invention provides a kind of character code identification device, by multiple to the text selection to be identified got
Text chunk obtains each text chunk according to each text chunk and coding mode identification model and corresponds to preset each coding mode
Meet probability value, from coding mode that each text chunk is determined for compliance in probability value is met, then determine text to be identified
Coding mode is decoded acquisition decoding result, need not be manually set needed for coding mode and matching coding mode so as to reach
Characteristic sequence, reduce workload, flexibility is strong.
Fig. 4 shows a kind of electronic equipment that one embodiment of the invention provides, including:It is processor 401, memory 402, total
Line 403 and storage are on a memory and the computer program that can run on a processor;
Wherein, the processor, memory complete mutual communication by the bus;
The processor realizes method as described above when performing the computer program, such as including:Obtain text to be identified
This;Meet the coding staff of the text to be identified according to the text to be identified and the acquisition of preset coding mode identification model
Formula;The file to be identified is decoded according to the coding mode of acquisition, obtains decoding result.
The embodiment of the present invention provides a kind of non-transient computer readable storage medium storing program for executing, the non-transient computer readable storage
Computer program is stored on medium, which realizes method as described above when being executed by processor, such as including:It obtains
Take text to be identified;The text to be identified is met according to the text to be identified and the acquisition of preset coding mode identification model
Coding mode;The file to be identified is decoded according to the coding mode of acquisition, obtains decoding result.
In addition, it will be appreciated by those of skill in the art that although some embodiments described herein include other embodiments
In included certain features rather than other feature, but the combination of the feature of different embodiments means in of the invention
Within the scope of and form different embodiments.For example, in the following claims, embodiment claimed is appointed
One of meaning mode can use in any combination.
It should be noted that the present invention will be described rather than limits the invention, and ability for above-described embodiment
Field technique personnel can design alternative embodiment without departing from the scope of the appended claims.In the claims,
Any reference mark between bracket should not be configured to limitations on claims.Word "comprising" does not exclude the presence of not
Element or step listed in the claims.Word "a" or "an" before element does not exclude the presence of multiple such
Element.The present invention can be by means of including the hardware of several different elements and being come by means of properly programmed computer real
It is existing.If in the unit claim for listing equipment for drying, several in these devices can be by same hardware branch
To embody.The use of word first, second, and third does not indicate that any sequence.These words can be explained and run after fame
Claim.
One of ordinary skill in the art will appreciate that:The above embodiments are only used to illustrate the technical solution of the present invention., and
It is non-that it is limited;Although the present invention is described in detail with reference to foregoing embodiments, those of ordinary skill in the art
It should be understood that:It can still modify to the technical solution recorded in foregoing embodiments either to which part or
All technical features carries out equivalent replacement;And it these modifications or replaces, it does not separate the essence of the corresponding technical solution this hair
Bright claim limited range.
Claims (10)
1. a kind of character code recognition methods, which is characterized in that including:
Obtain text to be identified;
Meet the coding staff of the text to be identified according to the text to be identified and the acquisition of preset coding mode identification model
Formula;
The file to be identified is decoded according to the coding mode of acquisition, obtains decoding result.
It is 2. according to the method described in claim 1, it is characterized in that, described according to the text to be identified and preset coding staff
Formula identification model obtains the coding mode for meeting the text to be identified, including:
The text to be identified is sent in the coding mode identification model calculate and obtains the text pair to be identified
Probability value should be met in preset each coding mode;
According to the coding mode for meeting probability value and being determined for compliance with the text to be identified.
It is 3. according to the method described in claim 1, it is characterized in that, described according to the text to be identified and preset coding staff
Formula identification model obtains the coding mode for meeting the text to be identified, including:
Multiple text chunks are chosen from the text to be identified;
Each text chunk is sent in the coding mode identification model calculate and obtains each text chunk corresponding to default
Each coding mode meet probability value, according to the coding mode for meeting probability value and being determined for compliance with each text chunk;
The coding mode of the text to be identified is determined according to the coding mode of each text chunk.
4. according to the method described in claim 2, it is characterized in that, according to it is described meet probability value be determined for compliance with it is described to be identified
The coding mode of text, including:According to it is described meet most probable value is chosen in probability value;The most probable value is corresponding
Coding mode is as the coding mode for meeting the text to be identified.
5. a kind of character code identification device, which is characterized in that including:
Acquisition module, for obtaining text to be identified;
Processing module, it is described to be identified for being met according to the text to be identified and the acquisition of preset coding mode identification model
The coding mode of text;
Decoder module is decoded the file to be identified for the coding mode according to acquisition, obtains decoding result.
6. device according to claim 5, which is characterized in that the processing module is specifically used for:
The text to be identified is sent in the coding mode identification model calculate and obtains the text pair to be identified
Probability value should be met in preset each coding mode;
According to the coding mode for meeting probability value and being determined for compliance with the text to be identified.
7. device according to claim 5, which is characterized in that the processing module is specifically used for:
Multiple text chunks are chosen from the text to be identified;
Each text chunk is sent in the coding mode identification model calculate and obtains each text chunk corresponding to default
Each coding mode meet probability value, according to the coding mode for meeting probability value and being determined for compliance with each text chunk;
The coding mode of the text to be identified is determined according to the coding mode of each text chunk.
8. device according to claim 6, which is characterized in that the processing module includes computing unit and determination unit,
Wherein:
Computing unit is carried out for the text to be identified to be sent in the coding mode identification model described in calculating acquisition
Text to be identified meets probability value corresponding to preset each coding mode;
Determination unit chooses most probable value, by the corresponding volume of the most probable value for meeting according in probability value
Code mode is as the coding mode for meeting the text to be identified.
9. a kind of electronic equipment, which is characterized in that including:Processor, memory, bus and storage on a memory and can located
The computer program run on reason device;
Wherein, the processor, memory complete mutual communication by the bus;
The processor realizes the method as described in any one of claim 1-4 when performing the computer program.
10. a kind of non-transient computer readable storage medium storing program for executing, which is characterized in that on the non-transient computer readable storage medium storing program for executing
Computer program is stored with, the side as described in any one of claim 1-4 is realized when which is executed by processor
Method.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810050150.1A CN108197087B (en) | 2018-01-18 | 2018-01-18 | Character code recognition method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810050150.1A CN108197087B (en) | 2018-01-18 | 2018-01-18 | Character code recognition method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108197087A true CN108197087A (en) | 2018-06-22 |
CN108197087B CN108197087B (en) | 2021-11-16 |
Family
ID=62589725
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810050150.1A Active CN108197087B (en) | 2018-01-18 | 2018-01-18 | Character code recognition method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108197087B (en) |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109064733A (en) * | 2018-09-30 | 2018-12-21 | 珠海全志科技股份有限公司 | Adaptive infrared signal coding/decoding method, computer installation and its control device |
CN110096481A (en) * | 2019-04-19 | 2019-08-06 | 福建天晴数码有限公司 | The recognition methods of document No. and computer readable storage medium |
CN110113327A (en) * | 2019-04-26 | 2019-08-09 | 北京奇安信科技有限公司 | A kind of method and device detecting DGA domain name |
CN110135566A (en) * | 2019-05-21 | 2019-08-16 | 四川长虹电器股份有限公司 | Registration user name detection method based on bis- Classification Neural model of LSTM |
CN111428484A (en) * | 2020-04-14 | 2020-07-17 | 广州云从鼎望科技有限公司 | Information management method, system, device and medium |
CN111681670A (en) * | 2019-02-25 | 2020-09-18 | 北京嘀嘀无限科技发展有限公司 | Information identification method and device, electronic equipment and storage medium |
CN111832257A (en) * | 2019-04-16 | 2020-10-27 | 三星电子株式会社 | Conditional transcoding of encoded data |
CN113627173A (en) * | 2021-08-16 | 2021-11-09 | 深圳市云采网络科技有限公司 | Manufacturer name identification method and device, electronic equipment and readable medium |
CN113807807A (en) * | 2021-08-16 | 2021-12-17 | 深圳市云采网络科技有限公司 | Component parameter identification method and device, electronic equipment and readable medium |
US11838035B2 (en) | 2019-03-15 | 2023-12-05 | Samsung Electronics Co., Ltd. | Using predicates in conditional transcoder for column store |
CN117391070A (en) * | 2023-12-08 | 2024-01-12 | 和元达信息科技有限公司 | Method and system for adjusting random character |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040078191A1 (en) * | 2002-10-22 | 2004-04-22 | Nokia Corporation | Scalable neural network-based language identification from written text |
CN104360988A (en) * | 2014-10-17 | 2015-02-18 | 北京锐安科技有限公司 | Method and device for identifying coding mode of Chinese characters |
CN104750666A (en) * | 2015-03-12 | 2015-07-01 | 明博教育科技有限公司 | Text character encoding mode identification method and system |
CN106354701A (en) * | 2016-08-30 | 2017-01-25 | 腾讯科技(深圳)有限公司 | Chinese character processing method and device |
CN107480723A (en) * | 2017-08-22 | 2017-12-15 | 武汉大学 | Texture Recognition based on partial binary threshold learning network |
-
2018
- 2018-01-18 CN CN201810050150.1A patent/CN108197087B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040078191A1 (en) * | 2002-10-22 | 2004-04-22 | Nokia Corporation | Scalable neural network-based language identification from written text |
CN104360988A (en) * | 2014-10-17 | 2015-02-18 | 北京锐安科技有限公司 | Method and device for identifying coding mode of Chinese characters |
CN104750666A (en) * | 2015-03-12 | 2015-07-01 | 明博教育科技有限公司 | Text character encoding mode identification method and system |
CN106354701A (en) * | 2016-08-30 | 2017-01-25 | 腾讯科技(深圳)有限公司 | Chinese character processing method and device |
CN107480723A (en) * | 2017-08-22 | 2017-12-15 | 武汉大学 | Texture Recognition based on partial binary threshold learning network |
Non-Patent Citations (3)
Title |
---|
ARDEN DERTAT: "Applied Deep Learning - Part3:Autoencoders", 《TOWARDS DATA SCIENCE》 * |
李继峰等: "基于N-gram模型的高速汉字编码识别系统", 《计算机工程与应用》 * |
杨艳等: "脱机手写体汉字识别的改进算法", 《微计算机侣息》 * |
Cited By (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109064733A (en) * | 2018-09-30 | 2018-12-21 | 珠海全志科技股份有限公司 | Adaptive infrared signal coding/decoding method, computer installation and its control device |
CN111681670B (en) * | 2019-02-25 | 2023-05-12 | 北京嘀嘀无限科技发展有限公司 | Information identification method, device, electronic equipment and storage medium |
CN111681670A (en) * | 2019-02-25 | 2020-09-18 | 北京嘀嘀无限科技发展有限公司 | Information identification method and device, electronic equipment and storage medium |
US11838035B2 (en) | 2019-03-15 | 2023-12-05 | Samsung Electronics Co., Ltd. | Using predicates in conditional transcoder for column store |
CN111832257A (en) * | 2019-04-16 | 2020-10-27 | 三星电子株式会社 | Conditional transcoding of encoded data |
CN111832257B (en) * | 2019-04-16 | 2023-02-28 | 三星电子株式会社 | Conditional transcoding of encoded data |
CN113064862B (en) * | 2019-04-19 | 2022-06-07 | 福建天晴数码有限公司 | File code identification method based on forward and reverse word stock and storage medium |
CN110096481B (en) * | 2019-04-19 | 2021-03-23 | 福建天晴数码有限公司 | Method for identifying file code and computer readable storage medium |
CN113064863A (en) * | 2019-04-19 | 2021-07-02 | 福建天晴数码有限公司 | Method for automatically recognizing file code and computer readable storage medium |
CN113064862A (en) * | 2019-04-19 | 2021-07-02 | 福建天晴数码有限公司 | File code identification method based on forward and reverse word stock and storage medium |
CN113064863B (en) * | 2019-04-19 | 2022-06-07 | 福建天晴数码有限公司 | Method for automatically recognizing file code and computer readable storage medium |
CN110096481A (en) * | 2019-04-19 | 2019-08-06 | 福建天晴数码有限公司 | The recognition methods of document No. and computer readable storage medium |
CN110113327A (en) * | 2019-04-26 | 2019-08-09 | 北京奇安信科技有限公司 | A kind of method and device detecting DGA domain name |
CN110135566A (en) * | 2019-05-21 | 2019-08-16 | 四川长虹电器股份有限公司 | Registration user name detection method based on bis- Classification Neural model of LSTM |
CN111428484A (en) * | 2020-04-14 | 2020-07-17 | 广州云从鼎望科技有限公司 | Information management method, system, device and medium |
CN113627173A (en) * | 2021-08-16 | 2021-11-09 | 深圳市云采网络科技有限公司 | Manufacturer name identification method and device, electronic equipment and readable medium |
CN113807807A (en) * | 2021-08-16 | 2021-12-17 | 深圳市云采网络科技有限公司 | Component parameter identification method and device, electronic equipment and readable medium |
CN117391070A (en) * | 2023-12-08 | 2024-01-12 | 和元达信息科技有限公司 | Method and system for adjusting random character |
CN117391070B (en) * | 2023-12-08 | 2024-03-22 | 和元达信息科技有限公司 | Method and system for adjusting random character |
Also Published As
Publication number | Publication date |
---|---|
CN108197087B (en) | 2021-11-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108197087A (en) | Character code recognition methods and device | |
CN107798235B (en) | Unsupervised abnormal access detection method and unsupervised abnormal access detection device based on one-hot coding mechanism | |
JP6955580B2 (en) | Document summary automatic extraction method, equipment, computer equipment and storage media | |
CN107704625B (en) | Method and device for field matching | |
CN111476023B (en) | Method and device for identifying entity relationship | |
CN110825857B (en) | Multi-round question and answer identification method and device, computer equipment and storage medium | |
WO2021189844A1 (en) | Detection method and apparatus for multivariate kpi time series, and device and storage medium | |
CN108959312A (en) | A kind of method, apparatus and terminal that multi-document summary generates | |
CN113011189A (en) | Method, device and equipment for extracting open entity relationship and storage medium | |
CN112257437B (en) | Speech recognition error correction method, device, electronic equipment and storage medium | |
CN111382271B (en) | Training method and device of text classification model, text classification method and device | |
CN109522395A (en) | Automatic question-answering method and device | |
CN107451106A (en) | Text method and device for correcting, electronic equipment | |
CN116579618B (en) | Data processing method, device, equipment and storage medium based on risk management | |
CN108320778A (en) | Medical record ICD coding methods and system | |
CN114722091A (en) | Data processing method, data processing device, storage medium and processor | |
CN111340638A (en) | Abnormal medical insurance document identification method and device, computer equipment and storage medium | |
CN111126056B (en) | Method and device for identifying trigger words | |
CN112765330A (en) | Text data processing method and device, electronic equipment and storage medium | |
CN109033078B (en) | The recognition methods of sentence classification and device, storage medium, processor | |
CN110705258A (en) | Text entity identification method and device | |
CN111241843A (en) | Semantic relation inference system and method based on composite neural network | |
CN115019316A (en) | Training method of text recognition model and text recognition method | |
CN112749532A (en) | Address text processing method, device and equipment | |
CN110378457A (en) | A kind of yard of target generation method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information |
Address after: Room 332, 3 / F, Building 102, 28 xinjiekouwei street, Xicheng District, Beijing 100088 Applicant after: Qianxin Technology Group Co.,Ltd. Address before: 100015 15, 17 floor 1701-26, 3 building, 10 Jiuxianqiao Road, Chaoyang District, Beijing. Applicant before: BEIJING QIANXIN TECHNOLOGY Co.,Ltd. |
|
CB02 | Change of applicant information | ||
GR01 | Patent grant | ||
GR01 | Patent grant |