CN107341135A - A kind of analytic method and instrument towards generic text form - Google Patents
A kind of analytic method and instrument towards generic text form Download PDFInfo
- Publication number
- CN107341135A CN107341135A CN201710372929.0A CN201710372929A CN107341135A CN 107341135 A CN107341135 A CN 107341135A CN 201710372929 A CN201710372929 A CN 201710372929A CN 107341135 A CN107341135 A CN 107341135A
- Authority
- CN
- China
- Prior art keywords
- field
- symbol
- record
- separator
- character
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/12—Use of codes for handling textual entities
- G06F40/151—Transformation
Abstract
The invention discloses a kind of analytic method and instrument towards generic text form.This method is:1) for a data a to be resolved, various self-defined symbols corresponding to it is imported in analytical tool first, the data a to be resolved is then read using the file coding format specified;Self-defined symbol includes line Separator, field surrounds symbol and interfield separator;2) the self-defined symbol parsed in data a is uniformly changed Chinese character string type by analytical tool;3) analytical tool analyzes read character one by one, if the character string of the character and behind n character composition is consistent with line Separator, data a to be resolved is divided into row data according to line Separator;4) analytical tool analyzes obtained row data, and record all in symbol parsing trip data is surrounded according to field;5) analytical tool analyzes obtained each record one by one, and field all in every record is parsed according to interfield separator.The present invention substantially increases analyzing efficiency.
Description
Technical field
The present invention relates to a kind of analytical tool towards generic text form, belong to computer software technical field.
Background technology
Generic text format specification is:Generic text is made up of any number of record, with customized line feed between record
Symbol separates;Every record is made up of field, and interfield is with customized interfield separators;It can customize interfield encirclement
Symbol;Include newline in field, the field must surround symbol with field and bracket;Include interfield separator in field,
The field must surround symbol with field and bracket;Include field in field and surround symbol, the field must surround symbol with field and include
Get up;Field in field is surrounded symbol and represented with two fields encirclement symbols.The analytical tool of generic text form is realized to following
The text field parsing of format specification.Current text resolution instrument is mainly for comma separated value (Comma-Separated
Value, CSV) file, line Separator uses the newline of system default, and interfield separator uses comma or tab, field
Surround symbol and use double quotation marks, text resolution instrument parses each field in every record and every record in file.
Current text resolution instrument can customize interfield separator and field bag mainly for comma separated value file
Symbol is enclosed, but line Separator uses the newline of system default, it is impossible to self-defined newline;Interfield separator and field surround symbol
It can customize as specific character, but specific character string or byte arrays can not be defined as.
The content of the invention
It is an object of the invention to provide a kind of analytic method and instrument towards generic text form, realize to following lattice
The field parsing of the text of formula specification.The present invention can parse the file or stream of prescribed coding form, it is allowed to which self-defined row separates
It is specific character, byte, character string or byte arrays that symbol, interfield separator and field, which surround symbol,;These customized letters
Breath is inputted by User Defined, is stored in inside analytical tool.
The technical scheme is that:
A kind of analytic method towards generic text form, its step are:
1) for a data a to be resolved, various self-defined symbols corresponding to it is imported in analytical tool first, then adopted
The data a to be resolved is read with specified file coding format;The wherein data a to be resolved is a file or data flow, described
Self-defined symbol includes line Separator, field surrounds symbol and interfield separator;
2) the self-defined symbol parsed in data a is uniformly changed Chinese character string type by analytical tool;
3) analytical tool analyzes read character one by one, if the character and the behind character string of n character composition
It is consistent with line Separator, then data a to be resolved is divided into by row data according to line Separator, wherein n is the length of line Separator
Subtract one;
4) analytical tool analyzes obtained row data, and record all in symbol parsing trip data is surrounded according to field;
5) analytical tool analyzes obtained each record one by one, is parsed in every record and owned according to interfield separator
Field.
Further, it is according to the method for record all in field encirclement symbol parsing trip data:
21) set a record end to identify and its value is initialized as false, by two continuation fields in each row of data
Surround one that symbol is resolved in field and surround symbol, a field encirclement symbol is resolved into field one surrounds symbol, Ran Houxiang
Preceding scanning character;If the encirclement symbol of the field is the last character or character string of current line, and record end ident value is
True, then the record be parsed, by record end mark be set to false;If the encirclement symbol of the field is the last of current line
One character or character string, and record end is identified as false, then judges whether encirclement symbol is that a field starts in record
Symbol is surrounded, if the encirclement symbol of field beginning, then be set to true by record end mark, then analyze the data of next line;Such as
The encirclement symbol of the fruit field is the last character or character string of the row, and record end identifier is false, and the encirclement accords with
It is not that the encirclement that certain field starts in record accords with line number, offset and the type of error of error of then dishing out, by record end
Mark is set to false;
If 22) the encirclement symbol of the field is not the last character or character string of the row, and record end is identified as
True, then line number, offset and the type of error of error of dishing out, record end mark is otherwise set to false;If the field
Encirclement symbol be not the row last character or character string, and record end is identified as false, then judges that encirclement symbol is
The no encirclement symbol started for a field in record, if the encirclement symbol of field beginning, then be set to true by record end mark, connect
The character for analyzing the row forward;If the encirclement symbol of the field is not the last character or character string of the row, record knot
Beam identification symbol is false, and encirclement symbol is not that the encirclement that certain field starts in record accords with, then line number, the offset of error of dishing out
And type of error, record end mark is set to false.
Further, the method for all field in every record being parsed according to interfield separator is:
31) it is false to set a field end of identification and initialize its value, if reading field surrounds symbol, is swept forward
Character is retouched, if reach record ending when scanning forward, and field end of identification value is true, then the field is parsed, should
Bar record is also parsed;Otherwise, dish out line number, offset and the type of error of error;
If 32) reading field surrounds symbol, fashion is scanned forward and not up to records ending, then continues to scan forward:
If it is true that a) scanning surrounds symbol and field end of identification to field, two continuous fields are surrounded into symbol parsing
For the encirclement symbol in field;Otherwise, judge whether encirclement symbol is that the encirclement that field starts accords with, if then by field end of identification
True is set to, continues to parse;If scanning is false to symbol, field end of identification is surrounded forward, encirclement symbol is not that field is opened
During the encirclement symbol of head, line number, offset and the type of error of error of dishing out;
If b) field seperator is arrived in scanning forward, and field end of identification is true, then encirclement symbol is the encirclement of field
Symbol, the separator are the separator of interfield, and the field is parsed, and field end of identification is reset into false;Otherwise, should
Separator is separator in field, then judges whether encirclement symbol is that the encirclement that a field starts accords with, if then continuing to parse,
Otherwise dish out line number, offset and the type of error of error;
If c) line Separator is arrived in scanning forward, and field end of identification is true, then parses the line Separator in field;
Otherwise, this record does not meet text formatting specification, line number, offset and the type of error of error of dishing out.If scan forward
Line Separator, other characters surrounded outside symbol, interfield separator, then parse the character, continue to parse;
33) if line Separator is read, and field end of identification is true, then parses the line Separator in field;It is no
Then, dish out line number, offset and the type of error of error;
If interfield separator 34) is read, and field end of identification is true, then the interfield separator is in field
Separator;Otherwise, the separator is the separator of interfield, parses a field of record;
If 35) read customized line Separator, other characters that field is surrounded outside symbol and interfield separator,
The character is a part for field contents.
Further, the line Separator is character, byte, character string or byte arrays.
Further, the field surrounds symbol and surrounds character, byte, character string or byte arrays for field.
Further, the interfield separator is character, byte, character string or byte arrays.
A kind of analytical tool towards generic text form, it is characterised in that including document parser, row resolver, note
Record resolver, field parser and exception handler;Wherein,
Document parser, for according to the character encoding format specified, the file or data flow in reading specified file path;
Row resolver, the character read for Study document resolver, file, solution are split according to customized line Separator
Separate out each row of data of file;
Resolver is recorded, the character in each row of data being partitioned into for analyzing row resolver one by one, according to customized
Field surrounds symbol and the record end set mark, parses all record datas that every row includes;
Field parser, for the every record parsed for record resolver, the word in every record is scanned one by one
Symbol, according to customized interfield separator and the field end of identification set, parse field all in every record;
Exception handler, for doing abnormality processing to produced problem during document analysis, error message is recorded in detail.
The invention provides a kind of analytical tool towards generic text form, mainly including document parser, row parsing
Device, record resolver, field parser and exception handler.Document parser refers to according to the character encoding format specified, reading
Determine the file or specified file stream of file path;Row resolver one by one Study document resolver read file character, according to from
The line Separator segmentation file of definition, parses each row of data of file, and wherein line Separator may be defined as character, byte, word
Symbol string or byte arrays;Record resolver analyzes the character in each row of data that row resolver is partitioned into one by one, according to self-defined
Field surround symbol, identified by the record end of setting, parse every record in trip data, wherein field surround symbol can
It is defined as field and surrounds character, byte, character string or byte arrays;Field parser is directed to every that record resolver parses
Record, the character in every record is scanned one by one, according to customized interfield separator, by setting field end of identification,
Field all in every record is parsed, wherein interfield separator can be character, byte, character string or byte arrays;
Produced problem does abnormality processing during document analysis of the exception handler to not meeting text formatting specification, records out in detail
Wrong information.
Compared with prior art, the present invention has following advantage:
1st, the file or stream of prescribed coding form can be read;
2nd, it can customize line Separator, field surrounds symbol, interfield separator is character, byte, character string or byte number
Group;
3rd, every note in surrounding file in symbol and the parsing of interfield separator according to customized line Separator, field or flow
All fields of record;
4th, abnormality processing is done to the record for not meeting text formatting specification, records Error Location and type of error in detail;
5th, multithreading safety is ensured.
Brief description of the drawings
Fig. 1 is record process of analysis figure;
Fig. 2 is field process of analysis figure.
Embodiment
The present invention will be further described in detail with specific embodiment below in conjunction with the accompanying drawings, but does not limit in any way
The scope of the present invention.
A kind of analytic method towards generic text form of example 1
1) file or stream are read using the file coding format specified;
If 2) it is character, byte or byte arrays that customized i.e. line Separator, field, which surround symbol and interfield separator,
Then it is converted into character string;
3) character in file or stream is read in analysis one by one, if the character and the behind character of n character composition
Go here and there as a line Separator, then according to line Separator by the row data of file division a line in a row, wherein n is newline length
Subtract one;
4) the often capable data of analysis, parse all records, as shown in Figure 1.
Initialization record end is identified as false, and two continuation fields in each row of data are surrounded into symbol resolves to field
Interior one surrounds symbol, and a field is surrounded and accords with an encirclement symbol for resolving to field.Parsing an encirclement of field
Fu Shi, character is scanned forward.
If the encirclement symbol of the field is the last character (string) of the row, then judge that record end identifies, if record
End of identification is true, then the record is parsed, and record end mark is set into false;If the encirclement symbol of the field is to work as
Forward last character or character string, and record end is identified as false, then judges whether encirclement symbol is certain in record
The encirclement symbol of individual field beginning, if the encirclement symbol of field beginning, then this record is not yet parsed, and record end is identified
True is set to, then analyzes the data of next line;If the encirclement symbol of the field is the last character (string) of the row, record
End identifier is false, and encirclement symbol is not the encirclement symbol that certain field starts in record, then this record does not meet text
This format specification, line number, offset and the type of error of error of dishing out, record end mark is set to false.
If the encirclement symbol of the field is not the last character (string) of the row, then judge that record end identifies, if note
Record end of identification is true, then this record does not meet text formatting specification, line number, offset and the wrong class of error of dishing out
Type, record end mark is otherwise set to false;If the encirclement symbol of the field is not the last character or character of the row
String, and record end is identified as false, then judges whether encirclement symbol is that the encirclement that certain field starts in record accords with, if word
Duan Kaitou encirclement is accorded with, then this record is not yet parsed, and record end mark is set into true, then analyzes the row forward
Character;If the encirclement symbol of the field is not the last character (string) of the row, record end identifier is false, should
It is not the encirclement symbol that certain field starts in record to surround symbol, then this record does not meet text formatting specification, the row for error of dishing out
Number, offset and type of error, record end mark is set to false.
5) data of every record are analyzed, parse all fields of every record, as shown in Figure 2.
Initialization field end of identification is false.If reading field surrounds symbol, character is scanned forward, if sweeping forward
Reach record ending when retouching, then judge field end of identification, if field end of identification is true, the field is parsed,
This record is also parsed;Otherwise, this record does not meet text formatting specification, the line number of error of dishing out, offset and
Type of error.
If reading field surrounds symbol, fashion is scanned forward and not up to records ending.If scanning is to symbol is surrounded forward, then
Judge field end of identification, if field end of identification is true, solve two continuous symbols that surround according to text formatting specification
Analyse as the encirclement symbol in field;Otherwise, then judge whether encirclement symbol is that the encirclement that field starts accords with, if then by field knot
Beam identification is set to true, continues to parse;If to symbol is surrounded, field end of identification is false, encirclement Fu Bushi for scanning forward
During the encirclement symbol of field beginning, this record does not meet text formatting specification, line number, offset and the wrong class of error of dishing out
Type.If field seperator is arrived in scanning forward, field end of identification is then judged, if field end of identification is true, the encirclement
The encirclement symbol for field is accorded with, the separator is the separator of interfield, and the field is parsed, field end of identification is reset to
false;Otherwise, the separator is separator in field, then judges whether encirclement symbol is that the encirclement that certain field starts accords with, if
It is to continue to parse, otherwise this record does not meet text formatting specification, line number, offset and the wrong class of error of dishing out
Type.If line Separator is arrived in scanning forward, judge field end of identification, if field end of identification is true, parse in field
Line Separator;Otherwise, this record does not meet text formatting specification, line number, offset and the type of error of error of dishing out.
If scanning line Separator, other characters surrounded outside symbol, interfield separator forward, the character is parsed, continues to parse.
If reading line Separator, field end of identification is judged, if field end of identification is true, parse in field
Line Separator;Otherwise, this record does not meet text formatting specification, line number, offset and the type of error of error of dishing out.
If reading interfield separator, field end of identification is judged, if field end of identification is true, the interfield
Separator is the separator in field;Otherwise, the separator is the separator of interfield, parses a field of record.
If reading customized line Separator, other characters surrounded outside symbol and separator, the character is in field
A part for appearance.
Claims (10)
1. a kind of analytic method towards generic text form, its step are:
1) for a data a to be resolved, various self-defined symbols corresponding to it is imported in analytical tool first, then uses and refers to
Fixed file coding format reads the data a to be resolved;The wherein data a to be resolved is a file or data flow, described to make by oneself
Adopted symbol includes line Separator, field surrounds symbol and interfield separator;
2) the self-defined symbol parsed in data a is uniformly changed Chinese character string type by analytical tool;
3) analytical tool analyzes read character one by one, if the character and the behind character string and row of n character composition
Separator is consistent, then data a to be resolved is divided into row data according to line Separator, wherein n is that the length of line Separator subtracts one;
4) analytical tool analyzes obtained row data, and record all in symbol parsing trip data is surrounded according to field;
5) analytical tool analyzes obtained each record one by one, and word all in every record is parsed according to interfield separator
Section.
2. the method as described in claim 1, it is characterised in that record all in symbol parsing trip data is surrounded according to field
Method be:
21) set a record end to identify and its value is initialized as false, two continuation fields in each row of data are surrounded
Accord with one resolved in field and surround symbol, a field is surrounded and accords with an encirclement symbol for resolving to field, is then swept forward
Retouch character;If the encirclement symbol of the field is the last character or character string of current line, and record end ident value is true,
Then the record is parsed, and record end mark is set into false;If the encirclement symbol of the field is the last character of current line
Symbol or character string, and record end is identified as false, then judges whether encirclement symbol is a field starts in record encirclement
Symbol, if the encirclement symbol of field beginning, then be set to true by record end mark, then analyze the data of next line;If should
The encirclement symbol of field is the last character or character string of the row, and record end identifier is false, and encirclement Fu Bushi
The encirclement that certain field starts in record accords with line number, offset and the type of error for error of then dishing out, and record end is identified
It is set to false;
If 22) the encirclement symbol of the field is not the last character or character string of the row, and record end is identified as true, then
Dish out line number, offset and the type of error of error, record end mark is otherwise set to false;If the encirclement of the field
Symbol is not the last character or character string of the row, and record end is identified as false, then judges whether encirclement symbol is note
The encirclement symbol that a field starts in record, if the encirclement symbol of field beginning, then be set to true, then forward by record end mark
Analyze the character of the row;If the encirclement symbol of the field is not the last character or character string of the row, record end mark
Symbol is false, and encirclement symbol is not that the encirclement that certain field starts in record accords with, then line number, offset and the mistake of error of dishing out
Type by mistake, record end mark is set to false.
3. the method as described in claim 1, it is characterised in that parsed according to interfield separator all in every record
The method of field is:
31) it is false to set a field end of identification and initialize its value, is accorded with if reading field and surrounding, forward scan word
Symbol, if reach record ending when scanning forward, and field end of identification value is true, then the field is parsed, this note
Record is also parsed;Otherwise, dish out line number, offset and the type of error of error;
If 32) reading field surrounds symbol, fashion is scanned forward and not up to records ending, then continues to scan forward:
If it is true that a) scanning surrounds symbol and field end of identification to field, two continuous fields are surrounded into symbol resolves to word
Encirclement symbol in section;Otherwise, judge whether encirclement symbol is that the encirclement that field starts accords with, if being then set to field end of identification
True, continue to parse;If scanning is false to symbol, field end of identification is surrounded forward, encirclement symbol is not that field starts
When surrounding symbol, line number, offset and the type of error of error of dishing out;
If b) field seperator is arrived in scanning forward, and field end of identification is true, then encirclement symbol accords with for the encirclement of field, should
Separator is the separator of interfield, and the field is parsed, and field end of identification is reset into false;Otherwise, the separator
For separator in field, then judge whether encirclement symbol is that the encirclement that a field starts accords with, if then continuing to parse, is otherwise thrown
Line number, offset and the type of error of error;
If c) line Separator is arrived in scanning forward, and field end of identification is true, then parses the line Separator in field;It is no
Then, this record does not meet text formatting specification, line number, offset and the type of error of error of dishing out.If scan line forward
Separator, other characters surrounded outside symbol, interfield separator, then parse the character, continue to parse;
33) if line Separator is read, and field end of identification is true, then parses the line Separator in field;Otherwise, throw
Line number, offset and the type of error of error;
If interfield separator 34) is read, and field end of identification is true, then the interfield separator is point in field
Every symbol;Otherwise, the separator is the separator of interfield, parses a field of record;
If 35) read customized line Separator, other characters that field is surrounded outside symbol and interfield separator, the word
Accord with the part for field contents.
4. the method as described in claims 1 to 3 is any, it is characterised in that the line Separator is character, byte, character string
Or byte arrays.
5. the method as described in claims 1 to 3 is any, it is characterised in that the field surrounds symbol and surrounds character, word for field
Section, character string or byte arrays.
6. the method as described in claims 1 to 3 is any, it is characterised in that the interfield separator is character, byte, word
Symbol string or byte arrays.
7. a kind of analytical tool towards generic text form, it is characterised in that including document parser, row resolver, record
Resolver, field parser and exception handler;Wherein,
Document parser, for according to the character encoding format specified, the file or data flow in reading specified file path;
Row resolver, the character read for Study document resolver, file is split according to customized line Separator, parsed
The each row of data of file;
Resolver is recorded, the character in each row of data being partitioned into for analyzing row resolver one by one, according to customized field
Symbol and the record end set mark are surrounded, parses all record datas that every row includes;
Field parser, for the every record parsed for record resolver, the character in every record, root are scanned one by one
According to customized interfield separator and the field end of identification set, field all in every record is parsed;
Exception handler, for doing abnormality processing to produced problem during document analysis, error message is recorded in detail.
8. analytical tool as claimed in claim 7, it is characterised in that the line Separator is character, byte, character string or word
Joint number group.
9. analytical tool as claimed in claim 7, it is characterised in that the field surround symbol for field surround character, byte,
Character string or byte arrays.
10. analytical tool as claimed in claim 7, it is characterised in that the interfield separator is character, byte, character
String or byte arrays.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710372929.0A CN107341135B (en) | 2017-05-24 | 2017-05-24 | A kind of analytic method and tool towards generic text format |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710372929.0A CN107341135B (en) | 2017-05-24 | 2017-05-24 | A kind of analytic method and tool towards generic text format |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107341135A true CN107341135A (en) | 2017-11-10 |
CN107341135B CN107341135B (en) | 2019-11-05 |
Family
ID=60219894
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710372929.0A Active CN107341135B (en) | 2017-05-24 | 2017-05-24 | A kind of analytic method and tool towards generic text format |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107341135B (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108319589A (en) * | 2018-03-14 | 2018-07-24 | 腾讯科技(深圳)有限公司 | Parameter string processing method, apparatus, computer readable storage medium and equipment |
CN108595453A (en) * | 2017-12-20 | 2018-09-28 | 中国联合网络通信集团有限公司 | URL identity maps acquisition methods and device |
CN109033410A (en) * | 2018-08-03 | 2018-12-18 | 韩雪松 | A kind of SQL analytic method based on canonical and character string cutting |
CN111143554A (en) * | 2019-12-10 | 2020-05-12 | 中盈优创资讯科技有限公司 | Data sampling method and device based on big data platform |
CN111177484A (en) * | 2019-12-09 | 2020-05-19 | 贵阳语玩科技有限公司 | System and method for loading and managing different data sources and format character string resource files |
CN111427899A (en) * | 2020-03-17 | 2020-07-17 | 中国建设银行股份有限公司 | Method, device, equipment and computer readable medium for storing file |
CN113761283A (en) * | 2020-06-01 | 2021-12-07 | 中移(苏州)软件技术有限公司 | Method, device, equipment and storage medium for reading XML file |
CN114422498A (en) * | 2021-12-14 | 2022-04-29 | 杭州安恒信息技术股份有限公司 | Big data real-time processing method and system, computer equipment and storage medium |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2010147394A3 (en) * | 2009-06-17 | 2011-03-31 | Kim Hoyon | Chinese language and chinese character input system and method |
CN102855306A (en) * | 2012-08-21 | 2013-01-02 | 飞天诚信科技股份有限公司 | Method and device for parsing source file |
CN103164538A (en) * | 2013-04-11 | 2013-06-19 | 深圳市华力特电气股份有限公司 | Method and device for analyzing data |
CN103294652A (en) * | 2012-02-27 | 2013-09-11 | 腾讯科技(深圳)有限公司 | Data conversion method and system |
CN104023018A (en) * | 2014-06-11 | 2014-09-03 | 中国联合网络通信集团有限公司 | Text protocol reverse resolution method and system |
CN106534267A (en) * | 2016-10-19 | 2017-03-22 | 中国银行股份有限公司 | File uploading and resolving method and device |
-
2017
- 2017-05-24 CN CN201710372929.0A patent/CN107341135B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2010147394A3 (en) * | 2009-06-17 | 2011-03-31 | Kim Hoyon | Chinese language and chinese character input system and method |
CN103294652A (en) * | 2012-02-27 | 2013-09-11 | 腾讯科技(深圳)有限公司 | Data conversion method and system |
CN102855306A (en) * | 2012-08-21 | 2013-01-02 | 飞天诚信科技股份有限公司 | Method and device for parsing source file |
CN103164538A (en) * | 2013-04-11 | 2013-06-19 | 深圳市华力特电气股份有限公司 | Method and device for analyzing data |
CN104023018A (en) * | 2014-06-11 | 2014-09-03 | 中国联合网络通信集团有限公司 | Text protocol reverse resolution method and system |
CN106534267A (en) * | 2016-10-19 | 2017-03-22 | 中国银行股份有限公司 | File uploading and resolving method and device |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108595453A (en) * | 2017-12-20 | 2018-09-28 | 中国联合网络通信集团有限公司 | URL identity maps acquisition methods and device |
CN108595453B (en) * | 2017-12-20 | 2020-09-01 | 中国联合网络通信集团有限公司 | URL (Uniform resource locator) identifier mapping obtaining method and device |
CN108319589B (en) * | 2018-03-14 | 2021-08-10 | 腾讯科技(深圳)有限公司 | Parameter string processing method, device, computer readable storage medium and equipment |
CN108319589A (en) * | 2018-03-14 | 2018-07-24 | 腾讯科技(深圳)有限公司 | Parameter string processing method, apparatus, computer readable storage medium and equipment |
CN109033410A (en) * | 2018-08-03 | 2018-12-18 | 韩雪松 | A kind of SQL analytic method based on canonical and character string cutting |
CN109033410B (en) * | 2018-08-03 | 2021-10-29 | 韩雪松 | SQL (structured query language) analysis method based on regular and character string cutting |
CN111177484A (en) * | 2019-12-09 | 2020-05-19 | 贵阳语玩科技有限公司 | System and method for loading and managing different data sources and format character string resource files |
CN111143554A (en) * | 2019-12-10 | 2020-05-12 | 中盈优创资讯科技有限公司 | Data sampling method and device based on big data platform |
CN111143554B (en) * | 2019-12-10 | 2024-03-12 | 中盈优创资讯科技有限公司 | Data sampling method and device based on big data platform |
CN111427899A (en) * | 2020-03-17 | 2020-07-17 | 中国建设银行股份有限公司 | Method, device, equipment and computer readable medium for storing file |
CN113761283A (en) * | 2020-06-01 | 2021-12-07 | 中移(苏州)软件技术有限公司 | Method, device, equipment and storage medium for reading XML file |
CN113761283B (en) * | 2020-06-01 | 2023-09-05 | 中移(苏州)软件技术有限公司 | Method and device for reading XML file, equipment and storage medium |
CN114422498A (en) * | 2021-12-14 | 2022-04-29 | 杭州安恒信息技术股份有限公司 | Big data real-time processing method and system, computer equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN107341135B (en) | 2019-11-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107341135B (en) | A kind of analytic method and tool towards generic text format | |
US5359673A (en) | Method and apparatus for converting bitmap image documents to editable coded data using a standard notation to record document recognition ambiguities | |
CN105653517A (en) | Recognition rate determining method and apparatus | |
CN109976840B (en) | Method and system for realizing multi-language automatic adaptation based on foreground and background separation platform | |
CN108021540A (en) | The analytic method and instrument of a kind of generic text form towards Hadoop | |
CN103902918B (en) | Method and device for rapidly extracting text from Word document | |
US20080179406A1 (en) | Method for the dual coding of information on physical media and in a comptuerized format (DOTEM) | |
US9049400B2 (en) | Image processing apparatus, and image processing method and program | |
CN110795606A (en) | Method for generating log analysis rule | |
CN104079450B (en) | Feature mode set creation method and device | |
CN104035765B (en) | A kind of analysis method of embedded system context | |
TWI557647B (en) | Two - dimensional code, generation method and recognition method with two - dimensional software installation information | |
JP5853531B2 (en) | Information processing apparatus and information processing program | |
CN107949852A (en) | Character recognition device, character identifying method and program | |
CN108021711A (en) | A kind of method of information processing | |
JP2011060268A (en) | Image processing apparatus and program | |
KR101790544B1 (en) | Information processing apparatus, information processing method, and storage medium | |
CN116361586B (en) | Method for realizing HTTP protocol request data highlighting in webpage | |
JP5673277B2 (en) | Image processing apparatus and program | |
KR101165201B1 (en) | Conversion server for a contents providing system | |
JP6260181B2 (en) | Information processing apparatus and information processing program | |
CN109145125A (en) | A kind of method and system, the storage medium of dynamic Extracting Information | |
US8256687B2 (en) | Method of coding information in a dual fashion on physical media and in DOTEM computerised form | |
CN108694229A (en) | String data analytical equipment and string data analysis method | |
JP2014235694A (en) | Document processing device, document processing method, and document processing program |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |