CN109918622A - The method and system converted from Word document to LaTeX document are realized based on JAVA - Google Patents
The method and system converted from Word document to LaTeX document are realized based on JAVA Download PDFInfo
- Publication number
- CN109918622A CN109918622A CN201910143870.7A CN201910143870A CN109918622A CN 109918622 A CN109918622 A CN 109918622A CN 201910143870 A CN201910143870 A CN 201910143870A CN 109918622 A CN109918622 A CN 109918622A
- Authority
- CN
- China
- Prior art keywords
- document
- text
- latex
- converted
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Landscapes
- Document Processing Apparatus (AREA)
Abstract
The invention discloses the method and system that a kind of Word document is converted to LaTeX document, Word document file is submitted according to user, system is using JACOB technology to data progress initial analysis such as text, picture, formula, tables in file;The data element in source file is extracted using Apache POI, JACOB technology, and records the relative position information of each element;Classified according to each text element of the NB Algorithm to extraction, source file formula is realized based on stacking autocoder and is converted;The relative position information is combined with each data element, forms the information flow of LaTeX destination document;Above- mentioned information stream is written in file destination, to be converted into final LaTeX document.The present invention can reduce the difficulty and complexity converted from Word document to LaTeX document, provides the document conversion method of profession for colleges and universities teachers and students and scientific research personnel etc., improves the working efficiency to document process.
Description
Technical field
The present invention relates to document conversion and data processing field, more specifically to one kind based on JAVA realize by
The method that Word document is converted to LaTeX document.
Background technique
TeX provides a set of powerful and extremely flexible composition language, its up to 900 instruction, and TeX has
Macroefficiency, user can define oneself applicable newer command constantly to extend the function of TeX system.Leslie Lamport is opened
The LaTeX of hair is most popular and the most widely used TeX Hong Ji in the world today.Microsoft Office Word conduct
The kernel program of Office suite provides many wieldy document creation tools, and occupancy volume is most currently on the market
Big word processor.The dedicated file format Word file (.docx) of Word come true on most general document standard.Text
Shelves conversion is to convert the document formats such as Word, Pdf, Txt, Ooxml, Odf, Html.Such as Fa Ming Ren ?the pure proposition of wood
The method that the document of Ooxml, Odf are converted to html format document, Adobe Acrobat Professional software it is real
Existing Word format and the conversion of Pdf format etc..Apache POI is the Java database an of open source code, main target
It is the bottom document for accessing Word.JACOB is a Java-COM middleware, can be in java application by this component
Middle calling com component and Win program library.It may be implemented using Apache POI and JACOB to Microsoft Office Word
The read-write capability of format file.
In realizing process of the present invention, inventor has found that existing document conversion is primarily present in technology and user's use aspect
Following three classes problem: firstly, format of the existing document switch technology generally be directed to a small number of source format documents and specific objective
Document, transformation function is single, and for a user, actual use value is not high.Secondly, the document different for coding mode is real
Now conversion has the conversion problem between certain difficulty, such as Microsoft Office Word and LaTeX document.Most
Afterwards, LaTeX document is made of the markup language of Tex language, and a complete LaTeX document is made, needs to be grasped TeX language
Nearly all description rule and written in code ability, for layman, document writes that there are higher with typesetting
Difficulty and complexity.
Summary of the invention
The technical problem to be solved in the present invention is that in view of the foregoing drawbacks, the present invention provides a kind of Word document to
The method and system of LaTeX document conversion.
The technical solution adopted by the present invention to solve the technical problems is: constructing one kind and is realized based on JAVA by Word document
The method converted to LaTeX document, includes the following steps:
S1, the Word source document files submitted according to user, are opened by the Word calling program module in JACOB component
Source document files;
S2, in open source document files, by JACOB component in source document files Various types of data element carry out just
Begin to analyze, obtains and record the data information of each data element in source document files;
S3, the data information recorded according to step S2 extract source document using Apache POI component and JACOB component
Various types of data element in file;
S4, the Various types of data element for extracting step S3 carry out the dealing with information flow;Wherein, every class data element distinguishes shape
At information flow corresponding thereto;
S5, the data information that step S2 is recorded is combined with the information flow of every class data element, is guaranteeing source document
In the case that each data element position is constant in files, the information flow of LaTeX destination document is formed;
S6, the information flow for the LaTeX destination document that step S5 is formed is written to file destination, thus by Word source document
File is converted into LaTeX document.
Further, it is obtained in step S2 and the data information that records includes the classification and each data of data element
Relative position of the element in source document;Data element by JACOB block analysis includes text, picture, table and formula
Element.
Further, initial analysis is carried out to Various types of data element in source file in step S2, specifically in source file
The storage states of all data elements judged.
Further, pass through Paragraphs, Item, Text and Table interface in JACOB component, note in step S2
Record classification and the relative position of each data element.
Further, extracting Various types of data element in step S3 in source document files includes:
For text element, pass through get (" Text "), the get (" Font ") and get (" Size ") letter in JACOB component
Number, extraction obtain the text element in source document;The text element includes text data content, text type and text lattice
Formula;
For picture element, using XWPFDocument interface in Apache POI component, extraction is obtained in source document
Picture element;Using the FileOutputStream method carried in JAVA, the picture element extracted is saved as into local text
Part;
For table element, in conjunction with the getTable function and ReadTable function in JACOB component, extraction obtains source
Table element in document;Wherein, the specification of table by getTableRowsCount method in JACOB component and
GetTableColumnsCount method obtains;
For formula element, in conjunction with the data information recorded in step S2, by copy method in JACOB component, and
The getContents function of pasting boards subclass function in Toolkit tool-class, extraction obtain the formula element of source document;Wherein,
Pasting boards are obtained by the Transferable variable in Toolkit tool-class, and will by getTransferData method
Data are converted;
Wherein, when every extraction one kind data element, its relative position in source document is recorded.
Further, the data element includes text, picture, table and formula element, utilizes simple shellfish in step S4
This algorithm of leaf carries out classification judgement to the text element of extraction, forms corresponding LaTeX text element information flow;In step S4
The formula element of extraction is converted based on stacking autocoder, forms corresponding LaTeX formula element information flow;Step
Remaining Various types of data element forms corresponding destination document format information stream directly according to relative position information in rapid S4.
Further, carrying out the step of classification determines using text element of the NB Algorithm to extraction includes:
A1, the n text element extracted is passed through into JIEBA segmentation methods, is converted into n dimensional feature vector X={ x1、
x2、…、xn};Wherein, xiFor i-th dimension feature vector, i ∈ n;
A2, a two-value classification problem is converted by the text data classification problem extracted, i.e., any unknown text number
Belong to category set C={ C according to sample d0, C1};Wherein, C0Represent body text, C1Represent title text;
A3, each text data type is identified using NB Algorithm, including body text, title text two
Class;
A4, the probability P that unknown text sample d belongs to classification c is calculated are as follows:
Wherein, it takes maximum probability value as the classification of unknown text sample d, forms corresponding LaTeX according to text categories
Text element.
Further, based on stacking autocoder the formula element of extraction is converted the step of include:
B1, the formula element extracted in step S3 is encoded using stacking autocoding algorithm;
Have coded data in B2, the coding result that step B1 is obtained, with formula template library and carries out approximate match;
B3, the highest formula template data of matching degree is input to system equations transfer function module
In WordMathToLaTeX, the formula format in source file is further converted, forms the volume that can be identified by LaTeX document
Code mode.
Further, in the step B3 converted based on stacking autocoder to the formula element of extraction, according to layer
The Euclidean distance y of folded autocoding arithmetic result x and known sample, judge the expression of the highest formula template of matching degree
Formula are as follows:
Wherein, x1、x2、…xn、y1、y2、…ynRepresent the value of each vector space after formula coder.
A kind of system converted based on JAVA realization from Word document to LaTeX document proposed by the present invention, use are above-mentioned
The method that any one Word document is converted to LaTeX document carries out document conversion.
In a kind of method and system converted based on JAVA realization from Word document to LaTeX document of the present invention
In, according to the original Word document that user provides, using machine learning algorithm, intellectual analysis is carried out to source file data, automatically
The most approximate or highest text element of matching degree and formula element are chosen, source file data integral layout and target text are integrated
Shelves specific coding forms file destination data flow and file destination catalogue, caption, table and the supplemental streams such as illustrates, writes
Enter into file destination, to realize the conversion between different type document.
Implement a kind of method converted based on JAVA realization from Word document to LaTeX document proposed by the present invention and is
System, has the advantages that
1, the difficulty and complexity of the conversion of different type document be can reduce, be vast colleges and universities teachers and students, scientific research personnel etc.
Conveniently professional document conversion regime is provided;
2, facilitate user that simple Word format is converted to the submission format of professional technical paper, solve vast section
It grinds personnel and colleges and universities teachers and students needs to learn complexity LaTeX code and take a significant amount of time to carry out recompiling typesetting to paper
Problem, improve work efficiency, compensate for the field blank that Now Domestic is converted from Word document to LaTeX document.
Detailed description of the invention
Present invention will be further explained below with reference to the attached drawings and examples, in attached drawing:
Fig. 1 is the flow chart that Word document is converted to LaTeX document;
Fig. 2 is that NB Algorithm and stacking autocoder classify to the text element and formula element of extraction
Flow chart;
Fig. 3 is the table conversion effect figure that Word document is converted to LaTeX document;
Fig. 4 is the picture conversion effect figure that Word document is converted to LaTeX document;
Fig. 5 is the formula conversion effect that Word document is converted to LaTeX document;
Fig. 6 is the overall conversion effect picture that Word document is converted to LaTeX document.
Specific embodiment
For a clearer understanding of the technical characteristics, objects and effects of the present invention, now control attached drawing is described in detail
A specific embodiment of the invention.
Referring to FIG. 1, it is the flow chart that Word document is converted to LaTeX document;One kind proposed by the present invention is based on
JAVA realizes the method converted from Word document to LaTeX document, specifically includes the following steps:
S1, the Word source document files submitted according to user, are opened by the Word calling program module in JACOB component
Source document files.
S2, in open source document files, by JACOB component in source document files Various types of data element carry out just
Begin to analyze, obtains and record the data information of each data element in source document files;The data information for wherein obtaining and recording
The relative position of classification and each data element in source document files including data element specifically leads in the present embodiment
Paragraphs, Item, Text and Table interface in JACOB component are crossed, records the classification of each data element and with respect to position
It sets;It include wherein text, picture, table and formula element by the data element of JACOB block analysis;Wherein to source document text
Various types of data element carries out initial analysis in part, specifically carries out to the storage state of all data elements in source document files
Judgement.
S3, the data information recorded according to step S2 extract source document using Apache POI component and JACOB component
Various types of data element in file;Wherein, Various types of data element is extracted in source document files includes:
For text element, pass through get (" Text "), the get (" Font ") and get (" Size ") letter in JACOB component
Number, extraction obtain the text element in source document files;The text element includes text data content, text type and text
Format;
For picture element, using XWPFDocument interface in Apache POI component, extraction obtains source document files
In picture element;The picture element of extraction is saved as this by the FileOutputStream method carried using JAVA program
Ground file;
For table element, in conjunction with the getTable function and ReadTable function in JACOB component, extraction obtains source
Table element in document;Wherein, the specification of table by getTableRowsCount method in JACOB component and
GetTableColumnsCount method obtains;
For formula element, in conjunction with the data information recorded in step S2, by copy method in JACOB component, and
The getContents function of pasting boards subclass function in Toolkit tool-class, extraction obtain the formula element of source document;
Wherein, when every extraction one kind data element, its relative position in source document is recorded.
S4, the Various types of data element for extracting step S3 carry out the dealing with information flow;Every class data element be respectively formed with
Its corresponding information flow;Wherein, it for the processing of information flow, specifically includes: the public affairs based on stacking autocoder to extraction
Formula element is converted, and corresponding LaTeX formula element information flow is formed;Using NB Algorithm to the text of extraction
Element carries out classification judgement, forms corresponding LaTeX text element information flow;Remaining Various types of data element is directly according to opposite
Location information forms corresponding destination document format information stream.
S5, the data information that step S2 is recorded is combined with the information flow of every class data element, is guaranteeing source document
In the case that each data element position is constant in files, the information flow of LaTeX destination document is formed;
S6, the information flow for the LaTeX destination document that step S5 is formed is written to file destination, thus by Word source document
File is converted into LaTeX document.
Referring to FIG. 2, it is NB Algorithm and autocoder is laminated to text element and the formula member of extraction
The flow chart that element is classified;Specifically, carrying out the step of classification judgement using text element of the NB Algorithm to extraction
Suddenly include:
A1, the n text element extracted is passed through into JIEBA segmentation methods, is converted into n dimensional feature vector X={ x1、
x2、…、xn};Wherein, xiFor i-th dimension feature vector, i ∈ n;
A2, a two-value classification problem is converted by the text data classification problem extracted, i.e., any unknown text number
Belong to category set C={ C according to sample d0, C1};Wherein, C0Represent body text, C1Represent title text;
A3, each text data type is identified using NB Algorithm, including body text, title text two
Class;
A4, the probability P that unknown text sample d belongs to classification c is calculated are as follows:
Wherein, it takes maximum probability value as the classification δ of unknown text sample d, is formed according to classification δ corresponding
LaTeX text element;
Specifically, the step of being converted based on stacking autocoder to the formula element of extraction includes:
B1, the formula element extracted in step S3 is encoded using stacking autocoding algorithm;
Have coded data in B2, the coding result that step B1 is obtained, with formula template library and carries out approximate match;
B3, the highest formula template of matching degree is input in system equations transfer function module WordMathToLaTeX,
Formula format in source file is further converted, the coding mode that can be identified by LaTeX document is formed.Wherein, according to layer
The Euclidean distance y of folded autocoding arithmetic result x and known sample, judge the expression of the highest formula template of matching degree
Formula are as follows:
Wherein, x1、x2、…xn、y1、y2、…ynRepresent the value of each vector space after formula coder.
By above-mentioned principle, it is proposed by the present invention another be based on JAVA realize from Word document to LaTeX document turn
The system changed carries out the function of document conversion including the method that any one Word document is converted to LaTeX document.
Fig. 3 is the table conversion effect figure that Word document is converted to LaTeX document;Fig. 4 is Word document to LaTeX document
The picture conversion effect figure of conversion;Fig. 5 is the formula conversion effect that Word document is converted to LaTeX document;Fig. 6 is Word document
The overall conversion effect picture converted to LaTeX document;Pass through Fig. 3-Fig. 6, it is seen that proposed by the present invention a kind of based on JAVA realization
Word document effectively can be changed into Latex document by the method converted from Word document to LaTeX document.
The embodiment of the present invention is described with above attached drawing, but the invention is not limited to above-mentioned specific
Embodiment, the above mentioned embodiment is only schematical, rather than restrictive, those skilled in the art
Under the inspiration of the present invention, without breaking away from the scope protected by the purposes and claims of the present invention, it can also make very much
Form, all of these belong to the protection of the present invention.
Claims (10)
1. a kind of realize the method converted from Word document to LaTeX document based on JAVA, which is characterized in that including walking as follows
It is rapid:
S1, the Word source document files submitted according to user open source document by the Word calling program module in JACOB component
Files;
S2, in open source document files, Various types of data element in source document files is initially divided by JACOB component
Analysis, obtains and records the data information of each data element in source document files;
S3, the data information recorded according to step S2, utilize Apache POI component and JACOB component, extraction source document files
In Various types of data element;
S4, the Various types of data element for extracting step S3 carry out the dealing with information flow;Wherein, every class data element be respectively formed with
Its corresponding information flow;
S5, the data information that step S2 is recorded is combined with the information flow of every class data element, is guaranteeing source document text
In the case that each data element position is constant in part, the information flow of LaTeX destination document is formed;
S6, the information flow for the LaTeX destination document that step S5 is formed is written to file destination, thus by Word source document files
It is converted into LaTeX document.
2. the method that Word document according to claim 1 is converted to LaTeX document, which is characterized in that obtained in step S2
It takes and the data information recorded includes the opposite position of the classification and each data element of data element in source document files
It sets;Data element by JACOB block analysis includes text, picture, table and formula element.
3. the method that Word document according to claim 1 is converted to LaTeX document, which is characterized in that right in step S2
Various types of data element carries out initial analysis, the specifically storage to all data elements in source document files in source document files
State is judged.
4. the method that Word document according to claim 1 is converted to LaTeX document, which is characterized in that lead in step S2
Paragraphs, Item, Text and Table interface in JACOB component are crossed, records the classification of each data element and with respect to position
It sets.
5. the method that Word document according to claim 1 is converted to LaTeX document, which is characterized in that in step S3
Various types of data element is extracted in source document files includes:
It is mentioned for text element by get (" Text "), the get (" Font ") and get (" Size ") function in JACOB component
Obtain the text element in source document files;The text element includes text data content, text type and text formatting;
For picture element, using XWPFDocument interface in Apache POI component, extraction is obtained in source document files
Picture element;The FileOutputStream method carried using JAVA program, saves as local text for the picture element of extraction
Part;
For table element, in conjunction with the getTable function and ReadTable function in JACOB component, extraction obtains source document
In table element;Wherein, the specification of table by getTableRowsCount method in JACOB component and
GetTableColumnsCount method obtains;
For formula element, in conjunction with the data information recorded in step S2, by copy method in JACOB component, and
The getContents function of pasting boards subclass function in Toolkit tool-class, extraction obtain the formula element of source document;
Wherein, when every extraction one kind data element, its relative position in source document is recorded.
6. the method that Word document according to claim 1 is converted to LaTeX document, which is characterized in that the data element
Element includes text, picture, table and formula element, is carried out in step S4 using text element of the NB Algorithm to extraction
Classification determines, forms corresponding LaTeX text element information flow;Based on stacking autocoder to the public affairs of extraction in step S4
Formula element is converted, and corresponding LaTeX formula element information flow is formed;Remaining Various types of data element is directly pressed in step S4
According to relative position information, corresponding destination document format information stream is formed.
7. the method that Word document according to claim 6 is converted to LaTeX document, which is characterized in that utilize simple shellfish
This algorithm of leaf carries out the step of classification determines to the text element of extraction
A1, the n text element extracted is passed through into JIEBA segmentation methods, is converted into n dimensional feature vector X={ x1、x2、…、
xn};Wherein, xiFor i-th dimension feature vector, i ∈ n;
A2, a two-value classification problem is converted by the text data classification problem extracted, i.e., any unknown text data sample
This d belongs to category set C={ C0, C1};Wherein, C0Represent body text, C1Represent title text;
A3, each text data type is identified using NB Algorithm, including body text, two class of title text;
A4, the probability P that unknown text sample d belongs to classification c is calculated are as follows:
Wherein, it takes maximum probability value as the classification δ of unknown text sample d, forms corresponding LaTeX text according to classification δ
This element.
8. the method that Word document according to claim 6 is converted to LaTeX document, which is characterized in that certainly based on stacking
Moving the step of encoder converts the formula element of extraction includes:
B1, the formula element extracted in step S3 is encoded using stacking autocoding algorithm;
Have coded data in B2, the coding result that step B1 is obtained, with formula template library and carries out approximate match;
B3, the highest formula template of matching degree is input in system equations transfer function module WordMathToLaTeX, to source
Formula format in file is further converted, and the coding mode that can be identified by LaTeX document is formed.
9. the method that Word document according to claim 8 is converted to LaTeX document, which is characterized in that the step B3
In, according to the Euclidean distance y of stacking autocoding arithmetic result x and known sample, judge the highest formula of matching degree
The expression formula of template are as follows:
Wherein, x1、x2、…xn、y1、y2、…ynRepresent the value of each vector space after formula coder.
10. a kind of realize the system converted from Word document to LaTeX document based on JAVA, which is characterized in that using such as right
It is required that the method that any one of 1-9 Word document is converted to LaTeX document carries out document conversion.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910143870.7A CN109918622B (en) | 2019-02-27 | 2019-02-27 | Method for realizing conversion from Word document to LaTeX document based on JAVA |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910143870.7A CN109918622B (en) | 2019-02-27 | 2019-02-27 | Method for realizing conversion from Word document to LaTeX document based on JAVA |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109918622A true CN109918622A (en) | 2019-06-21 |
CN109918622B CN109918622B (en) | 2020-12-08 |
Family
ID=66962462
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910143870.7A Active CN109918622B (en) | 2019-02-27 | 2019-02-27 | Method for realizing conversion from Word document to LaTeX document based on JAVA |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109918622B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021042542A1 (en) * | 2019-09-04 | 2021-03-11 | 平安科技(深圳)有限公司 | Table of contents storage method and apparatus, computer device and storage medium |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040073708A1 (en) * | 1999-12-08 | 2004-04-15 | Warnock Kevin L. | Internet document services |
CN1685312A (en) * | 2002-07-19 | 2005-10-19 | Jgr阿奎西申公司 | Registry driven interoperability and exchange of documents |
CN101055577A (en) * | 2006-04-12 | 2007-10-17 | 龙搜(北京)科技有限公司 | Collector capable of extending markup language |
CN101196886A (en) * | 2006-12-08 | 2008-06-11 | 鸿富锦精密工业(深圳)有限公司 | System and method for converting word files into XML files |
CN103309848A (en) * | 2013-06-14 | 2013-09-18 | 广东电网公司佛山供电局 | Method for converting excel document into pdf document |
CN104008087A (en) * | 2014-06-05 | 2014-08-27 | 李梦依 | Automatic typesetting method and system special for copywriter with standard format |
CN104267953A (en) * | 2014-09-27 | 2015-01-07 | 昆明钢铁集团有限责任公司 | Control and method for importing Word test questions based on browser |
CN107025407A (en) * | 2017-03-22 | 2017-08-08 | 国家计算机网络与信息安全管理中心 | The malicious code detecting method and system of a kind of office document files |
-
2019
- 2019-02-27 CN CN201910143870.7A patent/CN109918622B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040073708A1 (en) * | 1999-12-08 | 2004-04-15 | Warnock Kevin L. | Internet document services |
CN1685312A (en) * | 2002-07-19 | 2005-10-19 | Jgr阿奎西申公司 | Registry driven interoperability and exchange of documents |
CN101055577A (en) * | 2006-04-12 | 2007-10-17 | 龙搜(北京)科技有限公司 | Collector capable of extending markup language |
CN101196886A (en) * | 2006-12-08 | 2008-06-11 | 鸿富锦精密工业(深圳)有限公司 | System and method for converting word files into XML files |
CN103309848A (en) * | 2013-06-14 | 2013-09-18 | 广东电网公司佛山供电局 | Method for converting excel document into pdf document |
CN104008087A (en) * | 2014-06-05 | 2014-08-27 | 李梦依 | Automatic typesetting method and system special for copywriter with standard format |
CN104267953A (en) * | 2014-09-27 | 2015-01-07 | 昆明钢铁集团有限责任公司 | Control and method for importing Word test questions based on browser |
CN107025407A (en) * | 2017-03-22 | 2017-08-08 | 国家计算机网络与信息安全管理中心 | The malicious code detecting method and system of a kind of office document files |
Non-Patent Citations (3)
Title |
---|
WEIXIN_30379911: "JAVA解析word文档", 《CSDN》 * |
潘若瑛: "多模板多格式论文综合校排系统的研究和实现", 《中国优秀硕士学位论文全文数据库》 * |
蔡万景 等: "LaTex创作的Web模板系统的研究与实现", 《科技信息》 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021042542A1 (en) * | 2019-09-04 | 2021-03-11 | 平安科技(深圳)有限公司 | Table of contents storage method and apparatus, computer device and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN109918622B (en) | 2020-12-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Jaderberg et al. | Reading text in the wild with convolutional neural networks | |
CN100511219C (en) | Electronic filing system searchable by a handwritten search query | |
JP4724776B2 (en) | System and method for adaptive handwriting recognition | |
Govindaraju et al. | Guide to OCR for Indic scripts | |
CN110147534B (en) | Method and system for converting LaTeX document into Word document | |
CN109597886A (en) | It extracts and generates mixed type abstraction generating method | |
Kia et al. | A novel method for recognition of Persian alphabet by using fuzzy neural network | |
Kotani et al. | Generating handwriting via decoupled style descriptors | |
Halder et al. | Offline writer identification and verification—A state-of-the-art | |
Droettboom et al. | Using the Gamera framework for the recognition of cultural heritage materials | |
Kanoun et al. | Natural language morphology integration in off-line arabic optical text recognition | |
CN109918622A (en) | The method and system converted from Word document to LaTeX document are realized based on JAVA | |
Sari et al. | A search engine for Arabic documents | |
Bharath et al. | Online handwriting recognition for Indic scripts | |
CN109885818A (en) | A kind of powerpoint presentation is to Beamer PowerPoint conversion method and system | |
Bouibed et al. | Writer retrieval using histogram of templates features and SVM | |
CN106021241B (en) | Braille point place Chinese character coding and its machine translation method between braille | |
JP7435098B2 (en) | Kuzushiji recognition system, Kuzushiji recognition method and program | |
CN110147530A (en) | A kind of method and system that Word document is converted to LaTeX document | |
Wang | Pattern recognition and machine vision | |
Deshmukh et al. | Voice-Enabled Vision For The Visually Disabled | |
Eglin et al. | Computer assistance for Digital Libraries: Contributions to Middle-ages and Authors' Manuscripts exploitation and enrichment | |
Mirshekari | Foundations of Legal Protection of Reputation | |
Bhokse et al. | Devnagari handwriting recognition system using dynamic time warping algorithm | |
Jomy et al. | Pattern Analysis Techniques for the Recognition of Unconstrained Handwritten Malayalam Character Images |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20220414 Address after: 100000 b1001-01, floor 9, block B, No. 9, Shangdi Third Street, Haidian District, Beijing Patentee after: Beijing anzhengtong Information Technology Co.,Ltd. Address before: 430000 Lu Mill Road, Hongshan District, Wuhan, Hubei Province, No. 388 Patentee before: CHINA University OF GEOSCIENCES (WUHAN CITY) |
|
TR01 | Transfer of patent right |