CN110110270B - Parallel processing generation method and device for large genealogy lineage diagram - Google Patents

Parallel processing generation method and device for large genealogy lineage diagram Download PDF

Info

Publication number
CN110110270B
CN110110270B CN201910339580.XA CN201910339580A CN110110270B CN 110110270 B CN110110270 B CN 110110270B CN 201910339580 A CN201910339580 A CN 201910339580A CN 110110270 B CN110110270 B CN 110110270B
Authority
CN
China
Prior art keywords
generation
character
page
characters
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910339580.XA
Other languages
Chinese (zh)
Other versions
CN110110270A (en
Inventor
彭智勇
江欢
何子龙
彭煜玮
李蓉蓉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University WHU
Original Assignee
Wuhan University WHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University WHU filed Critical Wuhan University WHU
Priority to CN201910339580.XA priority Critical patent/CN110110270B/en
Publication of CN110110270A publication Critical patent/CN110110270A/en
Application granted granted Critical
Publication of CN110110270B publication Critical patent/CN110110270B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
    • G06F16/986Document structures and storage, e.g. HTML extensions

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Document Processing Apparatus (AREA)

Abstract

The invention discloses a method and a device for generating a parallel-processing large family genealogy graph, wherein the method comprises the steps of firstly reading characters in a genealogy character table from a database into a memory to form an sxt data table, then establishing a plurality of threads to process and analyze the logical relationship between the characters according to the stored characters and corresponding genealogy algebra, and calculating the position information of each character by adopting a page dynamic update algorithm; then writing the character information into the corresponding position of the corresponding HTML template according to the calculated position information of the character to form an HTML file; combining the HTML files by utilizing a browser kernel to generate a picture corresponding to each HTML file; and overlapping the generated pictures repeatedly to form a large family genealogy graph. The automatic generation of the large lineage diagram is realized, meanwhile, the condition that the memory is broken down due to overlarge data in the serial execution in the single thread mode is avoided, and the generation efficiency is greatly improved.

Description

Parallel processing generation method and device for large genealogy lineage diagram
Technical Field
The invention relates to the technical field of computers, in particular to a method and a device for generating a parallel-processing large family genealogy lineage diagram.
Background
The family tree, also called family tree or block tree, is a character carrier for recording important character traces and family migration and propagation in ancient China. The pedigree is one of three major literatures (national history, annals and pedigrees) of Chinese nationalities, belongs to precious cultural data, and has unique irreplaceable functions for deep researches on historians, folk-custom, demographics, sociology and economics. The family tree has many recorded contents and the formats are not uniform, and in general, the family tree has the following recorded contents: the family name source flow, the family information, the character biography, the family events and the like, wherein the most important thing is the record of the family map, and the styles of the family map show diversified characteristics after people explore, research and develop the family map for a long time. The components can be divided into three parts according to the expression form, and the three parts respectively correspond to a line transmission chart, a hanging chart and a quick look-up table (shown in figure 1).
With the continuous development and progress of science and technology and computer technology, more and more groups and companies begin to research, analyze and mine genealogical data, digitize the traditional paper mass spectrum book, effectively and reasonably organize the data and pictures contained in the genealogical data, and generate an electronic document in a PDF format, so that the electronic document is convenient to store and spread on the Internet.
At present, many researches on related aspects of family score digitization at home and abroad exist, but all the researches have shortcomings. The family digitalization system can be roughly divided into two types, one is a single machine mode, a user can input data on a computer, and the system stores the input data in a file form. And secondly, the genealogy data input by the user can be output and displayed, so that the people can conveniently check the genealogy data.
The inventor of the present application finds that the method of the prior art has at least the following technical problems in the process of implementing the present invention:
in the process of generating a lineage chart from genealogy data, the most complicated is the generation of a character table in a lineage chart, and some large families are propagated through a plurality of generations, the number of people reaches hundreds of thousands of people, and when the character lineage chart shown in fig. 1 is generated, the number of people, the generation number, the number of people in each generation and the living introduction are considered. In the lineage table, the relationship between characters forms a large family tree, and the number of characters increases exponentially with the number of breeding generations. However, with the existing method, firstly, the architecture and the operation mode of the system are not favorable for management sharing and efficient generation of data, and the purpose of efficient generation in the family tree digitization process cannot be achieved, so that certain limitations exist. The family tree website famous abroad can store family tree data with large data volume, but the external output is a printed version, and actually, the functions of generating and exporting an electronic book are not provided for a user, so that the family tree data cannot be digitalized and generated efficiently, meanwhile, the research on personalized customization of the book body is still in the bud stage, and the existing digital platform of each family tree basically does not provide related functional modules.
It is understood from the above that the methods in the prior art have the technical problems that the electronic lineage diagram cannot be generated and the generation method is inefficient.
Disclosure of Invention
In view of the above, the present invention provides a method and an apparatus for generating a large family lineage diagram through parallel processing, so as to solve or at least partially solve the technical problems that the method in the prior art cannot generate an electronic lineage diagram and the generation method is inefficient.
The invention provides a method for generating a parallel-processed large genealogy lineage diagram, which comprises the following steps:
step S1: reading the characters in the generation family character table into a memory from a database, and storing the characters in the generation family character table into a data structure of the memory to form an sxt data table, wherein the sxt data table stores the characters and the corresponding generation numbers;
step S2: creating a plurality of threads to process and analyze the logical relationship between the characters according to the stored characters and corresponding generation generations, and calculating the position information of the characters contained in each generation based on the logical relationship between the characters, the size of a preset generated page and the space occupied by the previous generation;
step S3: writing the character information into the corresponding position of the corresponding HTML template according to the calculated position information of the character to form an HTML file;
step S4: merging the HTML files by using a browser kernel, performing style typesetting on the merged HTML files, and generating a picture corresponding to each HTML file;
step S5: and repeatedly overlapping the generated pictures to form a large family genealogy graph.
In one embodiment, creating a plurality of threads to process and analyze the logical relationship between the characters according to the stored characters and corresponding generation numbers of the generations includes:
partitioning sxt data sheets according to generation numbers, wherein each sxt data sheet partition corresponds to a generation;
a separate thread is created to process each sxt data table chunk.
In one embodiment, the step S2 of calculating the position information of the people included in each generation based on the logical relationship between the people, the size of the preset generated page and the space occupied by the previous generation includes:
representing the number of pages corresponding to each person by a Pid identifier to obtain information in sxt data table fields corresponding to each person;
the position information of the person included in each generation is calculated using the following formula:
Figure BDA0002040281670000031
where p denotes the current page number, cyRepresenting the page number occupied by the character data of the current generation, n representing the ranking of the character in each generation, m representing the number of rows of the generated page, by-1Representing the blank line number of the last page of the previous generation, and representing the calculated page number by page;
and replacing the Pid identifier with the calculated page number page to obtain the position information of the person in the lineage diagram.
In one embodiment, a flag bit is set for each created thread, where the flag bit corresponds to an algebraic number of the person in the thread, and step S5 specifically includes:
and superposing the generated pictures according to the mark position of each thread to form a large genealogy lineage diagram.
In one embodiment, after calculating the page number, the method further includes:
and processing the data to be processed by adopting a preset scheduling algorithm.
Based on the same inventive concept, the second aspect of the present invention provides an apparatus for generating a parallel-processed large family genealogy lineage diagram, comprising:
sxt a data table forming module, which is used for reading the characters in the lineage character table from the database into the memory and storing the characters in the data structure of the memory to form a sxt data table, wherein the sxt data table stores the characters and the corresponding lineage algebra;
the character position calculation module is used for creating a plurality of threads to process and analyze the logical relationship between the characters according to the stored characters and corresponding generation generations, and calculating the position information of the characters contained in each generation based on the logical relationship between the characters, the size of a preset generated page and the space occupied by the previous generation;
the HTML file generation module is used for writing the character information into the corresponding position of the corresponding HTML template according to the calculated position information of the character to form an HTML file;
the picture generation module is used for merging the HTML files by utilizing the browser kernel, performing style typesetting on the merged HTML files and generating pictures corresponding to each HTML file;
and the large genealogy lineage diagram generating module is used for repeatedly superposing the generated pictures to form a large genealogy lineage diagram.
In one embodiment, the person position calculating module is specifically configured to:
partitioning sxt data sheets according to generation numbers, wherein each sxt data sheet partition corresponds to a generation;
a separate thread is created to process each sxt data table chunk.
In one embodiment, the preset size of the generated page is the number of rows included in the generated page, the logical relationship between the characters includes the ranking of each character in the generation, and the character position calculation module is further configured to:
representing the number of pages corresponding to each person by a Pid identifier to obtain information in sxt data table fields corresponding to each person;
the position information of the person included in each generation is calculated using the following formula:
Figure BDA0002040281670000041
where p denotes the current page number, cyRepresenting the page number occupied by the character data of the current generation, n representing the ranking of the character in each generation, m representing the number of rows of the generated page, by-1Representing the blank line number of the last page of the previous generation, and representing the calculated page number by page;
and replacing the Pid identifier with the calculated page number page to obtain the position information of the person in the lineage diagram.
Based on the same inventive concept, a third aspect of the present invention provides a computer-readable storage medium having stored thereon a computer program which, when executed, performs the method of the first aspect.
Based on the same inventive concept, a fourth aspect of the present invention provides a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the method according to the first aspect when executing the program.
One or more technical solutions in the embodiments of the present application have at least one or more of the following technical effects:
the invention provides a method for generating a parallel-processing large-scale genealogy lineage diagram, which comprises the steps of firstly reading characters in a lineage character table into a memory from a database to form an sxt data table, then creating a plurality of threads to process and analyze the logical relationship between the characters according to the stored characters and corresponding lineage algebra, and calculating the position information of each character by adopting a page dynamic update algorithm; then writing the character information into the corresponding position of the corresponding HTML template according to the calculated position information of the character to form an HTML file; combining the HTML files by utilizing a browser kernel to generate a picture corresponding to each HTML file; and overlapping the generated pictures repeatedly to form a large family genealogy graph.
Compared with the existing method, the method provided by the invention has the advantages that the concurrency of threads is utilized, a plurality of generations are processed simultaneously, the page where the character is located is calculated in real time to carry out typesetting, a large amount of data can be processed simultaneously, the processed data can be directly generated, the automatic generation of a large-scale lineage diagram is realized, the condition that the memory is broken down due to overlarge data in serial execution in a single-thread mode is avoided, and the generation efficiency is greatly improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a stylistic diagram of genealogy generation in accordance with the present invention;
FIG. 2 is a flow chart of a method for generating a large family genealogy graph processed in parallel according to an embodiment of the present invention;
FIG. 3 is a diagram of a genealogical generation character map according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of large genealogy personality page parallel processing using multiple threads;
FIG. 5 is a flow chart of a process for a large family lineage diagram for a specific application;
fig. 6 is a block diagram of an apparatus for generating a large family genealogy graph processed in parallel according to an embodiment of the present invention.
FIG. 7 is a block diagram of a computer-readable storage medium in an embodiment of the invention;
fig. 8 is a block diagram of a computer device in an embodiment of the present invention.
Detailed Description
The invention aims to provide a method for generating a large-scale genealogy lineage diagram by parallel processing, aiming at the technical problems that the electronic lineage diagram cannot be generated and the generation method is low in efficiency in the existing method.
The invention provides an algorithm for processing genealogy character data in parallel, which is mainly based on a multithreading technology and aims at overlarge memory resource required by serial execution data in the process of generating a pdf genealogy document from a genealogy data table. The method fully considers the time and space complexity of computer execution. The method can quickly generate the pdf family electronic document from the database in a reasonable use time and on a storage overhead, and can greatly improve the generation efficiency.
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Example one
Through a great deal of research and practice, the applicant of the present invention finds that the main problems faced in the prior art are as follows:
1. the lineage map is arranged according to lineage generation (ancestor). This means that the processing of the characters in the family tree needs to be done in a breadth-first manner. Namely, the next generation of people can be generated by processing one generation of people. This requires that when processing the last person of a certain generation, the system be able to go back to the first person of that generation, starting with his next generation.
2. When a generation is developed, the number of characters is particularly large, and the arrangement position of the page of the characters in the generated lineage diagram needs to be calculated through the number of characters and information of the characters. In this case, the person information in the person tree inevitably occupies a large storage space, and the output of the positions where the persons are arranged inevitably takes much time. In practice, without a good algorithm, it is often easy to crash, and thus the generation of the spectrum book fails.
Based on the above consideration, the invention provides a method for generating a parallel-processing large family genealogy lineage diagram.
The main inventive concept is as follows: the genealogy table data is converted into a book format capable of being printed and published, and individual printing and layout of each page by the conventional method is not realistic due to the large number of characters. One possible idea is to organize and typeset the pages in each lineage table using HTML pages. Therefore, the original picture information of the pdf page to be generated can use the HTML character strings to organize the page, and the memory can be greatly saved. Due to the large amount of human beings, the HTML character strings corresponding to pdf files of the finally formed book are still large. Parallel processing is therefore required to speed up the generation, while a large string scheduling algorithm is used to convert HTML strings generated in a predetermined format into pdf document pages.
The following describes a specific implementation flow of the method for generating a large family genealogy lineage diagram provided by the present invention.
The embodiment provides a method for generating a parallel-processed large family genealogy lineage diagram, please refer to fig. 2, which includes:
step S1: reading the characters in the generation family character table from the database into the memory and storing the characters in the data structure of the memory to form an sxt data table, wherein the sxt data table stores the characters and the corresponding generation numbers.
Specifically, the sxt data table stores the data corresponding to the person in the database, wherein the most important is the generation number of the person.
The family data (i.e. the ancestry character table) forms a huge character relation tree in the memory, each layer of the tree corresponds to the member of each generation in the family, and the logical relation between the upper layer and the lower layer is compact. The sizes, the layers and the node numbers of the character trees corresponding to different families are all different, and the process of generating the family genealogy graph is to traverse all the nodes of the tree from left to right and from top to bottom from the root node of the tree to finally form a complete genealogy book. The character tree structure is shown in fig. 3.
In a specific implementation process, Sql command processing may be performed on the data table, and the character information is extracted into the memory and constructed into a key sxt data structure, where the sxt data structure is a data table in xml format (in implementation, a Dataset in C # language, and according to the data to be typeset, the data to be typeset is obtained by using Sql query from the original relational data table in a join manner or the like to obtain the complete character and corresponding information to be typeset). The data is chunked by generation number (one generation for each generation), and a separate thread is created for each generation person to process.
Step S2: and creating a plurality of threads to process and analyze the logical relationship between the characters according to the stored characters and corresponding generation generations, and calculating the position information of the characters contained in each generation based on the logical relationship between the characters, the size of a preset generated page and the space occupied by the previous generation.
Specifically, the logical relationship between the personas includes: the number of generations of the characters and the relationship between the characters. For example, when a family page is generated according to the generation number, after an algebraic process is finished, the first child of the first person who returns to the current generation number needs to be processed, the first child of the current first generation needs to know who the first child is, and the next generation of people is processed from the child, wherein the relations related to the people, such as parents, marriage and the like, are determined according to the logical relations among the people.
The main work of each thread is to write the character information into the corresponding position of the corresponding HTML template to obtain the HTML code of the character page corresponding to the algebra, and simultaneously, the number of the pages corresponding to each character is temporarily represented by the Pid identifier, and the information in the corresponding data table field of each character node is obtained in the processing flow.
Step S3: and writing the character information into the corresponding position of the corresponding HTML template according to the calculated position information of the character to form an HTML file.
Specifically, page processing is performed for each character in a thread, and template code concatenation is performed to form an HTML file (character string) for one page.
Step S4: and merging the HTML files by using a browser kernel, performing style typesetting on the merged HTML files, and generating a picture corresponding to each HTML file.
Specifically, the present step is mainly to merge the previously generated HTML files.
Step S5: and repeatedly overlapping the generated pictures to form a large family genealogy graph.
Specifically, after the pictures corresponding to each generation are generated, the pictures can be superimposed to finally form a PDF digitized file for output.
In one embodiment, in step S2, creating multiple threads to process and analyze the logical relationship between the characters according to the stored characters and corresponding lineage algebras includes:
partitioning sxt data sheets according to generation numbers, wherein each sxt data sheet partition corresponds to a generation;
a separate thread is created to process each sxt data table chunk.
Specifically, the present embodiment chunks the data by generation number (one generation for each generation), and creates a separate thread for each generation person to process. Sxt data is partitioned according to the number of people, each partition is processed by a single thread, and sxt data tables are required to be completely copied for logic association processing of people information.
In one embodiment, the step S2 of calculating the position information of the people included in each generation based on the logical relationship between the people, the size of the preset generated page and the space occupied by the previous generation includes:
representing the number of pages corresponding to each person by a Pid identifier to obtain information in sxt data table fields corresponding to each person;
the position information of the person included in each generation is calculated using the following formula:
Figure BDA0002040281670000081
where p denotes the current page number, cyRepresenting the page number occupied by the character data of the current generation, n representing the ranking of the character in each generation, m representing the number of rows of the generated page, by-1Representing the blank line number of the last page of the previous generation, and representing the calculated page number by page;
and replacing the Pid identifier with the calculated page number page to obtain the position information of the person in the lineage diagram.
Please refer to fig. 4, which is a diagram illustrating a large genealogy character page parallel processing using multiple threads.
Firstly logging in a system, setting parameters, reading sxs data tables, splitting a character into a plurality of threads (processing thread 1 and processing thread 2 … processing thread n) for processing, calculating a character page through the character position calculating algorithm (adopting a page dynamic update algorithm), then superposing to generate a picture, generating a large family genealogy graph, and finally outputting.
In a specific implementation process, the initial page numerical value of each thread is represented by different characters, if n threads exist, data of characters in each thread are stored in a thread array, and the page initial value corresponding to each thread is set to be x1…xy(y 1 … n), wherein xyThe value is 1, then the last page number of each generation in each thread array is xy+cy. (wherein c isyThe page number y occupied by the character data of each generation 1 … n
Let the number of generated page lines be a fixed value m, and the length of each thread array (sxt data divided by generation) be ly(y is 1 … n), which indicates the number of people included in each generation, and for convenience of calculation, the space occupied by each person is 1 line. The number of array rows where the character is located in each thread array is n (numbering from 0).
Let ay=lyFormula 1 of/m (a)yNumber of pages occupied by data in the y-th thread array (i.e., y-th generation)
Let by=m-(by% m) formula 2 (b)yBlank line number of the last page of the y-th thread array
When the data is divided into a plurality of subdata for parallel processing, each subdata is stored in a corresponding data array, and after the data in each array is processed, the data of each array cannot be guaranteed to occupy the whole page number, so that the data of the arrays need to be translated according to different conditions by considering the size relation of blank rows in each array until the data in other thread arrays occupy the whole page number except that the data in the last thread array does not occupy the whole page. And meanwhile, calculating the Page field in each person to obtain a corresponding numerical value.
Through the judgment of the actual situation of each thread array, each situation corresponds to different numerical values, the Page value corresponding to each person Pid placeholder is different, and the specific situation needs to be divided into the following six situations for discussion. (in the following formula, the current thread is y, the previous thread is y-1, and the array represents a generation), the following assignment operations are mainly recorded for updating the blank lines of the current generation and the previous generation, and the blank lines are to be transferred to the next generation one by one.
(1) When b isy-10 and byWhen the thread number is not equal to 0, the previous array is full, the next array is empty, and the updating conditions of the empty row of the page array corresponding to the previous thread and the empty row and page number of the array corresponding to the current thread are as follows:
by=b′y
ay=ly/m
by-1=b′y-1
ay-1=ly-1/m
cy-1=ay-1
cy=ay
wherein, b'yCan be obtained by calculation of the aforementioned formula 1, b'y-1Can be obtained by the calculation of the aforementioned formula 2.
(2) When b isy-1Not equal to 0 and byWhen the value is 0, the next array is full,when the last page of the previous array has an empty row, and the typesetting is performed in sequence, the page number of the next array needs to be newly calculated, the content of the next array needs to be typeset in the empty row of the previous page, and the empty row of the previous array is updated to the next array, and the corresponding value is as follows:
by=by-1
Figure BDA0002040281670000101
by-1=0
Figure BDA0002040281670000102
cy-1=ay-1
cy=ay
(3) when b isy-10 and byWhen being equal to 0, the previous array is full page, the next array is full page, and the corresponding value is:
by=0
Figure BDA0002040281670000103
by-1=0
Figure BDA0002040281670000104
cy-1=ay-1
cy=ay
(4) when b isy-1Not equal to 0 and byWhen not equal to 0, the previous array has an empty row, and when the next array also has an empty row:
a. if b isy-1+by<m, the sum of the empty row values of the last pages of the front and the back threads is less than the row number of one page, and the corresponding values are as follows:
by=by-1+b′’y
Figure BDA0002040281670000111
by-1=0
Figure BDA0002040281670000112
cy-1=ay-1
cy=ay
b. if b isy-1+by>m, i.e. the sum of the two empty row values is greater than the number of rows of one page, the corresponding value is:
by=by-1+b′y-m
Figure BDA0002040281670000113
by-1=0
Figure BDA0002040281670000114
cy-1=ay-1
cy=ay
c. if b isy-1+byM, i.e. the sum of the two empty row values equals the number of rows of a page, the corresponding value is:
by=0
Figure BDA0002040281670000115
by-1=0
Figure BDA0002040281670000116
cy-1=ay-1
cy=ay
by analyzing the above situation specifically, the present invention can update the relative page numbers of the characters on the page at each generation processing in the following manner (assuming that the previous page numbers are all sorted and the current page number is p):
Figure BDA0002040281670000121
the Page field value of each person on the current Page can be obtained.
And processing the processed data again, and at the moment, only replacing the pid number in the person description information with the corresponding Page number Page.
For example, if the current page is p, the previous generation page has 2 blanks, i.e., by-1If there are 7 characters in the next generation (the y-th generation) and 4 characters are displayed on each page, the page numbers of the 1 st and 2 nd characters in the next generation are p, the page number of the 3456 th character is p +1, the page number of the 7 th character is p +2, and the blank of the page is transferred to the last of the next generation, i.e., by=2+1=3。
FIG. 5 is a flow chart of a process for a large family lineage diagram in a specific application.
The method comprises the processes of data acquisition, data processing, data filling, genealogy generation and the like.
In one embodiment, a flag bit is set for each created thread, where the flag bit corresponds to an algebraic number of the person in the thread, and step S5 specifically includes:
and superposing the generated pictures according to the mark position of each thread to form a large genealogy lineage diagram.
Specifically, since the scheduling of threads during parallel processing is determined by the operating system, the processing order of the threads is unordered, and the genealogical data must be sequentially superimposed into a book, so that a flag bit is set for each thread to obtain the processing order of the threads, and preparation is made for sequential superimposition of data after parallel processing.
In one embodiment, after calculating the page number, the method further includes:
and processing the data to be processed by adopting a preset scheduling algorithm.
Specifically, after the page number is processed, the remaining process is directly combined with an internal and external scheduling algorithm (an algorithm for processing extra-long text data, the length of the text data usually exceeds the size of a memory) of a large character string on the basis of the original serial data for processing. The core of the large-string internal and external scheduling algorithm is as follows: and adding a mark bit for the data to be processed in the memory, taking each major link of data processing as an interval in the process of processing the data by system operation, and dynamically scheduling and calling out the character string by adopting the algorithm in each data conversion interval so as to ensure the residual space of the memory. Wherein the data flag bits are shown in table 1:
table 1 flag bit data structure table
Figure BDA0002040281670000122
Figure BDA0002040281670000131
Specifically, when processing the person data, a flag bit is added to each person data, and the flag bit is used to determine whether the person data needs to be processed in time, for example, if the current or related data needs to be processed, the flag bit is set to true, and if the current or related data does not need to be processed, the flag bit is set to false.
When the system processes the character data and needs interactive superposition and logic judgment, the mark of each character is judged, if the mark bit is false, the data is moved out of the memory to a disk for storage, otherwise, the data is left in the memory for processing. The above-mentioned steps are repeated until the whole processing flow is finished.
Generally speaking, the large genealogy generation method for parallel processing data disclosed by the invention can digitize genealogy data into PDF files efficiently and output the PDF files, and performs split parallel processing on a complete character tree. Under the condition of not losing the logic relation among the data, the character tree is divided into a plurality of sub threads according to layers, each character sub node of the sub tree is processed in parallel, the physical position of the parent node and the sub node of each node in the character tree is calculated in the process of processing the sub nodes in parallel, namely the actual page number of each character in a PDF (portable document format) file, and the data of each sub tree is combined into a complete character tree to be output after all the data of each sub tree are processed.
Based on the same inventive concept, the application also provides a device corresponding to the method for generating the parallel-processed large family genealogy lineage diagram in the first embodiment, which is detailed in the second embodiment.
Example two
The present embodiment provides an apparatus for generating a parallel-processed large family genealogy lineage diagram, please refer to fig. 6, the apparatus includes:
sxt a data table forming module 201, configured to read the people in the ancestry people table from the database into the memory, and store the read people in the data structure of the memory to form a sxt data table, where the sxt data table stores people and corresponding ancestry generations;
the figure position calculating module 202 is used for creating a plurality of threads to process and analyze the logical relationship between the figures according to the stored figures and corresponding generation generations, and calculating the position information of the figures contained in each generation based on the logical relationship between the figures, the size of a preset generated page and the space occupied by the previous generation;
an HTML file generation module 203, configured to write the person information into a corresponding position of the corresponding HTML template according to the calculated position information of the person, so as to form an HTML file;
the picture generation module 204 is configured to merge HTML files by using a browser kernel, perform style typesetting on the merged HTML files, and generate a picture corresponding to each HTML file;
and the large genealogy lineage map generating module 205 is configured to repeatedly superimpose the generated pictures to form a large genealogy lineage map.
In one embodiment, the person position calculation module 202 is specifically configured to:
partitioning sxt data sheets according to generation numbers, wherein each sxt data sheet partition corresponds to a generation;
a separate thread is created to process each sxt data table chunk.
In one embodiment, the preset size of the generated page is the number of rows included in the generated page, the logical relationship between the personas includes the ranking of each persona in the generation, and the persona location calculation module 202 is further configured to:
representing the number of pages corresponding to each person by a Pid identifier to obtain information in sxt data table fields corresponding to each person;
the position information of the person included in each generation is calculated using the following formula:
Figure BDA0002040281670000141
where p denotes the current page number, cyRepresenting the page number occupied by the character data of the current generation, n representing the ranking of the character in each generation, m representing the number of rows of the generated page, by-1Representing the blank line number of the last page of the previous generation, and representing the calculated page number by page;
and replacing the Pid identifier with the calculated page number page to obtain the position information of the person in the lineage diagram.
In one embodiment, a flag bit is set for each created thread, the flag bit corresponds to an algebra where a person is located in the thread, and the large genealogy lineage map generating module 205 is specifically configured to:
and superposing the generated pictures according to the mark position of each thread to form a large genealogy lineage diagram.
In one embodiment, the apparatus further includes a character scheduling module, configured to, after calculating the page number page:
and processing the data to be processed by adopting a preset scheduling algorithm.
Since the apparatus described in the second embodiment of the present invention is an apparatus used for implementing the method for generating a large family genealogy lineage chart through parallel processing in the first embodiment of the present invention, a person skilled in the art can understand the specific structure and the modification of the apparatus based on the method described in the first embodiment of the present invention, and thus the details are not described herein. All the devices adopted in the method of the first embodiment of the present invention belong to the protection scope of the present invention.
EXAMPLE III
Based on the same inventive concept, the present application further provides a computer-readable storage medium 300, please refer to fig. 7, on which a computer program 311 is stored, which when executed implements the method in the first embodiment.
Since the computer-readable storage medium introduced in the third embodiment of the present invention is a computer-readable storage medium used for implementing the method for generating a large family genealogy lineage diagram processed in parallel in the first embodiment of the present invention, based on the method introduced in the first embodiment of the present invention, persons skilled in the art can understand the specific structure and modification of the computer-readable storage medium, and thus, details are not described here. Any computer readable storage medium used in the method of the first embodiment of the present invention falls within the intended scope of the present invention.
Example four
Based on the same inventive concept, the present application further provides a computer device, please refer to fig. 8, which includes a storage 401, a processor 402, and a computer program 403 stored in the memory and running on the processor, and when the processor 402 executes the above program, the method in the first embodiment is implemented.
Since the computer device introduced in the fourth embodiment of the present invention is a computer device used for implementing the method for generating a large family genealogy lineage diagram processed in parallel in the first embodiment of the present invention, based on the method introduced in the first embodiment of the present invention, persons skilled in the art can understand the specific structure and deformation of the computer device, and thus details are not described here. All the computer devices used in the method in the first embodiment of the present invention are within the scope of the present invention.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various modifications and variations can be made in the embodiments of the present invention without departing from the spirit or scope of the embodiments of the invention. Thus, if such modifications and variations of the embodiments of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to encompass such modifications and variations.

Claims (6)

1. A method for generating a parallel-processed large family genealogy lineage diagram is characterized by comprising the following steps:
step S1: reading the characters in the generation family character table into a memory from a database, and storing the characters in the generation family character table into a data structure of the memory to form an sxt data table, wherein the sxt data table stores the characters and the corresponding generation numbers;
step S2: creating a plurality of threads to process and analyze the logical relationship between the characters according to the stored characters and corresponding generation generations, and calculating the position information of the characters contained in each generation based on the logical relationship between the characters, the size of a preset generated page and the space occupied by the previous generation;
step S3: writing the character information into the corresponding position of the corresponding HTML template according to the calculated position information of the character to form an HTML file;
step S4: merging the HTML files by using a browser kernel, performing style typesetting on the merged HTML files, and generating a picture corresponding to each HTML file;
step S5: repeatedly overlapping the generated pictures to form a large family genealogy graph;
in step S2, creating multiple threads to process and analyze the logical relationship between the characters according to the stored characters and the corresponding generation numbers, specifically including:
partitioning sxt data sheets according to generation numbers, wherein each sxt data sheet partition corresponds to a generation;
creating a separate thread to process each sxt data table partition;
in step S2, based on the logical relationship between the people, the size of the preset generated page, and the space occupied by the previous generation, calculating the position information of the people included in each generation, which specifically includes:
representing the number of pages corresponding to each person by a Pid identifier to obtain information in sxt data table fields corresponding to each person;
the position information of the person included in each generation is calculated using the following formula:
Figure FDA0002777353060000011
where p denotes the current page number, cyRepresenting the page number occupied by the character data of the current generation, n representing the ranking of the character in each generation, m representing the number of rows of the generated page, by-1Representing the blank line number of the last page of the previous generation, and representing the calculated page number by page;
and replacing the Pid identifier with the calculated page number page to obtain the position information of the person in the lineage diagram.
2. The method as claimed in claim 1, wherein a flag bit is set for each created thread, the flag bit corresponds to an algebra of the character in the thread, and the step S5 specifically includes:
and superposing the generated pictures according to the mark position of each thread to form a large genealogy lineage diagram.
3. The method of claim 1, wherein after calculating the number of pages page, the method further comprises:
and processing the data to be processed by adopting a preset scheduling algorithm.
4. An apparatus for generating a parallel-processed large family genealogy lineage diagram, comprising:
sxt a data table forming module, which is used for reading the characters in the lineage character table from the database into the memory and storing the characters in the data structure of the memory to form a sxt data table, wherein the sxt data table stores the characters and the corresponding lineage algebra;
the character position calculation module is used for creating a plurality of threads to process and analyze the logical relationship between the characters according to the stored characters and corresponding generation generations, and calculating the position information of the characters contained in each generation based on the logical relationship between the characters, the size of a preset generated page and the space occupied by the previous generation;
the HTML file generation module is used for writing the character information into the corresponding position of the corresponding HTML template according to the calculated position information of the character to form an HTML file;
the picture generation module is used for merging the HTML files by utilizing the browser kernel, performing style typesetting on the merged HTML files and generating pictures corresponding to each HTML file;
the large-scale genealogy lineage diagram generating module is used for repeatedly superposing the generated pictures to form a large-scale genealogy lineage diagram;
wherein, the figure position calculation module is specifically configured to:
partitioning sxt data sheets according to generation numbers, wherein each sxt data sheet partition corresponds to a generation;
creating a separate thread to process each sxt data table partition;
the preset size of the generated page is the number of rows included in the generated page, the logical relationship between the characters comprises the ranking of each character in the generation, and the character position calculation module is further configured to:
representing the number of pages corresponding to each person by a Pid identifier to obtain information in sxt data table fields corresponding to each person;
the position information of the person included in each generation is calculated using the following formula:
Figure FDA0002777353060000031
where p denotes the current page number, cyRepresenting the page number occupied by the character data of the current generation, n representing the ranking of the character in each generation, m representing the number of rows of the generated page, by-1Representing the blank line number of the last page of the previous generation, and representing the calculated page number by page;
and replacing the Pid identifier with the calculated page number page to obtain the position information of the person in the lineage diagram.
5. A computer-readable storage medium, on which a computer program is stored, characterized in that the program, when executed, implements the method of any one of claims 1 to 3.
6. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method according to any of claims 1 to 3 when executing the program.
CN201910339580.XA 2019-04-25 2019-04-25 Parallel processing generation method and device for large genealogy lineage diagram Active CN110110270B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910339580.XA CN110110270B (en) 2019-04-25 2019-04-25 Parallel processing generation method and device for large genealogy lineage diagram

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910339580.XA CN110110270B (en) 2019-04-25 2019-04-25 Parallel processing generation method and device for large genealogy lineage diagram

Publications (2)

Publication Number Publication Date
CN110110270A CN110110270A (en) 2019-08-09
CN110110270B true CN110110270B (en) 2021-01-15

Family

ID=67486720

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910339580.XA Active CN110110270B (en) 2019-04-25 2019-04-25 Parallel processing generation method and device for large genealogy lineage diagram

Country Status (1)

Country Link
CN (1) CN110110270B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111797056B (en) * 2020-06-16 2024-04-19 武汉大学 Vectorization family book generation method and vectorization family book generation system

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101025760A (en) * 2007-01-31 2007-08-29 王宏源 Method for digitalizing family tree
US10261760B1 (en) * 2013-12-05 2019-04-16 The Mathworks, Inc. Systems and methods for tracing performance information from hardware realizations to models

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107480279A (en) * 2017-08-22 2017-12-15 北京九亲文化股份有限公司重庆分公司 A kind of generation method of network family tree style
CN109299167B (en) * 2018-09-30 2021-08-13 天津大学 Visualization method for displaying family migration history and family development condition
CN109492033B (en) * 2018-11-27 2021-01-22 中国传媒大学 Ethnic group chart creating and displaying method and device

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101025760A (en) * 2007-01-31 2007-08-29 王宏源 Method for digitalizing family tree
US10261760B1 (en) * 2013-12-05 2019-04-16 The Mathworks, Inc. Systems and methods for tracing performance information from hardware realizations to models

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
A PHLIPS-based expert system for genealogy search;Huiqing H. Yang;《Proceedings 2007 IEEE SoutheastCon》;20070423;165-170页 *
内存数据管理技术在族谱信息系统中的应用;张文杰等;《华东师范大学学报(自然科学版)》;20141103;311-319页 *

Also Published As

Publication number Publication date
CN110110270A (en) 2019-08-09

Similar Documents

Publication Publication Date Title
CA3165743A1 (en) Editor for generating computational graphs
CN105989150B (en) A kind of data query method and device based on big data environment
Shu FORMAL: A forms-oriented, visual-directed application development system
CN104881275B (en) A kind of electronic report forms generation method and device
CN104915450A (en) HBase-based big data storage and retrieval method and system
CN104123269A (en) Semi-automatic publication generation method and system based on template
JP5241738B2 (en) Method and apparatus for building tree structure data from tables
CN108228676A (en) Information extraction method and system
CN106815366A (en) A kind of method and system of Mass production data
WO2013134200A1 (en) Digital resource set integration methods, interface and outputs
CN113590894A (en) Dynamic and efficient remote sensing image metadata warehousing retrieval method
CN110110270B (en) Parallel processing generation method and device for large genealogy lineage diagram
Taylor Generalized data base management system data structures and their mappingto physical storage
WO2011074942A1 (en) System and method of converting data from a multiple table structure into an edoc format
CN111581162B (en) Ontology-based clustering method for mass literature data
CN114372097A (en) Efficient connection comparison implementation method and device for data set serialization
Adamson Data structures and algorithms: a first course
JP5273884B1 (en) Structure analysis apparatus and program
CN107408104A (en) The statement cascade rearrangement of pattern
US20130031085A1 (en) Docbase management system and implenting method thereof
Debasis Classic Data Structures
JP5208117B2 (en) Multi-core compatible data processing method, multi-core processing apparatus, and program for manipulating tabular data
JP3677852B2 (en) Document processing method and apparatus
WO2021025091A2 (en) Information management device and file management method
Marin et al. Big Data Analysis with Python: Combine Spark and Python to unlock the powers of parallel computing and machine learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant