CN109241012B - Sample entry method, device, computer equipment and storage medium - Google Patents

Sample entry method, device, computer equipment and storage medium Download PDF

Info

Publication number
CN109241012B
CN109241012B CN201811187254.3A CN201811187254A CN109241012B CN 109241012 B CN109241012 B CN 109241012B CN 201811187254 A CN201811187254 A CN 201811187254A CN 109241012 B CN109241012 B CN 109241012B
Authority
CN
China
Prior art keywords
file
files
group
compressed package
picture
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811187254.3A
Other languages
Chinese (zh)
Other versions
CN109241012A (en
Inventor
陈林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN201811187254.3A priority Critical patent/CN109241012B/en
Publication of CN109241012A publication Critical patent/CN109241012A/en
Application granted granted Critical
Publication of CN109241012B publication Critical patent/CN109241012B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The invention discloses a sample entry method, a sample entry device, computer equipment and a storage medium, wherein the method comprises the following steps: receiving a sample entry request sent by a client, and acquiring a picture file and a compressed package file contained in the sample entry request; detecting the file size of the compressed package file and dividing the compressed package file label exceeding a preset threshold value to obtain a sub-package file; respectively acquiring basic information of a picture file, a compressed package file and a packaged file, and storing the basic information into a sample database; converting the picture files into a picture file group, and forming a compressed package file group by the compressed package file and the sub-package file; and selecting a preset number of files from each group in turn at preset time intervals and sending the files to the storage platform until all the files in each group are sent. The technical scheme of the invention solves the problems that the server is excessively loaded instantaneously and cannot normally respond to the input request of the client in the process of inputting a large amount of sample data.

Description

Sample entry method, device, computer equipment and storage medium
Technical Field
The present invention relates to the field of information processing, and in particular, to a sample entry method, a sample entry device, a computer device, and a storage medium.
Background
The face sample library in the face recognition system is the basis for model training and face recognition. The face sample library consists of face pictures of different people, and the number of the face pictures in the huge face sample library can reach millions. The larger the face sample library is, the higher the probability of recognizing a face picture according to the face sample library is, and meanwhile, the wider the coverage of the trained model is.
Before generating a face sample library suitable for model training and face recognition, a series of preprocessing is required for the collected face pictures. The server for receiving the sample pictures needs to face a large amount of requests for inputting sample data initiated by the client, namely, a large amount of picture files or a large amount of picture compression packets need to be uploaded to the server. The server receives the sample input requests, stores the image files or the compressed package files with huge data volume to the back-end storage system through the network, so that network resources are very consumed, and meanwhile, the client is continuously provided with new input requests to be processed, so that the server is in full load or even overload operation, and the server is down under severe conditions, so that the input requests of the client cannot be responded normally.
Disclosure of Invention
The embodiment of the invention provides a sample input method, a sample input device, computer equipment and a storage medium, which are used for solving the problem that a server is excessively burdened instantaneously and cannot normally respond to a client input request in the process of inputting a large amount of sample data.
A sample entry method, comprising:
receiving a sample entry request sent by a client, and acquiring a picture file and a compressed package file contained in the sample entry request;
detecting the file size of the compressed package file, and marking the compressed package file with the file size exceeding a preset threshold as a target file;
dividing the target file according to the preset threshold value to obtain a subcontracting file;
respectively acquiring basic information of the picture file, the compressed package file and the sub-package file, and storing the basic information into a sample database, wherein the sample database is used for storing a profile of sample data to be recorded;
converting the picture file into a picture file group, and forming a compressed package file group by the compressed package file and the sub-package file;
and selecting a preset number of files from the picture file group and the compressed package file group in turn at preset time intervals, and sending the files to a storage platform until all the files in the picture file group and the compressed package file group are sent.
A sample entry device, comprising:
the detection module is used for detecting the file size of the compressed package file and marking the compressed package file with the file size exceeding a preset threshold value as a target file;
the segmentation module is used for segmenting the target file according to the preset threshold value to obtain a subcontracting file;
the information extraction module is used for respectively acquiring basic information of the picture file, the compressed package file and the package file and storing the basic information into a sample database, wherein the sample database is used for storing a profile of sample data to be recorded;
the grouping module is used for converting the picture files into picture file groups and forming compressed package files and the sub-package files into compressed package file groups;
and the sending module is used for selecting a preset number of files from the picture file group and the compressed package file group in turn at preset time intervals and sending the files to the storage platform until all the files in the picture file group and the compressed package file group are sent.
A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the sample entry method described above when the computer program is executed.
A computer readable storage medium storing a computer program which when executed by a processor implements the steps of the sample entry method described above.
The sample entry method, the sample entry device, the computer equipment and the storage medium are used for receiving a sample entry request sent by a client and detecting the file size of a compressed package file in the sample entry request; marking the compressed package file exceeding a preset threshold as a target file, and dividing the target file into a plurality of sub-package files; after the basic information of the picture files, the compressed package files and the sub-package files is stored in the sample database, the picture files are converted into picture file groups, the compressed package files and the sub-package files form compressed package file groups, and a preset number of files are selected from the picture file groups and the compressed package file groups in turn at preset time intervals and sent to the storage platform until all the files in the picture file groups and the compressed package file groups are sent to completion, so that when a large number of sample input requests, especially large compressed package files, are processed by a server, the input tasks can be decomposed into small tasks and sent in batches in a time sharing mode, the processing tasks of the server are shared, the instantaneous burden of the server is reduced, and the stable operation of the server is facilitated.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments of the present invention will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic view of an application environment of a sample entry method according to an embodiment of the present invention;
FIG. 2 is a flow chart of a sample entry method in an embodiment of the invention;
FIG. 3 is a flowchart of step S6 in a sample entry method according to an embodiment of the present invention;
FIG. 4 is a flow chart of dividing threads in a predetermined thread pool into two groups according to a second predetermined grouping ratio in a sample entry method according to an embodiment of the present invention;
FIG. 5 is a flowchart of step S64 in a sample entry method according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of a sample entry device in accordance with an embodiment of the invention;
FIG. 7 is a schematic diagram of a computer device in accordance with an embodiment of the invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The sample entry method provided by the application can be applied to an application environment as shown in fig. 1. The client comprises computer terminal equipment or a virtual terminal, and the client can be a browser, a mobile phone APP, a PC host and the like; the server may be a server or a cluster of servers. The storage platform may be a server or database, etc. The client and the server are connected through a network, and the network can be a wired network or a wireless network. The server responds to a sample entry request initiated by the client, and divides the sample information in the sample entry request into groups and forwards the groups to the storage platform. The sample input method provided by the embodiment of the invention is applied to the server.
In an embodiment, as shown in fig. 2, a sample entry method is provided, and a specific implementation procedure of the sample entry method includes the following steps:
s1: and receiving a sample entry request sent by the client, and acquiring a picture file and a compressed package file contained in the sample entry request.
The sample entry request comprises a picture file and a compressed package file, wherein the picture file comprises, but is not limited to, a picture file with bmp, jpg, png and the like as suffixes; the compressed package files include, but are not limited to, compressed package files with rar, zip, 7za, tar, gz, etc. as suffixes, and may include picture files or other sample files, such as voice files, etc.
The server side can receive a sample input request sent by the client side through a Web page or a special Socket channel.
Specifically, taking a sample entry request sent by a Web page receiving client as an example, a Web server, such as currently mainstream Web servers Apache, ng innx or IIS, is preset and deployed on a server. The server side provides a sample file submitting interface for the client side through the Web page, and obtains the picture file and the compressed package file while receiving the sample input request.
Wherein, apache, apache HTTP Server, is a web server of open source code of Apache software foundation, which can run in most computer operating systems, because of its multiple platforms and security are widely used; nginx, namely engine x, is a lightweight Web server/reverse proxy server and an email proxy server and is issued under a BSD-like protocol; IIS is a web server developed by Microsoft, mainly used for running the asps.
S2: detecting the file size of the compressed package file, and marking the compressed package file with the file size exceeding a preset threshold value as a target file.
The preset threshold is the basis for the server to measure the size of the compressed package file. Preferably, the compressed package file with a preset threshold of 100MB, that is, more than 100MB, will be marked as the target file by the server.
Specifically, if the server runs in a Linux system environment, the server may use a du command to query the size of each file, and indicate the queried file size in terms of bytes, where the specific command is as follows:
du-b filename
where filename is the file name of the file to be queried, -b is the file size expressed in bytes.
Specifically, the server side queries the sizes of all the compressed package files in a certain directory through a cyclic call du command, if the compressed package file size exceeds 100MB, the file names of the compressed package files are stored in a temporary array, and the temporary array is used for marking the file names of the compressed package files with the file sizes exceeding 100MB.
S3: and dividing the target file according to a preset threshold value to obtain the subcontracting file.
The preset threshold is not only the basis for measuring the size of the compressed package file by the server, but also the basis for dividing the target file by the server. After the target file is divided, the obtained small files are called subcontracting files.
Specifically, if the server runs in a Linux system environment, the server may divide the compressed packet by using a Linux command split:
split-b 100m filename
where filename is the file name to be split, -b is the splitting of the file in a byte-wise fashion, 100m is the unit of splitting, representing 100MB.
For example, for a 2.32G object file, split commands are used to split the object file, and each sub-packet size is 100M, the object file can be split into 24 sub-packet files, where sub-packet files less than 100M are also used as a single sub-packet file.
S4: and respectively acquiring basic information of the picture file, the compressed package file and the sub-package file, and storing the basic information into a sample database, wherein the sample database is used for storing the profile of the sample data to be recorded.
The basic information of the picture file, the compressed package file and the sub-package file is the basic outline of the sample file submitted by the client in the current sample entry request, and comprises the number of the picture file, the compressed package file and the sub-package file, the identification number of the sample entry request to which the picture file, the compressed package file and the sub-package file belong, the file sizes of the picture file, the compressed package file and the sub-package file and the like.
Specifically, for a picture file, the basic information to be saved includes: client identification information, sample picture original size, sample picture type, sample picture file size, and so forth. The client identification information refers to a client which initiates a sample input request to a server, and the server allocates one piece of identification information for the client to identify different clients; the sample picture identification information refers to identification information allocated to the sample picture by the server, and the server can allocate the sample picture files in sequence according to the receiving sequence.
For compressed package files, the basic information that needs to be saved includes: client identification information, compressed packet size, number of packets after the compressed packet is divided, and the like. The client identification information is the same as the client identification information of the picture file; the compressed package identification information is the identification information allocated by the server for the compressed package with the file size larger than 100M.
Sample databases include, but are not limited to, various relational and non-relational databases, such as MS-SQL, oracle, mySQL, sybase, DB2, redis, mongodDB, hbase, and the like. The sample database may be local to the server or may be connected to the server through a network, where the network may be a wired network or a wireless network. The sample database is used for backing up the basic information of the picture file, the compressed package file and the sub-package file, so that the server can resend after the server fails to send the picture file, the compressed package file and the sub-package file to the storage platform.
The profile of the sample data is an information summary of the sample data to be entered, including basic information of a picture file, a compressed package file, and a packetization file. Before the picture file, the compressed package file and the sub-package file are sent to the storage platform, the server saves the basic information to the sample database, so that the backup of the profile information of the sample data to be recorded is realized, and the server can compare the sample data sent to the storage platform according to the basic information in the sample database so as to verify whether the sample data to be recorded are all sent to the storage platform.
Specifically, the server establishes a database for sample entry requests in a sample database through JDBC, and establishes two tables for each sample entry request: a file summary information table and a compressed package information table; the file abstract information table is used for storing the quantity, type, size and the like of all submitted files in the sample input request; the compressed package information table is used for storing the size of the compressed package files, the number of the package files, the file size of the package files and the like. Among them, JDBC (Java DataBase Connectivity, java database connection) is a Java API for executing SQL statements, which can provide unified access to various relational databases, and is composed of a set of classes and interfaces written in Java language. JDBC provides a benchmark from which higher level tools and interfaces can be built to enable database developers to write database applications. The interface program written by the database developer through the JDBC can be suitable for different databases, and the interface program is not required to be written for the different databases, so that the development efficiency is greatly improved.
S5: and converting the picture file into a picture file group, and forming a compressed package file group by the compressed package file and the sub-package file.
And the server groups the picture file, the compressed package file and the sub-package file.
Specifically, the server establishes two temporary arrays, one array is used for storing file names of all picture files in the sample input request; the other array is used for storing the file names of all the compressed package files in the sample entry request and the file names of all the package files.
S6: and selecting a preset number of files from the picture file group and the compressed package file group in turn at preset time intervals, and sending the files to the storage platform until all the files in the picture file group and the compressed package file group are sent.
The storage platform is a computer terminal or virtual terminal for storing various sample files in the sample entry request. The storage platform can be a server or a server cluster, and can also be a cloud storage platform.
The preset time interval is a time interval for the server to execute the timing task. Preferably, the preset time interval is 1 second.
The timing task executed by the server side is used for selecting a preset number of files from the picture file group and the compressed package file group in turn and sending the files to the storage platform until all the files in the picture file group and the compressed package file group are sent. The preset number refers to the number of files transmitted simultaneously when the server transmits the files to the storage platform each time. The preset number can be determined according to the size of the files in each group, that is, the smaller the average size of the files in the group is, the larger the preset number is, otherwise, the larger the average size of the files in the group is, the smaller the preset number is, and the minimum value of the preset number is 1. For example, if the sizes of the picture files in the picture file group are all smaller than 1MB, the preset number may be more than 5; if the files in the compressed package file group are greater than 100MB, the preset number is 1.
Specifically, if the server runs in a Java environment, the TimerTask class method may be called to start a timing task, and at 1 second intervals, one file is selected from the picture file group and the compressed package file group in turn and sent to the storage platform, for example, 1 second selects one file from the picture file group and sends to the storage platform, 2 seconds selects one file from the compressed package file group and sends to the storage platform, 3 seconds selects one file from the picture file group and sends to the storage platform, and the above steps are repeated until all files in the picture file group and the compressed package file group have been sent.
Taking the 2.32G object file in step S3 as an example, it has been divided into 24 sub-packets, then the sub-packet file with the total size of 2.32G will be sent to the storage platform after 48 seconds, and 24 seconds are used to send the picture file in 48 seconds by the timing task of the server.
In the embodiment, a sample entry request sent by a client is received, and the file size of a compressed package file in the sample entry request is detected; marking the compressed package file exceeding a preset threshold as a target file, and dividing the target file into a plurality of sub-package files; after the basic information of the picture files, the compressed package files and the sub-package files is stored in the sample database, the picture files are converted into picture file groups, the compressed package files and the sub-package files form compressed package file groups, and a preset number of files are selected from the picture file groups and the compressed package file groups in turn at preset time intervals and sent to the storage platform until all the files in the picture file groups and the compressed package file groups are sent to completion, so that when a large number of sample input requests, especially large compressed package files, are processed by a server, the input tasks can be decomposed into small tasks and sent in batches in a time sharing mode, the processing tasks of the server are shared, the instantaneous burden of the server is reduced, and the stable operation of the server is facilitated.
Further, in an embodiment, as shown in fig. 3, for step S6, that is, at preset time intervals, a preset number of files are selected from the group of picture files and the group of compressed package files in turn and sent to the storage platform until all files in the group of picture files and the group of compressed package files have been sent, the method specifically may include the following steps:
s61: the number of files in the picture file group and the compressed package file group are detected respectively.
The file names are stored in the picture file group and the compressed package file group, and the server side calculates the number of array elements in the two arrays to obtain the number of files in the picture file group and the compressed package file group.
Specifically, if the server runs in a Java environment, the server may call a self-contained method length () of the array class to obtain the array length of each array, so as to obtain the number of files in the picture file group and the compressed package file group.
S62: if the sum of the file numbers of the compressed package files and the sub-package files is larger than or equal to the file number of the picture files, dividing the threads in the preset thread pool into two groups according to a first preset grouping proportion, and obtaining a first thread group and a second thread group.
The compressed package file and the divided sub package file are larger than the picture file, if the sum of the file numbers of the compressed package file and the sub package file is larger than or equal to the file number of the picture file, the files with high requirement on network transmission bandwidth are represented to be majority, the server divides the threads in the preset thread pool into two groups according to the first preset grouping proportion, and each group of threads is used for transmitting the files in the picture file group and the compressed package file group respectively.
The thread is the minimum unit of program execution flow, is an entity in the process, and is the basic unit independently scheduled and allocated by the system. The preset thread pool is a collection of threads that have been created by the system. After the thread is executed, the server side does not destroy the thread immediately, but returns the thread to the thread pool. In this way, taking threads from a thread pool can avoid the overhead of frequently creating and destroying threads.
The first preset grouping proportion is the proportion of grouping threads in a preset thread pool, so that threads in different thread groups respectively execute different tasks without interference. For example, the first preset grouping ratio may be adjusted between 7:3 and 9:1, that is, when the sum of the file numbers of the compressed package file and the packetized file is greater than or equal to the file number of the picture file, the threads in the preset thread pool are divided into two groups, so as to obtain a first thread group and a second thread group, and the thread number ratio of the first thread group and the second thread group is between 7:3 and 9:1.
Specifically, the server compares the sizes of the two groups, and if the sum of the file numbers of the compressed package file and the sub package file is greater than or equal to the file number of the picture file, marks a thread taken out from the preset thread pool, so that the thread number ratio of the first thread group to the second thread group is a first preset grouping ratio.
S63: and calling threads in the first thread group, sequentially selecting a preset number of files from the picture file group, and sending the files to the storage platform until all the files in the picture file group are sent.
The threads in the first thread group are specially used for sending the picture files, and because the picture files are smaller and do not need to be sent through timing task time sharing, the threads in the first thread group sequentially select a preset number of files from the picture file group to send the files to the storage platform until all the files in the picture file group are sent to be completed. The preset number refers to the number of files that are simultaneously sent when the server sends the files to the storage platform each time, and is the same as that in step S6, and will not be described here again.
Specifically, the server sequentially takes out files from the picture file group through a circulation function and sends the files to the storage platform until all the files in the picture file group are sent.
S64: and calling threads in the second thread group, sequentially selecting a preset number of files from the compressed package file group at preset time intervals, and sending the files to the storage platform until all the files in the compressed package file group are sent to completion.
The threads in the second thread group are dedicated to sending compressed package files or packetized files. The preset time interval and the preset number are defined in step S6, and are not described here again.
Specifically, taking a server as a Java running environment as an example, the server may start a timer through a scheduleAtFixedRate () method under a Java self-contained package timer, for example:
timer.scheduleAtFixedRate(task,delay,intevalPeriod);
where task represents a task function of transmitting a compressed package file or a packetized file, delay represents a delay time at the time of first execution, and intevalPeriod represents a time interval, here 1 second.
In this embodiment, according to the size relationship between the number of files in the compressed package file and the number of files in the picture file group, the threads in the preset thread pool are divided into two groups according to a proportion, so that each group of threads execute the sending task and do not interfere with each other, the configuration of thread resources is optimized, and the response capability of the server is improved.
Further, in an embodiment, as shown in fig. 4, after step S61 and before step S63, that is, after detecting the number of files in the group of picture files and the group of compressed package files respectively, and before invoking the thread in the first thread group, sequentially selecting a preset number of files from the group of picture files to send to the storage platform until all files in the group of picture files have been sent, the sample entry method may further include the following steps:
S65: if the sum of the file numbers of the compressed package files and the sub package files is smaller than the file number of the picture files, calculating the sum of the file sizes of all files in the compressed package file group to obtain the sum of the file sizes of the compressed package file group, and calculating the sum of the file sizes of all files in the picture file group to obtain the sum of the file sizes of the picture file group.
For the case that the sum of the file numbers of the compressed package files and the sub package files is smaller than the file number of the picture files, the server side determines the thread grouping proportion in the preset thread pool according to the relation between the file sizes of the compressed package file group and the picture file group.
The server calculates the file size of each file in the compressed package file group, and then adds up the file sizes to obtain the sum of the file sizes of the compressed package file group; the server calculates the file size of each file in the picture file group, and then sums up the file sizes of the picture file group.
Specifically, the server obtains the size of each file in the array through the du command for the two arrays, and then accumulates the sizes to obtain the sum of the file sizes of the compressed package file group and the sum of the file sizes of the picture file group.
S66: if the sum of the file sizes of the compressed package file group is smaller than the sum of the file sizes of the picture file group, dividing the threads in the preset thread pool into two groups according to a second preset grouping proportion to obtain a first thread group and a second thread group.
The second preset grouping proportion is the proportion of grouping the threads in the preset thread pool, so that the threads in different thread groups respectively execute different tasks without interference. The second preset grouping ratio may be set between 3:7 and 5:5, that is, when the sum of the file numbers of the compressed package file and the sub package file is smaller than the file number of the picture file, and the sum of the file sizes of the compressed package file group is smaller than the sum of the file sizes of the picture file group, the server divides the threads in the preset thread pool into two groups according to the second preset grouping ratio, so as to obtain a first thread group and a second thread group, wherein the thread number ratio of the first thread group and the second thread group is the second preset grouping ratio.
Specifically, if the sum of the file sizes of the compressed package file group is smaller than the sum of the file sizes of the picture file group, the server marks one thread taken out from the preset thread pool, so that the thread number ratio of the first thread group and the second thread group is a second preset grouping ratio.
In this embodiment, for the case that the sum of the file numbers of the compressed package file and the sub package file is smaller than the file number of the picture file, the server groups the threads in the preset thread pool according to the size relationship between the sum of the file sizes in the compressed package file group and the sum of the file sizes in the picture file group, so that the threads in the groups respectively process the sending tasks, the configuration of the thread resources is more optimized, and the response capability of the server is further improved.
Further, in an embodiment, as shown in fig. 5, for step S64, i.e. calling the threads in the second thread group, a preset number of files are sequentially selected from the compressed package file group at preset time intervals, and sent to the storage platform, until all files in the compressed package file group have been sent, which specifically may include the following steps:
s641: comparing the file size of each file in the compressed package file group with the preset network bandwidth size, and dividing the files in the compressed package file group into big package files and small package files according to the comparison result.
The preset network bandwidth refers to the network bandwidth between the server and the storage platform. The network bandwidth is planned, and the limited network bandwidth can be divided into different functional areas, namely, the limited network bandwidth is respectively used for transmitting different sample files, such as picture files, compressed package files or subpackage files. For example, if the network bandwidth is 1000M, it can be planned to be two parts for transmitting the files in the picture file group and the files in the compressed package file group, respectively. This avoids the occupation of full network bandwidth by large compressed packet files.
The server classifies the files in the compressed package file group according to the preset network bandwidth, wherein the files are large package files which are larger than or equal to the preset network bandwidth and small package files which are smaller than the preset network bandwidth.
Specifically, the server side sequentially compares the files in the compressed package file group with a preset network bandwidth, and divides the files in the compressed package file group into big package files and small package files according to a comparison result.
S642: if the number of the large package files is greater than or equal to the number of the small package files, dividing the threads in the second thread group into two groups according to the first preset grouping proportion to obtain a thread group for processing the large package files and a thread group for processing the small package files.
The first preset grouping proportion is a proportion of grouping threads in the preset thread pool, and the definition of the first preset grouping proportion is the same as that in step S62, and will not be described herein.
The server divides the threads in the second thread group into two groups according to the first preset grouping proportion, and the two groups are respectively used for processing the big package file and the small package file.
Specifically, the first preset grouping proportion may be set between 7:3 and 9:1, that is, the number of files of the big package file is greater than or equal to the number of files of the small package file, the server divides the threads in the preset thread pool into two groups to obtain a thread group for processing the big package file and a thread group for processing the small package file, wherein the thread number ratio of the thread group for processing the big package file to the thread group for processing the small package file is the first preset grouping proportion.
S643: and calling and processing threads in the thread group of the big package file, and sending the big package file to a storage platform at preset time intervals.
Threads in the thread group for processing the big-packet file are specially used for sending the big-packet file to the storage platform at preset time intervals. The predetermined time interval is defined in step S6, and will not be described herein.
Specifically, taking a server as a Java running environment as an example, the server may start a timer through a scheduleAtFixedRate () method under a Java self-contained package timer, for example:
timer.scheduleAtFixedRate(task,delay,intevalPeriod);
where task represents the task function of sending the big packet file, delay represents the delay time at the first execution, and intevalPeriod represents the time interval, here 1 second.
S644: and calling threads in the thread group for processing the small packet file, and sequentially sending the small packet file to a storage platform.
The file size of the small packet file is smaller than the preset network bandwidth, and the service end does not need to send the small packet file in a time-sharing mode through a timing task, and the threads in the thread group for processing the small packet file are specially used for sequentially sending the small packet file to the storage platform.
Specifically, the server sequentially takes out the small package files from the compressed package file group through a circulation function and sends the small package files to the storage platform until all the small package files in the compressed package file group are sent.
In this embodiment, the server divides the files in the compressed package file group into the big package file and the small package file according to the size relationship between the preset network bandwidth and each file in the compressed package file group, and subdivides the threads in the second thread group into two groups according to the number relationship between the big package file and the small package file and the first preset grouping proportion, so that the division of the work among the threads for processing the sample files is more reasonable, and the response speed of the server is further improved.
Further, in an embodiment, after step S641 and before step S643, the method for sample entry further includes, after comparing the file size of each file in the compressed package file group with the preset network bandwidth size, and dividing the files in the compressed package file group into big package files and small package files according to the comparison result, and before invoking the thread in the thread group for processing the big package files and sending the big package files to the storage platform at preset time intervals:
if the number of the large package files is smaller than that of the small package files, dividing the threads in the second thread group into two groups according to a second preset grouping proportion to obtain a thread group for processing the large package files and a thread group for processing the small package files.
The second preset grouping ratio is a ratio of grouping threads in the preset thread pool, and the definition thereof is already defined in step S66, and will not be described herein.
If the number of the large package files is smaller than that of the small package files, the server divides the threads in the second thread group into two groups according to a second preset grouping proportion, and the two groups are respectively used for processing the large package files and the small package files.
Specifically, the second preset grouping ratio may be set between 3:7 and 5:5, that is, when the number of files of the big packet file is smaller than the number of files of the small packet file, the server divides the threads in the second thread group into two groups according to the second preset grouping ratio, so as to obtain a thread group for processing the big packet file and a thread group for processing the small packet file, where the thread number ratio of the thread group for processing the big packet file to the thread group for processing the small packet file is the second preset grouping ratio.
In this embodiment, the server further divides the second thread group according to a second preset grouping proportion, so that the thread grouping mode can cover the situation that the number of files of the big package file is smaller than that of the small package file, thereby further improving the response speed of the server.
Further, in an embodiment, for step S643, the step of calling the thread in the thread group for processing the big package file to send the big package file to the storage platform at a preset time interval may specifically include:
And adjusting a preset time interval according to the relation between the number of the large package files and the number of threads in the thread group for processing the large package files.
Specifically, if the number of files in the large package file is smaller than the number of threads in the thread group for processing the large package file, the preset time interval may be set to 2 seconds; if the number of large package files is greater than or equal to the number of threads in the thread group processing the large package files, the preset time interval may be set to between 0.5 seconds and 2 seconds.
Taking the method of starting the timer in step S64 as an example, the server side calculates the number of files of the large package file and the number of threads in the thread group for processing the large package file respectively, and then sets the value of the parameter inteval period according to the size relation between them.
In this embodiment, the server optimizes the value of the preset time interval according to the size relationship between the number of large package files and the number of threads in the thread group for processing large package files, so that the server executes according to the default preset time interval when the number of threads in the thread group for processing large package files is sufficient; under the condition that the number of threads in a thread group for processing the big package file is insufficient, a preset time interval value is reduced, and the time for processing each big package file is averagely allocated, so that the processing response of a server to the big package file is quickened.
It should be understood that the sequence number of each step in the foregoing embodiment does not mean that the execution sequence of each process should be determined by the function and the internal logic, and should not limit the implementation process of the embodiment of the present invention.
In an embodiment, a sample entry device is provided, which corresponds to the sample entry method in the above embodiment one by one. As shown in fig. 6, the sample entry device includes a receiving module 61, a detecting module 62, a dividing module 63, an information extracting module 64, a grouping module 65, and a transmitting module 66. The functional modules are described in detail as follows:
the receiving module 61 is configured to receive a sample entry request sent by a client, and obtain a picture file and a compressed packet file included in the sample entry request;
the detecting module 62 is configured to detect a file size of the compressed package file, and mark the compressed package file with a file size exceeding a preset threshold as a target file;
the dividing module 63 is configured to divide the target file according to a preset threshold value to obtain a subcontracting file;
the information extraction module 64 is configured to obtain basic information of the picture file, the compressed package file, and the sub-package file, and store the basic information in a sample database, where the sample database is used to store a profile of sample data to be recorded;
The grouping module 65 is configured to convert the picture file into a picture file group, and form a compressed package file group from the compressed package file and the sub-package file;
the sending module 66 is configured to alternately select, at preset time intervals, a preset number of files from the group of picture files and the group of compressed package files, and send the selected files to the storage platform until all the files in the group of picture files and the group of compressed package files have been sent.
Further, the transmitting module 66 includes:
the detection submodule is used for respectively detecting the number of files in the picture file group and the compressed package file group;
the first thread grouping sub-module is used for dividing the threads in the preset thread pool into two groups according to a first preset grouping proportion if the sum of the file numbers of the compressed package files and the sub-package files is larger than or equal to the file number of the picture files, so as to obtain a first thread group and a second thread group;
the first thread execution sub-module is used for calling threads in the first thread group, sequentially selecting a preset number of files from the picture file group and sending the files to the storage platform until all the files in the picture file group are sent to completion;
and the second thread execution submodule is used for calling threads in the second thread group, sequentially selecting a preset number of files from the compressed package file group at preset time intervals, and sending the files to the storage platform until all the files in the compressed package file group are sent.
Further, the transmitting module 66 further includes:
the file size calculation sub-module is used for calculating the file size sum of all files in the compressed package file group to obtain the file size sum of the compressed package file group and calculating the file size sum of all files in the picture file group to obtain the file size sum of the picture file group if the file number sum of the compressed package file and the sub-package file is smaller than the file number of the picture file;
and the second thread grouping sub-module is used for dividing the threads in the preset thread pool into two groups according to a second preset grouping proportion if the sum of the file sizes of the compressed package file groups is smaller than the sum of the file sizes of the picture file groups, so as to obtain a first thread group and a second thread group.
Further, the second thread execution submodule includes:
the package file segmentation subunit is used for comparing the file size of each file in the compressed package file group with the preset network bandwidth size and dividing the files in the compressed package file group into big package files and small package files according to the comparison result;
the first thread grouping subunit is used for dividing the threads in the second thread group into two groups according to a first preset grouping proportion if the number of the large package files is greater than or equal to that of the small package files, so as to obtain a thread group for processing the large package files and a thread group for processing the small package files;
The large package file processing subunit is used for calling and processing threads in the large package file thread group and sending the large package file to the storage platform at preset time intervals;
and the small packet file processing subunit is used for calling and processing threads in the small packet file thread group and sequentially sending the small packet files to the storage platform.
Further, the second thread execution sub-module further includes:
and the second thread grouping subunit is used for dividing the threads in the second thread group into two groups according to a second preset grouping proportion if the number of the large package files is smaller than that of the small package files, so as to obtain a thread group for processing the large package files and a thread group for processing the small package files.
For specific limitations of the sample entry device, reference may be made to the limitations of the sample entry method hereinabove, and no further description is given here. The various modules in the sample entry device described above may be implemented in whole or in part by software, hardware, and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.
In one embodiment, a computer device is provided, which may be a server, the internal structure of which may be as shown in fig. 7. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a sample entry method.
In one embodiment, a computer device is provided that includes a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor executes the computer program to implement the steps of the sample entry method of the above embodiments, such as steps S1 to S6 shown in fig. 2. Alternatively, the processor, when executing the computer program, performs the functions of the modules/units of the sample entry device of the above embodiments, such as the functions of modules 61-66 shown in fig. 6. In order to avoid repetition, a description thereof is omitted.
In an embodiment, a computer readable storage medium is provided, on which a computer program is stored, which when executed by a processor implements the sample entry method in the above method embodiment, or which when executed by a processor implements the functions of the modules/units in the sample entry device in the above device embodiment. In order to avoid repetition, a description thereof is omitted.
Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the various embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, i.e. the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-described functions.
The above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention, and are intended to be included in the scope of the present invention.

Claims (6)

1. A sample entry method, the sample entry method comprising:
receiving a sample entry request sent by a client, and acquiring a picture file and a compressed package file contained in the sample entry request;
Detecting the file size of the compressed package file, and marking the compressed package file with the file size exceeding a preset threshold as a target file;
dividing the target file according to the preset threshold value to obtain a subcontracting file;
respectively acquiring basic information of the picture file, the compressed package file and the sub-package file, and storing the basic information into a sample database, wherein the sample database is used for storing a profile of sample data to be recorded;
converting the picture file into a picture file group, and forming a compressed package file group by the compressed package file and the sub-package file;
and selecting a preset number of files from the picture file group and the compressed package file group in turn at preset time intervals, and sending the files to a storage platform until all the files in the picture file group and the compressed package file group are sent, wherein the method comprises the following steps: detecting the number of files in the picture file group and the compressed package file group respectively;
if the sum of the file numbers of the compressed package files and the sub package files is larger than or equal to the file number of the picture files, dividing the threads in a preset thread pool into two groups according to a first preset grouping proportion to obtain a first thread group and a second thread group;
If the sum of the file numbers of the compressed package files and the sub package files is smaller than the file number of the picture files, calculating the sum of the file sizes of all files in the compressed package file group to obtain the sum of the file sizes of the compressed package file group, and calculating the sum of the file sizes of all files in the picture file group to obtain the sum of the file sizes of the picture file group;
if the sum of the file sizes of the compressed package file groups is smaller than the sum of the file sizes of the picture file groups, dividing the threads in the preset thread pool into two groups according to a second preset grouping proportion to obtain the first thread group and the second thread group;
invoking threads in the first thread group, sequentially selecting the files with the preset number from the picture file group, and sending the files to a storage platform until all the files in the picture file group are sent to completion;
calling the threads in the second thread group, sequentially selecting the files with the preset number from the compressed package file group at preset time intervals, and sending the files to a storage platform until all the files in the compressed package file group are sent, wherein the method comprises the following steps:
comparing the file size of each file in the compressed package file group with the preset network bandwidth size, and dividing the files in the compressed package file group into big package files and small package files according to the comparison result;
If the number of the big package files is greater than or equal to the number of the small package files, dividing the threads in the second thread group into two groups according to the first preset grouping proportion to obtain a thread group for processing the big package files and a thread group for processing the small package files;
if the number of the big package files is smaller than that of the small package files, dividing the threads in the second thread group into two groups according to a second preset grouping proportion to obtain a thread group for processing the big package files and a thread group for processing the small package files;
calling the threads in the thread group for processing the big package file, and sending the big package file to the storage platform at preset time intervals;
and calling the threads in the thread group for processing the small packet file, and sequentially sending the small packet file to the storage platform.
2. A sample entry device adapted for use in the sample entry method of claim 1, the sample entry device comprising:
the receiving module is used for receiving a sample entry request sent by a client and acquiring a picture file and a compressed package file contained in the sample entry request;
the detection module is used for detecting the file size of the compressed package file and marking the compressed package file with the file size exceeding a preset threshold value as a target file;
The segmentation module is used for segmenting the target file according to the preset threshold value to obtain a subcontracting file;
the information extraction module is used for respectively acquiring basic information of the picture file, the compressed package file and the package file and storing the basic information into a sample database, wherein the sample database is used for storing a profile of sample data to be recorded;
the grouping module is used for converting the picture files into picture file groups and forming compressed package files and the sub-package files into compressed package file groups;
and the sending module is used for selecting a preset number of files from the picture file group and the compressed package file group in turn at preset time intervals and sending the files to the storage platform until all the files in the picture file group and the compressed package file group are sent.
3. The sample entry device of claim 2, wherein the transmission module comprises:
the detection submodule is used for respectively detecting the number of files in the picture file group and the compressed package file group;
the first thread grouping sub-module is used for dividing threads in a preset thread pool into two groups according to a first preset grouping proportion if the sum of the file numbers of the compressed package file and the sub-package file is larger than or equal to the file number of the picture file, so as to obtain a first thread group and a second thread group;
The first thread execution submodule is used for calling threads in the first thread group, sequentially selecting the files with the preset number from the picture file group and sending the files to the storage platform until all the files in the picture file group are sent;
and the second thread execution sub-module is used for calling the threads in the second thread group, sequentially selecting the files with the preset number from the compressed package file group at preset time intervals, and sending the files to the storage platform until all the files in the compressed package file group are sent to completion.
4. A sample entry device as claimed in claim 3, wherein the transmission module further comprises:
a file size calculation sub-module, configured to calculate a file size sum of all files in the compressed package file group if the file number sum of the compressed package file and the sub-package file is smaller than the file number of the picture file, to obtain a compressed package file group file size sum, and to calculate a file size sum of all files in the picture file group, to obtain a picture file group file size sum;
and the second thread grouping sub-module is used for dividing the threads in the preset thread pool into two groups according to a second preset grouping proportion if the sum of the file sizes of the compressed package file group is smaller than the sum of the file sizes of the picture file group, so as to obtain the first thread group and the second thread group.
5. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the sample entry method of claim 1 when the computer program is executed by the processor.
6. A computer readable storage medium storing a computer program, which when executed by a processor performs the steps of the sample entry method of claim 1.
CN201811187254.3A 2018-10-12 2018-10-12 Sample entry method, device, computer equipment and storage medium Active CN109241012B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811187254.3A CN109241012B (en) 2018-10-12 2018-10-12 Sample entry method, device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811187254.3A CN109241012B (en) 2018-10-12 2018-10-12 Sample entry method, device, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN109241012A CN109241012A (en) 2019-01-18
CN109241012B true CN109241012B (en) 2024-02-02

Family

ID=65053437

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811187254.3A Active CN109241012B (en) 2018-10-12 2018-10-12 Sample entry method, device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN109241012B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114155975A (en) * 2021-11-26 2022-03-08 广州金域医学检验中心有限公司 Sample entry method, device, equipment and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009086945A (en) * 2007-09-28 2009-04-23 Mizuho Information & Research Institute Inc File reception system, file reception method, and file reception program
WO2011105023A1 (en) * 2010-02-26 2011-09-01 Jvc・ケンウッド・ホールディングス株式会社 Processing device and write method
CN104715076A (en) * 2015-04-13 2015-06-17 东信和平科技股份有限公司 Multi-threaded data processing method and device
CN105701152A (en) * 2015-12-29 2016-06-22 浪潮(北京)电子信息产业有限公司 File writing method and apparatus, and file reading method and apparatus
CN105873022A (en) * 2015-12-07 2016-08-17 乐视移动智能信息技术(北京)有限公司 Downloading method and device for mobile terminal
CN107566463A (en) * 2017-08-21 2018-01-09 北京航空航天大学 A kind of cloudy storage management system for improving storage availability
CN108228730A (en) * 2017-12-11 2018-06-29 深圳市买买提信息科技有限公司 Data lead-in method, device, computer equipment and readable storage medium storing program for executing

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009086945A (en) * 2007-09-28 2009-04-23 Mizuho Information & Research Institute Inc File reception system, file reception method, and file reception program
WO2011105023A1 (en) * 2010-02-26 2011-09-01 Jvc・ケンウッド・ホールディングス株式会社 Processing device and write method
CN104715076A (en) * 2015-04-13 2015-06-17 东信和平科技股份有限公司 Multi-threaded data processing method and device
CN105873022A (en) * 2015-12-07 2016-08-17 乐视移动智能信息技术(北京)有限公司 Downloading method and device for mobile terminal
CN105701152A (en) * 2015-12-29 2016-06-22 浪潮(北京)电子信息产业有限公司 File writing method and apparatus, and file reading method and apparatus
CN107566463A (en) * 2017-08-21 2018-01-09 北京航空航天大学 A kind of cloudy storage management system for improving storage availability
CN108228730A (en) * 2017-12-11 2018-06-29 深圳市买买提信息科技有限公司 Data lead-in method, device, computer equipment and readable storage medium storing program for executing

Also Published As

Publication number Publication date
CN109241012A (en) 2019-01-18

Similar Documents

Publication Publication Date Title
US10831562B2 (en) Method and system for operating a data center by reducing an amount of data to be processed
US11200258B2 (en) Systems and methods for fast and effective grouping of stream of information into cloud storage files
CN110222048B (en) Sequence generation method, device, computer equipment and storage medium
US9892124B2 (en) Method and device for transferring file
CN105447046A (en) Distributed system data consistency processing method, device and system
US11169994B2 (en) Query method and query device
CN108256118B (en) Data processing method, device, system, computing equipment and storage medium
CN107689976B (en) File transmission method and device
WO2021057253A1 (en) Data separation and storage method and apparatus, computer device and storage medium
WO2022057231A1 (en) Method and apparatus for accessing server, device, and storage medium
CN112215273A (en) Intelligent building information monitoring method based on cloud platform and intelligent building system
US20190228009A1 (en) Information processing system and information processing method
CN109241012B (en) Sample entry method, device, computer equipment and storage medium
CN113676563A (en) Scheduling method, device, equipment and storage medium of content distribution network service
CN111639902A (en) Data auditing method based on kafka, control device, computer equipment and storage medium
CN113900810A (en) Distributed graph processing method, system and storage medium
CN112613271A (en) Data paging method and device, computer equipment and storage medium
US20180121135A1 (en) Data processing system and data processing method
CN113688161A (en) Cache data query method, device, equipment and storage medium
CN105426407A (en) Web data acquisition method based on content analysis
CN106844420B (en) User grouping method and device based on social network and big data analysis
CN108228365A (en) A kind of function request sending method, function request call method and device
CN113630442B (en) Data transmission method, device and system
CN114595457A (en) Task processing method and device, computer equipment and storage medium
CN109284260B (en) Big data file reading method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant