CN111870253A

CN111870253A - Method and system for monitoring condition of tic disorder disease based on vision and voice fusion technology

Info

Publication number: CN111870253A
Application number: CN202010729476.4A
Authority: CN
Inventors: 周文举; 韩小飞; 费敏锐; 王海宽; 何钰
Original assignee: University of Shanghai for Science and Technology
Current assignee: University of Shanghai for Science and Technology
Priority date: 2020-07-27
Filing date: 2020-07-27
Publication date: 2020-11-03

Abstract

The invention relates to a method and a system for monitoring the condition of tic disorder disease based on vision and voice fusion technology. The image and sound information of the monitored person are continuously acquired, the movement and sound production twitching of the monitored person is judged in real time, the obtained twitch information is stored in a database according to a judgment result, and then the diagnosis result is timely fed back to related persons after the database information of the monitored person is analyzed and diagnosed, and the monitoring is continuously carried out. The invention can find the state of an illness of the monitored person in a natural state without causing the attention of the monitored person, and remind the monitored person of timely treatment and correction according to the serious condition of the illness. The problem of rely on artifical monitoring time easily arouse by the people that is monitored control voluntarily, make the state of an illness appear inadequately to and can't monitor the condition of morbidity of twitching disorder patient in real time for a long time is solved, can provide more accurate auxiliary treatment information for medical personnel.

Description

Method and system for monitoring condition of tic disorder disease based on vision and voice fusion technology

Technical Field

The invention relates to the field of big data analysis, in particular to the field of medical monitoring, and specifically relates to a method and a system for monitoring the condition of tic disorder based on vision and voice fusion technology.

Background

Human action recognition is a research direction in the field of artificial intelligence, and has long-term application prospect in the aspects of long-term monitoring and auxiliary treatment of the condition of a monitored person with twitch disorder. The technology is widely applied to the aspects of transportation, medical treatment, public safety and the like.

With the deep development of the machine vision technology, the human body action recognition technology integrating vision and voice is more mature and reliable, so that the condition recognition and monitoring of the monitored person with the visual monitoring and twitching disorder become possible. The big data technology can be applied to analysis and mining of unstructured data, analysis of a large amount of real-time monitoring data and the like, has very important application value in the medical field, and provides technical support for construction of medical health management systems, comprehensive information platforms and the like.

For the monitored person with tic disorder, the main manifestation of the condition is a group of neuropsychiatric disorders, such as motor tic or vocal tic disorders. Symptoms are characterized by involuntary, sudden, repetitive, non-rhythmic, stereotyped, single or multi-site motor twitching or the occurrence of twitching. The muscle twitch and the voice production information big data in a long time are formed by collecting the action or voice production change conditions of the monitored person in real time all day, and the disease condition development condition can be reflected by mining and analyzing the big data. The medical staff can find the abnormity in time and provide detailed and reliable diagnosis and treatment information.

The prior art discloses a preparation method of a medicine for treating tic disorder based on non-vision and a physical tic disorder therapeutic device. The method for analyzing the disease condition change of the monitored person through twitch and sounding information big data does not have public information. The invention collects the exercise twitch and twitch occurrence information data and utilizes a big data analysis method to identify and monitor the illness state of the monitored person.

Disclosure of Invention

The invention aims to overcome the defects of the existing monitoring technology and provide a method and a system for monitoring the state of the tic disorder disease based on the vision and voice fusion technology.

In order to achieve the purpose, the invention is realized by adopting the following technical scheme:

the method for monitoring the condition of tic disorder based on the vision and voice fusion technology is mainly characterized by comprising the following steps of:

(1) continuously acquiring image and sound information of the monitored person and judging the movement and sound production twitch of the monitored person in real time;

(2) storing the obtained twitch information into a database according to the real-time judgment result of the motion and sound production twitch of the monitored person in the step (1);

(3) and diagnosing the state of an illness according to the database information of the monitored person, feeding back the diagnosis basis and the diagnosis result to related persons in time, and continuously monitoring.

The method for monitoring the condition of tic disorder based on the vision and voice fusion technology comprises the following steps of (1):

(1.1) acquiring image information and sound information of a monitored person in real time through an image sensor and a sound sensor;

(1.2) making a matching template of the movement twitch and the sound production twitch of the monitored person;

and (1.3) matching in real time and judging whether the monitored person generates movement twitching and sounding twitching, if so, judging that twitching occurs and continuing the step (2), and if not, returning to the step (1.1) to continue monitoring.

The method for monitoring the condition of tic disorder disease based on the vision and voice fusion technology comprises the following steps (1.1):

the method comprises the steps of acquiring image information of limbs, facial actions and the like of a monitored person in real time through an image sensor, and acquiring sounding information of the monitored person in real time through a sound sensor.

The method for monitoring the condition of tic disorder disease based on the vision and voice fusion technology comprises the following steps (1.2):

the method comprises the steps that limb, facial twitch information and neurogenic sounding information are manually selected as matching templates, the size and format attributes of each template can be flexibly adjusted to adapt to different monitored people, each piece of template information can be added to different action classification labels, each type of action label corresponds to more than one image template with the action characteristic, and different image information of the type of label has different twitch strengths and related information of single twitch frequency.

The real-time matching and judging whether the monitored person generates movement twitching and sounding twitching in the twitching disorder disease monitoring method based on the vision and voice fusion technology specifically comprises the following steps:

performing feature extraction and screening on image information and sound obtained by an image and sound sensor in real time, learning and matching the processed information and the matching template, if the matching similarity reaches a preset threshold value, judging that the current action is an action twitch or a sound production twitch corresponding to the template, and returning the judgment results of the motion twitch and the sound production twitch; and if no proper matching result exists, returning the judgment result that no movement twitch or sound-producing twitch occurs.

The method for monitoring the condition of tic disorder based on the vision and voice fusion technology comprises the following steps of:

and storing the image information and the sound information of the monitored person which are judged as the nervous movement twitch and the vocal twitch into a database of the monitored person.

In the method for monitoring the condition of the tic disorder disease based on the vision and voice fusion technology, the condition of the disease is diagnosed according to the database information of the monitored person, and the diagnosis basis and the diagnosis result are timely fed back to related personnel, and the method specifically comprises the following steps:

after the information of the monitored person database is updated and analyzed, the information can be timely fed back to medical care personnel and family members of the patient, and when the condition of an illness changes seriously, an alarm signal and a remote diagnosis request can be timely sent to the medical care personnel.

The tic disorder condition monitoring system based on the vision and voice fusion technology for realizing the method is characterized by comprising the following components: the system comprises an image acquisition and identification module, a sound acquisition and identification module, a template matching module, a data protection module, a dynamic database module, a data analysis module, a human-computer interaction module and a remote diagnosis module; the image acquisition and recognition module and the sound acquisition and recognition module are both connected with the template matching module, and the template matching module and the data protection module are both connected with the dynamic database module;

the image acquisition and recognition module is used for acquiring image information of the monitored person in real time and analyzing and extracting the image information to obtain the nervous movement twitch information and facial movement information of the monitored person;

the voice acquisition and recognition module is used for acquiring voice information of the monitored person and extracting and obtaining nervous sounding twitch information of the monitored person;

the template matching module is used for making a standby matching template for a monitored person before the system starts monitoring, and matching the monitoring information in real time with the template matching similarity degree in the monitoring process;

the data protection module is used for protecting personal information of a monitored person;

the dynamic database module is used for storing the acquired image information data, the acquired sound information data and the information of twitch description, and the real-time matching unit is connected to the dynamic database module;

the data analysis module is used for analyzing the twitch frequency and intensity data of the sports twitch and the phonation twitch;

the human-computer interaction module is used for user input, display and real-time information feedback;

and the remote diagnosis module is used for initiating consultation request information to a consultation platform of a main doctor.

The template matching module in the system for monitoring the condition of the tic disorder disease based on the vision and voice fusion technology comprises a template making unit and a real-time matching unit which are sequentially connected, and the image acquisition and recognition module and the sound acquisition and recognition module are connected with the dynamic database module through the real-time matching unit.

The image acquisition and recognition module in the tic disorder disease monitoring system based on the vision and voice fusion technology comprises an image acquisition unit, a key limb part, a five sense organs recognition unit and a limb and face action detection unit which are sequentially connected, wherein the limb and face action detection unit is connected with the real-time matching unit; the voice acquisition and identification module comprises a voice acquisition unit and a signal frequency screening and analyzing unit which are sequentially connected, and the signal frequency screening and analyzing unit is connected with the real-time matching unit; the data protection module comprises an identity authentication unit, an access control unit and a data encryption unit which are mutually interacted.

By adopting the method and the system for monitoring the state of the patient with tic disorder based on vision and voice fusion technology, whether the monitored patient has the nervous movement tic disorder or the vocal tic disorder is judged by acquiring the limb and facial action information and the vocal information of the monitored patient, the monitoring information of the monitored patient is recorded into a personal database of the monitored patient, the state of the patient of the monitored patient is found under the natural state without causing the attention of the monitored patient, the system automatically calculates the severity of the state of the patient according to the twitch frequency and the strength, the state of the patient is fed back to medical staff in time after preliminarily knowing the state of the patient of the monitored patient, the monitored patient is reminded to treat and correct in time according to the severity of the state of the patient, and a remote diagnosis and treatment request can be directly initiated when the patient is serious. The invention effectively solves the problems that the monitored person is easy to be controlled intentionally by self when depending on manual monitoring, the disease condition is not fully expressed, and the disease condition of the patient with tic disorder can not be monitored in real time for a long time.

Drawings

Fig. 1 is a flow chart of the method for monitoring the condition of tic disorder based on the vision and voice fusion technology.

Fig. 2 is a drawing of the monitoring system architecture for tic disorder based on visual and speech fusion technology.

Reference numerals

21 image information acquisition and identification module

22 sound collection and identification module

23 template matching module

24 data protection module

25 dynamic database module

26 data analysis module

27 human-computer interaction module

28 remote diagnostic module

211 image acquisition unit

212 key limb part and five sense organs identification unit

213 Limb and facial movement detection unit

221 sound collecting unit

222 signal frequency analysis unit

231 template making unit

232 real-time matching unit

241 identity authentication unit

242 access control unit

243 data encryption unit

Detailed Description

In order to more clearly describe the technical solution of the present invention, the following description is provided for further details of the embodiments of the present invention with reference to the accompanying drawings.

The method for monitoring the condition of tic disorder based on vision and voice fusion technology comprises the following steps:

(1) continuously acquiring image and sound information of the monitored person and judging the movement and sound production twitch of the monitored person in real time; the method specifically comprises the following steps:

(1.1) image information and sound information of a monitored person are obtained in real time through an image sensor and a sound sensor, and the method specifically comprises the following steps: the method comprises the steps that image information of limbs, facial actions and the like of a monitored person is obtained in real time through an image sensor, and sound production information of the monitored person is obtained in real time through a sound sensor;

(1.2) making a matching template of the movement twitch and the sound production twitch of the monitored person; the method specifically comprises the following steps:

manually selecting the neural limb, facial twitch information and neural sounding information containing the definition of the monitored person as a matching template, wherein the size and format attribute of each section of template can be flexibly adjusted to adapt to different monitored persons, each section of template information can be added to different action classification labels, each type of action label corresponds to more than one image template with the action characteristic, and different image information of the type of label has different twitch intensities and related information of single twitch frequency;

(1.3) matching in real time and judging whether the monitored person generates movement twitching and sounding twitching, if so, judging that twitching occurs and continuing the step (2), and if not, returning to the step (1.1) to continue monitoring; real-time matching and judging whether the monitored person generates movement twitch and sound production twitch specifically are as follows:

performing feature extraction and screening on image information and sound obtained by an image and sound sensor in real time, learning and matching the processed information and the matching template, if the matching similarity reaches a preset threshold value, judging that the current action is an action twitch or a sound production twitch corresponding to the template, and returning the judgment results of the motion twitch and the sound production twitch; if no proper matching result exists, returning the judgment result that no movement twitch or sounding twitch occurs;

(2) storing the obtained twitch information into a database according to the real-time judgment result of the motion and sound production twitch of the monitored person in the step (1); the method for storing the twitch information comprises the following steps of:

storing image information and sound information which are judged as neural movement twitch and vocal twitch by the monitored person into a database of the monitored person;

(3) analyzing and diagnosing the database information of the monitored person, feeding back a diagnosis result to related personnel in time, and continuously monitoring; the analysis monitored person database information in time feeds back diagnosis results to relevant personnel, and specifically comprises the following steps:

the information of the monitored person database is fed back to medical care personnel and family members of the patient in time after being updated and analyzed,

when the disease condition changes seriously, an alarm signal and a remote diagnosis request can be sent to medical staff in time.

The tic disorder condition monitoring system based on the vision and voice fusion technology for realizing the method comprises the following steps: the system comprises an image acquisition and identification module, a sound acquisition and identification module, a template matching module, a data protection module, a dynamic database module, a data analysis module, a human-computer interaction module and a remote diagnosis module; the image acquisition and recognition module and the sound acquisition and recognition module are both connected with the template matching module, and the template matching module and the data protection module are both connected with the dynamic database module;

the image acquisition and recognition module is used for acquiring image information of the monitored person in real time and analyzing and extracting the image information to obtain the nervous movement twitch information and facial movement information of the monitored person; the image acquisition and identification module comprises an image acquisition unit, a key limb part and facial recognition unit and a limb and facial action detection unit which are sequentially connected, wherein the limb and facial action detection unit is connected with the real-time matching unit;

the voice acquisition and recognition module is used for acquiring voice information of the monitored person and extracting and obtaining nervous sounding twitch information of the monitored person; the voice acquisition and identification module comprises a voice acquisition unit and a signal frequency screening and analyzing unit which are sequentially connected, and the signal frequency screening and analyzing unit is connected with the real-time matching unit;

the template matching module is used for making a standby matching template for a monitored person before the system starts monitoring, and matching the monitoring information in real time with the template matching similarity degree in the monitoring process; the template matching module comprises a template making unit and a real-time matching unit which are connected in sequence, and the image acquisition and recognition module and the sound acquisition and recognition module are connected with the dynamic database module through the real-time matching unit;

the data protection module is used for protecting personal information of a monitored person; the data protection module comprises an identity authentication unit, an access control unit and a data encryption unit which are mutually interacted;

In practical use, the method for monitoring the condition of tic disorder based on the vision and voice fusion technology comprises the following steps:

in a preferred embodiment, the real-time obtaining of the image information and the sound information of the monitored person S11 is used to install a high-definition camera and a sound sensor at a suitable position in the range of motion of the monitored person to obtain the limb, facial movement information and sound information of the monitored person.

In a preferred embodiment, the creating of the matching template of the motion twitch and vocal twitch of the monitored person is performed S12.

Specifically, based on the obtained image information, the position information of the monitored person is located by using a feature extraction method, and the limb key point location and tracking are performed on the monitored person. The image information of artificial selection contains the image information of the specific nervous limbs and facial twitching actions of the monitored person as a matching template, the image size and the image frame number of each section of template can be flexibly adjusted for adapting to different monitored persons, and each section of image information can be added to different action classification labels, such as: blinking, strabismus, pounding, shaking, shoulder shrugging, necking down, arm stretching, arm swinging, chest rising, bending, body rotating and the like, wherein each type of action label corresponds to more than one image template with the action characteristic, and different image information of the type of action label has information such as different twitching intensity and single twitching frequency. Further, based on the audio information that obtains, after carrying out frequency division processing to audio signal, the manual work is selected to have and is monitored the clear neural sound production twitch characteristic of people's sound signal as the matching template, and the sound intensity and the frequency size of every section of audio template can be for adapting to the different people that monitor and nimble adjustment, and every section of audio template can be added to different classification labels, if: whistle, roaring, cursing, stating dirty words, etc. Similarly, each type of the vocalization twitch labels corresponds to more than one audio template with the vocalization characteristics, and the audio information in the type of the labels has different sound degrees and single-time vocalization frequencies. The manufactured matching template of the neural limb, facial twitch and vocal twitch of the monitored person is used for carrying out matching analysis on image information and audio information acquired by real-time monitoring later.

In a preferred embodiment, the determining step S13 determines whether the monitored person has a neuromotor tic or vocal tic.

Specifically, the image information obtained by the high-definition camera in real time is subjected to feature extraction to determine and track the position of the limb of the monitored person, the processed image information is subjected to learning matching with the limb and face twitching matching template manufactured by the template S12 for matching motion twitching and sounding twitching of the monitored person, and the template information with the highest similarity is used as the motion information of the monitored person captured by the image. Furthermore, after the audio signal obtained by the sound sensor in real time is processed, the audio signal is matched with the vocalization matching template made in the matching template S12 for making the movement twitch and vocalization twitch of the monitored person, the template information with the highest similarity is used as the neural vocalization twitch information captured at this time, and if no proper matching result exists, no special operation is performed on the current monitoring data.

In a preferred embodiment, the obtained twitch information is saved to a database S14.

Specifically, image information and sound information, which are determined to be the neural motor twitch and the vocal twitch of the monitored person each time, are stored in the database of the monitored person.

In a preferred embodiment, the analysis diagnoses the database information of the monitored persons and feeds back the diagnosis result to the relevant persons S15.

Specifically, the database information of the monitored person can be updated and analyzed at regular time, and the database information mainly comprises the frequency and the comprehensive strength of each twitching action category of the neural limb movement twitch in each day, each week and each month; the frequency and the overall intensity of the neural facial motor twitch of each twitch action category occurring every day, every week, every month; there is also the frequency and overall intensity of the neurogenic tic motor category occurring daily, weekly, monthly. Further, after the image information and the audio information are analyzed, comprehensive scoring is carried out by combining the state of the monitored person before the initial diagnosis, scoring results are timely fed back to medical staff, and when the disease condition changes seriously, an alarm signal and a remote diagnosis request are timely sent to the medical staff.

The method is used for realizing the system for monitoring the condition of the tic disorder based on the vision and voice fusion technology, and further setting a threshold value for the severity of the condition of the tic disorder by comparing the vision and voice monitoring with the facial expression monitoring change and the voice tone monitoring of the normal monitored person. The family is to being monitored the monitoring of people's state of an illness, and the input is monitored the people and is twitched the performance, and its severity is calculated to the system automation, through giving the family feedback state of an illness, lets the family have preliminary understanding to being monitored the people's state of an illness. The comparatively serious condition of an illness is recorded by monitoring people's accessible video, and consultation request information is launched to the primary doctor and is consulted the platform to the mode of machine automatic transmission, and the primary doctor of being convenient for carries out remote diagnosis to monitoring people to remind medical personnel and the time zone to monitor people to the hospital according to the condition of an illness fluctuation. Specifically, as shown in fig. 2, the system architecture diagram for monitoring the state of tic disorder based on the vision and voice fusion technology of the present invention includes an image information collecting and recognizing module 21, a voice information collecting and recognizing module 22, a template matching module 23, a data protection module 24, a dynamic database module 25, a data analysis module 26, a human-computer interaction module 27, and a remote diagnosis module 28.

The image information acquisition and identification module 21 is used for acquiring image information of the monitored person in real time and analyzing the image information to acquire nervous movement twitch information and facial movement information of the monitored person; the voice acquisition and recognition module 22 is used for acquiring voice information of the monitored person and analyzing the voice information to obtain neurogenic vocalization twitch information of the monitored person; the template matching module 23 is used for making a standby matching template for the monitored person before the system is used, and matching the monitoring information in real time with the template matching similarity degree in the monitoring process; the data protection module 24 is used for protecting personal information of a monitored person; the dynamic database module 25 is used for dynamically storing the acquired image information data, the number of sound information, the twitching frequency and the intensity; the data analysis module 26 is used for analyzing data such as twitch frequency and intensity of the neural motor twitch and the vocal tic; the human-computer interaction module 27 is used for user input, display and real-time information feedback; the remote diagnosis module 28 is used for initiating consultation request information to a consultation platform of the main doctor.

Specifically, the image capturing and recognizing module 21 includes an image capturing unit 211, a key limb part, a five-sense organ recognizing unit 212, and a limb and facial movement detecting unit 213, which are connected in sequence; the image acquisition unit 211 in the image acquisition and recognition module 21 is configured to acquire image information of a monitored person, the key limb and facial feature recognition unit 212 is configured to locate and track a trunk, four limbs, and facial features of the monitored person in an image, the limb and facial motion detection unit 213 inputs captured motion information and template information in the template making unit 231 in the template matching module 23 into the real-time matching unit 232, and a matching result is stored in the dynamic data module 25, specifically, the matching result includes: the matched nervous movement twitch classification label, twitch frequency, strength and the like;

specifically, the voice information collecting and identifying module 22 includes a voice collecting unit 221 and a signal frequency screening and analyzing unit 222 which are connected in sequence; wherein, the sound collecting unit 221 is used for acquiring the sound information in the monitoring range, the signal frequency screening and analyzing unit 222 screens out the sound signal in the monitored human sound frequency range and then matches and analyzes with the neural sounding twitching matching template in the template making unit 231 in the template matching module 23 in the real-time matching unit 232, the matching result can be stored in the dynamic data module 25, and specifically, the matching result includes: the matched neural vocal tic classification label, twitch frequency, strength and the like;

specifically, the template matching module 23 includes a template making unit 231 and a real-time matching unit 232 connected in sequence;

specifically, the data protection module 24 includes an identity authentication unit 241, an access control unit 242 and a data encryption unit 243; furthermore, the identity authentication unit 241 is used for authenticating the identity of the user when logging in the monitoring system, and the authentication method includes face recognition authentication, fingerprint authentication, pupil authentication, account password authentication, random short message verification code authentication, and the like, so as to ensure that only the specified user can enter the system; the access control unit 242 is used for system waiting operation duration limitation, partial function access authority limitation and the like; the data encryption unit 243 is used for performing technical encryption on the user sensitive data in the dynamic database module 25, so as to prevent risks such as loss and theft of the release data. The data protection module 24 is crossed among the dynamic database module 25, the data analysis module 26, the human-computer interaction module 27 and the remote diagnosis module 28;

specifically, the data analysis module 26 is configured to extract and analyze effective data in the dynamic database module 25, such as data of the monitored person about the occurrence of the neural movement twitch, the twitch frequency and intensity of the vocal twitch, the occurrence time of the twitch, and the like, and give a comprehensive score and a visual change trend to the monitored person about the development of the disease condition in the day, the week, and the month;

specifically, the human-computer interaction module 27 is used for the functions of user input control of the detection system, query of analysis results of monitoring data, display of system states and the like;

specifically, the remote diagnosis module 28 may automatically initiate a video consultation request to the main doctor when the severity of the behavior vocalization of the monitored person changes or the monitored person strongly performs a neural twitch, and the main doctor may also review recent monitoring video and audio information, and meanwhile, the main doctor may also intelligently push related medical cases and treatment schemes when consulting the platform.

The method and the system for monitoring the state of an illness of tic disorder disease based on vision and voice fusion technology are adopted, whether the monitored person has the nervous movement tic disorder or the vocal tic disorder is judged by acquiring the body and facial action information and the vocal information of the monitored person, the monitoring information of the monitored person is recorded into a personal database of the monitored person, the system automatically calculates the severity of the state of an illness according to the twitch frequency and the intensity, the state of an illness is fed back to medical staff in time, the monitored person has preliminary knowledge about the state of an illness, and the serious monitored person can directly initiate a remote diagnosis and treatment request. The invention effectively solves the problems that the monitored person is easy to be controlled intentionally by self when depending on manual monitoring, the disease condition is not fully expressed, and the disease condition of the patient with tic disorder can not be monitored in real time for a long time.

In this specification, the invention has been described with reference to specific embodiments thereof. It will, however, be evident that various modifications and changes may be made thereto without departing from the broader spirit and scope of the invention. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense.

Claims

1. A method for monitoring the condition of tic disorder based on vision and voice fusion technology is characterized by comprising the following steps:

2. The method for monitoring tic disorder condition based on vision and voice fusion technology as claimed in claim 1, wherein the step (1) comprises the following steps:

3. The method for monitoring tic disorder condition based on visual and audio fusion technology of claim 2, wherein the step (1.1) comprises:

4. The method for monitoring tic disorder condition based on visual and audio fusion technology of claim 2, wherein the step (1.2) comprises:

the method comprises the steps that the specific nervous limbs, facial twitch information and nervous sounding information of a monitored person are selected to serve as matching templates manually, the size and format attributes of each segment of template can be flexibly adjusted to adapt to different monitored persons, each segment of template information can be added to different action classification labels, each type of action label corresponds to more than one image template with the action characteristic, and different image information of the type of label has different twitch strengths and related information of single twitch frequency.

5. The method for monitoring the condition of tic disorder disease based on vision and voice fusion technology as claimed in claim 2, wherein the real-time matching and determining whether the monitored person has movement twitch and vocal twitch specifically comprises:

6. The method for monitoring the condition of tic disorder based on vision and voice fusion technology as claimed in claim 1, wherein the step of storing the obtained tic information into a database comprises:

7. The method for monitoring the condition of tic disorder disease based on vision and voice fusion technology as claimed in claim 1, wherein the condition of disease is diagnosed according to the database information of the monitored person, and the diagnosis basis and the diagnosis result are fed back to the relevant persons in time, specifically:

8. A tic disorder condition monitoring system based on visual and speech fusion techniques for carrying out the method of any one of claims 1 to 7, wherein said monitoring system comprises: the system comprises an image acquisition and identification module, a sound acquisition and identification module, a template matching module, a data protection module, a dynamic database module, a data analysis module, a human-computer interaction module and a remote diagnosis module; the image acquisition and recognition module and the sound acquisition and recognition module are both connected with the template matching module, and the template matching module and the data protection module are both connected with the dynamic database module;

9. The vision and speech fusion technology-based tic disorder disease monitoring system according to claim 8, wherein the template matching module comprises a template making unit and a real-time matching unit which are connected in sequence, and the image acquisition and recognition module and the sound acquisition and recognition module are both connected with the dynamic database module through the real-time matching unit.

10. The system for monitoring the condition of tic disorder disease based on vision and voice fusion technology as claimed in claim 9, wherein the image acquisition and recognition module comprises an image acquisition unit, a key limb part and five sense organs recognition unit and a limb and facial movement detection unit which are connected in sequence, wherein the limb and facial movement detection unit is connected with the real-time matching unit; the voice acquisition and identification module comprises a voice acquisition unit and a signal frequency screening and analyzing unit which are sequentially connected, and the signal frequency screening and analyzing unit is connected with the real-time matching unit; the data protection module comprises an identity authentication unit, an access control unit and a data encryption unit which are mutually interacted.