A hybrid deep neural network for classification of schizophrenia using EEG Data

Sun, Jie; Cao, Rui; Zhou, Mengni; Hussain, Waqar; Wang, Bin; Xue, Jiayue; Xiang, Jie

doi:10.1038/s41598-021-83350-6

Download PDF

Article
Open access
Published: 25 February 2021

A hybrid deep neural network for classification of schizophrenia using EEG Data

Jie Sun¹,
Rui Cao²,
Mengni Zhou³,
Waqar Hussain¹,
Bin Wang¹,
Jiayue Xue¹ &
…
Jie Xiang¹

Scientific Reports volume 11, Article number: 4706 (2021) Cite this article

11k Accesses
56 Citations
1 Altmetric
Metrics details

Subjects

Abstract

Schizophrenia is a serious mental illness that causes great harm to patients, so timely and accurate detection is essential. This study aimed to identify a better feature to represent electroencephalography (EEG) signals and improve the classification accuracy of patients with schizophrenia and healthy controls by using EEG signals. Our research method involves two steps. First, the EEG time series is preprocessed, and the extracted time-domain and frequency-domain features are transformed into a sequence of red–green–blue (RGB) images that carry spatial information. Second, we construct hybrid deep neural networks (DNNs) that combine convolution neural networks and long short-term memory to address RGB images to classify schizophrenic patients and healthy controls. The results show that the fuzzy entropy (FuzzyEn) feature is more significant than the fast Fourier transform (FFT) feature in brain topography. The deep learning (DL) method that we propose achieves an average accuracy of 99.22% with FuzzyEn and an average accuracy of 96.34% with FFT. These results show that the best effect is to extract fuzzy features as input features from EEG time series and then use a hybrid DNN for classification. Compared with the most advanced methods in this field, significant improvements have been achieved.

A neural speech decoding framework leveraging deep learning and speech synthesis

Article Open access 08 April 2024

Xupeng Chen, Ran Wang, … Adeen Flinker

EEG is better left alone

Article Open access 09 February 2023

Arnaud Delorme

Rhythmicity of neuronal oscillations delineates their cortical and spectral architecture

Article Open access 03 April 2024

Vladislav Myrov, Felix Siebenhühner, … J. Matias Palva

Introduction

Schizophrenia is a mental disorder from which 1% of the global population suffers¹. In a clinic, doctors directly judge schizophrenia by electroencephalography (EEG). Although this method has a certain effect, it needs a substantial amount of time and energy and is not suitable for a large number of accurate diagnoses of schizophrenia. Researchers subsequently introduced a model using a computer, which reduced the workload and accelerated the diagnostic speed of EEG.

EEG is a low-cost and noninvasive measurement network tool and an effective tool for recording brain activity ². In recent years, EEG has been extensively utilized in the research and diagnosis of various nervous system diseases, including epilepsy³, Alzheimer’s disease (AD)⁴ and schizophrenia⁵. EEG signals show the complex information in the brain, which has a high dimension and contains a considerable amount of information, and are difficult to analyze directly. Therefore, some auxiliary means are necessary to extract useful information about the brain. Feature extraction is an effective method for studying EEG data. By extracting some useful features from a large quantity of original signals, the purpose of reducing feature dimensions is achieved, and the physical meaning of individual features is not destroyed during this process. For schizophrenia, some feature extraction methods (such as those employed for time-domain features⁶ and frequency-domain features⁷) have been proposed to quantify EEG signals for studying state changes in the brain. Currently, whether time-domain features or frequency-domain features can more effectively distinguish brain differences between patients with schizophrenia and healthy controls remains ambiguous. Therefore, one of the purposes of this study is to compare the ability of different features to extract EEG signals.

Machine learning (ML) can be used to develop computer-aided diagnostic tools for clinical applications and explore the pathophysiological mechanisms of diseases. ML has revolutionized the field of schizophrenia by providing a tool to solve the high complexity of EEG signals. In the past few years, traditional ML technology (that is, a non-deep learning (DL) algorithm) has been the only method of choice in EEG analysis and has been combined with various feature extraction methods ^8,9. In a relatively new development, DL algorithms have been extensively applied in medical image and signal processing and have shown high research potential. In most cases, their performance exceeds traditional machine learning techniques¹⁰. In the process of disease diagnosis and classification, an increasing number of researchers have applied DL to the field of EEG to study mental diseases^11,12,13,14. We performed a search on Web of Science and PubMed using the following group of keywords: “schizophrenia” AND “EEG” AND (“machine learning” OR “deep learning”). References from 2000 until 2019 were utilized for further analysis; the accuracy of these articles is reflected in Table 1. In addition, an increasing number of researchers have employed hybrid structures to design neural network structures. Specifically, the convolutional neural network (CNN) is used for learning task-related features, processing pictures, and mining interchannel correlation from frames via designed convolutional filters. Long-short-term memory (LSTM) networks are composed of recurrent networks that include memory to model temporal dependencies in time series problems. This approach gives us a way to structure our research.

Table 1 List of published works on schizophrenia classification using EEG signals in recent years.

Full size table

In this study, we propose DL algorithms for use in the analysis of EEG signals for schizophrenia research to improve the classification accuracy. We first divide the EEG signal into three bands and extract different domain features from each band. These bands are then constructed to input red–green–blue (RGB) images into our network. Furthermore, we construct a hybrid DL network that integrates a CNN and LSTM for processing the EEG-based schizophrenia classification and obtains a high classification accuracy.

This article is organized as follows: “New method” describes the proposed method and the DL structure, and “Results” introduces the specific baseline method. The data acquisition method is presented in “Comparison with Different Deep Learning Models”. “Discussion” describe the experimental results that are achieved and present a discussion, respectively. The conclusions are detailed in “Conclusions”.

New method

EEG signals are nonstationary signals that contain complex brain activity information, but some of these features cannot be estimated from these signals. To preserve the continuous time relationship and internal characteristics of the EEG time series, we divide each trial into 6 time windows (each time window is 400 ms, and adjacent time windows overlap by 200 ms). The EEG signals contain multiple frequency characteristics. Based on previous research, the frequency band power of the EEG in the θ, α, and β bands of schizophrenic patients is different than that of healthy controls ^15,16,17. Therefore, we divide the whole frequency spectrum into three sub-bands: theta (4–7 Hz), alpha (8–13 Hz), and beta (14–30 Hz).

Participants

Fifty-four patients with schizophrenia [36 male patients and 18 female patients, with a mean age = (37.80 ± 1.34)] and fifty-five healthy controls [31 male controls and 24 female controls, with a mean age = 41.00 ± 1.59] were included in this study. The patients were recruited from Huilongguan Hospital in Beijing, China. The schizophrenic patients were diagnosed according to the fourth edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-IV) and World Health Organization (ICD-10, 10th revision of the International Classification of Diseases) criteria for a lifetime diagnosis of schizophrenia or schizophrenia spectrum disorder and were recruited from consecutive admissions to a psychiatric hospital. These patients had been treated with stable doses of antipsychotic medications. In addition, the healthy controls did not have any history of mental illness or drug abuse. This study complies with the Code of Ethics of the Declaration of Helsinki. The study protocol was approved by Beijing Huilongguan Hospital. All the participants provided written informed consent as approved by the institutional review board. The full name of the approving committee is Beijing Huilongguan Hospital' s ethics committee, which affirmed its approval of the study.

The characteristics of all the participants, including age, sex, illness course and age at disease onset, are shown in Table 2. Each characteristic is averaged, and the standard error (SE) is shown in parentheses. No statistically significant difference in the ages of the healthy controls and schizophrenia patients (p > 0.05) was obtained.

Table 2 Comparison of the demographic characteristics between healthy controls and schizophrenic patients.

Full size table

Feature extraction

Accurately extracting EEG signal features is not only challenging but also an essential step in classification because this extraction determines the classification accuracy. The EEG signals of schizophrenia can be extracted by time-domain feature methods and frequency-domain feature methods.

Time-domain feature extraction methods study EEG signals using variations in signal time series. The complexity of EEG reflects the irregularity or unpredictability of brain activity. With the continuous advancement and development of nonlinear theory, many researchers are applying nonlinear analysis methods to EEG data analysis. Entropy is a nonlinear analysis method that can be used to measure the complexity. Entropy is the most commonly employed feature index among the time-domain features and is extensively employed in the diagnosis of diseases. Among the commonly employed entropies, fuzzy entropy (FuzzyEn) was developed based on other entropies. Compared with other entropies, such as information entropy, sample entropy and FuzzyEn have the advantages of excellent robustness and strong antinoise ability, and the algorithm complexity is lower. The entropy value measured by fuzzy entropy is continuously stable and less sensitive to the noise of EEG data, which renders it more suitable for analyzing chaotic signals. Previous studies have proven that the ability of FuzzyEn to detect and recognize signals is superior to the ability of other entropies for both epilepsy¹⁸ and schizophrenia¹⁹.

The frequency-domain feature extraction method primarily studies the EEG signal by converting the original time-domain signal into the frequency-domain signal, which reflects the relationship between the frequency and its corresponding amplitude and can mine the deeper features of signals. Among the common frequency-domain features, the Fourier transform (FT) is extensively applied. Compared with other frequency-domain features, the FT has a low computational cost and can be easily implemented. In addition, the Fourier transform requires that the signals in the frequency domain are stable EEG samples. This type of analysis is most suitable for studying EEG signals. Signals are generally continuous, but computers cannot process continuous signals, so only continuous signals are discretized. The discrete Fourier transform (DFT) reflects the discrete form of the FT in the time and frequency domains. The fast Fourier transform (FFT) is essentially a simple DFT algorithm. In previous studies, many researchers have preferred the FFT when processing EEG signals for frequency-domain features²⁰.

According to the current research results in related fields, FuzzyEn and FFT are typical algorithms for time- and frequency-domain features, and both feature extraction methods have been proven to achieve excellent results and are extensively utilized in various biological signal research. In this study, to select better features to represent EEG signals, we compare the two features that are extensively applied in the time and frequency domains: FuzzyEn and FFT.

Fuzzy entropy

FuzzyEn is a nonlinear indicator to evaluate the occurrence probability of newly generated patterns based on fuzzy theory. In 2007, Chen et al.²¹, based on the sample entropy (SampEn) algorithm, proposed a new algorithm for measuring the complexity of time series-fuzzy entropy. By blurring the similarity measurement formula, the limitation of SampEn is eliminated^22,23.

The FuzzyEn algorithm is described as follows:

(1)
For a time series of N length, the algorithm is expressed as [u(1),u(2),⋯,u(N)].
(2)
Carry out phase space reconstruction of the original time series and define the dimension $\mathrm{m}(\mathrm{m}\le \mathrm{N}-2)$ of phase space. After reconstruction, as shown in formula (1),
$$X_{i}^{m} = \left\{ {u(1,u\left( {i + 1} \right), \ldots ,u(i + m - 1))} \right\}$$
(1)
$i=1,2,\dots ,N-m+1$, ${U}_{0}(i)$ is the average, and the formula is shown in (2)
$$U_{0} \left( i \right) = \frac{1}{m}\mathop \sum \limits_{j = 0}^{m - 1} u(i + j)$$
(2)
(3)
The distance ${\mathrm{d}}_{\mathrm{ij}}^{\mathrm{m}}$ is defined as the maximum difference between the corresponding elements of vector ${\mathrm{X}}_{\mathrm{i}}^{\mathrm{m}}$ and vector ${\mathrm{X}}_{\mathrm{j}}^{\mathrm{m}}$, that is,
$$d_{ij}^{m} = d\left[ {X_{i}^{m} ,X_{j}^{m} } \right] = \mathop {\max }\limits_{p = 1,2, \ldots ,m} \left( {\left| {u\left( {i + p - 1} \right) - u_{0} \left( i \right)} \right| - \left| {u\left( {j + p - 1} \right) - u_{0} \left( j \right)} \right|} \right)\quad (i,j = 1,2, \ldots ,N - m + 1,j \ne i$$
(3)
(4)
The similarity between vector ${\mathrm{X}}_{\mathrm{i}}^{\mathrm{m}}$ and vector ${\mathrm{X}}_{\mathrm{j}}^{\mathrm{m}}$ is defined by the fuzzy membership function $({\mathrm{d}}_{\mathrm{ij}}^{\mathrm{m}},\mathrm{n},\mathrm{r})$, as shown in formula (4):
$$D_{ij}^{m} = \mu \left( {d_{ij}^{m} ,n,r} \right) = exp\left( { - \frac{{\left( {d_{ij}^{m} } \right)^{n} }}{r}} \right)$$
(4)
$\upmu \left({\mathrm{d}}_{\mathrm{ij}}^{\mathrm{m}},\mathrm{n},\mathrm{r}\right)$ are exponential functions; n and r are the gradient and the width, respectively, of the exponential functions.
(5)
${\mathrm{\varnothing }}^{\mathrm{m}}\left(\mathrm{n},\mathrm{r}\right)$ is shown in formula (5):
$$\emptyset^{m} \left( {n,r} \right) = \frac{1}{N - m}\mathop \sum \limits_{i = 1}^{N - m} \left[ {\frac{1}{N - m - 1}\mathop \sum \limits_{j = 1,j \ne i}^{N - m} D_{ij}^{m} } \right]$$
(5)
(6)
By adding dimension m + 1, the ${\mathrm{\varnothing }}^{\mathrm{m}+1}\left(\mathrm{n},\mathrm{r}\right)$ function is obtained:
$$\emptyset^{m + 1} \left( {n,r} \right) = \frac{1}{N - m}\mathop \sum \limits_{i = 1}^{N - m} \left[ {\frac{1}{N - m - 1}\mathop \sum \limits_{j = 1,j \ne i}^{N - m} D_{ij}^{m + 1} } \right]$$
(6)
(7)
The FuzzyEn is
$$FuzzyEn\left( {m,n,r} \right) = \mathop {lim}\limits_{N \to \infty } \left[ {ln \Phi^{m} \left( {n,r} \right) - ln \Phi^{m + 1} \left( {n,r} \right)} \right]$$
(7)
(8)
However, the length of the time series N is limited in the actual operation, and the FuzzyEn is estimated as follows:
$$FuzzyEn\left( {m,r,N} \right) = ln \Phi^{m} \left( {n,r} \right) - ln \Phi^{m + 1} \left( {n,r} \right)$$
(8)

The similarity tolerance limit r and the dimension m of the phase space reconstruction parameters are the main parameters in this algorithm, and r represents the width of the boundary of the exponential function in practical application. In general, the larger is the r value, the greater is the amount of information that is lost, the smaller is the r value, and the more sensitive the result to noise will be. The r value is usually 0.1–0.25 times the standard deviation (SD) of the original time series, while the m value is usually 1 or 2. The parameters r = 0.25 and m = 2 are utilized in this study.

Fast Fourier transform

The FFT algorithm is an improved version of the DFT algorithm that involves fast implementation of the DFT method²⁴. According to previous research ²⁵, the DFT execution time is higher than the FFT execution time. Compared with the DFT algorithm, the FFT algorithm can obtain faster results when analyzing EEG signals. In previous studies, the FFT algorithm has been employed as a frequency-domain signal to extract features of various neurological disorders, including epileptic seizures²⁶ and AD²⁷.

The FFT algorithm is described as follows:

(1)
The number of sequence points is $\mathrm{N}={2}^{\mathrm{M}}$, and M is an integer; then, the DFT of $\mathrm{x}(\mathrm{n})$ is expressed as follows:
$$X\left( k \right) = \mathop \sum \limits_{n = 0}^{N - 1} x(n)e^{ - j2\pi nk/N} = \mathop \sum \limits_{n = 0}^{N - 1} x\left( n \right)W_{N}^{nk} ,\quad 0 \le k \le N - 1.$$
(9)
(2)
The DFT operation of N points is decomposed into two groups of DFT operations of $\mathrm{N}/2$ points, that is, $\mathrm{x}(\mathrm{n})$ is decomposed into two groups: the first group is the even term; the second group is the odd term. The decomposition process is shown in formula (10):
$$\begin{aligned} X\left( k \right) & = \mathop \sum \limits_{n = 0}^{N/2 - 1} x\left( n \right)W_{N}^{nk} + \mathop \sum \limits_{{n = \frac{N}{2}}}^{N - 1} x\left( n \right)W_{N}^{nk} \\ & = \mathop \sum \limits_{n = 0}^{N/2 - 1} x\left( n \right)W_{N}^{nk} + \mathop \sum \limits_{n = 0}^{N/2 - 1} x\left( {n + N/2} \right)W_{N}^{{\left( {n + N/2} \right)k}} \\ & = \mathop \sum \limits_{n = 0}^{N/2 - 1} \left[ {x\left( n \right) + W_{N}^{nk/2} x\left( {n + N/2} \right)} \right]W_{N}^{nk} \\ & \quad (W_{N}^{nk/2} = ( - 1)^{k} ,\;\;\;k = 0,1, \ldots ,N - 1) \\ \end{aligned}$$
(10)
(3)
Decompose $\mathrm{x}(\mathrm{k})$ into even and odd groups.

If k takes an even number, when $\mathrm{k}=2\mathrm{r},\mathrm{r}=\mathrm{0,1},2,\dots ,\mathrm{N}/2-1$,
$$X\left( {2r} \right) = \mathop \sum \limits_{n = 0}^{N/2 - 1} \left[ {x\left( n \right) + x\left( {n + \frac{N}{2}} \right)} \right]W_{N/2}^{nk}$$
(11)
If k takes an odd number, when $\mathrm{k}=2\mathrm{r}+1,\mathrm{r}=\mathrm{0,1},2,\dots ,\mathrm{N}/2-1$,
$$X\left( {2r + 1} \right) = \mathop \sum \limits_{n = 0}^{N/2 - 1} \left[ {x\left( n \right) - x\left( {n + \frac{N}{2}} \right)} \right]W_{N/2}^{nr} W_{N}^{n}$$
(12)
(4)
If ${\text{g}}\left({\text{n}}\right)= \text{x} \left({\text{n}}\right)\text{+x(n+N/2)}$
$$h\left( n \right) = \left[ {x\left( n \right) - x(n + N/2)} \right]W_{N}^{n} ,\;\;n = 0,1,2 \ldots ,\frac{N}{2} - 1,\;\;n = 0,1, \ldots \frac{N}{2} - 1$$
$$X\left( {2r} \right) = \mathop \sum \limits_{n = 0}^{N/2 - 1} g\left( n \right)W_{N/2}^{nr}$$
(13)
$$X\left( {2r + 1} \right) = \mathop \sum \limits_{n = 0}^{N/2 - 1} h\left( n \right)W_{N/2}^{nr}$$
(14)

Based on these calculations, the DFT of one N point can be decomposed into two DFTs according to parity because the previously mentioned N is even, and the DFT can be further decomposed to the first decomposition. The DFT of one N point can also be decomposed into two arrays according to parity. From 1 to 2, and from 2 to 4… According to this rule, the DFT can be decomposed M times into two points of addition and subtraction. This kind of process forms the butterfly algorithm of the FT. In this study, the FFT uses 200 sampling points, and the EEG data frequency is 500 Hz.

Making images

In the previous section, we introduced the features of FuzzyEn and FFT for different applications. The standard approach in EEG data analysis is to form a feature vector of all electrodes. However, this approach disregards the spatial, spectral, and temporal structure of the data. Conversely, to maintain the spatial structure, we recommend converting the measurement results into a two-dimensional (2D) image and using multiple color channels to represent the spectral dimension.

The EEG electrodes are distributed in the three-dimensional space of the cerebral cortex sphere. To convert the spatial distribution of the electrodes into a 2D image and maintain the relative distance between adjacent electrodes, we project the electrodes from the position in three-dimensional space onto the two-dimensional surface. To ensure that the distance between all points and the center point is proportional, we use the azimuthal equidistant projection (AEP) in surveying and mapping applications²⁸. In our example, the sphere can be applied to approximate the shape of the head covering or it can be used to calculate the projection on the two-dimensional surface, where the head apex is tangent to the electrode position. In addition, the use of isometric projection methods helps to interpret image and feature map visualization data and classify cognitive load levels better than standard nonspatial methods²⁹. When applying AEP to the three-dimensional electrode position, we obtained the two-dimensional projection position of the electrode (Fig. 1). The spatial distribution of cortical activity is represented by the width and height of the image. We apply the Clough-Tocher scheme³⁰ to interpolate the scattered power measurements over the scalp and estimate the values between the electrodes over a 32 × 32 mesh. The choice of 32 for the image size is a trade-off between the signal resolution and the computational cost.

Figure 2 illustrates the process of generating RGB images from the EEG data. First, we divide the EEG time series into three sub-bands (theta, alpha and beta) and calculate the FuzzyEn or FFT values for the three frequency bands. Second, the AEP process is repeated for the theta, alpha, and beta bands to produce three topographical activity maps, which are then merged to form an image with three (color) channels. Last, these three (color) channels are fed into the hybrid DNN for representation learning and classification.

Construction of the hybrid DNN

We propose a model based on hybrid DL to distinguish schizophrenic patients from healthy controls (Fig. 3). First, we divided each experiment into 6 time windows. In this way, a single RGB image can be constructed from the EEG signal in a time window. When the time window slides, an RGB image sequence can be obtained from the EEG signals. Second, we construct the hybrid DNN, which includes two types of DL structures: a CNN and an LSTM unit. The CNN unit processes images and extracts features from the RGB images. The LSTM unit is an improved recurrent neural network (RNN) structure that models context information with long-term sequences of arbitrary length. Third, the useful information obtained by the network is entered into the fully connected (FC) layer. Last, the model ends with a softmax (SF) layer to achieve the binary classification result (i.e., schizophrenia or normal).

CNN

CNNs are a subset of DL networks that have received widespread attention in recent years and are often applied for image recognition. The CNN architecture consists of three types of layers: (1) convolutional, (2) pooling, and (3) FC layers ³¹.

(1)
Convolutional layer: The convolutional layer is composed of a filter (kernel), which passes the EEG image and outputs a feature map. The convolution operation is expressed as
$$y = \frac{n + 2p - f}{s} + 1$$
(15)
where n, p, f and s denote the matrix of the input picture, the padding size, the matrix of the filter, and the stride, respectively. The size of the output matrix is $y\times y$. Next, the activation function follows the convolution layer and provides a nonlinear attribute structure, which gives the DNN the learning ability to conduct hierarchical nonlinear mapping. The most commonly employed activation function in CNNs is referred to as a rectified linear unit (ReLU, $f\left(X\right)=max(0,x)$), which provides better performance in terms of generalization and learning time.
(2)
Pooling layer: The pooling layer decreases the size of the feature map by the average (average pooling) or maximum (max pooling) while preserving the significant features, which reduces the computational intensity and prevents overfitting. The size of the output matrix is $y\times y$,
$$y = \frac{n - f}{s} + 1$$
(16)
where n denotes the matrix of the input picture, f is the matrix of the filter, and s is the number of strides.
(3)
FC layer: The convolution layer (+ ReLU) and pooling layers are followed by one or more FC layers. The following equation is used to connect each neuron in this structural layer with each neuron in the next layer:
$$y_{j} = \sum w_{j} *x_{i} + b_{j}$$
(17)
where w and b represent the weight and the deviation, respectively; x represents the output of the previous layer; y represents the output of the current layer; i represents the previous layer; and j represents the current layer. The output of the last FC layer is input into the SF function, and the class is predicted by determining the probability that each EEG signal indicates a normal person or a schizophrenic patient.where w and b represent the weight and the deviation, respectively; x represents the output of the previous layer; y represents the output of the current layer; i represents the previous layer; and j represents the current layer. The output of the last FC layer is input into the SF function, and the class is predicted by determining the probability that each EEG signal indicates a normal person or a schizophrenic patient.

We compare four CNN models with different depths and configurations in Table 3. The convolutional layer parameters are denoted by Conv < Layer i (1, 2, and 3) >—< number of kernels > , where Layer i (1, 2, 3) represents the ith convolutional layer. CNN structure A only involves two convolutional layers (Conv1–32) that are superimposed, followed by a max pooling layer (Max-pooling1). Compared to configuration A, configuration B adds two more convolutional layers (Conv2–64), which are followed by another max pooling layer (Max-pooling2). Compared to configuration B, configuration C adds a convolutional layer (Conv3–128), followed by another maximum pool layer (Max-pooling3). Compared to configuration C, Configuration D starts with 4 layers of Conv1–32 convolutional layers instead of 2 layers. An FC layer with 512 nodes (FC-512) is added to the architecture.

Table 3 Configuration information of different CNN models. The convolutional layer parameters are denoted as Conv < Layer i (1, 2, 3) >—< number of kernels > .

Full size table

We adopt an architecture that mimics a visual geometry group (VGG) network that is used in image classification tasks³². In our research, the input three color channels of the RGB image size of the networks are $32\times 32$ pixels. All the convolutional layers use small $3\times 3$ receptive fields, a stride of 1 pixel, a padding of 1 pixel and an ReLU activation function. This work uses the largest pool for subsequent operations. The maximum value in each feature map is selected to reduce the number of output neurons, which is a process that is performed over a $2\times 2$ window with a stride of 2 pixels. The main parameters of the CNN model are listed in Table 4.

Table 4 Main parameters of the CNN model.

Full size table

LSTM

LSTM is an improvement over the RNNs, which have been previously employed in EEG analyses^33,34. Compared to traditional RNNs, the innovation of LSTM networks is the addition of three control units (“cells”): (1) a forget gate, (2) an input gate and (3) an output gate. The structure of a typical LSTM unit is shown in Fig. 4, and the mechanisms of the gates are described as follows:

Forget gate: The gate decides what previous information should be forgotten. The current step’s input ${\mathrm{x}}_{\mathrm{t}}$ and the hidden state ${\mathrm{h}}_{\mathrm{t}-1}$ from the prior unit are concatenated into a new vector. Multiplying by the weight parameter ${\mathrm{W}}_{\mathrm{f}}$ of the gate, every element’s value of the output vector ${\mathrm{f}}_{\mathrm{t}}$ is scaled from 0 to 1 via the elementwise sigmoidal operation σ. A ‘0′ element enables the corresponding information in ${\mathrm{C}}_{\mathrm{t}-1}$ to be eliminated, while a ‘1′ means that the corresponding information is allowed to be passed through. The output ${\mathrm{f}}_{\mathrm{t}}$ of the gate is formalized as Eq. (18).

$$f_{t} = \sigma \left( {W_{f} \cdot \left\lceil { h_{t - 1}, x_{t}}\right\rceil + b_{f} } \right)$$

(18)

Input gate: The gate determines how much of the input ${x}_{t}$ of the network is saved to the unit state ${\mathrm{C}}_{\mathrm{t}}$. The fulfillment of the input gate’s function requires cooperation between two parallel layers. The tangent layer outputs candidate information ${\mathrm{C}}_{\mathrm{t}}$ for selection, while the sigmoidal layer acts as the forget gate and decides what candidate information will be selected by outputting the decision vector ${\mathrm{i}}_{\mathrm{t}}$. After the elementwise multiplication of the candidate information by the decision vector ${\mathrm{C}}_{\mathrm{t}}\times {\mathrm{i}}_{\mathrm{t}}$ is performed, the final update information that should be added to the cell state is determined. The function of the two layers is formalized by Eqs. (19) and (20).

$$i_{t} = \sigma \left( {W_{i} \cdot \left\lceil {h_{t - 1}, x_{t}}\right\rceil + b_{i} } \right)$$

(19)

$$\bar{C}_{t} = \tan \left( {W_{c} \cdot \left\lceil {h_{t - 1}, x_{t}}\right\rceil + b_{c} } \right)$$

(20)

Therefore, the cell state ${\mathrm{C}}_{\mathrm{t}}$ of the current chain is a combination of the reserved historical information of ${\mathrm{C}}_{\mathrm{t}-1}$, and the updating information selected from ${\mathrm{C}}_{\mathrm{t}}$ (Eq. 21).

$$C_{t} = C_{t - 1} \times f_{t} + \bar{C}_{t} \times { }i_{t}$$

(21)

Output gate: The gate decides which hidden state ${\mathrm{h}}_{\mathrm{t}}$ in the current chain to output via multiplication of the decision vector ${\mathrm{o}}_{\mathrm{t}}$ by the candidate information selected from ${\mathrm{C}}_{\mathrm{t}}$, as shown in Eqs. (22) and (23).

$$o_{t} = \sigma \left( {W_{o} \cdot \left\lceil {h_{t - 1}, x_{t}}\right\rceil + b_{o} } \right)$$

(22)

$$h_{t} = \tan \left( {C_{t} } \right) \times { }o_{t}$$

(23)

Our RNN, which is based on the LSTM structure, learns contextual time series information from the feature sequences extracted from the CNN and then determines the overall classification of schizophrenia or normal according to the LSTM output in each time step. Because brain activity is a dynamic process, changes between trials may contain additional information about potential mental states.

Training and testing

We train the hybrid DNN with the optimization algorithm adaptive moment estimation (Adam)³⁵, a learning factor of $1\times {10}^{-3}$ and decay rates of the first moment and second moment of 0.9 and 0.999, respectively. This work uses the Adam optimizer to update the parameters of the proposed network structure. Note that the Adam optimizer can make the network converge at a faster speed to improve the efficiency of the training process. The batch size is set to 32 to update the parameters of the proposed recurrent-convolutional network. To avoid overfitting and improve the generalization ability, dropout³⁶ (set to 0.5) is applied to the FC layers. The network converges after approximately 33,660 iterations and 180 epochs.

In total, 200 training epochs were run. The epoch refers to the iteration over the entire training set. This study uses a tenfold cross-validation³⁷method. First, randomly divide the EEG signals into ten equal sets. Nine groups are used as training models, and one group is used to test the system performance. By transmission testing and training of the data set, this strategy was repeated ten times. The accuracy value reported in this study is the average value obtained from ten evaluations.

Statistical test

A statistical test was performed with SPSS 16.0. For the group comparisons of the demographic and clinical variables, we used chi-square tests for categorical variables and independent-sample t-tests for continuous variables. To explore the differences among the conditions, a paired t-test was computed. All p values were two-tailed, and the significance level was set to p < 0.05 and corrected using the false discovery rate (FDR)³⁸ and Bonferroni correction ³⁹. Pearson's r coefficients were computed to investigate the correlations.

Baseline methods

Our study compared our method with various commonly employed classifiers, including the support vector machine (SVM), K-nearest neighbor (KNN) and logistics regression (LR) classifiers. All baseline methods are compared with our method using a tenfold cross-validation method. Here, we briefly describe the details and parameters used in these methods.

SVM: An SVM is based on statistical learning theory and uses kernel functions to transform linearly inseparable problems in low-dimensional space into linearly separable problems in high-dimensional space. The SVM hyperparameters, which consist of the regularization penalty parameter (C) and the radial basis function (RBF) kernel SD ($\upgamma =1/\upsigma$) inverse, are selected.

KNN: The KNN is a supervised learning algorithm that uses k-nearest examples to classify data labels. The majority vote on the sample neighbors determines the label of the sample. The Euclidean metric is used to measure distance.

LR: LR is used to describe the relationship between the independent variable X and the dependent variable Y and predict the dependent variable Y. The dependent variable Y is a real number between 0 and 1 that represents the probability of obtaining two results in the binary classification. The LR selects the optimal regularization parameter C and searches the log range of [${10}^{-2},{10}^{3}$].

Results

Data recording and preprocessing

The EEG data were recorded using a 64-channel EEG system produced by Brain Products, Germany, according to the international 10–20 system. The impedances were kept below 5 kΩ, and the sampling rate was 500 Hz. During recording, each participant was seated upright and asked to remain quiet and relaxed with open eyes in an acoustically and electrically shielded room. In this experiment, the recording time lasted 2 min. In addition, the effects of brain activity and psychological factors were disregarded.

The whole data preprocessing and analysis procedure was implemented on BrainVision Analyzer 2.0. First, new referencing was used to select the available electrodes (The following electrodes were deleted: HEOGL, HEOGR, VEOGL, and VEOGU.). Second, bandpass filtering was used to obtain the frequency bands from 0.5 to 50 Hz with a slope of 24 dB/oct. The EEG recordings were divided into 1,400-ms lengths for each segmentation and 55 trials. Third, baseline correction was performed in the range for the mean value calculation beginning at 0 ms and ending at 1400 ms to eliminate EEG noise caused by spontaneous EEG activity. Fourth, ocular correction was utilized to correct the signal interference caused by blinking or eye movements. Artifact rejection was then used to remove the false signals produced by the equipment or the action of the subject. Last, the EEG data of the 60 electrodes were exported for further analysis.

Comparison of the feature

For the group analysis, a relation analysis based on the direct contrast between the schizophrenia patients and the healthy controls was carried out to generate the brain topographic map for each group (significance level = 0.05). As shown in Fig. 5, brain topographic maps of the FuzzyEn and FFT features are displayed for the entire time series.

The FuzzyEn maps show that the FuzzyEn value of the patients with schizophrenia is significantly larger than that of the normal controls (corrected p < 0.05), and large differences exist in the frontal region compared with other regions. The FFT maps show that the FFT values for the schizophrenia patients are similar to those for the normal control group (corrected p < 0.05), and no significant differences exist between the regions of the brain from the FFT. In addition, we applied a repeated-measures analysis of variance (ANOVA) as the statistical tool. Depending on the ANOVA results, we evaluated the statistical significance of the differences in the features between the groups of subjects and observed a significant group effect [F = 334.208; p < 0.001] in FuzzyEn and a small group effect [F = 9.595; p < 0.01] in FFT.

Classification results

We present our classification results derived by extracting the FuzzyEn and FFT features from the 6 time windows. The purpose was to seek the best-performing models for the generated images. The accuracies for the validation set and the test set of our proposed methods and baseline methods are reported in Table 5.

Table 5 Comparison of the classification results.

Full size table

We discovered that the accuracy of our proposed methods for the validation set is higher than that for the test set because the test set was selected from the model with the highest accuracy for the validation set. Considering the randomness of the data, the models with the highest accuracy for the validation set may not be those with the highest accuracy for the test set, so the calculated indicators from the validation set are generally better than those from the test set.

We determine that the accuracy of the FuzzyEn feature is higher than that of the FFT feature. For the baseline methods, the FuzzyEn value of each model is higher than the FFT value in terms of the testing accuracy. For the proposed methods, the FuzzyEn value of each model is higher than the FFT value in terms of the testing accuracy and the verification accuracy. Therefore, we can conclude that the FuzzyEn features have a more substantial role in the classification results than the FFT features.

In the proposed methods, we discover that the testing accuracy is different when the convolution layer is different. The findings negate the notion that the greater is the number of layers that are included, the higher is the accuracy. In particular, using FuzzyEn features, we achieve the best performance with the B architecture (testing accuracy of 99.22%), which contains 5 convolution layers and is marginally better than the other methods.

Figure 6 shows a bar graph of the testing accuracies of the two features for all the methods. We observe that the proposed methods are superior to the baseline methods. We determined that the best result is obtained with the C < 2,2,1 > architecture and the FuzzyEn feature. The differences among the accuracy rates between the four DL methods are not statistically significant.

To further study the performance of the best-performing architecture (i.e., C architecture), we draw the accuracy and loss curves in Fig. 7. From the accuracy curve in Fig. 7A, we can observe that as the number of training epochs increases, the validation and testing accuracy show an overall upward trend. When the number of epochs of training reaches 180, the algorithm quickly converges to an ideal state, and the accuracy tends to be stable and reaches approximately 99.22%. In contrast to the accuracy curve, Fig. 7B shows that the training loss, validation loss and test loss are gradually reduced, and when the number of training epochs reaches 180, the testing loss reaches a stationary state of approximately 0.019.

Previous studies have shown that the performance of the trained model is significant for both the validation set and the test set and can be simultaneously judged by the training loss and the validation loss and reaches the stable state; thus, the base model that was trained in this experiment shows a satisfactory fit.

Comparison with different deep learning models

To quantify the importance of the 6 time windows for our results, we also applied 5 time windows and 7 time windows and retrained our network. The results show that the classification effect of using 6 time windows is better than the classification effects achieved with the other time windows for both the DL methods and the ML methods.

Table 6 compares the training times and the number of parameters among the proposed methods. We note that the more complex is the structure of the network, the longer is the training time and the higher is the number of parameters. Most network parameters are located in the last two layers (i.e., FC layer and SF layer) and require more storage and computing power during training and testing. From the obtained results, the accuracy of the C model is the highest, and the training time and number of training parameters are moderate. Therefore, we focus on the C model in the following discussion.

Table 6 Comparison of the training times and number of parameters of different structures.

Full size table

In the VGG style network, to keep the size of the output after each stack constant (filter size × number of cores), we determined the number of filters in each layer. In addition, we manually reduced the amount of data that is required to manually extract power features from EEG signals.

Discussion

Comparison of the FuzzyEn feature and FFT feature

EEG signals are complex nonlinear dynamic signals⁵ and contain the dynamic properties of brain activity⁴⁰. In this study, the eigenvalues of schizophrenic patients were higher than those of healthy controls. This finding is consistent with previous results. Fernández A et al. employed the Lempel–Ziv complexity (LZC) method to study healthy controls and patients with schizophrenia and discovered that in terms of time-domain characteristics, patients show higher performance in the entire brain complexity ⁴¹. The EEG signals of subjects with schizophrenia were more random, and therefore, had a greater approximate entropy compared to the EEG signals of healthy subjects⁴². In addition, as previously reported using multiscale entropy, the complexity of schizophrenic patients is higher than that of the control group⁴³. In the frequency-domain features, patients with schizophrenia had significantly higher theta power over the F4, F7, F8, P4 and O2 regions than healthy subjects⁴⁴. Compared with the healthy control group, schizophrenic patients have more active brain activity and are more likely to generate new EEG signal patterns⁴⁵.

The statistical test results show that both the FuzzyEn feature and the FFT feature can be used to distinguish healthy controls from patients with schizophrenia, but the FuzzyEn feature is more significant and more easily distinguished between healthy controls and patients with schizophrenia. The main differences between schizophrenic patients and healthy controls are concentrated in the frontal regions. This is consistent with previous research findings^46,47,48,49. The frontal area is mainly responsible for memory problems related to behavior regulation and cognitive perception ⁵⁰. The impairment of metacognitive function in patients with schizophrenia may be caused by the frontal region⁵¹. According to previous research on social cognition in schizophrenia⁵², the abnormality observed in this area is based on dopamine signaling to the prefrontal cortex.

In addition to the differences in brain topographic maps, the two features are also different in terms of the classification performance. For the classification results, the FuzzyEn feature performs better than the FFT feature in all methods. The EEG signals using FuzzyEn features are more effective for classification than those using FFT features. For these results, the FuzzyEn feature retains the time sequence, which adequately remembers the relative characteristics of the original signals during the sliding of the time window. The hybrid DNN, which is based on FuzzyEn features, combines the time-, frequency- and spatial-domain characteristics. Therefore, the hybrid DNN may more easily extract additional effective features based on the FuzzyEn feature than the FFT feature.

Comparison with other studies of schizophrenia

The classification problem of schizophrenia EEG signals involves extracting the discriminative features from EEG signals and then performing the classification. We compare our research with the related state-of-the-art techniques that have been developed in recent years, as shown in Table 1, which use different feature extraction and classification methods for classifying schizophrenia EEG signals.

In 2016, Bose et al. developed an SVM filter for the identification of schizophrenia based on the delta, theta, alpha, and beta bands that were extracted from EEG signals using a finite impulse response bandpass filter. In alpha power, the subject groups yielded a high classification accuracy of 83.33%. These results suggest that schizophrenic subjects can be identified by the absolute alpha ¹⁶. Johannesen et al. employed a Morlet continuous wavelet transform to extract time–frequency features in healthy communities and schizophrenic patients from EEG signals, and the SVM provided the highest classification accuracy of 87%⁵³.

In 2017, Jeong et al. developed a multimodal (audiovisual) emotion perception test. The discriminatory features were extracted using a mean subsampling technique from EEG recordings. Shrinkage linear discriminant analysis (SKLDA) can decrease the ill-conditioned covariance matrix, which provides a more accurate classification of the event-related potential (ERP), even when using an insufficient training sample size. Thus, SKLDA was employed for the classification and attained more than 98% accuracy⁵⁴. Piryatinska et al. created a low-dimensional feature space, which decomposed the EEG signals of adolescent schizophrenic and control subjects using the є-complexity of a continuous vector function. They utilized a random forest (RF) classifier and achieved an average accuracy of 85.3%⁵⁵. Chu et al. used three different types of International Affective Picture System (IAPS) pictures as visual stimuli and captured the associated brainwaves. They then employed the approximate entropy (ApEn) to extract features and classified them with an SVM. The researchers discovered that the classification accuracy of healthy controls and schizophrenic patients with obvious illness was as high as 81.5%⁵⁶.

The following year, Alimardani et al. proposed an efficient feature selection algorithm named Davies–Bouldin fast feature reduction (DB-FFR) to select the most discriminative features to enhance the classification rate. These researchers applied a modified version of the KNN classifier and achieved an 87.51% classification accuracy for the EEG features of schizophrenia patients and bipolar mood disorder (BMD) patients⁵⁷. The researchers also applied the steady-state visual evoked potential (SSVEP) of the EEG signals and extracted the power spectral densities. The feature was fed into five classifiers to characterize the EEG signals, and the KNN classifier provided the highest classification accuracy (91.30%), with the best feature set selected by the Fisher score between BMD and schizophrenic patients⁵⁸.

In 2019, Phang et al. proposed a DNN with a deep belief network (DBN) architecture for the automated classification of schizophrenia (SZ) based on the EEG effective connectivity. The structure has a multilayer architecture as an inherent feature extractor, which is able to learn hidden hierarchical representations of the complex brain network structure. These researchers employed directed connectivity (DC) based on vector autoregression (VAR), graph theory composite network (CN) metrics, and a combination of both as input features, and achieved 95% significant classification accuracy for the θ and β bands⁵⁹. In addition, for the same subjects, they applied combinations of various connectivity features as input features, including time- and frequency-domain metrics of the effective connectivity based on the VAR model and partial directed coherence (PDC), with complex network (CN) measures of network topology. The researchers designed a novel multidomain connectome CNN (MDC-CNN) based on a parallel ensemble of one dimensional (1D) and 2D CNNs to integrate the features from various domains and dimensions using different fusion strategies. The results showed that the MDC-CNN with combined connectivity features further improved the performance over single-domain CNNs and achieved a remarkable accuracy of 91.69% with a decision-level fusion⁶⁰. In the same year, Oh et al. established an eleven-layer CNN model to directly process the original EEG signals for analysis without any feature processing. The proposed model generated a classification accuracy of 98.07% using nonsample testing¹⁴.

In the comparative studies, the researchers collected the subjects themselves. Our subjects were in accordance with the usual format of the current collection: resting data for patients with small requirements, sampling rate = 500, and channels = 60. In Table 1, we also add these conditions. After observation, a unified conclusion cannot be obtained. In different studies, different information is collected for the subjects, different features are extracted, and different classifiers will affect the accuracy. Because of data limitations, the impact of the model is unknown. Given that a long time is needed to collect new data, we did not investigate it but will explore it in future studies. Therefore, we mainly focus on the accuracy of this study to show that the distinction between normal and schizophrenic patients can be achieved with a high accuracy.

It can be seen from Table 1 that most previous studies applied machine learning techniques to diagnose schizophrenia. These traditional methods are cumbersome and require feature extraction and selection before classification. In addition, these methods perform poorly when using large data sets. In some recently applied DL methods, the classification results of these studies have greatly improved compared with machine learning methods.

In this study, we propose a hybrid deep neural network model to classify schizophrenia. Compared with previous research, our experimental data are more numerous, and the amount of data is enlarged by sliding the time window. Before entering the model, we retained the brain electrodes by AEP mapping. This method has not appeared in previous research. We selected fuzzy entropy features and constructed the hybrid DNN model, which combines the advantages of a CNN and an LSTM. The classification accuracy rate reached 99.22%, and the experimental results are superior to those of previous studies. This result shows that our method provides a significant breakthrough in the classification of schizophrenia based on EEG data. In future research, the hybrid deep neural network classification method can also be applied to the diagnosis of EEG signals and other classifications of medical diseases.

Conclusions

In this study, we attempted to compare which feature of FuzzyEn and FFT is better and improve the accuracy of classification of schizophrenia in EEG signals. Our method involves two procedures. First, we convert EEG signals into a series of topology-preserving RGB images rather than standard EEG analysis techniques that disregard this spatial information. Second, we use a hybrid DNN that consists of CNN and LSTM components to address the RGB images and differentiate schizophrenic patients and healthy controls. In the hybrid structure, the CNN is used to process the RGB images and extract features from them, and an LSTM is used to structure the contextual information for long-term sequences of arbitrary length.

We compared the features of FuzzyEn and FFT and discovered that the FuzzyEn feature has a better effect than the FFT feature in terms of the classification accuracy. In addition, we determined that the maximum classification accuracy of our proposed method can reach 99.22%, which is higher than the accuracy of the baseline method. In addition, we review the latest methods of schizophrenia classification based on EEG signals. Compared with these studies, our study significantly improves the classification accuracy.

In the future, the proposed method will be trained using a more powerful graphics processing unit (GPU) to optimize the training time. A larger cohort of subjects with schizophrenia/healthy controls will be taken into account to further demonstrate and fully exploit the generalization potential of DL techniques for clinical applications.

References

Boostani, R., Sadatnezhad, K. & Sabeti, M. An efficient classifier to diagnose of schizophrenia based on the EEG signals. Expert Syst. Appl. 36, 6492–6499 (2009).
Article Google Scholar
Ahmadlou, M., Adeli, H. & Adeli, A. Fractality analysis of frontal brain in major depressive disorder. Int. J. Psychophysiol. 85, 206–211 (2012).
Article PubMed Google Scholar
Cogan, D., Birjandtalab, J., Nourani, M., Harvey, J. & Nagaraddi, V. Multi-biosignal analysis for epileptic seizure monitoring. Int. J. Neural Syst. https://doi.org/10.1142/s0129065716500313 (2017).
Article PubMed Google Scholar
Morabito, F. C. et al. A longitudinal EEG study of Alzheimer’s disease progression based on a complex network approach. Int. J. Neural Syst. https://doi.org/10.1142/s0129065715500057 (2015).
Article PubMed Google Scholar
Akar, S. A., Kara, S., Latifoğlu, F. & Bilgiç, V. Analysis of the complexity measures in the EEG of schizophrenia patients. Int. J. Neural Syst. 26, 1650008 (2016).
Article PubMed Google Scholar
Bonita, J. D. et al. Time domain measures of inter-channel EEG correlations: a comparison of linear, nonparametric and nonlinear measures. Cognit. Neurodyn. 8, 1–15 (2014).
Article CAS Google Scholar
Rozgic, V., Vitaladevuni, S. N. & Prasad, R. in IEEE International Conference on Acoustics.
Gadhoumi, K., Lina, J.-M., Mormann, F. & Gotman, J. Seizure prediction for therapeutic devices: a review. J. Neurosci. Methods 260, 270–282 (2016).
Article PubMed Google Scholar
Alotaiby, T. N., Alshebeili, S. A., Alshawi, T., Ahmad, I. & El-Samie, F. E. A. EEG seizure detection and prediction algorithms: a survey. EURASIP J. Adv. Signal Process. 2014, 183 (2014).
Article ADS Google Scholar
Min, S., Lee, B. & Yoon, S. Deep learning in bioinformatics. Brief. Bioinform. 18, 851–869 (2017).
PubMed Google Scholar
Zhou, M. et al. Epileptic seizure detection based on EEG signals and CNN. Front. Neuroinform. 12, 95 (2018).
Article PubMed PubMed Central Google Scholar
Chen, H., Song, Y. & Li, X. A deep learning framework for identifying children with ADHD using an EEG-based brain network. Neurocomputing 356, 83–96 (2019).
Article Google Scholar
Ieracitano, C., Mammone, N., Bramanti, A., Hussain, A. & Morabito, F. C. A convolutional neural network approach for classification of dementia stages based on 2D-spectral representation of EEG recordings. Neurocomputing 323, 96–107 (2019).
Article Google Scholar
Oh, S. L., Vicnesh, J., Ciaccio, E. J., Yuvaraj, R. & Acharya, U. R. Deep convolutional neural network model for automated diagnosis of schizophrenia using EEG signals. Appl. Sci. 9, 2870 (2019).
Article Google Scholar
Andersen, E., Campbell, A., Girdler, S., Duffy, K. & Belger, A. Acute stress modifies oscillatory indices of affective processing: Insight on the pathophysiology of schizophrenia spectrum disorders. Clin. Neurophysiol. 130, 214–223 (2019).
Article PubMed Google Scholar
Bose, T., Sivakumar, S. D. & Kesavamurthy, B. Identification of schizophrenia using EEG alpha band power during hyperventilation and post-hyperventilation. J. Med. Biol. Eng. 36, 901–911 (2016).
Article Google Scholar
Dierks, T. Equivalent EEG sources determined by FFT approximation in healthy subjects, schizophrenic and depressive patients. Brain Topogr. 4, 207–213 (1992).
Article CAS PubMed Google Scholar
Tibdewal, M. N., Dey, H. R., Mahadevappa, M., Ray, A. & Malokar, M. Multiple entropies performance measure for detection and localization of multi-channel epileptic EEG. Biomed. Signal Process. Control 38, 158–167 (2017).
Article Google Scholar
Xiang, J. et al. Abnormal entropy modulation of the EEG signal in patients with schizophrenia during the auditory paired-stimulus paradigm. Front. Neuroinform. 13, 4 (2019).
Article PubMed PubMed Central Google Scholar
Chan, A. M., Sun, F. T., Boto, E. H. & Wingeier, B. M. Automated seizure onset detection for accurate onset time determination in intracranial EEG. Clin. Neurophysiol. 119, 2687–2696. https://doi.org/10.1016/j.clinph.2008.08.025 (2008).
Article PubMed Google Scholar
Chen, W., Wang, Z., Xie, H. & Yu, W. Characterization of surface EMG signal based on fuzzy entropy. IEEE Trans. Neural Syst. Rehabil. Eng. 15, 266–272 (2007).
Article PubMed Google Scholar
Kosko, B. Fuzzy entropy and conditioning. Inf. Sci. 40, 165–174 (1986).
Article MathSciNet MATH Google Scholar
Cheng, H., Chen, Y. & Jiang, X. Thresholding using two-dimensional histogram and fuzzy entropy principle. IEEE Trans. Image Process. 9, 732–735 (2000).
Article ADS CAS PubMed Google Scholar
Heckbert, P. Fourier transforms and the fast Fourier transform (FFT) algorithm. Comput. Graphics 2, 15–463 (1995).
Google Scholar
Zonst, A. E. Understanding the FFT: A Tutorial on the Algorithm & Software for Laymen, Students, Technicians & Working Engineers (Citrus Press, London, 1995).
Google Scholar
AlGhayab, H. R., Li, Y., Siuly, S. & Abdulla, S. Epileptic seizures detection in EEGs blending frequency domain with information gain technique. Soft Comput. 23, 227–239 (2019).
Article Google Scholar
Durongbhan, P. et al. A dementia classification framework using frequency and time-frequency features based on EEG signals. IEEE Trans. Neural Syst. Rehabil. Eng. 27, 826–835 (2019).
Article PubMed Google Scholar
Snyder, J. P. Map Projections–A Working Manual Vol. 1395 (US Government Printing Office, Washington, 1987).
Google Scholar
Bashivan, P., Rish, I., Yeasin, M. & Codella, N. Learning representations from EEG with deep recurrent-convolutional neural networks. https://arxiv.org/abs/1511.06448 (2015).
Alfeld, P. A trivariate clough—tocher scheme for tetrahedral data. Comput. Aided Geom. Des. 1, 169–181 (1984).
Article MATH Google Scholar
Lecun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436 (2015).
Article ADS CAS PubMed Google Scholar
Simonyan, K. & Zisserman, A.. Very deep convolutional networks for large-scale image recognition. https://arxiv.org/abs/1409.1556 (2014).
Petrosian, A., Prokhorov, D., Homan, R., Dasheiff, R. & Wunsch, D. II. Recurrent neural network based prediction of epileptic seizures in intra-and extracranial EEG. Neurocomputing 30, 201–218 (2000).
Article Google Scholar
Graves, A. Supervised Sequence Labelling with Recurrent Neural Networks 5–13 (Springer, Berlin, 2012).
MATH Google Scholar
Kingma, D. P. & Ba, J. Adam: A method for stochastic optimization. https://arxiv.org/abs/1412.6980 (2014).
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I. & Salakhutdinov, R. Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929–1958 (2014).
MathSciNet MATH Google Scholar
Duda, R. O., Hart, P. E. & Stork, D. G. Pattern Classification (Wiley, Hoboken, 2012).
MATH Google Scholar
Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate : a practical and powerful approach to multiple testing. J. R. Stat. Soc. 57, 289–300 (1995).
MathSciNet MATH Google Scholar
Armstrong, R. A. When to use the Bonferroni correction. Ophthalic Physiol. Opt. J. Br. Coll. Ophthalmic Opt. 34, 502–508 (2015).
Article Google Scholar
Acharya, U. R. et al. A novel depression diagnosis index using nonlinear features in EEG signals. Eur. Neurol. 74, 79–83 (2015).
Article PubMed Google Scholar
Fernández, A. et al. Lempel-Ziv complexity in schizophrenia: a MEG study. Clin. Neurophysiol. 122, 2227–2235 (2011).
Article PubMed Google Scholar
Namazi, H., Aghasian, E. & Ala, T. S. Fractal-based classification of electroencephalography (EEG) signals in healthy adolescents and adolescents with symptoms of schizophrenia. Technol Health Care 27, 1–9 (2019).
Article Google Scholar
Takahashi, T. et al. Antipsychotics reverse abnormal EEG complexity in drug-naive schizophrenia: a multiscale entropy analysis. Neuroimage 51, 173–182 (2010).
Article CAS PubMed PubMed Central Google Scholar
Begić, D. et al. Quantitative electroencephalography in schizophrenia and depression. Psychiatria Danubina 23, 355–362 (2011).
PubMed Google Scholar
Zhao, Q. et al. in 2013 6th International IEEE/EMBS Conference on Neural Engineering (NER). 484–488 (IEEE).
Li, Y. et al. Abnormal EEG complexity in patients with schizophrenia and depression. Clin. Neurophysiol. 119, 1232–1241 (2008).
Article PubMed Google Scholar
Ford, J. M. et al. Out-of-synch and out-of-sorts: dysfunction of motor-sensory communication in schizophrenia. Biol. Psychiatry. 63(8), 736–743 (2008).
Article PubMed Google Scholar
Nenadic, I., Yotter, R. A., Sauer, H. & Gaser, C. Cortical surface complexity in frontal and temporal areas varies across subgroups of schizophrenia. Hum. Brain Mapp. 35, 1691–1699 (2014).
Article PubMed Google Scholar
Goldstein, M. R., Peterson, M. J., Sanguinetti, J. L., Tononi, G. & Ferrarelli, F. Topographic deficits in alpha-range resting EEG activity and steady state visual evoked responses in schizophrenia. Schizophr. Res. 168, 145–152 (2015).
Article PubMed Google Scholar
Highley, J. R. et al. Schizophrenia and the frontal lobes: post-mortem stereological study of tissue volume. Br. J. Psychiatry 178, 337–343 (2001).
Article CAS PubMed Google Scholar
Asmal, L. et al. Symptom attribution and frontal cortical thickness in first-episode schizophrenia. Early Interv. Psychiatry 12, 652–659 (2018).
Article PubMed Google Scholar
Sokunbi, M. O. et al. Nonlinear complexity analysis of brain FMRI signals in schizophrenia. PLoS ONE 9, e95146 (2014).
Article ADS PubMed PubMed Central CAS Google Scholar
Johannesen, J. K., Bi, J., Jiang, R., Kenney, J. G. & Chen, C.-M.A. Machine learning identification of EEG features predicting working memory performance in schizophrenia and healthy adults. Neuropsychiatr. Electrophysiol. 2, 3 (2016).
Article PubMed PubMed Central Google Scholar
Jeong, J. W. et al. Classifying schizotypy using an audiovisual emotion perception test and scalp electroencephalography. Front. Hum. Neurosci. 11, 450 (2017).
Article PubMed PubMed Central Google Scholar
Piryatinska, A., Darkhovsky, B. & Kaplan, A. Binary classification of multichannel-EEG records based on the ϵ-complexity of continuous vector functions. Comput. Methods Programs Biomed. 152, 131–139 (2017).
Article PubMed Google Scholar
Chu, W. L., Huang, M. W., Jian, B. L. & Cheng, K. S. Analysis of EEG entropy during visual evocation of emotion in schizophrenia. Ann. Gen. Psychiatry 16, 34 (2017).
Article PubMed PubMed Central Google Scholar
Alimardani, F. & Boostani, R. DB-FFR: a modified feature selection algorithm to improve discrimination rate between bipolar mood disorder (BMD) and schizophrenic patients. Iran. J. Sci. Technol. Trans. Electr. Eng. 42, 251–260 (2018).
Article Google Scholar
Alimardani, F., Cho, J.-H., Boostani, R. & Hwang, H.-J. Classification of bipolar disorder and schizophrenia using steady-state visual evoked potential based features. IEEE Access 6, 40379–40388 (2018).
Article Google Scholar
Phang, C.-R., Ting, C.-M., Samdin, S. B. & Ombao, H. in 2019 9th International IEEE/EMBS Conference on Neural Engineering (NER). 401–406 (IEEE).
Phang, C.-R., Noman, F. M., Hussain, H., Ting, C.-M. & Ombao, H. A multi-domain connectome convolutional neural network for identifying schizophrenia from EEG connectivity patterns. IEEE J. Biomed. Health Inform. https://doi.org/10.1109/jbhi.2019.2941222 (2019).
Article PubMed Google Scholar

Download references

Acknowledgments

We are very grateful for the data provided by the Beijing Huilongguan Hospital Psychiatric Research Center.

Funding

This project was supported by the National Natural Science Foundation of China (61503272, 61305142, 61373101 and 61873178), the Natural Science Foundation of Shanxi (2015021090), the China Postdoctoral Science Foundation (2016M601287), the Shanxi Provincial Foundation for Returned Scholars, China (2016–037), and the Scientific Research Foundation for Returned Overseas Chinese Scholars.

Author information

Authors and Affiliations

College of Information and Computer, Taiyuan University of Technology, Taiyuan, China
Jie Sun, Waqar Hussain, Bin Wang, Jiayue Xue & Jie Xiang
College of Software, Taiyuan University of Technology, Taiyuan, China
Rui Cao
Graduate School of Interdisciplinary Science and Engineering in Health Systems, Okayama University, Okayama, Japan
Mengni Zhou

Authors

Jie Sun
View author publications
You can also search for this author in PubMed Google Scholar
Rui Cao
View author publications
You can also search for this author in PubMed Google Scholar
Mengni Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Waqar Hussain
View author publications
You can also search for this author in PubMed Google Scholar
Bin Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jiayue Xue
View author publications
You can also search for this author in PubMed Google Scholar
Jie Xiang
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

J.S. and R.C. completed the experimental process and wrote the manuscript. M.Z. and W.H. provided advice and guidance on the article. B.W. and J.X. revised the manuscript content. J.X. provided the direction and ideas of the research.

Corresponding author

Correspondence to Jie Xiang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Sun, J., Cao, R., Zhou, M. et al. A hybrid deep neural network for classification of schizophrenia using EEG Data. Sci Rep 11, 4706 (2021). https://doi.org/10.1038/s41598-021-83350-6

Download citation

Received: 07 August 2020
Accepted: 07 January 2021
Published: 25 February 2021
DOI: https://doi.org/10.1038/s41598-021-83350-6

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

A neural speech decoding framework leveraging deep learning and speech synthesis

EEG is better left alone

Rhythmicity of neuronal oscillations delineates their cortical and spectral architecture

Introduction

New method

Participants

Feature extraction

Fuzzy entropy

Fast Fourier transform

Making images

Construction of the hybrid DNN

CNN

LSTM

Training and testing

Statistical test

Baseline methods

Results

Data recording and preprocessing

Comparison of the feature

Classification results

Comparison with different deep learning models

Discussion

Comparison of the FuzzyEn feature and FFT feature

Comparison with other studies of schizophrenia

Conclusions

References

Acknowledgments

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Share this article

Comments

Search

Quick links