Document Type : Original article
Authors
1 Koneru Lakshmaiah Education Foundation, Andhra Pradesh, India
2 Department of ECE, Koneru Lakshmaiah Education Foundation, Vaddeswaram, Guntur, A.P., India
Abstract
Keywords
Abstract
Background: This research adds to the growing body of work demonstrating the vital role of image categorization in the medical sector. The efficiency of illness diagnosis is being greatly enhanced using image classification. A brain tumor is a collection of abnormal cells in the brain. Diagnostic precision is required when looking for a tumor in the brain because even the smallest error in human judgement can have disastrous results. To get around this problem, the medical community has implemented an automated brain tumor segmentation system. A variety of methods are employed to segment a brain tumor, but the results are inconsistent. To improve the quality of MRI images, we present our findings in this paper. Deep learning methods for image segmentation and classification are discussed in this paper.
Methods: In this paper we a very robust deep learning method for image segmentation and classification. For image classification, we are employing the enhanced faster R-CNN method. The ResNet50 model is used to differentiate between tumor and non-tumor images. The next step in this framework is to use Deep Residual UNET for segmentation. RESUNET is an FCNN that maximizes efficiency. The proposed method works well in terms of its ability to find and classify things accurately.
Results: The accuracy rate for identifying tumours in MRI scans using the proposed technique is 94.23%. Using transfer learning with Resnet50 as the base model and the use of discriminative learning rates, our model achieved superior results than other recent models.
Conclusion: Within the scope of this study, we have integrated the residual networks with the U-Net to make the network stronger. This strategy improves the efficiency of classification and segmentation tools. To achieve a higher level of accuracy, the model may be trained further, or additional data may be applied in the training process.
Keywords: CNN, Deep Residual U-Net, Residual block, Resnet50, Segmentation
Introduction
Due to the impact of brain cellular metabolism, the rapid and uncontrolled growth of brain cells can disrupt connectivity with nearby cells, resulting in defects in structure and functionality. Because of the impact of brain cellular metabolism, this abnormality in the brain is referred to as “cancer” or “tumor”. Through scanning, which ultimately scans a portion of the brain, imaging technology can be used to further identify these disorders in the brain through scanning. One of the scanning techniques we used is MRI, while another is a CT scan. Both are utilized to determine the abnormalities in the brain. The abnormalities of the brain can easily be observed using MRI scan images. This study uses MRI scanning rather than CT scanning to get a better picture of the brain. Figure 1 depicts a brain MR scan with tumor areas highlighted. Radiation and surgery are used to treat this tumor. Early-stage tumors are best treated with radiation therapy, while more advanced tumors are treated with surgery.
A brain tumor biopsy cannot be performed surgically, unlike other types of tumors in the body. To get the best possible treatment and avoid surgery, it is necessary to develop a proper MRI imaging diagnostic approach (1). Based on Berkeley Wavelet Processing (BWT), Bahadure et al developed a classification strategy for Supporting Vector Machines (SVMs) to identify brain tumors (2). The researchers employed a variety of techniques to increase the brightness of the brain images to increase the likelihood of detecting tumors. An MRI brain tumor identification framework was developed by Perez et al (3). There are many ways to use MRI to divide up a tumor, but most of them use either generative or discriminative models (4-6).
Doctors have become increasingly concerned about brain tumors in the last few decades. Brain tumors called gliomas arise from the growth of glial cells. These tumors are categorized into four grades: grade 1 and grade 2 are considered normal, while grade 3 and 4 are regarded cancerous (7). Every pixel in CNNs can be assigned a class label to alleviate the problem of redundancy, according to modern segmentation techniques (8). In U-NET, the researchers followed a straightforward structure and made a few adjustments to the layers of pathways that shrink and expand (9). They used residual block instead of regular block in the layer. Each block has two convolution units, one with a Batch Normalization (BN) layer and the other with a Parametric Rectified Linear Unit (PReLU). Although the segmentation results are improved, the pixel-wise computation time is still too high for these methods. Here, we are using RESUNET that has been modified with U-NET; by adding residual blocks to U-NET, we are able to obtain the desired result of RESUNET. The author proposed a technique to distinguish between brain tumors and non-tumors, and the technique demonstrated high accuracy (10).
Ge C et al (11) suggested using a deep form of semi-supervised learning to get the most out of the unlabeled data. 3D neural networks were proposed in Ge C et al study (12). Studies proposed different methods for medical image analysis by using deep learning (13,14). In a study by Gajula S et al (15) segmentation method for MRI brain images was improved. Using the method of machine learning known as logistic regression, it is possible to carry out automatic brain abnormality identification using the work presented here.
Materials and Methods
RESNET 50
This layer has a total of 50 layers, with 48 of them being convolution layers, one of which is a maximum poll layer, and one of which is an average pool layer. With this net, we can train ultra-deep neural networks. Image classification, object location, and object detection can all benefit from this design (Table 1). This system can also be utilized for non-computer vision applications to add depth while reducing processing costs. The dataset TCGA is collected from Kaggle and consists of MRI brain images without tumor 2556 and with tumor 1373 with a size of 256*256. This model is implemented by using the Keras framework. The selection of datasets that we used for testing and training purposes is outlined in table 1. In table 2, we are provided with a report on the classification of the suggested system. The comparison of classification accuracy with several available approaches is shown in table 3.
Table 1. Image classification
Classification |
Images |
|
|||
Total |
Training set (70%) |
Validation set (15%) |
Test set (15%) |
|
|
No tumor |
2556 |
2751 |
589 |
589 |
|
Tumor |
1373 |
Table 2. Classification report
F1-score |
Recall |
Precision |
Value |
0.96 |
0.95 |
0.96 |
0 |
0.92 |
0.93 |
0.91 |
1 |
0.94 |
|
|
Accuracy |
0.94 |
0.94 |
0.93 |
Ave |
0.94 |
0.94 |
0.94 |
Weighted avg |
Table 3. Comparison of classification accuracy with existing methods
Method |
Accuracy |
Data sets |
Molecular-based glioma subtype classification (12) |
86.53 |
TCGA |
3D multi-scale convolutional network (13) |
89.47 |
TCGA |
Proposed |
94.23 |
TCGA |
UNET, on the other hand, is a completely con-ventional network that is designed to perform at a much higher level. Deep residual architectures and UNET architectures are combined to benefit from this technology’s advantages. Using residual blocks in deep architecture can solve the exploding gradient issue. It is effective at segmentation tasks, and it contributes to a more efficient flow of information between network layers. It was primarily employed in the field of Magnetic Resonance Imaging (MRI) brain tumor segmentation in the medical field. The RESUNET architecture is composed of two networks: the encoding network and the decoding network. These two networks are linked together using the UNET, as illustrated in figure 2. We are making use of the ReLu activation function in the UNET.
Encoder
It is made up of three pre-activated residual blocks. To keep future maps smaller, we are keeping stride at 2. This stride reduces spatial dimensions by half.
Decoder
To generate a segmentation mask, a decoder block collects the information from the bridge and selects the information that is needed. It is composed of three decoder blocks, after each of which the special dimensions are doubled with increased sampling and the number of feature channels is reduced. In the final decoder’s convolution layer, we have a sigmoid activation function for pixel-by-pixel classification.
Figure 3 depicts the process flow of the segmentation procedure. The brain MRI images and the cor-responding tumor masks were generated using segmentation blocks, after which it was used to train the segmentation model. Tumor masks were generated automatically using the segmentation model that had been trained and the MRI images that had been processed. Classifying images into different grades was done using a subset of MRI images that had been processed using the generated tumor masks.
Down sampling is performed by replacing the max pooling layer with a conventional layer during the design of ResUnet. In a variety of contexts, this has been found to be useful. Down sampling with convolution layers uses a stride of 2 and no padding, while the other convolution layers use a stride of 1 and zero padding to maintain the image’s original dimensions. The activation function is another important part of the network’s ability to represent things in a way that is not linear. To improve the amplitude of gradient backpropagation and improve the training process, the leaky rectified linear activation (LReLu) function is applied.
Deep neural network
Deep Neural Networks (DNNs) are particularly effective in extracting relevant entire brain tumor and intra-tumor areas.
Convolutional neural network
Using a series of convolutional and completely linked layers, CNNs can categorize complex features collected from small regions of input datasets. The proposed model consists of different convolution layers, a max pooling layer, and fully connected layers. To link two paths together at the very end, residual units are employed. Up sampling and convolution filters are used to build an upgrading path using three residual blocks.
Residual network
He et al (13) proposed a residual neural network to solve the degradation problem. We employ residual networks to reduce the occurrence of the vanishing gradient problem. In this, the output of every residual layer is convoluted with its input, giving this result as an input to the next layer. Let R (x) be the residual mapping. The combination of input and output of the stacked layer is F (x)+x. Then the resultant output from the residual block is R (x)=F (x)+x as shown in figure 4.
Deep RESUNET
A semantic segmentation neural network, which includes both the U-Net and the residual neural network, is presented here. This network has three blocks: encoding, bridge, and decoding. All of these are created using residual units that have undergone convolution and identity mapping. The Batch Normalization layer, ReLu layer, and convolution layer are all included in the convolution block. The identity mapping technique is used to connect the input and output units. The different layers in RESUNET are shown in table 4.
Table 4. RESUNET network structure
|
Stage |
Layer |
Filter |
Stride |
Output size |
Input |
|
|
|
|
256*256*3 |
Encoder |
Stage 1 |
conv 1 |
3*3/64 |
1 |
256*256*64 |
|
|
conv 2 |
3*3/64 |
1 |
256*256*64 |
|
Stage 2 |
conv 3 |
3*3/128 |
2 |
128*128*128 |
|
|
conv 4 |
3*3/128 |
1 |
128*128*128 |
|
Stage 3 |
conv 5 |
3*3/256 |
2 |
64*64*256 |
|
|
conv 6 |
3*3/256 |
1 |
64*64*256 |
Bridge |
Stage 4 |
conv 7 |
3*3/512 |
2 |
32*32*512 |
|
|
conv 8 |
3*3/512 |
1 |
32*32*512 |
Decoder |
Stage 5 |
conv 9 |
3*3/256 |
1 |
64*64*256 |
|
|
conv 10 |
3*3/256 |
1 |
64*64*256 |
|
Stage 6 |
conv 11 |
3*3/128 |
1 |
128*128*128 |
|
|
conv 12 |
3*3/128 |
1 |
128*128*128 |
|
Stage 7 |
conv 13 |
3*3/64 |
1 |
256*256*64 |
|
|
conv 14 |
3*3/64 |
1 |
256*256*64 |
Output |
|
conv 15 |
1*1 |
1 |
256*256*1 |
Results
In this study, data was collected from the Kaggle TCGA dataset, and it was used for classification and detection of brain tumor. Tumors on different MRI images on collected data are shown in figure 5. Loss and accuracy curves for classification and segmentation are shown in figures 6 and 7. Figure 8 is segmentation of the tumor from MRI images.
Performance metrics
The following equations give the performance metrics estimation for the following MRI brain disorder calculation. The accuracy and sensitivity are calculated as follows:
The comparison of the suggested model’s accuracy results to those of several models presented in table 5.
Table 5. Accuracy results for different models
|
Model |
Accuracy |
Data |
Tumor segmentation |
Proposed |
91.64 |
TCGA |
Tumor detection |
Tumor detection FLAIR MRI random forest (17) |
92 |
TCGA |
Proposed |
94.23 |
TCGA |
In figure 8, the first image depicts an MRI brain image that must be processed, and the second image depicts the original mask image. By combining these two images, we can obtain the predicted mask image, which is depicted in the third image. We have a brain MRI with the original mask in the fourth image and a brain MRI with the predicted mask in the final image.
Conclusion
Tumor segmentation is a critical step in the therapy of any tumor. Deep Neural Networks are powerful tools for segmentation. But during the training process, they suffer from a vanishing gradient problem. We are using residual networks with identity connections to overcome this problem, and as a result, we are getting better results in terms of accuracy and computation time than we were previously. When it comes to identifying low-grade gliomas, the proposed method works well. To strengthen the network, we have combined the residual networks with the U-Net in this study. The performance of the classification and segmentation systems is improved due to this method. This represents a significant step forward in the direction of rapid convergence, which will be beneficial when high-performance computer resources are limited. When it comes to accuracy and tumor identification, this method performs admirably
References