Salih Beyaz1, Koray Açıcı2, Emre Sümer2

1Department of Orthopedics and Traumatology, Başkent University Adana Turgut Noyan Training and Research Centre. Adana, Turkey
2Department of Computer Engineering, Başkent University, Ankara, Turkey

Keywords: Artificial intelligence; deep learning, femoral neck fracture, genetic algorithm

Abstract

Objectives: This study aims to detect frontal pelvic radiograph femoral neck fracture using deep learning techniques.

Patients and methods: This retrospective study was conducted between January 2013 and January 2018. A total of 234 frontal pelvic X-ray images collected from 65 patients (32 males, 33 females; mean age 74.9 years; range, 33 to 89 years) were augmented to 2106 images to achieve a satisfactory dataset. A total of 1,341 images were fractured femoral necks while 765 were non-fractured ones. The proposed convolutional neural network (CNN) architecture contained five blocks, each containing a convolutional layer, batch normalization layer, rectified linear unit, and maximum pooling layer. After the last block, a dropout layer existed with a probability of 0.5. The last three layers of the architecture were a fully connected layer of two classes, a softmax layer and a classification layer that computes cross entropy loss. The training process was terminated after 50 epochs and an Adam Optimizer was used. Learning rate was dropped by a factor of 0.5 on every five epochs. To reduce overfitting, regularization term was added to the weights of the loss function. The training process was repeated for pixel sizes 50¥50, 100¥100, 200¥200, and 400¥400. The genetic algorithm (GA) approach was employed to optimize the hyperparameters of the CNN architecture and to minimize the error after testing the model created by the CNN architecture in the training phase.

Results: Performance in terms of sensitivity, specificity, accuracy, F1 score, and Cohen’s kappa coefficient were evaluated using five- fold cross validation tests. Best performance was obtained when cropped images were rescaled to 50¥50 pixels. The kappa metric showed more reliable classifier performance when 50¥50 pixels image size was used to feed the CNN. The classifier performance was more reliable according to other image sizes. Sensitivity and specificity rates were computed to be 83% and 73%, respectively. With the inclusion of the GA, this rate increased by 1.6%. The detection rate of fractured bones was found to be 83%. A kappa coefficient of 55% was obtained, indicating an acceptable agreement.

Conclusion: This experimental study utilized deep learning techniques in the detection of bone fractures in radiography. Although the dataset was unbalanced, the results can be considered promising. It was observed that use of smaller image size decreases computational cost and provides better results according to evaluation metrics.