ORIGINAL ARTICLE Year : 2010  Volume : 21  Issue : 6  Page : 10731080 A comparative study of artificial neural network and multivariate regression analysis to analyze optimum renal stone fragmentation by extracorporeal shock wave lithotripsy Neeraj K Goyal^{1}, Abhay Kumar^{1}, Sameer Trivedi^{1}, Udai S Dwivedi^{1}, TN Singh^{2}, Pratap B Singh^{1}, ^{1} Department of Urology, Institute of Medical Sciences, Banaras Hindu University, Varanasi, India ^{2} Department of Earth Sciences, Indian Institute of Technology, Mumbai, India Correspondence Address: To compare the accuracy of artificial neural network (ANN) analysis and multivariate regression analysis (MVRA) for renal stone fragmentation by extracorporeal shock wave lithotripsy (ESWL). A total of 276 patients with renal calculus were treated by ESWL during December 2001 to December 2006. Of them, the data of 196 patients were used for training the ANN. The predictability of trained ANN was tested on 80 subsequent patients. The input data include age of patient, stone size, stone burden, number of sittings and urinary pH. The output values (predicted values) were number of shocks and shock power. Of these 80 patients, the input was analyzed and output was also calculated by MVRA. The output values (predicted values) from both the methods were compared and the results were drawn. The predicted and observed values of shock power and number of shocks were compared using 1:1 slope line. The results were calculated as coefficient of correlation (COC) (r2 ). For prediction of power, the MVRA COC was 0.0195 and ANN COC was 0.8343. For prediction of number of shocks, the MVRA COC was 0.5726 and ANN COC was 0.9329. In conclusion, ANN gives better COC than MVRA, hence could be a better tool to analyze the optimum renal stone fragmentation by ESWL.
Introduction Artificial neural network (ANN) is a division of the "Artificial Intelligence", other than Case Based Reasoning, Expert Systems, and Genetic Algorithms. The Classical statistics, Fuzzy logic and Chaos theory are also considered to be related areas. The ANN is an information processing system that simulates the structure and functions of the human intellect. It attempts to reproduce the approach in which the human brain works in processes such as studying, remembering, reasoning and inducing with a complex network which is executed by comprehensively connecting various processing units. It is a highly interconnected structure that consists of many simple processing elements (called neurons) capable of performing massively parallel computation for data processing and knowledge representation. The paradigms in this area are based on direct modeling of the human neuronal system. [1] A neural network can be considered as a smart hub that is able to forecast an output pattern when it recognizes and learns a given input pattern. The neural network is first trained by processing a large number of input patterns and shows what output resulted from each input pattern. The neural network is able to recognize resemblance when presented with a new input pattern (even in imprecise data) after proper training and results in a predicted output pattern. Neural networks may be used as a direct replacement for autocorrelation, multivariable regression, linear regression, trigonometric and other statistical analysis techniques. When data are analyzed using a neural network, it is possible to detect important predictive patterns that were not previously perceptible to a nonexpert system. Thus, the neural network can act like an intelligent system. Particular network can be distinct using three fundamental components: transfer function, network architecture and learning law. [2] One has to define these components, depending upon the problem to be solved. Network Training A network first needs to be trained before extracting new information. Numerous different algorithms are available for training of neural networks. But the back propagation algorithm (capable to solve complex predicting problems) is the most adaptable and robust technique, which endows with the most proficient learning procedure for multilayer neural networks. It consists of at least three layers: input layer, hidden layer and output layer. Each layer consists of a number of elementary processing units called neurons, and each neuron is connected to the next layer through weights, i.e. neurons in the input layer will send its output as an input for neurons in the hidden layer and similar is the connection between hidden and output layer. According to the problem to be solved, number of hidden layers and number of neurons in the hidden layer changes. The number of input and output neurons is the same as the number of input and output variables. To make a distinction between the different processing units, values called biases are introduced in the transfer functions. These biases are referred to as the weight of a neuron. Except for the input layer, all the neurons in the back propagation network are connected with a bias neuron and a transfer function. The application of these transfer functions depends on the purpose of the neural network. The output layer produces the computed output vectors corresponding to the solution. During training of the network, data are processed through the input layer to hidden layer, until they reach the output layer (forward pass). In this layer, the output is compared to the measured values (the "true" output). The difference or error between both is processed back through the network (backward pass), updating the individual weights of the connections and the biases of the individual neurons. The input and output data are mostly represented as vectors called training pairs. The process as mentioned above is repeated for all the training pairs in the data set, until the network error converged to a threshold minimum defined by a corresponding cost function; usually the root mean squared error (RMS) or summed squared error (SSE). [Figure 1] the j th neuron is connected with a number of inputs{Figure 1} x i = (x 1 , x 2, x 3 , xn ) The net input values in the hidden layer will be [INLINE:1] where xi = input units, w ij = weight on the connection of ith input and jth neuron, θj = bias neuron (optional), and n = number of input units. So, the net output from hidden layer is calculated using a logarithmic sigmoid function O j = f(netj) = 1/1 + e (netj+θj) The total input to the kth unit is [INLINE:2] where θk = bias neuron and w jk = weight between jth neuron and kth output. So, the total output from 1 th unit will be O k = f(netk). In the learning process, the network is presented with a pair of patterns, an input pattern and a corresponding desired output pattern. The network computes its own output pattern using its (mostly incorrect) weights and thresholds. Now the actual output is compared with the desired output. Hence, the error at any output in layer k is e k = t k  O k where t k = desired output and O k = actual output. The total error function is given by [INLINE:3] Training of the network is basically a process of arriving at an optimum weight space of the network. The descent down error surface is made using the following rule: [INLINE:4] Where η is the learning rate parameter and E is the error function. The update of weights for the (n + 1) th pattern is given as: [INLINE:5] Similar logic applies to the connections between the hidden and output layers. [3] Each pass through all the training patterns is called a cycle or an epoch. The process is then repeated as many epochs as required until the error is within the user specified goal is reached fruitfully. This quantity is the measure of how the network has learned. Multivariate Regression Analysis The purpose of multiple regressions is to learn more about the relationship between several independent or predictor variables and a dependent or criterion variable. The goal of regression analysis is to determine the values of parameters for a function, which cause the function to best fit a set of data observations provided. In linear regression, the function is a linear (straightline) equation. When there is more than one independent variable, then multivariate regression analysis is used to get bestfit equation. Multiple regressions solve the data sets by performing least squares fit. It constructs and solves the simultaneous equations by forming the regression matrix and solving for the coefficient using the backslash operator. Renal calculus is a common problem in daily urological practice, and despite the advances in the diagnostic and therapeutic modalities, there is considerable morbidity in managing these cases. Since its introduction in 1980, extracorporeal shock wave lithotripsy (ESWL) is the only noninvasive treatment of choice for most renal calculi despite its pitfalls. [5],[6],[7] The most distressing condition for an urologist is when stone does not get fragmented after four sittings of ESWL and an alternative treatment is required. This is what is known as optimum fragmentation (defined as none of the remaining fragments of >4 mm). The possibility of determining the optimum fragmentation by ESWL [8],[9] before the start of treatment includes the traditional statistical models to solve such problems. [10] We have already established the application of ANN in optimum renal stone fragmentation by ESWL. [8] Most centers follow the usual guidelines on indications and contraindications for ESWL, mainly including stone size and the presence or absence of distal obstruction. Other factors influencing stone fragmentation are invariably not considered or may be difficult to assess. So, there is a need to know the modality which assesses the patients who will be best managed by ESWL. There are various studies predicting the fragmentation of stone, i.e. by computerized tomography attenuation value of renal stone, [11] spontaneous ureteral calculus passage by an ANN [12] and a neural computational model of stone recurrence after ESWL. [13] In this study, to further validate the ANN application, we compared this modality with a larger population of patients undergoing ESWL and compared the results with statistical model, i.e. multivariate regression analysis (MVRA). Materials and Methods Our institute is a tertiary health care center. A total of 328 patients with renal calculus were treated by ESWL during December 2001 to December 2006 in our institution. We followed the usual guidelines for indication and contraindication for ESWL. The patients were treated using an electro hydraulic lithotripter. As per the protocol, we delivered ≤13,000 shocks/ stone, involving not more than four ESWL sittings (with a power range of 1418 kV and a shock frequency of 6090/minute). The optimum stone fragmentation was defined as no fragments (after fragmentation) of >4 mm. We included 276 patients who had successful fragmentation and in whom were followed the criteria for ESWL strictly, i.e. stone size <2 cm in nonobstructed system. All calyceal stones which were not cleared with ESWL were excluded from study to give it uniform parameters. Out of 276 patients, the data of 196 patients were used for training the ANN. The network was trained using the MATLAB software system; the working code for the ANN was constructed so that it was compatible with the analysis and processing of the input data. Network architecture Feed forward network was adopted here as this architecture was considered to be suitable for problem based on pattern recognition. Pattern matching is basically an input/output mapping problem. The architecture of the network is shown in [Table 1].{Table 1} Testing and validation of ANN model To test and validate the ANN model, the new data set was chosen. These data were not used while training of the network. They validate the use of ANN in a more versatile way. The results are presented in this section to demonstrate the performance of the network. The coefficient of correlation (COC) between the predicted and observed values is taken as the performance measures. The predictability of trained ANN was tested on 80 subsequent patients. The prediction was based on the input data sets. The input data remained the same as that used for the training, which included age of patient, stone size, stone burden, number of sittings and urinary pH. The output values (predicted values) were number of shocks and shock power. Of the same 80 patients, the input was analyzed and output was calculated by MVRA. The output values (predicted values) from both the methods were compared and results were drawn. Training of the network was done using one hidden layer with four hidden neurons. As Bayesian regulation [4] was used, there was no danger of overfitting problems. Hence, the network was trained with 700 training epochs. The performance of the ANN during training is shown in [Figure 2].{Figure 2} Multivariate regression analysis MVRA equation for prediction of power = 7.4993  0.0403*(AGE)  0.7371*(SIZE) + 1.4020*(BURDEN)  0.5457*(SITTINGS) +1.7037*(pH) MVRA equation for prediction of number of shocks = 20026  87*(AGE)  258*(SIZE) + 1808*(BURDEN) + 767*(SITTINGS) + 4559*(pH) Results The ANN COC between the predicted values and observed values highly correlated with each other as given in [Figure 3] and [Figure 4]. [Figure 5] and [Figure 6] show the poor correlation coefficient for predicted power and shocks by MVRA. [Figure 7] and [Figure 8] show the comparison of measured and predicted power and numbers of shock by ANN and MVRA, respectively.{Figure 3}{Figure 4}{Figure 5}{Figure 6}{Figure 7}{Figure 8} Discussion In our study, we compared the ANN analysis and the MVRA results, which showed a high degree of correlation for the former. The ANN is therefore a better option for prediction of power and shock using five important and easily determined parameters which can save time and complications in patients undergoing ESWL. Being a non bias system, it does not follow any over fitting and under fitting of datasets as the case may be with MVRA. [Figure 7] and [Figure 8] indicate very clear comparison between measured and predicted shock power and number of shocks using ANN and MVRA. ANN in our model adds on to the quest of predicting fruitful outcome of ESWL. [9],[10],[11],[12],[13] Conclusion ANN gives better COC because it overcomes the shortcomings of the statistical analysis by MVRA. Hence, ANN is probably a better tool to predict the optimum renal stone fragmentation by ESWL. Further use of this model on a larger scale will clarify its role in ESWL.[Table 2]{Table 2} References


