Login New user?  
01-Applied Mathematics & Information Sciences
An International Journal
               
 
 
 
 
 
 
 
 
 
 
 
 
 

Content
 

Volumes > Volume 10 > No. 4

 
   

New Hierarchical Clustering Algorithm for Protein Sequences Based on Hellinger Distance

PP: 1541-1549
doi:10.18576/amis/100432
Author(s)
Gamil Abdel-Azim,
Abstract
Protein sequences clustering based on their sequence patterns has attracted lots of research efforts in the last decade. The principal idea of most clustering systems is how to represent and interpret protein sequences, which principally determines the performance of classifiers. In this paper, we proposed a new methodology, that definite a new descriptor to represent and interpret each sequence using its Probability Densities Functions (PDF). The Hellinger distance is used to measure the similarity between the sequences. Afterward, a hierarchical algorithm is applied to clustering proteins sequences using the Hellinger distance. Two of protein data sets are using for the experiments; the first is a mixed between Influenza and Ebola virus and the second is a set of Influenza. We compare between a two Hierarchical Clustering Algorithms, The first based on similarity measure is to use methods with sequences alignments (HCAWSA). The second is the proposed approach to the similarity measure is to use methods without sequences alignments.( HCAWOSA). The experiments result show that the proposed methodology is feasible and achieves good accuracy.

  Home   About us   News   Journals   Conferences Contact us Copyright naturalspublishing.com. All Rights Reserved