Tree-kNN: A Tree-Based Algorithm for Protein Sequence Classification
Journal Title: International Journal on Computer Science and Engineering - Year 2011, Vol 3, Issue 2
Abstract
The phylogenomic classification of protein sequences attempts to categorize a given protein within the evolutionary context of the entire family. It involves mainly four steps: selection of homologous sequences, multiple sequence alignment, phylogenetic tree construction and tree-based classification. This supposes that the tree used as a basis of protein classification is correct. Sequence alignment is the first step for tree construction. Thus, the accuracy of the alignment produced should affect the topology of the phylogenetic tree. This work proposes a kNN tree-based algorithm for protein classification, namely Tree-kNN, which uses a phylogenetic tree estimated from pair-wise and multiple alignment approaches. We compare the classification performance of Tree-kNN with an existing method, called TreeNN. Results show that Tree-kNN gives better results than TreeNN. Based on four datasets we show that classification performances of the two algorithms using pair-wise alignment are better than using multiple alignment
Authors and Affiliations
Khaddouja Boujenfa , Nadia Essoussi , Mohamed Limam
COCOMO model for software based on Open Source: Application to the adaptation of TRIADE to the university system
Today, within the software industry, Open Source Software has many qualities that need to be analyzed. This kind of software has gained a lot of attraction nowadays from researchers since it offers technical and economic...
Honeypot based Secure Network System
A honeypot is a non-production system, design to interact with cyber-attackers to collect intelligence on attack techniques and behaviors. There has been great amount of work done in the field of network intrusion detect...
Realisation of Resourceful Data Mining Services Using Cloud Computing
Data security and access control are the most challenging research work going on, at present, in cloud computing. This is because of the users sending their sensitive data to the cloud providers for acquiring their servi...
Audio Watermarking Based On The PSK Modulation
Audio watermarking is a technique, which can be used to embed information into the digital representation of audio signals. The main challenge is to hide data representing some information without compromising the qualit...
Multi-constrained QoS Multicast Routing based on the Genetic Algorithm for MANETs
A wireless MANET is a collection of wireless mobile hosts that dynamically create a temporary network without a fixed infrastructure. The topology of the network may change unpredictably and frequently. Therefore, multic...