Enhanced Approach on Web Page Classification Using Machine Learning Technique
Journal Title: International Journal of Advanced Research in Computer Engineering & Technology(IJARCET) - Year 2012, Vol 1, Issue 7
Abstract
The data set contains WWW-pages collected from computer science departments of various universities in January 1997 by the World Wide Knowledge Base project of the CMU text learning group. The 8,282 pages were manually classified into 7 classes: 1) student, 2) faculty, 3) staff, 4) department, 5) course, 6) project and 7) other. For each class the data set contains pages from the four universities: Cornell, Texas, Washington, Wisconsin and 4,120 miscellaneous pages from other universities. The files are organized into a directory structure, one directory for each class. Each of these seven directories contains 5 subdirectories, one for each of the 4 universities and one for the miscellaneous pages. These directories in turn contain the Web-pages. The proposed work performs the data preprocessing to clean the dataset and transform it in to the pattern for classification. Then the feature extraction is performed for extracting only minimum number of representative features or terms extracted from it without using the entire Web page. After that the classification algorithm is used to classify the dataset into one of the seven classed using FP-Growth algorithm. The proposed approach is compared with the existing system apriori algorithm.
Authors and Affiliations
S. Gowri Shanthi , Dr. Antony Selvadoss Thanamani,
Design of Wireless Monitor System Based On S3C2440 and GPRS
This Paper introduces a new type Wireless Monitoring System, which applied in Industrial Field. In the Proposed system, the main hardware includes the S3C2440 Microcontroller based on ARM9 core, and the software adop...
Investigation of SAR inside Different Enclosures and Reduction Techniques
Electromagnetic interface with a human head exposed to internal antenna is calculated. A human head phantom with dielectric properties inside an enclosure is designed. The radiation pattern of Planar Inverted F An...
GA optimized SVD based signal detector for Cognitive radio Networks
This paper examines the implementation of the Genetic Algorithm (GA) optimized Singular Value Decomposition (SVD) method to detect the presence of wireless signal. We simulated the algorithm using common digital...
SECURE DATA DELIVERY USING GEOGRAPHIC MULTICAST ROUTING
In MANET group communication is an important characteristic which can be implemented through Multicasting. MANET has a dynamic topology through which the mobile nodes keep changing their location and thus it doesn’t have...
Cloud based Intra-College Communication Information System using Mobile Clients
we have seen over the years that the process of notice boards, important notification about academics has been carried out manually almost across all educational institutions. The process is not only time consuming...