Effective Performance of Information Retrieval by using Domain Based Crawler

Apply

Effective Performance of Information Retrieval by using Domain Based Crawler

Journal Title: International Journal of Advanced Computer Science & Applications - Year 2013, Vol 4, Issue 7

Abstract

World Wide Web continuously introduces new capabilities and attracts many people[1]. It consists of more than 60 billion pages online. Due to this explosion in size, the information retrieval system or Search Engines are being upgraded day by day and it can be used to access the information effectively and efficiently. In this paper, we have addressed Domain Based Information Retrieval (DBIR) System. In this system we crawl the information from the web and added all links to the data base which are related to a specific domain. It simply ignores which are not related to that domain. Because of that we can save the Storage Space (SS) and Searching Time (ST) and as a result it improves the performance of the system. It is an extension of Effective Performance of Web Crawler (EPOW) System [2], in which it has two Crawler modules. The first one is Basic Crawler. It consists of multiple downloaders to achieve parallelization policy . The second one is Master Crawler, which is used to filter the URLs send by the Basic Crawler based on the Domain and sends back to the Basic Crawler to extract the related links. All these related links are collectively stored into the database under a unique domain name.

Authors and Affiliations

Sk. Nabi, Dr. Premchand

Keywords

Automated Greenhouses for the Reduction of the Cost of the Family Basket in the District of Villa El Salvador-Perú

Today, the cost of the family basket is gradually increasing, not only globally but also in our country. This increase includes the demand for vegetables and fresh vegetables that allow people to improve their quality of...

QVT transformation by modelling - From UML Model to MD Model

To provide a complete analysis of the organization, its business and its needs, it is necessary for leaders to have data that help decision making. Data warehouses are designed to meet such needs; they are an analysis an...

Downlink and Uplink Message Size Impact on Round Trip Time Metric in Multi-Hop Wireless Mesh Networks

In this paper, the authors propose a novel real-time study metrics of Round Trip Time (RTT) for Multi-Hop Wireless Mesh Networks. They focus on real operational wireless networks with fixed nodes, such as industrial wire...

Clustering-based Spam Image Filtering Considering Fuzziness of the Spam Image

If there are pros, corns are always there. As email becomes a part of individual’s need in our busy life with its benefits, it has negative aspect too by means of email spamming. Nowadays images with embedded text called...

Development of Mobile-Interfaced Machine Learning-Based Predictive Models for Improving Students’ Performance in Programming Courses

Student performance modelling (SPM) is a critical step to assessing and improving students’ performances in their learning discourse. However, most existing SPM are based on statistical approaches, which on one hand are...

EP ID EP115027
DOI 10.14569/IJACSA.2013.040713
Views 104
Downloads 0