A New Hidden Web Crawling Approach

Abstract

Traditional search engines deal with the Surface Web which is a set of Web pages directly accessible through hyperlinks and ignores a large part of the Web called hidden Web which is a great amount of valuable information of online database which is “hidden” behind the query forms. To access to those information the crawler have to fill the forms with a valid data, for this reason we propose a new approach which use SQLI technique in order to find the most promising keywords of a specific domain for automatic form submission. The effectiveness of proposed framework has been evaluated through experiments using real web sites and encouraging preliminary results were obtained

Authors and Affiliations

L. Saoudi , A. Boukerram , S. Mhamedi

Keywords

Related Articles

Dynamic Bandwidth Allocation in LAN using Dynamic Excess Rate Sensing

Today human and information processing system both need rapid access to anything they want on the internet. To fulfill these needs more and more internet service providers with a large amount of bandwidth are introducing...

Cloud Security based on the Homomorphic Encryption

Cloud computing provides services rather than products; where it offers many benefits to clients who pay to use hardware and software resources. There are many advantages of using cloud computing such as low cost, easy t...

Non-linear Dimensionality Reduction-based Intrusion Detection using Deep Autoencoder

The intrusion detection has become core part of any network of computers due to increasing amount of digital content available. In parallel, the data breaches and malware attacks have also grown in large numbers which ma...

Framework for Applicability of Agile Scrum Methodology: A Perspective of Software Industry

Agile scrum methodology has been evolved over the time largely through software industry where it has grown and developed through empirical progress. The research work presented in this paper has proposed a framework by...

Arabic Stemmer for Search Engines Information Retrieval

Arabic language is very different and difficult structure than other languages, that’s because it is a very rich language with complex morphology. Many stemmers have been developed for Arabic language but still there are...

Download PDF file
  • EP ID EP106530
  • DOI 10.14569/IJACSA.2015.061039
  • Views 118
  • Downloads 0

How To Cite

L. Saoudi, A. Boukerram, S. Mhamedi (2015). A New Hidden Web Crawling Approach. International Journal of Advanced Computer Science & Applications, 6(10), 293-297. https://europub.co.uk./articles/-A-106530