A Novel Semantically-Time-Referrer based Approach of Web Usage Mining for Improved Sessionization in Pre-Processing of Web Log

Abstract

Web usage mining(WUM) , also known as Web Log Mining is the application of Data Mining techniques, which are applied on large volume of data to extract useful and interesting user behaviour patterns from web logs, in order to improve web based applications. This paper aims to improve the data discovery by mining the usage data from log files. In this paper the work is done in three phases. First and second phase0 which are data cleaning and user identification respectively are completed using traditional methods. The third phase, session identification is done using three different methods. The main focus of this paper is on sessionization of log file which is a critical step for extracting usage patterns. The proposed referrer-time and Semantically-time-referrer methods overcome the limitations of traditional methods. The main advantage of pre-processing model presented in this paper over other methods is that it can process text or excel log file of any format. The experiments are performed on three different log files which indicate that the proposed semantically-time-referrer based heuristic approach achieves better results than the traditional time and Referrer-time based methods. The proposed methods are not complex to use. Web log file is collected from different servers and contains the public information of visitors. In addition, this paper also discusses different types of web log formats.

Authors and Affiliations

Navjot Kaur, Himanshu Aggarwal

Keywords

Related Articles

Impact of External Disturbance and Discontinuous Input on the Redundant Manipulator Robot Behaviour using the Linear Parameter Varying Modelling Approach

This paper is concerned with the synthesis of dynamic model of the redundant manipulator robot based on Linear Parameter Varying approach. To evaluate its behavior and in presence of external disturbance several motions...

Performance Analysis of Keccak f-[1600]

Keccak is the latest Hash Function selected as the winner of NIST Hash Function Competition. SHA-3 is not meant to replace SHA-2 as no significant attacks on SHA-2 have been demonstrated. But it is designed in response t...

An Adaptive Intrusion Detection Method for Wireless Sensor Networks

Current intrusion detection systems for Wireless Sensor Networks (WSNs) which are usually designed to detect a specific form of intrusion or only applied for one specific type of network structure has apparently restrict...

 Automated Periodontal Diseases Classification System

This paper presents an efficient and innovative system for automated classification of periodontal diseases, The strength of our technique lies in the fact that it incorporates knowledge from the patients' clinical data,...

Fault Tolerant System for Sparse Traffic Grooming in Optical WDM Mesh Networks Using Combiner Queue

Queuing theory is an important concept in current internet technology. As the requirement of bandwidth goes on increasing it is necessary to use optical communication for transfer of data. Optical communication at backbo...

Download PDF file
  • EP ID EP249761
  • DOI 10.14569/IJACSA.2017.080122
  • Views 97
  • Downloads 0

How To Cite

Navjot Kaur, Himanshu Aggarwal (2017). A Novel Semantically-Time-Referrer based Approach of Web Usage Mining for Improved Sessionization in Pre-Processing of Web Log. International Journal of Advanced Computer Science & Applications, 8(1), 158-168. https://europub.co.uk./articles/-A-249761