JAMALZADEH, MOHAMMADAMIN (2011) Analysis of Clickstream Data. Doctoral thesis, Durham University.
| PDF - Accepted Version 16Mb |
Abstract
This thesis is concerned with providing further statistical development in the area of web usage analysis to explore web browsing behaviour patterns. We received two data sources: web log files and operational data files for the websites, which contained information on online purchases. There are many research question regarding web browsing behaviour. Specifically, we focused on the depth-of-visit metric and implemented an exploratory analysis of this feature using clickstream data. Due to the large volume of data available in this context, we chose to present effect size measures along with all statistical analysis of data. We introduced two new robust measures of effect size for two-sample comparison studies for Non-normal situations, specifically where the difference of two populations is due to the shape parameter. The proposed effect sizes perform adequately for non-normal data, as well as when two distributions differ from shape parameters. We will focus on conversion analysis, to investigate the causal relationship between the general clickstream information and online purchasing using a logistic regression approach. The aim is to find a classifier by assigning the probability of the event of online shopping in an e-commerce website. We also develop the application of a mixture of hidden Markov models (MixHMM) to model web browsing behaviour using sequences of web pages viewed by users of an e-commerce website. The mixture of hidden Markov model will be performed in the Bayesian context using Gibbs sampling. We address the slow mixing problem of using Gibbs sampling in high dimensional models, and use the over-relaxed Gibbs sampling, as well as forward-backward EM algorithm to obtain an adequate sample of the posterior distributions of the parameters. The MixHMM provides an advantage of clustering users based on their browsing behaviour, and also gives an automatic classification of web pages based on the probability of observing web page by visitors in the website.
Item Type: | Thesis (Doctoral) |
---|---|
Award: | Doctor of Philosophy |
Keywords: | PhD, Research, clickstream, web usage analysis, Mixture of Hidden Mrkov Models |
Faculty and Department: | Faculty of Science > Mathematical Sciences, Department of |
Thesis Date: | 2011 |
Copyright: | Copyright of this thesis is held by the author |
Deposited On: | 25 Jan 2012 15:42 |