Computing the entropy of user navigation in the web

Article English OPEN
Levene, Mark ; Loizou, George (2003)

Navigation through the web, colloquially known as "surfing", is one of the main activities of users during web interaction. When users follow a navigation trail they often tend to get disoriented in terms of the goals of their original query and thus the discovery of typical user trails could be useful in providing navigation assistance. Herein, we give a theoretical underpinning of user navigation in terms of the entropy of an underlying Markov chain modelling the web topology. We present a novel method for online incremental computation of the entropy and a large deviation result regarding the length of a trail to realize the said entropy. We provide an error analysis for our estimation of the entropy in terms of the divergence between the empirical and actual probabilities. We then indicate applications of our algorithm in the area of web data mining. Finally, we present an extension of our technique to higher-order Markov chains by a suitable reduction of a higher-order Markov chain model to a first-order one.
  • References (32)
    32 references, page 1 of 4

    P. Billingsley. Statistical methods in Markov chains. The Annals of Mathematical Statistics, 32:12{40, 1961.

    J. Borges and M. Levene. Data mining of user navigation patterns. In B. Masand and M. Spiliopoulou, editors, Web Usage Analysis and User Pro¯ling, Lecture Notes in Arti¯cial Intelligence (LNAI 1836), pages 92{111. Springer-Verlag, Berlin, 2000.

    T. Bell and A. Mo®at. A note on the DMC data compression scheme. The Computer Journal, 32:16{20, 1989.

    V. Bush. As we may think. Atlantic Monthly, 176:101{108, 1945.

    T. Bell, I.H. Witten, and J.G. Cleary. Modeling for text compression. ACM Computing Surveys, 21:557{591, 1989.

    [CDK+99] S. Chakrabarti, B. Dom, S.R. Kumar, P. Raghavan, S. Rajagopalan, A. Tomkins, D. Gibson, and J.M. Kleinberg. Mining the web's link structure. IEEE Computer, 32:60{67, 1999.

    In Proceedings of International World Wide Web Conference, pages 161{172, Brisbane, 1998.

    G.V. Cormack and R.N.S. Horspool. Data compression using dynamic Markov modelling. The Computer Journal, 30:541{550, 1987.

    C. Chat¯eld. Statistical inference regarding Markov chain models. Applied Statistics, 22:7{20, 1973.

    IEEE Transactions on Knowledge and Data Engineering, 10:209{221, 1998.

  • Metrics
    views in OpenAIRE
    views in local repository
    downloads in local repository

    The information is available from the following content providers:

    From Number Of Views Number Of Downloads
    Birkbeck Institutional Research Online - IRUS-UK 0 33
Share - Bookmark