Link Structure of the Web in Hong Kong and Mainland China

The shape of the Web in terms of its graphical structure has been a widely interested topic. Two graphs, Bow Tie and Daisy, have stood out from previous research. In this work, we take a different approach, by viewing the Web as a hierarchy of three levels, namely page level, host level, and domain level. Such structures are analyzed and compared with a snapshot of Chinese Web in early 2006, involving 830 million pages, 17 million hosts, and 0.8 million domains. Some interesting results have emerged. For example, the Chinese Web appears more like a teapot (with a large size of SCC, a medium size of IN and a small size of OUT) at page level than the classic bow tie or daisy shape. Some challenging phenomena are also observed. For example, the INs becomes much smaller than OUTs at host and domain levels.

Major Findings:

Source: Zhu, J. J. H., Meng, T., Xie, Z. M., Li, G., & Li, X. M. (2008). A Teapot graph and its hierarchical structure of the Chinese Web. WWW’08: Proceedings of the 17th international conference on World Wide Web, pp. 1133-1134.

  • Web Mining Lab

    Web Mining Lab in Pictures
    Follow us
    Follow us on Twitter Follow us on SinaWeibo Follow us on Facebook
  • Where are you from

    web counter
  • Contact Us

    Dept. of Media & Communication

    City University of Hong Kong

    Tat Chee Avenue, Kowloon

    Hong Kong S. A. R.

    Tel: 852-3442 5950

    Fax: 852-3442 0228

    Email: weblabcityu [at] gmail.com