Online activities categorization of mobile users


In this paper, we collected online activity data from one million mobile users on the wireless network of a national carrier and analyzed a random sample of 7500 mobile users (generating 20 million online requests). We note that several interesting characterisitcs that make characterizing a real user’s online activities challenging a large volume of background activities, URL shortening and redirection, and widespread encrypted/dynamic/personalized content. We propose a scheme for online activity characterization that addresses these issues. Our method relies on URL expansion to reveal the true destination URL, host/domain classification to provide the right context, and finally using the extracted URL tokens determine the most appropriate activity category. We demonstrate and validate the effectiveness of our approach on real mobile data and publicly available data such as the Yahoo Open Directory Project.

In IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS)