Paper: Anticipating Information Needs Based on Check-in Activity


In this work we address the development of a smart personal assistant that is capable of anticipating a user's information needs based on a novel type of context: the person's activity inferred from her check-in records on a location-based social network. Our main contribution is a method that translates a check-in activity into an information need, which is in turn addressed with an appropriate information card. This task is challenging because of the large number of possible activities and related information needs, which need to be addressed in a mobile dashboard that is limited in size. Our approach considers each possible activity that might follow after the last (and already finished) activity, and selects the top information cards such that they maximize the likelihood of satisfying the user's information needs for all possible future scenarios. The proposed models also incorporate knowledge about the temporal dynamics of information needs. Using a combination of historical check-in data and manual assessments collected via crowdsourcing, we show experimentally the effectiveness of our approach.

Selection of related papers

  • Modeling User Interests for Zero-Query Ranking, Liu Yang, Qi Guo, Yang Song, Sha Meng, Milad Shokouhi, Kieran McDonald, W. Bruce Croft (link)
  • From Queries to Cards: Re-ranking Proactive Card Recommendations Based on Reactive Search History, Milad Shokouhi, Qi Guo (link)
  • Anticipatory search: using context to initiate search, D. J. Liebling, P. N. Bennett, and R. W. White (link)
  • Learning Optimal Card Ranking from Query Reformulation, Liangjie Hong, Yue Shi, Suju Rajan (link)
  • A context-aware model for proactive recommender systems in the tourism domain, Matthias Braunhofer, Francesco Ricci, Béatrice Lamche, Wolfgang Wörndl (link)
  • User interactions with everyday applications as context for just-in-time information access, Jay Budzik, Kristian J. Hammond (link)
  • Just-in-time information retrieval agents, Bradley J. Rhodes, Pattie Maes (link)


On this page we provide additional resources related
to the above mentioned paper. The resources include
experiment details, datasets to download, further
analyses and algorithms.

Currently, the paper is in the reviewing process and
from obvious reasons we provide only previews
of the files at the moment.

Crowdsourcing experiments

Experiment #1

  • We seek to measure the recall of the extracted information needs. We ask people to imagine being at a location from a given top-level POI category and provide us with the top three information needs that they would search for on a mobile device in that situation. (graphics/experiment01-form.png)
  • see experiment layout

Experiment #2 (textual mode)

  • The second experiment is aimed at determining how well we can rank information needs with respect to their relevance given an activity (i.e., $P(i|a)$). We ask study participants to rank the usefulness of a given information need with respect to a selected category on a 5-point Likert scale, from 'not useful' to a 'very useful' piece of information. We evaluated the top 25 information needs for the 5 most visited second-level categories for each of the 9 top-level categories, amounting to 1125 distinct information need and activity pairs.
  • see experiment layout

Experiment #3 (card-based mode)

  • Identical settings as in experiment #2 with one difference: the information needs are not presented as text, instead, information cards are used.
  • see experiment layout

Experiment #4

  • Experiment #4 is focused on collecting measurements for temporal scope of information needs. In batches of 5, we presented the 30 top-ranked information needs in each top-level category. The task for the assessors was to decide when they would search for that piece of information in the given activity context: before, during, or after they have performed that activity. They were allowed to select one or more answers if the particular information need was regarded as useful for multiple time slots.
  • see experiment layout

Experiment #5 (top category)

  • Experiments #5 and #6 are used to evaluate how well we can anticipate (i.e., rank) information needs given a past activity. Crowd judges are tasked with evaluating the usefulness of individual information needs, presented as cards, given the transition between two activities. We collected judgments for the top 10 information needs from each of the activities in the transition. #5 considers top-level activities.
  • see experiment layout

Experiment #6 (second category)

  • Identical settings as in experiment #2 with one difference: second-level activities are used.
  • see experiment layout

Overview table:

Experiment #Tasks Workers/task Payment/task Worker satisfection Payment total Download dataset
#1 9 30 10.0 ¢ 86.0% \$ 27 download
#2 1125 5 0.60 ¢ 68.0% \$ 34 download
#3 1125 5 0.60 ¢ 66.9% \$ 34 download
#4 335 9 2.00 ¢ 84.0% \$ 60 download
#5 1148 5 0.75 ¢ 60.0% \$ 43 download
#6 1240 3 0.75 ¢ 72.0% \$ 28 download
Total \$ 226 download all


Query suggestions

We retrieved query suggestions for a sample of Foursquare POIs (see Section 3.2.1). Here we provide this data after cleansing steps described in the paper.

Normalized information needs

As described in Section 3.3.2, we normalized information needs extracted from Google Suggestions. Clustering provided by 3 assessors as well as final canonical set is made available here.

Foursquare check-in dataset