Personal tools
BARAC: An Effective Presentation of Ranked Structured Datasets
| What |
|
|---|---|
| When |
Mar 05, 2010 from 02:00 PM to 03:00 PM |
| Where | Engr IV Maxwell Room 57-124 |
| Add event to calendar |
|
Julia Stoyanovich
University of Pennsylvania
Friday, March 5, 2010 at 2:00pm
Engr IV Maxwell Room 57-124
Abstract
In online applications such as Yahoo! Personals and Yahoo! Real Estate
users define structured profiles in order to find potentially
interesting matches. Typically, profiles are evaluated against large
datasets and produce thousands of matches. In addition to filtering,
users also specify ranking in their profile, and matches are returned in
the form of a ranked list. Top results in a ranked list are often
homogeneous, which hinders data exploration. For example, a user looking
for 1- or 2-bedroom apartments sorted by price will see a large number
of cheap 1-bedrooms in undesirable neighborhoods before seeing any
apartments with different characteristics. An alternative to ranking is
to group matches on common attribute values (e.g., cheap 1-bedrooms in
good neighborhoods, 2-bedrooms with 2 baths, etc.). However, not all
groups will be of interest to the user given his ranking criteria. We
argue here that neither single-list ranking nor attribute-based grouping
is adequate for effective exploration of ranked datasets. We formalize
rank-aware clustering and develop BARAC, a novel clustering algorithm
that enables rank-aware data exploration in domains with a large number
of heterogeneous attributes. We present results of a large-scale user
study that validate the effectiveness of our approach. We extensively
evaluate the performance of our algorithm over large datasets from
Yahoo! Personals, a leading online dating site.
Biography
Julia Stoyanovich is a Postdoctoral Researcher and a Computing
Innovations Fellow at the University of Pennsylvania. Julia holds M.S.
and Ph.D. degrees in Computer Science from Columbia University, and a
B.S. in Computer Science and in Mathematics and Statistics from the
University of Massachusetts at Amherst. After receiving her B.S. Julia
went on to work for two start-ups and one real company in New York City,
where she interacted with a variety of massive datasets. Julia's
industry experience convinced her that many practical data management
challenges remain to be tackled, and that she does not like to wake up
early in the morning, prompting her return to academia. Julia's research
focuses on improving search, ranking, and data exploration in
semantically rich application domains. She is particularly excited
about the challenges that arise in life sciences applications and in
social information processing.
