Call for Papers (Archived)

Overview

The ubiquitous use of search engines to discover and access internet content shows clearly the success of information retrieval algorithms. However, unlike controlled collections, the vast majority of the Web pages lack an authority asserting their quality. This openness of the Web has been the key to its rapid growth and success, but this openness is also a major source of new adversarial challenges for information retrieval methods.

Adversarial Information Retrieval addresses tasks such as gathering, indexing, filtering, retrieving and ranking information from collections wherein a subset has been manipulated maliciously. On the Web, the predominant form of such manipulation is "search engine spamming" or spamdexing, i.e., malicious attempts to influence the outcome of ranking algorithms, aimed at getting an undeserved high ranking for some items in the collection. There is an economic incentive to rank higher in search engines, considering that a good ranking on them is strongly correlated with more traffic, which often translates to more revenue.

This, the third AIRWeb workshop, builds on the previous successful meetings at Chiba, Japan as part of WWW2005, and at Seattle, USA as part of SIGIR2006. This workshop provides a focused venue for both mature and early-stage work in web-based adversarial IR.

We solicit both full and short submissions on any aspect of adversarial information retrieval on the Web. Particular areas of interest include, but are not limited to:

Link spam: nepotistic linking, collusion, link farms, link exchanges and link bombing.
Content spam: keyword stuffing, phrase stitching, and other techniques for generating synthetic text.
Cloaking: sending different content to a search engine than to regular visitors of a web site, which is often used in combination with other spamming techniques.
Comment spam in legitimate sites: in blogs, forums, wikis, etc.
Spam-oriented blogging: splogs, spings, etc
Click fraud detection: including forging clicks for profit, or to deplete a competitor's advertising funds
Reverse engineering of ranking algorithms
Web content filtering: as used by governments, corporations or parents to restrict access to inappropriate content
Advertisement blocking: developing software for blocking advertisements during browsing
Stealth crawling: crawling the Web while avoiding detection
Malicious tagging: for injecting keywords or for self-promotion in general

Papers addressing higher-level concerns, e.g., whether "open" algorithms can succeed in an adversarial environment, whether permanent solutions are possible, how the problem has evolved over years, what are the differences between "black-hat", "white-hat", and "gray-hat" techniques, where is the line between search engine optimization and spamming, etc., are also welcome.

At least three anonymous reviews will be provided per paper, judged on the basis of relevance, originality, quality, and presentation. Proceedings of the workshop will be included in the ACM Digital Library. Substantially longer papers may also be submitted to ACM Transactions on the Web (TWEB): special issue on adversarial issues in Web search.

Full papers are limited to 8 pages; work-in progress will be permitted 4 pages. See submission instructions for details.

Web Spam Challenge

This year, we are introducing a novel element: a Web Spam Challenge for testing web spam detection systems. This challenge is supported by the EU Network of Excellence PASCAL Challenge Program, and by the DELIS EU-FET research project. Participation is open to all.

We will be using the WEBSPAM-UK2006 collection for Web Spam Detection that comprises a large set of web pages, a web graph, and human-provided labels for a set of hosts. To reduce the amount work due to data processing, we provide a set of features extracted from the contents and links in the collection, which may be used by the participant teams in addition to any automatic technique they choose to use.

We ask that participants of the Web Spam Challenge submit predictions (normal/spam) for all unlabeled hosts in the collection. Predictions will be evaluated and results will be announced at the AIRWeb 2007 workshop. See the Web Spam Challenge website for details.

Participation on the challenge does not require a paper submission, and researchers submitting papers are not required to participate in the challenge. However, we encourage that participants of the Web Spam Challenge also submit research articles describing their systems, and we encourage authors submitting research articles on Web Spam detection to participate in the challenge.

Student Grants

Students who author or co-author accepted papers at AIRWEB 2007, are eligible for a grant to support their travel or registration. These grants are possible by a sponsorship from Microsoft.

Up to three students will be supported, with an expected level of support of USD$500 each. If you are a student authoring or co-authoring an accepted paper at AIRWEB 2007, please send an e-mail to the organizers by March 30th, 2007, indicating your name, school, the purpose of the grant (travel/registration) and the amount requested.

Important Dates

~~7 February 2007: E-mail intention to submit (optional, but helpful)~~
~~26 February 2007: Extended deadline for submissions~~
20 March 2007: Notification of acceptance
30 March 2007: Camera-ready copy due
30 March 2007: Challenge submissions due
8 May 2007: Date of workshop

2007 Organizing Committee

Carlos Castillo, Yahoo! Research
Kumar Chellapilla, Microsoft Live Labs
Brian D. Davison, Lehigh University

2007 Program Committee

Einat Amitay, IBM Research
András Benczúr, Hungarian Academy of Sciences
Andrei Broder, Yahoo! Research
Soumen Chakrabarti, Indian Institute of Technology Bombay
Paul-Alexandru Chirita, University of Hannover
Tim Converse, Powerset
Nick Craswell, Microsoft Research
Matt Cutts, Google
Ludovic Denoyer, University Paris 6
Aaron D'Souza, Google
Dennis Fetterly, Microsoft Research
Tim Finin, University of Maryland
Edel García, Mi Islita.com
Natalie Glance, Nielsen BuzzMetrics
Antonio Gulli, Ask.com
Zoltán Gyöngyi, Stanford University
Monika Henzinger, Google & Ecole Polytechnique Fédérale de Lausanne (EFPL)
Jeremy Hylton, Google
Ronny Lempel, IBM Research
Mark Manasse, Microsoft Research
Gilad Mishne, University of Amsterdam
Marc Najork, Microsoft Research
Jan Pedersen, Yahoo!
Tamás Sarlós, Hungarian Academy of Sciences
Erik Selberg, Microsoft Search Labs
Mike Thelwall, University of Wolverhampton
Andrew Tomkins, Yahoo! Research
Matt Wells, Gigablast
Baoning Wu, Snap.com
Tao Yang, Ask.com

E-mail:

AIRWeb 2007

Third International Workshop on
Adversarial Information Retrieval on the Web

AIRWeb'07

Call for Papers (Archived)

Overview

Web Spam Challenge

Student Grants

Important Dates

2007 Organizing Committee

2007 Program Committee

AIRWeb 2007

Third International Workshop onAdversarial Information Retrieval on the Web

AIRWeb'07

Call for Papers (Archived)

Overview

Web Spam Challenge

Student Grants

Important Dates

2007 Organizing Committee

2007 Program Committee

Third International Workshop on
Adversarial Information Retrieval on the Web