Web Images Videos Maps News Shopping Gmail more »
Sign in
Scholar Home  
  Advanced Scholar Search
Scholar Preferences
Scholar  Results 1 - 10 of about 2,130. (0.14 sec) 

[PDF] Crawling the hidden web

psu.edu [PDF]
S Raghavan, H Garcia-Molina - … of the International Conference on Very …, 2001 - Citeseer
Current-day crawlers retrieve content only from the publicly indexable Web, ie, the set of web
pages reachable purely by following hypertext links, ignoring search forms and pages that require
authorization or prior regis- tration. In particular, they ignore the tremendous amount of ...
Cited by 434 - Related articles - View as HTML - BL Direct - All 61 versions

The hidden web

psu.edu [PDF]
H Kautz, B Selman, M Shah - AI magazine, 1997 - aaai.org
The vast network of linked documents that make up the World Wide Web (WWW) is only one
manifestation of a larger and more profound phenomenon; namely, the social network that links
all peo- ple. In the 1960s, Stanley Milgram's (1967) pioneering work on the small-world ...
Cited by 259 - Related articles - BL Direct - All 20 versions

Distributed search over the hidden web: Hierarchical database sampling and …

psu.edu [PDF]
PG Ipeirotis, L Gravano - … of the 28th international conference on …, 2002 - portal.acm.org
Many valuable text databases on the web have non-crawlable contents that are “hidden” be-
hind search interfaces. Metasearchers are help- ful tools for searching over many such databases
at once through a unified query interface. A critical task for a metasearcher to process a ...
Cited by 129 - Related articles - All 38 versions

Probe, count, and classify: categorizing hidden web databases

psu.edu [PDF]
PG Ipeirotis, L Gravano, M Sahami - Proceedings of the 2001 ACM …, 2001 - portal.acm.org
ABSTRACT The contents of many valuable web-accessible databases are only accessible through
search interfaces and are hence in- visible to traditional web “crawlers.” Recent studies have
estimated the size of this “hidden web” to be 500 billion pages, while the size of the “ ...
Cited by 128 - Related articles - BL Direct - All 27 versions

QProber: A system for automatic classification of hidden-web databases

psu.edu [PDF]
L Gravano, PG Ipeirotis, M Sahami - ACM Transactions on …, 2003 - portal.acm.org
The contents of many valuable Web-accessible databases are only available through search
inter- faces and are hence invisible to traditional Web “crawlers.” Recently, commercial Web
sites have started to manually organize Web-accessible databases into Yahoo!-like ...
Cited by 90 - Related articles - BL Direct - All 24 versions

Downloading textual hidden web content through keyword queries

psu.edu [PDF]
A Ntoulas, P Zerfos, J Cho - Proceedings of the 5th ACM/IEEE-CS …, 2005 - portal.acm.org
ABSTRACT An ever-increasing amount of information on the Web today is available only through
search interfaces: the users have to type in a set of keywords in a search form in order to access
the pages from certain Web sites. These pages are often referred to as the Hidden Web or ...
Cited by 73 - Related articles - All 14 versions

[PDF] Siphoning hidden-web data through keyword-based interfaces

psu.edu [PDF]
L Barbosa, J Freire - Proc. of SBBD, 2004 - Citeseer
Abstract In this paper, we study the problem of automating the retrieval of data hidden behind
simple search interfaces that accept keyword-based queries. Our goal is to automatically retrieve
all available results (or, as many as possible). We propose a new approach to siphon ...
Cited by 50 - Related articles - View as HTML - All 10 versions

[PDF] Searching for hidden-web databases

psu.edu [PDF]
L Barbosa, J Freire - Proceedings of WebDB, 2005 - Citeseer
ABSTRACT Recently, there has been increased interest in the retrieval and inte- gration of hidden
Web data with a view to leverage high-quality in- formation available in online databases. Although
previous works have addressed many aspects of the actual integration, including ...
Cited by 42 - Related articles - View as HTML - All 7 versions

Automatic generation of agents for collecting hidden web pages for data extraction


J Palmieri Lage, AS da Silva, PB Golgher, AHF … - Data & Knowledge …, 2004 - Elsevier
As the Web grows, more and more data has become available under dynamic forms of
publication, such as legacy databases accessed by an HTML form (the so called hidden
Web). In situations such as this, integration of this data relies more and more on the fast ...
Cited by 44 - Related articles - All 5 versions

Crawling for domain-specific hidden Web resources


A Bergholz, B Chidlovskii - Proceedings of the …, 2003 - doi.ieeecomputersociety.org
The Hidden Web, the part of the Web that remains unavailable for standard crawlers, has become
an im- portant research topic during recent years. Its size is estimated to 400 to 500 times larger
than that of the Publicly Indexable Web (PIW). Furthermore, the in- formation on the ...
Cited by 34 - Related articles - All 6 versions


Result Page: 

1

2

3

4

5

6

7

8

9

10

Next


 

Go to Google Home - About Google - About Google Scholar

©2009 Google