Please subscribe to my feed.
Google recently announced via its product blogs that they have begun an effort to index the “invisible” web. The details point to perhaps a big step in the technology for indexing online content. The announcement refers to detection of online forms and filling them with suitable data so as to generate pages that could be indexed.

An excerpt from the Google Webmaster Central Blog.
In the past few months we have been exploring some HTML forms to try to discover new web pages and URLs that we otherwise couldn’t find and index for users who search on Google. Specifically, when we encounter a
The Invisible Web refers to the part of the Internet which is unavailable for indexing to the search bots or crawlers in the normal course of indexing.
Wikipedia has a synopsis of what qualifies as the invisible web.
Deep Web resources may be classified into one or more of the following categories:
Until recently the invisible web was indexed only when the sites were made available through submission. Google’s approach hints to applications of several technologies that they have been researching on for years but have seldom mentioned in their products.
The new abilities of Google’s crawlers to simulate form submissions implies the application of artificial intelligence, language processing and contextual analysis - technologies that have come to Google by way of acquisition and in-house talent. The present process will be limited to forms that use GET for data submission.
What this means is that now Google will be able to more accurately address the questions posed by users. This is another effort from Google to be the one-stop-shop for all queries related to the web - essentially automating the search process at other websites so that the users get a final result page.
Leave a reply