The \"invisible Web\" or the millions of pages not indexed at Yahoo! or Google is much bigger than most people thought. A company called BrightPlanet says there are now about 550 billion documents stored on the Web. They say Internet search engines index about 1 billion pages, although Google claims more than a billion. They say the problem lies in how search engines index the web. Search engines rely on technology that generally identifies \"static\" pages, rather than the \"dynamic\" information stored in databases. You can get the Report Here
This first-ever study describes the nature of the deep Web, and quantifies its size, importance and quality. Major sections deal with study methodology, findings, and implications. The paper has been written according to peer-reviewable standards. It contains 7 figures and 10 tables, with complete references and citations.
A Story from CNET also has some more to say on this.
\"The World Wide Web is getting to be so humongous that you need specialized engines. A centralized approach like this isn\'t going to be successful,\" predicted Carl Malamud, co-founder of Petaluma, Calif.-based Invisible Worlds.
Like BrightPlanet, Invisible Worlds is trying to extract more data hidden from search engines but is customizing the information.
Malamud calls this process \"giving context to the content.\"
Sullivan agreed that BrightPlanet\'s greatest challenge will be showing businesses and individuals how to effectively deploy the company\'s breakthrough.
\"No one else has come up with something like this yet, so when they fetch people all this information on the deep Web, they are going to have to show people where to dive in. Otherwise, people will just drown.\"