Search Engines

Google Purchases Aardvark

On Sunday LISNEWS had a story about the search engine Aardvark. Aardvark, a social search company, is developing a new paradigm for Web searches that taps into social networks, not automated formulas, to provide answers to queries.

Today Aardvark has been purchased by Google. Story in the Washington Post.

The Importance of Word-Sense Disambiguation in Online Information Retrieval

By Jeffrey Beall

The Problem

Word-sense disambiguation is the ability of an online system to differentiate the different senses, or meanings, of words in online searching. Say for example that you need information on boxers, so you access an Internet search engine and enter "boxers" in the search box. The search engine then finds documents that contain the word "boxers" and returns those documents to you as search results.

You probably already see the problem here -- the word "boxers" is a homonym with several different meanings, and the search engine doesn’t know which meaning you want. Boxers are a breed of dog, a category of athlete, and a kind of men’s garment. It’s also the possessive of a surname, as in "Barbara Boxer’s bill …" Finally, boxers were those who participated in the Boxer Rebellion in China from 1899 to 1901. There may be additional meanings.

Information retrieval in libraries has transitioned from the high precision and recall that legacy library systems offered to the probabilistic and linguistic free-for-all that internet search engines now provide. One of the great values of legacy library databases was that they effectively handled polysemy -- the ability of a term to have multiple meanings -- in searching. Because online searching needs word-sense disambiguation to be effective and precise, it’s important for all librarians to understand the problem and its solutions.

Your rating: None Average: 4.1 (8 votes)

A Search Engine That Relies on Humans

Aardvark, a social search company, is developing a new paradigm for Web searches that taps into social networks, not automated formulas, to provide answers to queries.

Article at NYT.com

Bing to Include Results from WolframAlpha

In a partnership to be initially rolled out in the United States, Bing plans to use data sets and algorithms from the computational knowledge engine to punch up its search results. Particular emphasis is being placed on Wolfram's quick calculations when it comes to nutrition and health information.

Story at BBC News

What's Next? Twitter Search on Google

Twitter has signed deals to put messages sent via the microblogging service into the Microsoft and Google search indexes, BBC News reports.

The deals will see messages, or tweets, show up in Bing and Google search results almost as soon as they show up on Twitter.

Microsoft has moved quickly to set up a stand-alone Twitter search page accessible via its Bing site.

Google said its Twitter search service would debut within the next few months.

Google Book Search Hearing to Be Postponed

The parties in the Google Book Search Settlement have asked the court to adjourn the scheduled October 7th fairness hearing, telling the court the parties intend to amend the deal. "Because the parties, after consultation with the DOJ, have determined that the Settlement Agreement that was approved preliminarily in November 2008 will be amended, plaintiffs respectfully submit that the Fairness Hearing should not be held, as scheduled, on October 7," reads a memorandum appended to the parties motion to adjourn.

"To continue on the current schedule would put the Court in a position of reviewing and having participants at the hearing speak to the
original Settlement Agreement, which will not be the subject of a motion for final approval." The court is expected to grant the motion. Publishers Weekly reports.

Google's Book Search: A Disaster for Scholars

Whether the Google books settlement passes muster with the U.S. District Court and the Justice Department, Google's book search is clearly on track to becoming the world's largest digital library. No less important, it is also almost certain to be the last one. Google's five-year head start and its relationships with libraries and publishers give it an effective monopoly: No competitor will be able to come after it on the same scale. Nor is technology going to lower the cost of entry. Scanning will always be an expensive, labor-intensive project. Of course, 50 or 100 years from now control of the collection may pass from Google to somebody else—Elsevier, Unesco, Wal-Mart. But it's safe to assume that the digitized books that scholars will be working with then will be the very same ones that are sitting on Google's servers today, augmented by the millions of titles published in the interim.

That realization lends a particular urgency to the concerns that people have voiced about the settlement —about pricing, access, and privacy, among other things. But for scholars, it raises another, equally basic question: What assurances do we have that Google will do this right?

More from Geoffrey Nunberg at the Chronicle of Higher Education.

The Real-Time Library

The Real-Time Library Academic libraries always had elements of Web 2.0 to them, but without the 2.0 technology. Much the same, the exchange of information in real-time (think phone and F2F reference) is not new to libraries, but now we have the convenience, immediacy and community presence of the real-time web world. We are poised to move there.

Bing Keeps Rising

It may be far too early to pop the champagne on the Microsoft campus, but a celebration with a round of beers — the good stuff — may be in order.

More at the NYT Bits Blog

Victory for LGBT Websites in Tennessee School Districts

NPR's Andy Carvin reports from "All Tech Considered"...

The American Civil Liberties Union announced today that they have settled out of court with two Tennessee school districts sued on behalf of local students for blocking classroom access to lesbian, gay, bisexual and transgender Web sites. The lawsuit, as we reported last May, alleged that Metropolitan Nashville Public Schools and Knox County Schools violated the rights of three students by denying them access to LGBT sites, yet continued to allow access to sites that advocated "reparative therapy" programs that attempt to change a person's sexual orientation.

As part of the settlement, the school districts agreed to unblock the LGBT Web sites. If the districts re-block the sites at any time, the ACLU says it will bring the case back to court.

Syndicate content Syndicate content