1 / 11

GOOGLE API

GOOGLE API. Search requests: submit a query string and a set of parameters to the Google Web APIs service and receive in return a set of search results

edith
Télécharger la présentation

GOOGLE API

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. GOOGLE API • Search requests: submit a query string and a set of parameters to the Google Web APIs service and receive in return a set of search results • Cache requests: submit a URL to the Google Web APIs service and receive in return the contents of the URL when Google's crawlers last visited the page • Spelling requests: submit a query to the Google Web APIs service and receive in return a suggested spell correction for the query

  2. CSC 9010: Google API Dr. Paula Matuszek Paula_A_Matuszek@glaxosmithkline.com (610) 270-6851

  3. Search Requests • Some Parameter/value pairs that can be passed to the search request: • key: Required for you to access the Google service. Google uses the key for authentication and logging. • q: Query. (See Query Terms for details on query syntax.) • maxResults: Number of results desired per query. The maximum value per query is 10. • filter: Activates or deactivates automatic results filtering • restrict: Restricts the search to a subset of Google Web index. (See Restricts for more details.) • safeSearch: Enables filtering of adult content • lr:Language Restrict - Restricts the search within languages.

  4. Google Query Terms • General information: • Default Search: AND. The order of the terms in the query will impact the search results. • Stop Words: Google ignores stop words unless enclosed in quotes, such as in the phrase "to be or not to be". • Special Characters: Most non-alphanumeric characters are treated as word separators. • Exceptions are: • double quote mark ("): phrase search. May still ignore stop words. • plus sign (+) force inclusion of stop word • minus sign or hyphen (-): exclude term • ampersand (&): treated as another character in the query term

  5. Additional Google Query Terms • Google supports a variety of other special query terms, such as • Boolean OR Search: london OR paris • Site Restricted Search: site:www.stanford.edu • Date Restricted Search: daterange:2452122-2452234 • Title Search: intitle:Google search • URL Search (term) inurl:Google search • Back Links link:www.google.com • File Type Filtering Google filetype:doc

  6. Search Result Format The API returns a number of components, such as: • <documentFiltering> - A Boolean value indicating whether filtering was performed • <searchComments> - A text string intended for display to an end user. e.g.: a note that "stop words" were removed from the search automatically. <estimatedTotalResultsCount> - The estimated total number of results that exist for the query. • <resultElements> - An array of <resultElement> items. This corresponds to the actual list of search results. • <searchQuery> - This is the value of <q> for request. • <directoryCategories> - An array of <directoryCategory> items. This corresponds to the ODP directory matches

  7. Result Element The actual result returned has several fields, including: • <URL> - URL of the result, returned as text, absolute URL path. • <snippet> - A snippet which shows the query in context on the URL where it appears. This is formatted HTML and usually includes <B> tags within it. Note that the query term does not always appear in the snippet. • <title> - The title of the search result, returned as HTML. • <hostName> - When filtering occurs, a maximum of two results from any given host is returned. When this occurs, the second resultElement that comes from that host contains the host name in this parameter. • <directoryTitle> - If the URL for this resultElement is contained in the ODP directory, the title that appears in the directory appears here as a text string. Note that the directoryTitle may be different from the URL's <title>.

  8. Cache Requests • Cache requests submit a URL to the Google Web APIs service and receive in return the contents of the URL when Google's crawlers last visited the page (if available). • The return type for cached pages is base64 encoded text.

  9. Spelling Requests • Spelling requests submit a query to the Google Web APIs service and receive in return a suggested spell correction for the query (if available). Spell corrections mimic the same behavior as found on Google's Web site. • Spelling requests are limited to 2048 bytes and 10 individual words. • The return type for spelling requests is a text string.

  10. Google API Lab • The Google API has been installed on our lab PCs. The documentation for it has also been installed. The most relevant doc files for an overview are README.txt and APIs_Reference.html • The goal for this lab is to try out the API to conduct searches against Google. There are instructions in the readme file about various ways to try this. • FIRST, please read the license file (LICENSE.txt) so you know what we've agreed to in using the API. • Begin with java -cp googleapi.jar com.google.soap.search.GoogleAPIDemo <key> search <Foo> to try it out, then explore some of the other alternatives.

  11. Google API Lab Adminstrivia • We will use my key for this lab. It has a limit of 1000 queries/day, which is plenty for trying by hand but would be easy to exceed programmatically; please be careful. • There are additional query limits -- see docs. • I have verified with Google that using my key for the class is fine. If someone wants to do a more detailed project you should also request a key. • The Google API can be found at http://www.google.com/apis/

More Related