SEARCH ENGINES

Search engines are one of the most excellent example of today's technology . As the name suggests , it searches !!! But what ?

Technically speaking , search engine is a program that searces through some database. Search engines are one of the primary ways by which Internet users find web sites. In context of web , it's a program , usually reachable through a web page, used to search a WEB SITE , or a DOCUMENT on a World Wide Web or USENET groups. Some of the search engines commonly used are : Altavista , Excite , Hotbot , Infoseek , Lycos , Webcrawler , AOL Netfind , Northern Light , Megellan ,YAHOO etc...... (Links to these engines are provided here).

The typical Search Engine will accept a "keyword" or a number of "keywords". A keyword is simply a word in plain English that has something to do with the subject you are interested in finding. A typical search might be the keyword: cooking or something similar. You can narrow your searches by using more keywords and being more specific. Something like: cooking eggless cake - will bring back many fewer and more specific links to what you want than simply using the keyword "cooking".

Search engines have three major elements. First is the spider, also called the crawler ( Altavista has 'scooter' while Excite has 'Architext spider' as their crawlers). The spider visits a web page, reads it, and then follows links to other pages within the site .Some large search engines have many spiders working in parallel . About 10 million pages are crawled each day by scooter! This is what it means when someone refers to a site being "spidered" or "crawled." The spider returns to the site on a regular basis, such as every month or two, to look for changes.

Everything the spider finds goes into the second part of a search engine, the index (which is a database lying behind every search engine ). The index is maintained and generated by programs, called robots, that follow links on Web pages, download the pages, and then index them according to the words and phrases that the pages contain. So the index, sometimes called the catalog, is like a giant book containing a copy of every web page that the spider finds (Altavista has about 100 million & Excite has 55 million pages which are indexed) .

If a web page changes, then this book is updated with new information. Sometimes it can take a while for new pages or changes that the spider finds to be added to the index. Thus, a web page may have been "spidered" but not yet "indexed." Until it is indexed -- added to the index -- it is not available to those searching with the search engine.

Search engine software is the third part of a search engine. This is the program that sifts through the millions of pages recorded in the index to find matches to a search ( ie. the keyword ) and rank them in order of what it believes is most relevant.The result displayed may be standard , compact ,a summary or only a title corresponding to the search keyword . Altavista displays 10 pages at a time while Excite gives a choice of 10, 20 ,30 ,40 & 50 pages.

Besides , Meta search engines (like Metacrawler ) are also used . A "Meta Search" will take your query and search for it by sending your requests to other search engines, all in parallel.

In this manner search engines allow you to locate specific information . The common knowledge that the World Wide Web is the world's largest and fastest growing storehouse of digital information, where finding information can often be a frustrating experience , is no more frustrating - thanks to the search engines.They all have strengths and weaknesses, so it's a good idea to learn to use more than one.


Feel free to comment on the above document.To do so mail me at wiplove@softhome.net OR fill this Feedback Form!