A search engine is a software program or script available through the Internet that searches documents and files for keywords and returns the results of any files containing those keywords. Today, there are thousands of different search engines available on the Internet, each with their own abilities and features. The first search engine ever developed is considered Archie, which was used to search for FTP files and the first text-based search engine is considered Veronica. Today, the most popular and well known search engine is Google.
Because large search engines contain millions and sometimes billions of pages, many search engines not only just search the pages but also display the results depending upon their importance. This importance is commonly determined by using various algorithms.
The picture gives an example of how a search engine works. As can be seen in the image, the starting point of all search engines is a spider or crawler, which visits the pages that will be included in the search and grabs the contents of each of those pages.
Once a page has been crawled the data contained within the page is processed, often this involves stripping out stop words, grabbing the location of each of the words in the page, the frequency they occur, links to other pages, images, etc. This data is used to rank the page and is the primary method a search engine uses to determine if a page should be shown and in what order.
Finally, once the data has been processed it is often broken up into one or more files, moved to different computers or servers, or loaded into memory where it can be accessed when users perform a search.
No comments:
Post a Comment