Every Things You Need: Search Engine

Desember 15, 2009

How To Indexing a Site

Author: linkey - Categories: Search Engine

Before a site appears in search results, a search engine must index it. An indexed site will have been visited and analyzed by a search robot with relevant information saved in the search engine database. If a page is present in the search engine index, it can be displayed in search results otherwise, the search engine cannot know anything about it and it cannot display information from the page..

Most average sized sites (with dozens to hundreds of pages) are usually indexed correctly by search engines. However, you should remember the following points when constructing your site. There are two ways to allow a search engine to learn about a new site:

- Submit the address of the site manually using a form associated with the search engine, if available. In this case, you are the one who informs the search engine about the new site and its address goes into the queue for indexing. Only the main page of the site needs to be added, the search robot will find the rest of pages by following links.

- Let the search robot find the site on its own. If there is at least one inbound link to your resource from other indexed resources, the search robot will soon visit and index your site. In most cases, this method is recommended. Get some inbound links to your site and just wait until the robot visits it. This may actually be quicker than manually adding it to the submission queue. Indexing a site typically takes from a few days to two weeks depending on the search engine. The Google search engine is the quickest of the bunch.

Try to make your site friendly to search robots by following these rules:

- Try to make any page of your site reachable from the main page in not more than three mouse clicks. If the structure of the site does not allow you to do this, create a so-called site map that will allow this rule to be observed.

- Do not make common mistakes. Session identifiers make indexing more difficult. If you use script navigation, make sure you duplicate these links with regular ones because search engines cannot read scripts (see more details about these and other mistakes in section 2.3).

- Remember that search engines index no more than the first 100-200 KB of text on a page. Hence, the following rule – do not use pages with text larger than 100 KB if you want them to be indexed completely.

You can manage the behavior of search robots using the file robots.txt. This file allows you to explicitly permit or forbid them to index particular pages on your site.

The databases of search engines are constantly being updated; records in them may change, disappear and reappear. That is why the number of indexed pages on your site may sometimes vary. One of the most common reasons for a page to disappear from indexes is server unavailability. This means that the search robot could not access it at the time it was attempting to index the site. After the server is restarted, the site should eventually reappear in the index.

You should note that the more inbound links your site has, the more quickly it gets re-indexed. You can track the process of indexing your site by analyzing server log files where all visits of search robots are logged. We will give details of seo software that allows you to track such visits in a later section.

Desember 14, 2009

Common search engine principles

Author: linkey - Categories: Search Engine

To understand seo you need to be aware of the architecture of search engines. They all contain the following main components:

Spider - a browser-like program that downloads web pages.

Crawler – a program that automatically follows all of the links on each web page.

Indexer - a program that analyzes web pages downloaded by the spider and the crawler.

Database– storage for downloaded and processed pages.

Results engine – extracts search results from the database.

Web server – a server that is responsible for interaction between the user and other search engine components.

Specific implementations of search mechanisms may differ. For example, the Spider+Crawler+Indexer component group might be implemented as a single program that downloads web pages, analyzes them and then uses their links to find new resources. However, the components listed are inherent to all search engines and the seo principles are the same.

Spider. This program downloads web pages just like a web browser. The difference is that a browser displays the information presented on each page (text, graphics, etc.) while a spider does not have any visual components and works directly with the underlying HTML code of the page. You may already know that there is an option in standard web browsers to view source HTML code.

Crawler. This program finds all links on each page. Its task is to determine where the spider should go either by evaluating the links or according to a predefined list of addresses. The crawler follows these links and tries to find documents not already known to the search engine.

Indexer. This component parses each page and analyzes the various elements, such as text, headers, structural or stylistic features, special HTML tags, etc.

Database. This is the storage area for the data that the search engine downloads and analyzes. Sometimes it is called the index of the search engine.

Results Engine. The results engine ranks pages. It determines which pages best match a user's query and in what order the pages should be listed. This is done according to the ranking algorithms of the search engine. It follows that page rank is a valuable and interesting property and any seo specialist is most interested in it when trying to improve his site search results. In this article, we will discuss the seo factors that influence page rank in some detail.

Web server. The search engine web server usually contains a HTML page with an input field where the user can specify the search query he or she is interested in. The web server is also responsible for displaying search results to the user in the form of an HTML page.

History of search engines

Author: linkey - Categories: Search Engine

In the early days of Internet development, its users were a privileged minority and the amount of available information was relatively small. Access was mainly restricted to employees of various universities and laboratories who used it to access scientific information. In those days, the problem of finding information on the Internet was not nearly as critical as it is now.

Site directories were one of the first methods used to facilitate access to information resources on the network. Links to these resources were grouped by topic. Yahoo was the first project of this kind opened in April 1994. As the number of sites in the Yahoo directory inexorably increased, the developers of Yahoo made the directory searchable. Of course, it was not a search engine in its true form because searching was limited to those resources who’s listings were put into the directory. It did not actively seek out resources and the concept of seo was yet to arrive.

Such link directories have been used extensively in the past, but nowadays they have lost much of their popularity. The reason is simple – even modern directories with lots of resources only provide information on a tiny fraction of the Internet. For example, the largest directory on the network is currently DMOZ (or Open Directory Project). It contains information on about five million resources. Compare this with the Google search engine database containing more than eight billion documents.

The WebCrawler project started in 1994 and was the first full-featured search engine. The Lycos and AltaVista search engines appeared in 1995 and for many years Alta Vista was the major player in this field.

In 1997 Sergey Brin and Larry Page created Google as a research project at Stanford University. Google is now the most popular search engine in the world.

Currently, there are three leading international search engines – Google, Yahoo and MSN Search. They each have their own databases and search algorithms. Many other search engines use results originating from these three major search engines and the same seo expertise can be applied to all of them. For example, the AOL search engine (search.aol.com) uses the Google database while AltaVista, Lycos and AllTheWeb all use the Yahoo database.

Every Things You Need

How To Indexing a Site

Common search engine principles

History of search engines

Categories

Archives

Followers