Poll on Few Questions

For the purpose of analyzing public webmaster’s opinion I have a few questions, which I’d be very pleased to receive answers for:

1. What do you consider to be “Black optimization, Grey optimization”? I mean give examples of methods which you consider black and/or gray, regardless of whether you use it or not?

2. How many time do you spend daily for SEO?

3. Is SEO your main work or just hobby?

4. Are you investing in SEO, or think that it is just source for money, so you are just earning money (of cource domain names and hosting are not considered as investing).

5. How do you fight lazyness? :) Yes, it is a problem of lots of people, and it is a problem of mine too. I think many people around here will appreciate some methods of self-motiavtion.

Hope this poll will be interesting for you and that answers for these questions will make you smile and find something interesting to consider in further work. Let’s share useful tips making our job better.

Aged, Used, Expired Domains

Due to sandbox problem, various penalties and filters, some of my friends have tried to use aged, used and expired domains. Results are surprisingly contradictory.

Expired domains appeared to be bad idea in 95% of cases, because usually Google filters expired domains out for a couple of months, especially if such domains were listed in dmoz. There were only few cases, when domain name has been listed in dmoz.

Used domains are good if you make everything wise. For instance you have a lot of information on some subject. And you manage to rebuy the domain that perfectly fits your needs. It is a wise decision to buy hosting account, where it was primarily hosted and replace old information step by step - not do delete everything and set up absolutely different site. Google tracks changes of IP address, so moving to other host can cause Page Rank loss and temporary penalty. Immediate change of all content of site can cause the same penalty, unless it was a dynamic site that changed daily.

Aged domains are same thing to used, but they are good because of their age - they are usually bypassing sandbox and a couple of other penalties and filters. Again, the only condition of building site on such domain is wise use.

This information is taken from my own experience and thoughts shared by friends. If you disagree or have something to add - waiting for your comments.

Page Length: How to Avoid Long or Bad Indexing

You may know that page length affects indexing time to time. Not so far ago, Google was reading up to 100th kilobyte of page and didn’t read it after 100th kilobyte for the purpose that the surfer unlikely will read it up to 100th kilobyte and further. Of course this is about text content only. Images, graphic elements and invisible elements are excluded from calculation of page length.

Remember, that even though ivisible elements are usually excluded from calculation of page length, too heavy page is usually read worst. For instance, I’ve seen lots of pages overstuffed with things that can be moved outside: CSS (which can occupy half of total page length), Ad blocks, which can be moved to include PHP files and so on.

I usually make pages that are not longer one screen, especially those targeted on the majority of surfers around. It can seem unbelievably by many surfers don’t know what scrolling is, so it is better to put everything in screen. This site building strategy is also good for search engines, because short pages can be devoted to keywords, which could be in a mess on a single page. For instance if I’d had a page telling about apple, apple pies and apple trees, I’d divide it into three pages - about apples, about apple pies and about apple tries. This way I’d increase relevancy of each single page and facilitate indexing because they’ve become three times shorter each.

PageRank Explained

Appraisal of link importance.
Term “Link Popularity” is a bit incorrect. It would be much more close to what it means if it would be called “Link Topology”, because this method considers relationship of links along with quantity. However, as a result of analysis we receive “importance” of a page. This is not what “relevancy” is. Relevancy shows how contents of your page correspond to a particular search query. “Importance” shows value of page, regardless of it’s contents. Any inbound link states that this page has some value and it increases it’s “importance” this way. The more rating it has, the more “important” it is.
Not all links are making equal contribution in page’s rating. Some of linking pages can be more important than others and so on, thus outbound link from such page is more important.
So, “Important page is a page that has links from important pages. Exclusive circle? Yes, it’s rather easy to understand subconsciously. For instance a link from NASA will be more important than a link from your cousin’s Kate homepage – not because NASA loves you more, but because there are thousands of sites linking to NASA and just a couple of them linking to Kate’s.
How the “importance” is measured.
Though it is easy to understand on instance of relationship between two pages, measuring of importance of milliards of related pages seems hopelessly complicated. Indeed this is really complicated, but not hopelessly – everything’s almost easy. Such measures demand lots calculations, but fortunately we shan’t invent anything new. We can just take ready formulae from scientific sources.
Larry Page and Sergey Brin, the founders of Google and first developers of it’s algorithm have published “The Anatomy of Large Hypertext Search Engine”. You can download it from http://www-db.stanford.edu/pub/papers/google.pdf in PDF format. The document describes the Page Rank technology – method of appraisal of page important, measured proceeding from pages linking to the appreciated page.

So, the Page Rank formula. It looks complicated, but it just looks so. In practice, you will need just a little knowledge of algebra (I don’t know whether algebra is studied in such volumes in schools of Her Majesty’s Land and US, but in my country math is studied since 7 y.o; algerbra and senior math since 10 y.o.)

For instance, there are page A, which has inbound links from other pages. Let’s call them T1, T2, T3, and so on up to Tn.
No math yet, we’ll just give names to things that we are going to speak about. Imagine that A is your homepage and T1-Tn – other pages, which contain hyperlinks pointing your page. For instance, T2 can be a homepage of your cousin Kate (if this helps in understanding ;) )
PageRank of page A is calculated using the following formula:

PR(A) = (1-d)+d [PR(T1)/C(T1)+PR(T2)/C(T2)+PR(T3)/C(T3)+…+PR(Tn)/C(Tn)]
In case if it looks complicated for you, let’s divide it in three groups:
PR (A) means PageRank of a page A – value we are trying to calculate. This expression just defines the problem – all calculations will be on the other side of “=”.
( 1-d) + d – fade ratio. Don’t pay attention to it. Page and Brin recommend to measure it equally to 0,85. so we will set it 0,85 and forget about it. Though it is important if you create a search engine, our calculations allow taking ready value. We are just going to calculate expression in brackets, multiply it by 0,85 and add 0,15 to the result, as it is mentioned in formula.

Now let’s get back to the expression in brackets and write it as follows:
[ PR(T1)/C(T1)+
PR(T2)/C(T2)+
PR(T3)/C(T3)+…+
PR(Tn)/C(Tn)]
It’s easy to see that T1, T2 and T3 are that pages, which link to A. I hope it’s easy to make calculations with these simple formulas you have received after dividing. Obvious difficulty is just in quantity of calculations.
PR means Page Rank of T1, T2… Tn pages. The only novelty that appears in this formula is C – quantity of hyperlinks on the given page. C(T2) is common quantity of outbound links on T2 page, e.g. links of such kind:
http://www.av.com
This link is an inbound link for page, where it points.
Having united these three components, which we have previously divided, we can define sequence of actions applying this formula to any particular page.
Create a list of all pages, which link to this page.
Define the following values for each page:
PageRank, outbound links,
Divide PageRank by outbound links (e.g. if PageRank is 6 and there are three outbound links the result will be 6:3=2. )
Make sum of such results for each inbound link.
Apply fade ratio to the result.

Altavista Characteristics

Quantity of indexed pages: about 500.000.000
Frame support: uses NOFRAMES tag
Metatags support: contents of title and description tags is used to define relevancy of a page.
Database updates: completely updated once per three months
Approximate indexing time: 4-6 weeks for free submission, 1-2 weeks for paid.
Search robot name: Scooter. Current version Scooter W3.1.2
Price of quick indexing is $39 per URL. Indexing price for additional pages is measured using special scale.
Pay links: three links from Overture database (in the upper side of SERP).
Updates – weekly for paid submissions, 4 weeks for free.
Image and media search: indexed and counted at definition of relevancy
Optimal keyword density: use one instance on each page and in each tag: title, keywords, description, body. Recommended keyword density is 0.33-5.5%.
Indexing by inbound links: robot is quite unpredictable: the best way to get your page indexed is to submit your site manually.
Link Popularity is considered when defining relevancy.

Free Hosts and 3rd Level Domains

You may have noticed that SERPs are full of 3rd level domains for the last five or six month. There are a lot of questions coming to me by e-mail, which concern this subject. The majority of webmasters ask how Google defines that a domain is a free hosting and how to imitate it. Other ask why are they dropping out from SERPs, while domains, where subdomains are based on are still in SERPs.

First. There are not special definition of a free host. Any unsandboxed domain can run a lot of subdomains that will hold good positions in SERPs, untill Google spots them to be harmful and wipes them out. That is because of blogs and guestbooks spam Google is filtering out 3rd level domains. If use gray and white methods wisely, you can hold good positions and never drop out.

Concerning subdomains that have been dropped out. The problem is that in my opinion, Google is filtering such domains and their subdomains manually, by somebody’s abuse or smth. like this.
However, they see that domain is good itself - it gives free space to people that create homepages on it, or something like that. That is why they do not ban, but filter out all subdomains. They get sandboxed or lowered in SERPs.

If you have something to tell about - share your experience of managing subdomain based projects.