Survey Data Mining:   Home | FAQ | Archive | Glossary
Free Reports
  


You are viewing an outdated report. The latest version of this report was published on November 1st, 2024


    
WebSite Failure/Growth Report
September 1st, 2003

Report Description
This report focuses attention exclusively on those servers that are new to our survey, and those servers that have disappeared from the survey (and presumably off the net) as a result of being unreachable for 3 consecutive months.

Monthly Web Site Failure Rate
This graph illustrates the percentage of servers in our survey database that are removed each month as a result of the sites in question being non-responsive for 3 consecutive months.

Monthly Failure Rates by Server Type
Analyzing failed sites by the server that the site was last known to operate, we can determine the distribution of server types amongst failed sites.

By modifying the above graph's percentage to reflect a delta between the percentage market share a web server currently enjoys, and the percentage of failed sites using that server, we highlight whether a disproportionate number of servers with the specified server type are failing.

For example, if a particular server currently enjoys a market share of 50%, but only 40% of failed sites are of that server type, then that server will have a value of -20% on the graph, since the server type is losing customers at a rate 20% lower than expected. Conversely, if the market share of a server is 10%, but 12% of all failed sites are of this server type, then that server will have a value of 20%, since it is losing sites 20% faster than expected. By calculating values in this way, we have the ability to directly compare values between different server types, regardless of their current market share.


Monthly Web Site Growth
As part of the survey system, we have crawlers that visit non-stop web sites looking for new sites via hypertext references. While we don't crawl the entire web each month, we do manage to crawl all sites we know of to a pre-configured depth approximately once a year.

By measuring the rate at which we find sites that we've never known about in any one month, and knowing the sample size of our data sets, we can do a first order approximation of how large the web is, by estimating how many sites we would find if we crawled all of the web in one month, rather than only the approximlate 10% that we currently crawl.

As a result of how we crawl the web, our surveys only report on what we call "Active" Web. That is to say, we only include sites that were important enough to be referenced by another site. This means that parked domains, personal web sites not referenced anywhere, etc. are not included in our survey. Our argument is that if we can't find a site, then it really isn't part of the "Active" Web.

The following graph depicts what we feel is a reasonably accurate estimate of the size of the active web over time:

New Web Sites by Server Type
Analyzing new sites that we find by the server signature returned by that site provides insight into the types of technologies being selected by new web site operators. This provides an important indicator as to the viability of a product. Technologies already in place tend to remain in place due to the inertial resistance to change by existing administrators. However, when a new web site is launched, it is much less likely to be constrained in that fashion. The following graph depicts web server market share of new sites. It should be noted that because of how we crawl the web, on average a site will have been up for about 6 months before we find a reference to it.

By modifying the above graph's percentage to reflect a delta between the percentage market share a web server currently enjoys, and the percentage of new web sites using that server, we highlight whether a site is doing better or worse than it has in the past in terms of acquiring new sites.

For example, if a particular server currently enjoys a market share of 50%, but only 40% of new sites found are of that server type, then that server will have a value of -20% on the graph, since the server type is underperforming its expected percentage by 20%. Conversely, if a market share of a server is 10%, but 12% of all new sites are of this server type, then that server will have a value of 20%, since it is over performing by 20% of its expected value. This mechanism has the effect of providing a common basis of comparison among all servers, regardless of their current market share.




© 1998-2024 E-Soft Inc. All rights reserved.