What are 2020Media Webstats?
Most hosting accounts from 2020Media come with website visitor statistics. These show you useful and interesting information about visits to your website. They can also show you what’s wrong with your website.
How do I access my webstats?
On most accounts, you simply open a web browser, type in your domain name and then add /webstats or /logs on the end. These can be made private if you wish. The main summary page for your statistics will then open. If the link doesn’t work, contact 2020Media via the portal.
The main summary screen.
The main summary page shows all the information for the last 12 months. Information older than 12 months is not kept. There are links to each month. The monthly report has detailed statistics for that month with additional links to any URL’s and referrers found.
Hits, Files, Pages and Visits
Any request made to the server which is logged, is considered a ‘hit’. The requests can be for anything… html pages, graphic images, audio files, CGI scripts, etc… Each valid line in the server log is counted as a hit. This number represents the total number of requests that were made to the server during the reported period.
Some requests made to the server, require that the server then send something back to the requesting client, such as a html page or graphic image. When this happens, it is considered a ‘file’ and the files total is incremented. The relationship between ‘hits’ and ‘files’ can be thought of as ‘incoming requests’ and ‘outgoing responses’.
Pages are, well, pages! Generally, any HTML document, or anything that generates an HTML document, would be considered a page. This does not include the other stuff that goes into a document, such as graphic images, audio clips, etc… This number represents the number of ‘pages’ requested only, and does not include the other ‘stuff’ that is in the page. What actually constitutes a ‘page’ can vary from server to server. The default action is to treat anything with the extension ‘.htm’, ‘.html’ or ‘.cgi’ as a page. A lot of sites will probably define other extensions, such as ‘.phtml’, ‘.php3’ and ‘.pl’ as pages as well. Some people consider this number as the number of ‘pure’ hits. Sometimes this is referred to as ‘Pageviews’.
Each request made to the server comes from a unique ‘site’, which can be referenced by a name or ultimately, an IP address. The ‘sites’ number shows how many unique IP addresses made requests to the server during the reporting time period. This DOES NOT mean the number of unique individual users (real people) that visited, which is impossible to determine using just logs and the HTTP protocol (However, this number might be about as close as you will get).
Whenever a request is made to the server from a given IP address (site), the amount of time since a previous request by the address is calculated (if any). If the time difference is greater than a pre-configured ‘visit timeout’ value (or has never made a request before), it is considered a ‘new visit’, and this total is incremented (both for the site, and the IP address). The default timeout value is 30 minutes, so if a user visits your site at 1:00 in the afternoon, and then returns at 3:00, two visits would be registered. Note: in the ‘Top Sites’ table, the visits total should be discounted on ‘Grouped’ records, and thought of as the “Minimum number of visits” that came from that grouping instead. Note: Visits only occur on PageType requests, that is, for any request whose URL is one of the ‘page’ types defined with the PageType option. Due to the limitation of the HTTP protocol, log rotations and other factors, this number should not be taken as absolutely accurate, rather, it should be considered a pretty close “guess”.
The KBytes (kilobytes) value shows the amount of data, in KB, that was sent out by the server during the specified reporting period. This value is generated directly from the log file, so it is up to the web server to produce accurate numbers in the logs. In general, this should be a fairly accurate representation of the amount of outgoing traffic the server had, regardless of the web servers reporting quirks.
Note: A kilobyte is 1024 bytes, not 1000
Monthly totals: Entry/Exit, Referrers, Search, Countries
On the monthly detail (click the link on the summary page) more detailed information is displayed. Much of this is of the same type as on the summary page, but there are some new headings as well.
Top Entry and Exit Pages
The Top Entry and Exit tables give a rough estimate of what URL’s are used to enter your site, and what the last pages viewed are. Because of limitations in the HTTP protocol, log rotations, etc… this number should be considered a good “rough guess” of the actual numbers, however will give a good indication of the overall trend in where users come into, and exit, your site.
Referrers are those URLs that lead a user to your site or caused the browser to request something from your server. The vast majority of requests are made from your own URLs, since most HTML pages contain links to other objects such as graphics files. If one of your HTML pages contains links to 10 graphic images, then each request for the HTML page will produce 10 more hits with the referrer specified as the URL of your own HTML page.
Sometimes when a user clicks on a link to your site from a search engine, information about what search terms they used will be recorded in our logs. This information is extracted and displayed here. The number of hits is quite low here, and this is becuase a lot of the time the search terms are not passed to our server when the visitor clicks the search engine link. However it does provide a useful indication of how people are finding your site.
Countries are determined based on the top level domain of the requesting site. This is somewhat questionable however, as there is no longer strong enforcement of domains as there was in the past. A .COM domain may reside in the US, or somewhere else. An .IL domain may actually be in Isreal, however it may also be located in the US or elsewhere. The most common domains seen are .COM (US Commercial), .NET (Network), .ORG (Non-profit Organization) and .EDU (Educational). A large percentage may also be shown as Unresolved/Unknown, as a fairly large percentage of dialup and other customer access points do not resolve to a name and are left as an IP address
Notes on accuracy
The majority of data analyzed and reported on by The Webalizer is as accurate and correct as possible based on the input log file. However, due to the limitation of the HTTP protocol, the use of firewalls, proxy servers, multi-user systems, the rotation of your log files, and a myriad of other conditions, some of these numbers cannot, without absolute accuracy, be calculated. In particular, Visits, Entry Pages and Exit Pages are suspect to random errors due to the above and other conditions. The reason for this is twofold, 1) Log files are finite in size and time interval, and 2) There is no way to distinguish multiple individual users apart given only an IP address. Because log files are finite, they have a beginning and ending, which can be represented as a fixed time period. There is no way of knowing what happened previous to this time period, nor is it possible to predict future events based on it. Also, because it is impossible to distinguish individual users apart, multiple users that have the same IP address all appear to be a single user, and are treated as such. This is most common where corporate users sit behind a proxy/firewall to the outside world, and all requests appear to come from the same location (the address of the proxy/firewall itself). Dynamic IP assignment (used with dial-up internet accounts) also present a problem, since the same user will appear as to come from multiple places.
For example, suppose two users visit your server from XYZ company, which has their network connected to the Internet by a proxy server ‘fw.xyz.com’. All requests from the network look as though they originated from ‘fw.xyz.com’, even though they were really initiated from two separate users on different PC’s. The Webalizer would see these requests as from the same location, and would record only 1 visit, when in reality, there were two. Because entry and exit pages are calculated in conjunction with visits, this situation would also only record 1 entry and 1 exit page, when in reality, there should be 2.
As another example, say a single user at XYZ company is surfing around your website.. They arrive at 11:52pm the last day of the month, and continue surfing until 12:30am, which is now a new day (in a new month). Since a common practice is to rotate (save then clear) the server logs at the end of the month, you now have the users visit logged in two different files (current and previous months). Because of this (and the fact that the Webalizer clears history between months), the first page the user requests after midnight will be counted as an entry page. This is unavoidable, since it is the first request seen by that particular IP address in the new month.
For the most part, the numbers shown for visits, entry and exit pages are pretty good ‘guesses’, even though they may not be 100% accurate. They do provide a good indication of overall trends, and shouldn’t be that far off from the real numbers to count much. You should probably consider them as the ‘minimum’ amount possible, since the actual (real) values should always be equal or greater in all cases.
It is common practice to add 1/3 to the value of visits to take account of the above problems.