Where Do I Monitor Web Usage Statistics?

Your log files contain the statistics for your site's traffic. These files are in your /logs directory and are accessed through your control panel. We offer several program anysis updated several times a day. The raw files mat be large and difficult to understand. We use a comprehensive log analysis program to present these statistics in a more graphic format and help identify opportunities. The output is very comprehensive. Just type in, yourdomain.com/logs/ where yourdomain.com is the name of your web site. You may be prompted for your user id and password to see the logs.

If you are looking for a more analysis for your web site, we suggest you contact us for a detailed analysis of your logs with a report. We run a statistical package. The program we use is a "bear" to run, takes forever to output, and costs way too much ($1,200 per copy). But it gives good business related data.

If you are looking to perform a more detailed analysis yourself, your selection of a program should give you information about:
The return on your investment
Measurements of the traffic on your domain
The quality and quantity of visitors to your web site
How many users visit your site daily and how that traffic is progressing to grow
The paths do visitors take when they browse your web site
Which is the most active day of the week? The most active hour?
What kind of information is accessed on your server?
Which pages are the most popular?
How many users are accessing each directory?
Which forms are submitted more than others?

Our default system has almost all of the above data and can be related back to the other information easily. Thus, for most people, another program is unnecessary.


What Does It Mean and How Do I Analyze It?

If you do not include your web address in every single document you send out, advertisements, cards, the side of your building, and more, you are missing one of the best opportunities to increase your business and find out how your hard earned advertising and other dollars are paying off. In every study since 1999, the inclusion of your web address in a publication or other item increases the productivity of that item by 10% to 20%. This means that if you place an advertisement in any type of media, there are ways to find out if it is paying off. They don't cost much either. Contact us for details on how to do this in your advertising program (print or otherwise).

Logs and their analysis can be one of the most important markers for success or failure of your business. Many people are surprised that this can include your off line efforts too. This support discussion is NOT meant to cover all issues as log analysis can be complex and too much for a single support page document. We have experts who can make a full analysis and report if desired.

The yearly (index) report shows statistics for a 12 month period, and links to each month. The monthly report has detailed statistics for that month with additional links to any URL's and referrers found. The various totals shown are explained below.

General- Hits versus Page Requests

Hits. Any request made to the server which is logged, is considered a 'hit'. The requests can be for anything... html pages, graphic images, audio files, CGI scripts, etc... Each valid line in the server log is counted as a hit. This number represents the total number of requests that were made to the server during the specified report period.

Pages. Pages are, generally, any HTML document, or anything that generates an HTML document. This does not include the other stuff that goes into a document, such as graphic images, audio clips, etc. This number represents the number of 'pages' requested only, and does not include the other 'stuff' that is in the page. Treat anything with the extension '.htm', '.html' or '.cgi' as a page.

Hits and page requests are not synonymous. When someone clicks on a link to one of your pages or types the URL of one of your pages in the address box of their browser, they have generated a request header that is sent to the server hosting that page. The server records that request in the today log file. When that HTML page arrives at the browser, the browser reads it and creates the page on your screen. Any place on that page that a graphic is displayed remains blank until the browser generates the request headers for those graphics files and receives those files. The server also logs these requests in the today log file. Each night, the server appends that info to the mtd (Month to date) file and starts the today file anew. A program then reads the info in the mtd file and creates the analysis files for the month overlooks and month details pages. Hits are not of much use unless you are a web designer and will not be covered further in this discussion.

Files. Some requests made to the server, require that the server then send something back to the requesting client, such as a html page or graphic image. When this happens, it is considered a 'file' and the files total is incremented. The relationship between 'hits' and 'files' can be thought of as 'incoming requests' and 'outgoing responses'.

The Month Overview:

The current (to about midnight last night) and previous month summaries are shown when you first enter your log system. The main issues in your daily average is page requests and visits. Page requests are physical viewable documents and visits correspond to people visiting. Monthly totals show the sites (referrers), the Kb traffic, visits, pages, files, and hits. Sites, traffic, visits, and pages are the main issues here.

In general, if your site is new and web marketing is in progress, your traffic, sites, and visits should be constantly increasing. The actual behavior is strongly dependent on your advertising and marketing efforts. If your web site is established, the traffic will generally increase during the fall and spring, decline in summer, and depending on the type of site, increase or decrease during holidays. Business sites typically decline during holidays, sales of gifts increase prior to them. Again, the actual behavior is strongly dependent on your advertising and marketing efforts. If your site is a part of an AIM or other program managed by OfficeOnWeb, you will see much more variation than the typical. We purposely generate marketing efforts, remove some of the effort, and perform other features so we can improve the production of such marketing consulting services for that client. These do not affect other clients without these services.

The Month Detail:

The month detail shows daily, average, maximum and much more information for the month in question. If it is the current month, the data is through the day prior to viewing (date and time of the run is shown at the top of the report). Details regarding specific issues are shown in the discussions below.

Response code. Code 200 means the visitor obtained the information OK and it was not previously cached. The assumption then becomes that they are a new visitor. Code 304 means the visitor obtained the information from their previous cache. This means that they had it before. The typical assumption is a repeat visitor but it also can be the same visitor going back to a page they viewed earlier. In general, a successful web site has 20% to 30% return ratios here. If people are not returning, at this rate, they may not be finding the site has what they are looking for. If your 404 is high, you should examine your site to see if some links are broken. If the 404's are for a favico, you are missing an opportunity for creating a memorable site return. You should create a favico and place it on the site.

Daily usage graph. If you run an advertisement, had a nice article published about your business, or other action, you should make sure your web site address appears in it. Then look at your logs. If you don't see a rise on the day of publication and about three days after it, the advertisement, article, or other item was not very effective. You can enhance this ability by using multiple domain and parking techniques.

Ratios. If you have less than ten pages, ratios don't tell you as much about your success potential. After all, if you have only one page, the fact that people came, looked at the page, and left doesn't say much. Did they like the site, feel like buying, or whatever, you cannot tell. In general, for a web site of ten or more pages, your average visit to page ratio should be about three page requests per visit. This means people found the kind of information they planned on when they came to the site. A ratio of three to one or more on pages to unique URL's and sites indicates highly successful search engine or other link placements. You should factor the size of your own sire into the unique URL's and referrer computations.

Hourly usage Graph. If you run a time dependent advertisement or other similar action, you should make sure your web site address appears in it. Then look at your logs. If you don't see a rise on the hour of airing and about three hours after it, the advertisement or other item was not very effective. You can enhance this ability by using multiple domain and parking techniques. If you are not doing this, the hourly shows they type of visitation you get. Business to business groups are usually heavily visited from about 6-6 during the day. Shopping are 10 am to 10pm in the evening. Sites of international interest have high early morning visitations. Peaks are depending on the type of site too.

URL and page data. Totals show the most popular files looked at. Usually the home page is number one but it may be another due to an advertising or marketing program. It may be a graphic that you are using as an ad too. Entry is the most popular page that attracted people to first enter your web site. Properly done, this can identify marketing and advertising efforts too. Exit pages are the page people left on. People will tend to go to the home page just before they leave, so expect that to be a number one exit in most cases for successful or unsuccessful sites. If the page is an extremely common exit point, you may want to change the page to remove a problem. Link pages are always an exit point. Remove them unless you feel comfortable with the traffic loss.

Referrer. This is a critical list. It would take a book to tell you all this can tell you. Your advertising program effectiveness can be shown here, search engine effectiveness, popular information links within your own site, popular issues clients seem to like, and so much more.

Search string. If it was a search engine that was used to find the web site, a search string was probably used in the query. This means you can see what strings are being used to find your site, and how effective are your pages are at meeting the terms you thought would be important. Note that bad spelling actually can get you clients.

Sites. Each request made to the server comes from a unique 'site', which can be referenced by a name or ultimately, an IP address. The 'sites' number shows how many unique IP addresses made requests to the server during the reporting time period. This DOES NOT mean the number of unique individual users (real people) that visited, which is impossible to determine using just logs and the HTTP protocol.

Visits. Whenever a request is made to the server from a given IP address (site), the amount of time since a previous request by the address is calculated (if any). If the time difference is greater than a pre-configured 'visit timeout' value (or has never made a request before), it is considered a 'new visit', and this total is incremented (both for the site, and the IP address). The default timeout value is 30 minutes, so if a user visits your site at 1:00 in the afternoon, and then returns at 3:00, two visits would be registered. Visits only occur on PageType requests, that is, for any request whose URL is one of the 'page' types defined with the PageType option. Due to the limitation of the HTTP protocol, log rotations and other factors, this number should not be taken as absolutely accurate, rather, it should be considered a pretty close "guess" of the number of unique visitors. This number might be about as close as you will get to finding that number.

KBytes. The Kbytes (kilobytes) value shows the amount of data, in KB, that was sent out by the server during the specified reporting period. This value is generated directly from the log file, so it is up to the web server to produce accurate numbers in their logs for none http type visitation. In general, this should be a fairly accurate representation of the amount of outgoing traffic the server had, regardless of the web servers reporting quirks. Note: A kilobyte is 1024 bytes, not 1000.

Top Entry and Exit Pages. The Top Entry and Exit tables give a rough estimate of what URL's are used to enter your site, and what the last pages viewed are. This number should be considered a good "rough guess" of the actual numbers, of the overall trend in where users come into, and exit, your site.


Notes on Visits/Entry/Exit Figures.
The majority of data analyzed and reported is as accurate and correct as possible based on the input log file. However, due to the limitation of the HTTP protocol, the use of firewalls, proxy servers, multi-user systems, the rotation of your log files, and a myriad of other conditions, some of these numbers cannot, without absolute accuracy, be calculated. In particular, Visits, Entry Pages and Exit Pages are suspect to random errors due to the above and other conditions. There is no way to distinguish multiple individual users apart given only an IP address. Because log files are finite, they have a beginning and ending, which can be represented as a fixed time period. There is no way of knowing what happened previous to this time period, nor is it possible to predict future events based on it. Also, because it is impossible to distinguish individual users apart, multiple users that have the same IP address all appear to be a single user, and are treated as such. This is most common where corporate users sit behind a proxy/firewall to the outside world, and all requests appear to come from the same location (the address of the proxy/firewall itself). Dynamic IP assignment (used with dial-up Internet accounts) also present a problem, since the same user will appear as to come from multiple places.

For example, suppose two users visit your server from XYZ company, which has their network connected to the Internet by a proxy server 'fw.xyz.com'. All requests from the network look as though they originated from 'fw.xyz.com', even though they were really initiated from two separate users on different PC's. The log analysis would see these requests as from the same location, and would record only 1 visit, when in reality, there were two. Because entry and exit pages are calculated in conjunction with visits, this situation would also only record 1 entry and 1 exit page, when in reality, there should be 2.

As another example, say a single user at XYZ company is surfing around your website. They arrive at 11:52pm the last day of the month, and continue surfing until 12:30am, which is now a new day (in a new month). Since we rotate (save then clear) the server logs at the end of the month, you now have the users visit logged in two different files (current and previous months). Because of this, the first page the user requests after midnight will be counted as an entry page. This is unavoidable, since it is the first request seen by that particular IP address in the new month.

For the most part, the numbers shown for visits, entry and exit pages are pretty good 'guesses', even though they may not be 100% accurate. They do provide a good indication of overall trends, and shouldn't be that far off from the real numbers to count much. You should probably consider them as the 'minimum' amount possible, since the actual (real) values should always be equal or greater in all cases. In all cases, they are far better and useful than any counter system around.

Other items. There are many other items listed and they tell you much about your web site performance. However, the meaning is highly dependent on your marketing program and how your site is designed. Expert advise for analysis beyond the above is recommended. For example, if you design in a browser specific system and don't avoid key problems, all your traffic will show pretty much one browser type (FrontPage->Internet Explorer on Windows). That is because it designs pages that don't look to good on other peoples browsers. If you pages are compatible, you will find that 5% or more traffic is Apple or others, and Navigator is as much as 25% of the traffic.

 

OfficeOnWeb
261 Hines Road
Polk. PA 16342
Main Phone: 301-327-7500
FAX (call for number)
Toll free is available for clients only (call for number)
Copyright © 1993-2005 by Office on Web of Evergreen Colorado