Data Collection Part 2 - Hybrid Methodologies
Over the last 5 years, the web analytics industry has made significant advances in its data collection techniques. Web analytics vendors as well as individuals and businesses using web analytics strive to obtain the most useful and impacting data in a manner that is the most cost and time efficient. At the inception of web analytics, log files were the only available data source. Today, page tagging as well as network data collection are also very valuable sources of information. To review definitions as well as additional information on the page tagging, log files, and network data collection, view Data Collection Part 1 - Single Methodologies.
To overcome certain obstacles that can be caused by each of the data collection techniques, many organizations have implemented a hybrid solution. Hybrid data collection solutions have the potential to leverage the strengths of each technique while minimizing the weaknesses. Below is a review of the following four main hybrid solutions:
- Case 1: Logs and Tags
- Case 2: Logs and Network Data Collection
- Case 3: Network Data Collection and Tags
- Case 4: Network Data Collection, Tags, and Logs
Case 1: Logs and Tags
-
Usage patterns from spiders (via the log files)
-
Complete download data (via the log files)
-
Error code data (via the log files)
-
Behavioral data (via page tags)
-
Specific data elements defined by the company (via page tags)
-
Cached pages (via page tags)
This hybrid solution will require additional resources including additional configuration, expertise, and people. Additionally, the proper configuration of this hybrid solution is essential. It is crucially important that the software or tool is setup in such a way that there is a process for ensuring that the data captured is not duplicated. For example, data counted by the tag is not duplicated by the log file for the same request.
Case 2: Logs and Network Data Collection
Case 3: Network Data Collection and Tags
Network Level Data:
Network data collection provides access to a more granular level of technical data that can be used to determine server response times to requests and identify network related issues that could be interfering with user experience.
Data Consolidation:
Often, network data collection simplifies the process of consolidating and combining data from many servers which is common to log files.
Additional Application Data:
Some network data collectors are capable of collection application server variables and other additional fields of data that are not captured in log files and would be difficult or impossible to capture with page tags.
First Time Visit Cookie Setting:
Some network data collectors are capable of setting a visitor identification cookie which is a superior method of setting this cookie as the first request the web server sees from a new visitor will not have the appropriate visitor identification cookie on it.
Search Engine Spider Reporting:
Knowing the usage patterns of spiders can be valuable when engaging in search engine optimization. This data can be utilized to optimize the technology and content of the site for those spiders.
Complete download data:
Log files make it possible to calculate the amount of downloads for files that are successfully completed vs. downloads that were not fully completed.
Server Error Code Reporting:
Error code data is automatically recorded in most log files and can provide valuable information into site functionality and design issues that would be difficult to detect through other means.
Not all data collectors are able to collect all of the above information. However, the most recently developed data collectors have made many new advances and are able to collect the data listed.
Many companies would likely yield the highest return on investment by utilizing a network data collection and page tagging hybrid solution. As in all hybrid cases, the implementation and maintenance must be carefully monitored.
Case 4: Network Data Collection, Tags, and Logs
Recently, a number of web analytics vendors have presented innovative ways to collect data from the network. In particular, ClickStream Technologies has created some interesting technology that allows for the combination of network data collections and tagging in such a way that the tagging is done in a highly automated way. Additionally, Visual Sciences has been offering creative solutions which incorporate network data collection for some time now. It can be expected that more vendors will be adopting this option of data collection in the near future and developing their own approaches.
Josh Manion Chief Executive Officer
Stratigent, LLC For more information please call 877-427-2900 or email info@stratigent.com.
.jpg)