site site-navbranch.xsl site
section no_callout.xsl no_callout
page static_html.xsl b2006_5_29
NOTE: demo is now on-line - see end of article for details.
If you are fortunate enough to have signed up for Google Analytics when it was first released, or to have been invited to join since then, then you probably know what an awesome web tracking solution it is. I was one of the fortunate ones who quickly signed up and I have instrumented a couple of my web sites, including Active Interface.
Google Analytics has top notch analytics and a powerful yet intuitive portal. The main issue you are likely to have is information overload - but that is not a bad problem to have with a free solution. In this article I will discuss my mashup strategy for dealing with this information onslaught.
What most users want, I have discovered, is a self-contained narrative - and they usually want it in PowerPoint. My clients have converted me to this way of thinking. A PowerPoint report has the advantage that it can easily be shared. Most Windows business users have PowerPoint (for those who don’t, the viewer is a free download from Microsoft). The narrative for the report can be added with bullet slides between charts or with text in the Notes panel. Being a file, such a report is easy to share within an organization. It can be emailed, placed on a file share, or put into a SharePoint repository. The presentation can be formatted to the company’s layout and branding standards. The list of benefits goes on.
My current PowerPoint Automation Toolkit can obtain data from relational databases and from SQL Analysis Services. In keeping with the mashup philosophy, I am going to add another data source interface which gets XML data from Google Analytics. Thankfully, Google provides a REST interface to the analytics reports. And the XML it returns is in a developer-friendly format.
There are a few architectural issues which quickly came to mind. First, should I add a new data source type to the PPTATK presentation manifest (the XML file which defines the presentation to be built) or should I switch to a pipeline architecture to decouple data acquisition from presentation generation? A related question is whether to create slide generation templates which directly consume Google’s XML format or if to use my existing intermediate tabular format. The final question which came to me is whether to have the Google REST URL be the format for the presentation manifest. The alternative would be to have a more user friendly set of attribute name-value pairs.
Currently there is a single WSH script which a) opens the specified manifest, b) augments the manifest with acquired data, and c) generates the PPT. It is a nice clean solution I believe. But perhaps a more modular solution would be to decouple data acquisition from PPT generation. The manifest would be passed through a series of independent scripts, each one of which would look for data sources it can handle and would get that data and stuff it into the manifest.
The alternative would be to augment the current script with additional procedural code to deal with Google Analytics as a data source. This approach would probably minimize the code and architectural changes to the current PPTGEN solution. Many of the benefits of a pipeline could be obtained by placing code for a given data source type into its own JS file which could be included in the main script.
As I write this, I haven’t come to a conclusion (usually the act of writing out the options makes one of the obvious choice). So on to the other issue.
A Google Analytics REST URL has the form
Some of the names are obvious: “bd” - begin date, “ed” - end date. The rest? well I’ll just have to reverse engineer them - at lest as far as is necessary to express they types of queries I want for my reports. A slide element in the manifest could have a query specified with just the Google URL. Or it could be done by having separate XML attributes (with more friendly names) for each of the different parts. For example, I could have:
<query type=”google analytics” report=”unique visitors” start_date=”2006/05/01” ... />
My hesitation to go the shredding route is twofold. First, I don’t really think of the XML manifest file as being a user friendly format. The user interface which generates the manifest could have the responsibility for presenting friendly property names for the user-editable components of the URL. Second, I am thinking that sticking with the Google URL for the manifest will make the mashup less brittle. A change or addition to the Google format would not require changes to the mashup script. And if a user wanted to create a slide from a report which Google added after I published my script, all he would have to do is copy and paste the URL from the Google portal.
You can’t bang out your mashups quickly if you spend time fretting over such decisions. And I’ve decided:
The results can somewhat be summarized with this listing:
F:\VSS\PPTAutomation\dev\src\xml2ppt>wc -l *.js 70 google-provider.js 128 msxml-helper.js 94 sqlxml-provider.js 58 wsh-helper.js 125 xmla-provider.js 300 xml2ppt.wsf 775 total
The PPT generator now has 475 lines of code. The new Google analytics provider is 70 lines. The SQLXML and XMLA providers were also split out an are about twice as big. But much of the work done in this mashup is accomplished using XSLT so it wouldn’t be fair to leave out those files
F:\VSS\PPTAutomation\dev\src\xml2ppt\xsl>wc -l *.xsl 126 ga2table.xsl 59 sqlxml2table.xsl 741 xml2ppt.xsl 157 xmla2table.xsl 1083 total
So there is actually more declarative than imperative code in this mashup.
And finally, the long awaited PowerPoint results. I’ve built two presentations which combined present almost all of the Google Analytics reports. One presentation has the Marketing Optimization reports and the other has the Content Optimization reports.
I generated the reports using my “landscape dark blue” template - I like dark backgrounds since they are easier on they eyes when projected. All settings (master page, fonts, colors, etc.) came from the this template fed into the generator.
I’ve decided to present the PPTs just as they come off the assembly line. Normally you would apply some “spit and polish” in terms of positions, font sizes, table layouts, etc. But I think these look respectable even without any fine tuning.
Many of the charts and table are empty - I don’t currently sell anything on my site so reports pertaining to conversion and ROI have no data to plot.
Here is a screenshot of PowerPoint with one of my Content Performance slides:
You download the presentations using these links.
The Google Analytics mashup is somewhat non-traditional in that it does not consume a “published” public XML interface. As such, I am relying on Google using the information in my browsers cookies when I make the requests.
As I start to become a web analytics junkie, I’ll probably now want to extend this concept. My next target will be WebTrends. This will be a more traditional mashup since they expose and ODBC interface - even to their hosted analytics I hear.
Another mashup I'll be doing with Google Analytics will be an integration with ThinCMS. I want to build a custom site overlay since I find the Google one pretty useless (their overlays cover up all your links).
You can see the mashup in action by opening http://www.activeinterface.com/pptatk/ga2ppt.hta
This only works in IE on Windows. You need to have MSXML4 installed (if you don't the application will tell you and give you the download link). And of course you will need to trust me enough to run an HTA.