Security News

<< Next Post - Previous Post >>

How to setup PIWIK to track visitors' downloads

PIWIK is a an amazing Open-Source Web Analytics platform which is a good alternative to http://www.google.com/analytics/ as it provides full control to your data and more details (i.e.: full IP addresses).
You have 3x hosting options:

  • Use their cloud service. (More info here)
  • Host it yourself online: at the back of your web server or on a different/dedicated server. (More info here)
  • Host it offline, and manually import your apache logs. (More info here )

  • The advantage of hosting it online is that you can use a php/javascript trackers within your web pages producing more information on your visitors (screen resolution, plugins, etc)
    It also allows you to do certain "tricks", such as tracking who downloads a specific image as described in this article. You can of course do that without a php script, but again, you will not get as much detailed information.

    The problem with PIWIK, at the moment, is that you cannot easily produce a report on who has downloaded a file.
    Under Actions -> Downloads, you do get an overall total number of downloads for each file being tracked/logged but no other details.

    Below, is a work around on how to get that missing detail:
  • Note it only works for logs being produced AFTER the following has been setup, it is not a retrospective hack
  • Go to Goals -> Add new Goals
  • Give you goal a name
  • If the file you want to track is a binary, then select "download a file"
  • Set the filter to "contains" and type the name of the file. There is no need to use regular expression, if for example you want to track all the following files: package1.rar, package2.rar... package50.rar, you can just enter "package". It would however be advisable to choose a filename as unique as possible, which is not part of an html page name. If you want to track a single file just enter "package1.rar"
  • Allow the Goal to be converted more than once per visit
  • Add the goal

  • If you wanted to track the download of a specific html page or shell script (i.e.: .sh) select "visit a given URL" instead of "download a file". Remember there is no need to use the full URL path or regular expression.

    To generate a report you need to do the following:
  • Get your goals ID, by adding a goal, and instead of setting a new one just click on the "manage goal" link and take note of the goal IDs (there is probably another way to go to the list of goals you have setup through the admin pages)
  • Go back to your PIWIK website dashboard
  • Click on the "ALL VISITS" filter and click on "Add a Segment"
  • On the left handside, select "Visit" and drag&drop "Visit Converted a specific Goal ID" to the right handside, then specify the Goal ID you found earlier
  • Name that Segment
  • If you want to remove bots' hits (i.e.: Google, crawlers, etc), on the left handside open the "Custom Variables" and drag&drop the "custom variable 1" to the right handside, in the AND section. Specify "is" "NOT-BOT"
  • Save & apply

  • That's it, by creating different goals and segment as described above, you will be able to generated detailed reports on who downloaded specific files from your websites.
    Any new logs that will be imported to your Piwik installation after those goals are created, will be evaluated and can be reported on.
    Now, it would be great to have a simpler way to generate such report as well as for those goals to get retrospective "hits"... Maybe in a future release?

    << Next Post - Previous Post >>