Web servers are the computers that host the various web pages that make up the Internet. Web servers also log a range of information about people who visit their web pages such as:
- IP address
- Type of operating system and browser
- Operating characteristics such as screen resolution and colour depth
- The URL that referred the visitor to the site
- The country the visitor is from
- Where this visitor clicks while visiting the site
It is common practice for web maintainers to analyse web activity log data to generate marketing intelligence by analysing visitor’s online behaviour and turning this information into marketing knowledge. It is a similar story in higher education where Learning management systems like Moodle and Blackboard aggregate these logs into database tables where the records can be analysed by the institution. For example, our Indicators project is looking at ways that this recorded data can be converted into information that can inform and improve university teaching and learning. We have delivered several presentations based on the Indicators project and a common question that arises is about the ethics of using web server logs for this purpose.
There appear to be two main themes or ethical concerns around the use of web data mining and these relate to privacy and individuality. According to a paper that discusses ethical issues in web data mining; “Web mining does, however, pose a threat to some important ethical values like privacy and individuality. Web mining makes it difficult for an individual to autonomously control the unveiling and dissemination of data about his/her private life” (Wel & Royakkers, 2004). They go on to say that web usage mining raises privacy concerns when web users are traced and their actions are analysed without their knowledge.
Privacy is a conceptually fragile and enigmatic term but in the context of web data mining it is commonly referred to as the control of information about oneself. In terms of the Indicators project we are de-identifying individuals and courses as well as aggregating data to look at patterns of activity across student groups that consist of thousands of students. This, I suspect, does not present any privacy concerns, as it is impossible to identify individuals within the data sets.
De-individualisation has been defined as a tendency of judging and treating people on the basis of group characteristics instead of on their own individual characteristics and merits. It has been said that when group profiles are used as a basis for decision-making and formulation of policy, or if profiles somehow become public knowledge, the individuality of people is threatened. My interpretation of this is that it more relates to how the collected data is used rather than how the data is collected and I can think of many situations where the use of such information could be deemed unethical.
In terms of the Indicators project, where we are endeavouring to provide research on how students are using the LMS, I do not see any issues relating to privacy as the identity of individuals is not disclosed and cannot be inferred. The argument about Individuality is more complex as it relates to how the information is used. The following is taken from the privacy statement on the Australian Privacy Commissioner’s web site:
When an individual looks at our website, our internet service provider (WebCentral) makes a record of the individual’s visit and logs (in server logs) the following information for statistical purposes:
We do not identify users or their browsing activities except, in the event of an investigation, where a law enforcement agency may exercise a warrant to inspect the Internet service provider’s server logs. (http://www.privacy.gov.au/component/content/article/545#mozTocId230471)
- the individual’s server address
- the individual’s top level domain name (for example .com, .gov, .org, .au, etc)
- the pages the individual accessed and documents downloaded
- the previous site the individual visited and
- the type of browser being used.
This seems to be a standard inclusion into the privacy statements of most governments and organisations including CQUniversity. There appears to be very few privacy statements attached to web sites that actually spell out how the information will be used but I suspect that most will be used to either improve services and process or simply as a marketing intelligence tool.
What do you think?
Wel, L. v., & Royakkers, L. (2004). Ethical Issues in web data mining. Ehtics and Information Technology, 6, 11.