Analytics and complexity

This post is a quick summary of the paper and presentation that we did for ASCILITE2012 in lovely Wellington, New Zealand. Basically the paper introduced the concept of analytics as information arising from interactions occurring within a complex adaptive system. You can read the full paper here.

Some definitions

  • Managerialism. Universities are increasingly managed as if they were businesses in a competitive marketplace. Accountability for public funding requires the rational allocation of resources and the intentional management of change. This teleological approach to the management of universities is known as managerialism and its  influence has extended to how universities manage their learning and teaching.
  • Educational data mining. “Educational Data Mining is an emerging discipline, concerned with developing methods for exploring the unique types of data that come from educational settings, and using those methods to better understand students, and the settings which they learn in.” George Siemens, 2011 (
  • Academic analytics. This is the use of data collection by educational datamining by universities. It marries statistical techniques and predictive modeling with the large data sets collected by higher education institutions, including learning management systems. Academic analytics has been described as business intelligence for HEI and is focused on the needs of the institution, such as recruitment, retention and pass rates. (Open University, 2012)
  • Learning analytics. Learning analytics is again the use of data developed through educational data mining but its more focused on better understanding and optimizing learning and the learning environment. According to George Siemens (2011), learning analytics is “the measurement, collection, analysis and reporting of data about learners and their contexts, for purposes of understanding and optimizing learning and the environments in which it occurs”.

The Indicators Project

The Indicators project is an analytics project that has been running at CQUniversity since 2008. The project started when the members were responsible for supporting academic staff with their use of the, then, Blackboard learning management system. We found that the activity database table that holds a record of every staff and student click within the system had never been cleared. So we started looking at correlations between student activity within the LMS and their resulting grades. While these correlations based on aggregate data are somewhat interesting, their utility is perhaps limited as we will show later.

Some simple patterns

The following charts simply correlate student activity with the LMS with their resulting grade. Students on the horizontal axis are grouped by the final grades they received. Note that at CQUniversity, the grades are: HD=high distinction, D=distinction, C=credit, P=pass, F=fail, WF=withdraw fail. These charts are simple examples from the hundreds that we have developed over the last four years.

Simple correlation

Student clicks on Moodle against the grade they received


The first day of Moodle access against their resulting grade


The number of question marks within Moodle forum contributions per grade group

Analytics as the next big thing

We are noticing a large increase in the amount of hype around learning analytics as evidenced by the following:

BIG data sets showing what students do online may prove as vital to education as genome databases have been to genetics or Europe’s Large Hadron Collider to physics” (The Australian, 15th September, 2012)

EDUCAUSE and the Bill and Melinda Gates Foundation have targeted learning analytics as one of 5 categories for funding initiatives” (Educause, 2012)

Learning analytics promises to harness the power of advances in data mining, interpretation, and modeling to improve understandings of teaching and learning, and to tailor education to individual students more effectively” (Horizon report, 2011)

We urge some caution as there are well known cycles associated with the hype around new educational technologies. It also seems to us as that many are reporting on the amazing potential of analytics without a corresponding balance of healthy skepticism.


Some potential problems

Based on our experience with the Indicators project we have identified a number of likely problems that universities will face with their analytics projects. These are:

  • Abstraction losing detail
  • Organisational structures
  • Confusion between correlation and causation
  • Assumptions of causality

Abstraction losing detail

We think Gardner Campbell summed this up nicely in his presentation to LAK12.

“…the nature of learning analytics and its reliance on abstracting patterns or relationships from data has a tendency to hide the complexity of reality” Gardner Campbell (2012)

For example if we consider the following chart that shows the correlation between student posts and replies to the LMS discussion forums and their resulting grades.


And compare this to the average number of student forum contributions for each course across an academic year


Or even the number of forum posts and replies for a single high achieving student


Our experience with the Indicators project is that the devil is very much in the detail when it comes to aggregated analytics data in that the data aggregations we see at the macro level of analysis doesn’t really help us a great deal at the micro levels (single courses, students etc)

Organisational structures

Most universities are structured in a very deliberate reductionist way. People are organized into units base on their task or role within the university. Eg IT folk tend to live in the IT area, the finance folk live in the finance area etcetera. While these divisions between organizational units are imaginary, I’m sure most of us have experienced the frustration associated with a lack of cross unit cooperation. The ever constant battle for budgets often leads to inter-departmental rivalry which can hinder the cross organizational collaboration that analytics requires.

For example, because we aren’t IT and it is unusual to give non-IT folk access to the backend databases of systems such as Blackboard, Moodle, Peoplesoft etcetera, we had considerable difficulty in getting permission to access these resources in order to pursue our analytics research. The silos within universities is potentially a very significant problem as analytics is going to require a set of skills that, that given typical university structures, doesn’t usually exist in a single department. For example database administrators, educational developers and educational technologists do not typically belong in a single organisational silo.

Confusion between correlation and causation

If we look at the simple pattern from before that correlated student forum posts and replies with their resulting grade.


And look at the following figure where the correlation did not necessarily hold true as we moved from the macro to the micro level:


The data/information we are extracting from the learning environments is data stemming from a very complex interplay of variables and it would be dangerous to assume that we can use what happened in the past is going to happen in the future.

Assumptions of causality

One of the more worrisome problems that I can foresee is the assumption that analytics data is based on causality. To be more precise, that management introduce key performance indicators based on analytics data that is aggregated and without reference to the context in which it was gathered. We simply cannot assume that the data we are looking is representing a universal constant when in fact the underlying system is vastly more complex.

A possible path forward

So from a perspective of using analytics to enhance learning and teaching, we are much less concerned with the retrospective data representations and interpretations at the macro level even though the correlations at this level often appear to be quite distinct. We are aiming to focus more on the micro levels which is more tuned to the context in which the data is being gathered. What we are talking about here is a bottom up approach to the representation and interpretation of analytics data, and to some extent, this stands in opposition to the way that universities and their learning environments are currently being managed. We are proposing to do this by looking at analytics through the lens of complex adaptive systems.

Complex adaptive systems (CAS)

“A CAS is a dynamic network of semi-autonomous, competing and collaborating individuals who interact and co-evolve in nonlinear ways with their surrounding environment. These interactions lead to various webs of relationships that influence the system’s performance” (Boustani, 2012)

CAS are a variation on complex systems and have been described as systems that involve many components that adapt, learn or change as they interact. Each agent within a complex adaptive system is nested within other systems which are all evolving and interacting that we cannot understand any of the agents or systems without reference to the others. In simple terms, context is king when it comes to using analytics to improve learning and teaching so we can’t easily interpret analytics data without reference to the context from which it is derived.

Where to from here?

So given that analytics data is extracted from complex array of interacting systems, we are intending to focus our efforts at the course/teacher/student levels. That way the people operating within the context are the people interpreting and making decisions based on the analytics data. While we see the importance of of providing analytics derived insights to students in the future much like Purdue university have done with their signals project, initially ( given the constraints we currently have) its likely to be the teacher who has the right mix of closeness and knowledge about the context from which the analytics information is extracted. So we are aiming to provide the teaching academics with better information.

To borrow David’s car analogy, its about using analytics to augment the driver. In a lot of modern cars when you get out and leave the lights on, they turn off the lights for or give that annoying beep when you haven’t fastened your seat belt. The car is smart enough to help the driver out with the vehicles operation. These sorts of augmentations are what we would like to see within the LMS. We are concerned with using analytics to nurture evolutionary improvement from the present rather than rationally targeting some idealistic future state.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s