merit and demerit

When I graduated many years ago, the hall was filled with people from around the world: different colours, different races, different religions, as many women as men. Now some 30 years later, I’m often the only woman in a room full of white men. When I ask my peers where all the diversity has gone, they shrug their shoulders and say “We appoint on merit”.

Actually, they’re wrong. We don’t appoint on merit. We appoint on metrics.

rethinking merit and metrics

The accepted norms of the higher education workplace are an obsessive focus on a very narrow set of metrics as a proxy for merit, a high attrition of women, a lack of diversity in leadership, and sometimes the development of toxic unwelcoming workplaces.

We need to rethink how we measure merit and we need to consider demerit too so that we can be confident that the people we invest with power, leadership and decision-making are not sexist, racist, homophobic or bullies.

To start, we need to look at what we mean by merit. The dictionary defines merit as the “quality of being particularly good or worthy, especially so as to deserve praise or reward”. I don’t think anyone would argue that we shouldn’t appoint leaders on merit using that meaning. In a society that is diverse, like ours – 50% women, multicultural – you would expect that merit and the power and leadership earned as a result, would be evenly distributed across diverse demographics. But it’s not.

For some reason, promotion on merit does not give everyone a fair go. Leadership, power and decision making are concentrated almost uniformly in a narrow demographic: old white men.

Yet the data show that leadership teams with greater diversity and differing life experience generate better outcomes. More diversity provides a competitive edge. If we focus on gender, for example, companies with more women on their boards make larger profits. Really, investors should only support companies with women CEOs. They’d make a lot more money. What about research? Well, teams of mixed gender produce higher quality research and a higher proportion of women increases team collective intelligence. What’s more, when organisations improve things for women they make things better for everyone by increasing access to parental leave, flexible work practise, better work life balance.

The attrition of diversity impacts negatively on productivity and innovation in academia. Yet when the dominant group are challenged about the lack of diversity in senior academic positions, their defense often focuses on the word “merit”. When we probe further, we find that merit here actually means metrics. Most importantly, we don’t measure demerit at all. Let me explain.

easy to measure metrics

To assess merit in academia, we measure a few specific things. The number of publications, the number of grants, the number of PhD students. These numbers focus on a very narrow selection of things that people and universities do. And it is no coincidence that these metrics are also easy to measure. The problem comes when we use these “easy to measure” metrics as a proxy for merit. We have now evolved ever more cryptic numbers (H index, IF, etc) that mean nothing to those outside the sector but which are avidly pursued within the sector, almost to the exclusion of everything else. The higher the numbers, the better and more valued you are. We chase after these metrics – but do they really measure what we should be measuring?

personal qualities we value

creativity     critical thinking   resilience   motivation   persistence   curiosity   endurance   reliability    enthusiasm   empathy   self-awareness   self-discipline leadership   courage     civic-mindedness    compassion    integrity   resourcefulness    honesty   sense of beauty   sense of wonder    generosity  humour   humility  kindness   consideration authenticity care

(Collated by US education policy researcher, Gerald Bracey with a few extra that I threw in)

In my opinion, it is the above list of personal qualities that should be considered when rewarding merit and choosing leaders. Yet none of these are measured directly and most are not measured at all when we assess the merit of people and higher ed institutions using current metrics. That means there is a disconnect between the metrics we use and the actual merit of a person or an institution based on these qualities.

We need new metrics. Metrics that value personal qualities. We should not measure how many PhD students an institution produces, we should measure how well an institution supports their PhD students. Universities should be assessed on how inclusive they are, how diverse their senior executive is, and how well they support the work-life balance of their staff. After all, university rankings are meant to help students and staff identify the best places in the world to study, work and do research. That should mean measuring which universities provide the safest and most supportive workplaces where everyone – not just those who fit into a very narrow demographic – can succeed. Professors should be assessed, for example, on how well they sponsor and mentor others to achieve research, teaching and service goals (with more weighting given to supporting diversity), not how many people are in their group or how much money they have received in grants.

And then we also need to look at the other side of the coin.

demerit 

The dictionary defines demerit as a “fault or disadvantage”, or “a mark awarded against someone for a fault or offence”. When we measure the worth and value of someone or some institution we ought to consider demerit alongside merit. When a professor tells a sexist, racist or homophobic joke, that should count as demerit. When a university supports or organizes a conference with an all white male list of speakers, that must count as demerit.

Our current focus on a very narrow set of metrics as a proxy for merit sometimes leads to or supports selfish, unprofessional or even unethical behaviours that can generate toxic workplaces. Harassment is one such toxic behaviour that pushes women out. In a recent study, 64% of scientists surveyed about their experience on field trips reported sexual harassment; 22% reported sexual assault. The majority of those reporting harassment and assault were young (undergrads, postgrads, postdocs) and female. The perpetrators were predominantly male and senior. The power differential makes it very difficult for the victim to report the bad behaviour; the perpetrator may be a highly respected person with huge metrics. They are “too valuable” to lose, too powerful to challenge. The power differential silences and shames the victim. Even when unethical behaviour is reported it may not be dealt with appropriately.

Sometimes I wish there were a Demerit App – one that silences and shames the bully, harasser, or predator. So that when a married male professor won’t stop looking down the shirt of a female postdoc, she can press the thumbs down button against the professor’s name. The professor would be denied access to his laptop and portable electronic devices for an hour. If two or more people activate the app, the professor would be locked out for an even longer time and a message sent to the supervisor who would need to take action or they too would earn demerit points. Demerit points would accumulate for each individual and for each institution and would be deducted from the metrics used to calculate a person’s merit and a university’s international ranking.

it’s time for change

We are now well into a new millennium. But we are stuck in the stereotypes of the past. This roadblock is limiting our decision-making, our progress, our innovation. To move forward, we need to challenge the current norms; define merit much more broadly; measure qualities we value in people but which are hard to measure; and we must value ethical behaviour. Most importantly, we need to assess demerit alongside merit to gauge the true worth of a person or an institute. This way we can bequeath new models of success and leadership to the next generation to help fix the problems we have inherited from the past.

In this revolutionised workplace, academics with integrity, empathy, respect and compassion – as well as critical thinking and creativity – will be rated highest and valued most of all.

______________________________________________________________

This post is based on a TedX talk I gave at the University of Queensland on 23 May 2015. The video is here. (updated with new link on 10 Jan 2016)

how to measure a professor

“Many of those personal qualities that we hold dear….are exceedingly difficult to assess. And so, unfortunately, we are apt to measure what we can, and eventually come to value what is measured over what is left unmeasured. The shift is subtle and occurs gradually”. So wrote Robert Glaser of the USA National Academy of Education in 1987.

Those words – written about the standardised tests used in American schools in the 1980s – ring so true today for the way we assess academics. The things we tend to measure, because they are easy to measure, are things like publication numbers, impact factors, H-index (regrettably not the Happiness index), citations, grant income. And we tend to value most those who have big grants and papers in big name journals. Are we “driving out the very people we need to retain: those who are interested in science as an end in itself…“? Is the current “Impact factor mania (that) benefits a few” forcing academics to participate in a “winner-takes-all economics of science“? Is the “tournament” competition model ruining science by adversely affecting research integrity and creativity? Have we fallen into the trap Glaser warned of: do we now value what we can measure at the cost of losing what is actually most valuable?

Inspired by Glaser, education policy researcher Gerald Bracey generated a list:

Personal Qualities NOT measured by Standardised Tests

creativity     critical thinking    resilience    motivation   persistence    curiosity    question-asking    humour    endurance   reliability     enthusiasm    civic-mindedness    self-awareness    self discipline    empathy    leadership    compassion    courage   sense of beauty    sense of wonder    resourcefulness    spontaneity   humility

Do metrics for academics assess these qualities? In some respects, they do. Publications and grant success require a level of creativity, critical thinking, motivation, persistence, curiosity, question-asking, enthusiasm. But at best they are a proxy measure. And there are deeper issues. Counting grant income as well as scientific publications – well that’s double-dipping. What’s more, current metrics completely ignore many key responsibilities expected of academics. Committee work. Conference organisation. Reviewing. Mentoring. Outreach. I’ve been fortunate to work with fantastic supervisors and collaborators – people I trust, respect and like – but that’s certainly not everyone’s experience in academia. How do we ensure that academics with integrity, empathy, humility and compassion – as well as leadership, critical thinking and creativity – are rated highest and valued most of all if these personal qualities are not assessed or incentivised? In my mind, the best metrics would (1) enable a fair assessment relative to opportunity, (2) assess more of the duties expected of academics and (3) report on the personal qualities we hold dear in people we want to work with.

To address point 1, the metrics for those who have made it – full professors – ought to be different from those we use to assess academics still in the pipeline.

How might we measure a professor? Well let’s imagine a few more new metrics…..

Publication Efficiency. Currently we focus heavily on three metrics: publication quantity, publication quality and grant income – and “more is better”. Professors are expected to secure competitive grants, attract junior researchers (many bringing in their own competitive fellowships) and train scholarship-funded students. The more dollars pulled in (grants, scholarships, fellowships), the more people in the team, and hence the more outputs generated. But large teams are not necessarily better. How productive has the team leader been with those funds? Using the publication efficiency (PE) metric, publication metrics are weighted by income:

PE = PO/RI

where PO is a measure of publication output over the past 5 years (eg POc could be total citations past 5 years, POn number of publications past 5 years etc etc) and RI is the total research income over the past 7 years (that is, the certified total dollar value of all grants, all scholarships and all fellowships to all team members over that time). Seven years is chosen for research income aggregate, rather than five years, because it takes time to generate scientific publications. The higher the PE, the better.

Sponsorship Index. One of the most important roles a professor can take on is training the next generation of research leaders. Trouble is, the way we rank and assess academics leads to a hypercompetitive environment. Take for example publications, the major currency of academia. The senior author position on papers is highly coveted because it identifies the intellectual leader of the research. Future grant success (= future survival) for senior academics requires senior author papers – and the more the better. A well-established professor, leading a large group, traveling extensively and with a large admin/committee/teaching load, relies on mid-career researchers within the team to generate ideas, direct the day-to-day research, train students, analyse results, write the papers. Yet the way the system works at present, the professor needs to take the senior author positions on papers. This is justified because the work was done in the professor’s lab, using equipment or protocols they established and using grant money they brought in to cover the salaries of the team members. The sponsorship index, SI, changes the incentives. It rewards professors for supporting mid-career researchers in a team:

SI = (SAS+2M+4A) / N

where N is the total number of papers from the team in the past 5 years, SAS is the number of papers over that time for which senior authorship was shared between the professor and a team member, M is the number of papers where the professor was middle author and a team member was senior author, and A is the number of papers where a team member is senior author and the professor is gratefully thanked in the acknowledgements (and not by inclusion in the author list). Requiring that a professor maximise their sponsorship index will place greater emphasis on selflessness and in turn this will help ensure career development of the next generation of academics.

Good Mentorship Score. Following on directly from sponsorship is mentorship. Using current metrics “whether you are the best or worst mentor is irrelevant“. But it’s hardly irrelevant to potential team members and colleagues. How can a PhD student or postdoc find out if a professor is a person they can rely on to help them achieve their career goals (whatever they may be)? Horror stories abound of professors who treat team members appallinglytoxic academic mentors. Sadly, despite university policies that prohibit these behaviours, it’s usually the victims that suffer most. People in positions of power above the professor may not be aware of the problem (asshole behaviour is usually directed downwards) or may have an inkling but the grant income and papers generated by the professor are too valuable to risk losing. So how to address this? My solution – get references. From former team members. HR can provide a random selection of 10 diverse former team members (ie male/female, PhDs/postdocs, different ethnicities). These referees then use a 5 point scale, where 1 is strongly disagree and 5 is strongly agree, to rate the professor against various statements. You know the sort of thing: “My ideas for developing my research were respected and valued”, “I felt included and appreciated as a team member”, “My goals as a researcher and a person were supported”, “The professor was someone I respected and trusted and want to be like”, “I was confident to speak to the professor about issues that arose regarding my work-life balance”, “I was encouraged to explore career options outside the traditional academic path”. Perhaps we should also poll mid-career colleagues in the same school – for example “The professor actively helps more junior colleagues develop their career”, “The professor takes on a fair and equitable teaching and committee workload”, and “The professor is a positive and encouraging role model”. To generate the good mentorship score (GMS), the scores are averaged across all questions and all reviewers. The GMS can then be used in discussions at performance reviews and considered in a mentoring component of track record assessments for grants and fellowships.

Civic-Mindedness Tally. Academics are expected to do much more than research and teaching – though it is research and (to a lesser extent) teaching that are assessed, measured to the nth degree, and valued most highly. Those other things we do – contributing to department/institute committees, professional societies, conference organisation, peer review and community outreach are difficult to measure, so they tend not to be measured or assessed and therefore are not valued highly. The civic-mindedness tally (CMT) ensures that outstanding professorial citizens, who give their time for the good of society, are recognised for their altruistic contributions. The CMT is simply a sum for each year over the past 5 years of each certified committee, representative role, organisational appointment, grant review panel, editorial responsibility (see also academic karma for a new take on valuing peer review), science communication and community engagement – and yes I think that should include blog posts 🙂 .

I know, it’s too simplistic. But it’s better than nothing, which is what we do now. On its own, a high CMT won’t lead to *favourite* status for a professor. But in combination with current metrics, and the metrics described above, it should do wonders for improving the Happiness Index of institutions.

There you have it. That’s my philosophy for how we should measure a professor. It’s only a start, and no doubt there are many things that could be improved or are still missing (for other ideas see roadmap to academia beyond quantity and is competition ruining science?). So now over to you, what are the measures you think should be implemented to assess the qualities that really matter in our professoriate?