Insights
We look at how data bias can derail your digital transformation plans, and what you can do to avoid it
Data bias poses a huge danger to charities. That’s because it can place in jeopardy many of the benefits that they hope to enjoy by undergoing digital transformation and becoming more data-driven. For that reason, it is vital that charities understand data bias and take steps to ensure that their data and its use is free from it.
Put simply, data bias occurs when some of the data your charity collects or uses is more heavily represented or weighted than it should be. That means the data is unrepresentative of the world as it really is. It often occurs because humans have conscious or unconscious biases, and therefore data collected or used by humans often reflects those biases.
When charities rely on biased data to make data-driven decisions, the results are often not as desired or expected. It could mean the difference between a highly successful fundraising campaign and one which ends up losing your charity money.
To get a clearer understanding of data bias, here are some of the most common types that charities might encounter, plus some tips for dealing with them:
This occurs when you give greater weight to things that confirm your existing beliefs, and everyone is susceptible to confirmation bias unconsciously. Sometimes you may even discount completely or ignore data that goes against what you believe. The result is that you end up working with data that tells you what you expected to discover, rather than what is actually happening.
For example, you might have a preconception – consciously or otherwise – that homelessness mainly occurs in a certain part of the country. So when collecting data about homelessness you might concentrate on those areas and neglect to collect data as diligently in other areas, or perhaps discount data in other areas as temporary and therefore not worth including. The result is that the collected data will not be representative of the national picture.
Preventing confirmation bias: It’s a good idea to write down what you expect data to show, before you go about collecting the data. Then try to avoid finding data sources that are simply likely to confirm your beliefs. If can also be helpful to imagine that your data will show you something different, and then see how your data collection methods might differ.
This type of bias may be present if you use a small data set or if you collect data that is simply not representative of whatever you are trying to understand.
For example, a charity may wish to know what types of art to display in a gallery that it runs, but only ask people who live in a particular area, or who are members of a particular ethnic group. This data would then effectively ignore those who may live in less affluent areas or who belong to different ethnic groups.
Preventing Selection Bias: The key to avoiding this type of bias is to ensure that you draw data from a representative population of people. In the example above, this might involve asking a random sample of people who live within visiting distance of the gallery. But the temptation may be to do something more convenient, like asking people who are on the gallery’s mailing list – who may be predominantly from one socio-economic group.
This is an interesting type of bias, and it is one that it is easy for charities to succumb to. It involves ignoring data that is not made prominent.
For example, a charity may collect some examples of people who rarely attended school, but ended up running a successful business. The temptation is to conclude that you don’t need to go to school to become a successful business person. But this ignores the many people who rarely attended school and did not end up so successful even though this number may be much higher – it only takes account of the “victors”.
Preventing Victory Bias: To prevent this type of bias charity’s need to look at the whole picture and ensure that ensure that "victors” are not given undue weighting. If victors are only 1% of the total population then the majority of attention should be on data about people who rarely attended school and end up in less enviable situations.
This often occurs as a result of a failure to really appreciate how data can be distributed around what is often known as an average but which should more accurately be called a mean.
For example, you could take the average household disposable income in two different areas, and conclude that the area with the lower mean disposable income is the one that requires more help from your charity.
But the inhabitants of the lower mean income area might have incomes that are all very close to the mean, while in the higher mean income area there might be a few people with very high incomes, and a much larger group with low incomes that really need help from your charity.
Preventing Average Bias: Rather than looking at the mean average, it can often be helpful to look at another type of average called a median. This is the middle value in a data set, and if the mean and the median are very different then this indicates that relying on the mean is likely to bring in considerable Average Bias.
This type of bias is self-perpetuating. It involves using historical data that was representative of some period in the past. Since things change over time, this data may no longer be representative of the current situation.
For example, historical data may show that unmarried mothers were likely to need certain types of help, because in the past they were likely to be single and their opportunities for employment were limited.
But this is no longer the case: marital status is largely irrelevant when it comes to parenthood today as more people choose to live together before, or without, getting married. That means that unmarried mothers today may not need the same types of help as historical data may appear to suggest.
Preventing Historical Bias: Historical bias can be tricky to catch, because preventing it involves identifying and acknowledging biases in historical data sources and, once identified, refraining from using these sources.
Our courses aim, in just three hours, to enhance soft skills and hard skills, boost your knowledge of finance and artificial intelligence, and supercharge your digital capabilities. Check out some of the incredible options by clicking here.