Ethical considerations with data
What is data ethics
“Data ethics is a branch of ethics that evaluates data practices with the potential to adversely impact people and society – in data collection, sharing and use.” (The Open Data Institute)
Data ethics isn’t just about personal data
- Data ethics concerns itself with both personal and non-personal (public) data.
- The choice of making public data accessible only through online services, especially in poorer areas, without providing offline alternatives can mean that people without access to the internet are left out and can increase existing inequalities.
Data ethics isn’t just about compliance with the law
- Some data activities can be lawful but not ethical.
- Think of the “Emotional Contagion” study by Cornell University and Facebook where they studied the emotions of around 700,000 Facebook users by removing either positive or negative words from their news feeds.
- Think of the Cambridge Analytica controversy.
Data ethics isn’t just about how data is used
- Data ethics also applies to how data is collected and shared.
- Not collecting data about certain groups of people is bad but collecting data only about certain groups of people is worse as this could create a higher risk of discrimination and profiling.
Data ethics isn’t about restricting access to data
- It can be argued that an ethical approach to data would lead to more openness because people will be more willing to trust entities with their data if they know that the data is collected, shared, and used ethically.
- If everyone could trust everyone else that they will be ethical with data then we would see more organizations opening up their datasets instead of restricting access to it.
Data ethics and data-driven decision-making
- Good data ethics practices support data-driven decision-making by addressing people’s fears about how data will be collected, maintained and used When this is the case, people would be more likely to share their data which in turn supports effective decision-making.
Data bias
Data bias is an issue in data ethics. This refers to the fact that human biases are reflected, propagated, and are amplified with data.
Some examples where bias can arise include:
- Survey questions are constructed with a particular intent/framing
- Selective collection of data from a particular group
- Underlying bias in the data sources