How you can discriminate data analysis
They blame the British mathematician Sir William Thomson for saying that what can be measured can be improved. This idea, something like a mantra for all those professionals who work with data, highlights the importance of analyzing everything based on data.
However, just because something is data-driven doesn’t mean it’s objective. As explained by Gemma Galdón, founder of the consulting firm Ética Consulting (a company that performs algorithm audits so that companies are aware of the biases that the data they work with may have), algorithms tend to reward the norm. And when, historically, this data is biased, we are perpetuating the errors from which we started.
At a time when we generate more and more data and these are thoroughly analyzed by companies to, they say, make better decisions (based on evidence provided by that data, and not on intuitions, which can be right or wrong) , then the objective is to present them in a friendly graphic way, with which actions can be taken and conclusions drawn without much more explanation.
But what happens when the data with which the analyzes are carried out are biased? What if to show them we choose shapes, colors, labels and words that can further increase these basic deficiencies?
The power of details
Those of us who use language as a work tool know the importance of choosing certain words in favor of others. It is not the same to ask someone “Do you understand me?” than “Have I explained myself?”
Something similar happens when it comes to presenting the analyzed data: it is not the same to choose certain colors, shapes or types of graphs than to choose others.
For this reason, it is increasingly claimed that the concepts of equality and equity are, from the beginning, in any type of analysis and in its subsequent visualization. The idea is to be aware of these possible baseline errors so that they are not amplified further and can be corrected, even with biased baseline data.
In fact, one of the first recommendations is to analyze the data set and understand where it comes from, analyzing which groups might not be included and which are excluded from the data.
It is also necessary to carry out an analysis of how this data was collected, why and if there are groups that can benefit or harm this collection.
If we find that there are groups that are not included in the data, a good measure is to add notes to highlight how these are not inclusive or representative.
In addition, and especially in order to display this data in a friendlier way that can be more understandable by all ranges of the population, the experts consider that there are a series of basic principles, in good faith, that must be taken into account. so as not to make any more equality and equity mistakes.
Basic principles to keep in mind
Broadly speaking, there are three main considerations that, according to the report “
Do no Harm” (Do No Harm), carried out by the Urban Institute, we must take into account when presenting any analysis carried out with data.
On the one hand, use a language that everyone can understand and in which they come first and foremost. In other words, using labels like “black people” instead of “black” is more inclusive, because we are focusing the analysis on people, not on the color of their skin.
In addition, it is recommended to order the labels and the answers because, otherwise, we will be reflecting the historical biases. Thus, instead of ordering the data that reinforces the categories “white” and “male”, it is recommended to order the labels by sample size or magnitude of the results.
Stereotypes even on display
Last but not least, care must be taken when choosing the colors, icons, and shapes that will be used to display the data. Therefore, we must avoid stereotypes such as pink for women and blue for men, a woman as a nurse and a man as a doctor.
For example, bar charts are useful for displaying categorical data, such as data by race or gender, because they allow you to show a comparison between groups. A horizontal bar chart makes it easy for viewers to compare bars to recognize which bars are longer and which are shorter.
Vertical grid lines provide a quick reference for estimating length and comparing differences, and labels provide percentages or exact numbers that are particularly helpful when comparing groups with smaller differences.
The Urban Institute, which produced this report in collaboration with Tableau (a data visualization tool), is a nonprofit research organization that provides data to help promote upward mobility and equity.
To help data analysis and subsequent visualizations have a positive impact on fairness and equality, this organization facilitates a
Check Guide so that those responsible for each visualization can check whether or not they are amplifying certain biases.