Operation Love Serenade: Fighting corruption in Brazil with an open-source, Machine Learning-powered robot

Motivated to fight corruption in Brazil, tech-savvy citizens used crowdfunding to start a non-profit, open-source project that uses data science for the purpose of monitoring public spending.

Uncommon affair for robots, common affair for politicians

The Toblerone Affair. In 1995, the Swedish newspaper Expressen revealed that Mona Sahlin, who was then serving as Deputy Prime Minister,  had used her professional credit card for personal purchases, including grocery shopping. The case became known as “The Toblerone Affair” due to the inclusion of the famed chocolate bars on her credit card statement. [1]

Serenata de Amor. Toblerone bars are relatively expensive in Brazil, so most people resort to cheaper alternatives to satiate their cravings, such as the Serenata de Amor (Portuguese for “love serenade”) candy.

Desire for chocolate is not the only similarity between Brazil and other countries, however. Brazilian politicians have also been involved in corruption scandals, albeit in a widespread scale rarely seen in most other countries. In 2016, motivated to fight corruption, tech-savvy citizens led by Irio Musskopf used crowdfunding to start a non-profit, open-source project that uses data science for the purpose of monitoring public spending in Brazil. Inspired by the Toblerone Affair, the project was named Operation Serenata de Amor and is maintained by a dedicated team of 10 people, alongside hundreds of volunteers. [2]

 

Enter Rosie, an anti-corruption robot aimed at politicians

The “civic tech” project gave birth to Rosie, an application powered by machine learning and trained to identify suspicious uses of public funds. Rosie analyzes every reimbursement claimed by members of the Congress in Brazil (who are each entitled to monthly expense as much as 46 times the minimum wage in Brazil). Subsequently, it automatically identifies potential irregularities and exposes the reasons for suspicion. [3]

How it works. Technically, Rosie is an open-source, Github-hosted Python application that applies unsupervised learning algorithms to estimate a “probability of corruption” for each reimbursement receipt submitted by politicians in Brazil. For instance, one of Rosie’s functionalities is applying clustering algorithms to identify outliers among meal expense reimbursements. [4] [5]

The organization decided to use Machine Learning instead of traditional algorithms because of three key reasons: the necessity to analyze diverse and dynamic datasets adaptively, without the need to “hard code” a tailored application for each task; the ability to extract the most important features from the data set (i.e., the variables more likely to suggest frauds); and the requirement to scale efficiently given the large size of the datasets. Besides, the team decided to make it open-source to be able to leverage the contributions of other citizens and make the technology available for use in other contexts.[6] [7] [8] [9]

Expanding the project: Twitter, Jarbas and Perfil Político. Since 2017, Rosie’s suspicion results are posted regularly on Twitter (@RosieDaSerenata). The project was covered by more than 90 journalistic vehicles, including the most respected in Brazil. [10] To amplify the project’s impact in the medium term, the team developed new ways to use data to help citizens take action, such as Jarbas, a visualization tool that organizes the scattered public spending database into a searchable, organized dashboard. Another new service is Perfil Político (Portuguese for “political profile”), a platform for profiling and comparing candidates in Brazil’s elections.

 

The challenge ahead: from data to actions

Despite its potential, Rosie has faced substantial barriers to its deployment. Only dozens of cases resulted in politicians giving money back to public accounts (out of thousands identified). A key challenge is that the Congress itself has to analyze the complaints but it may refuse to do so if the value is seen as “low” (which in some cases corresponds to a threshold of as much as US$ 20,000). Another hurdle for Rosie is that the Congress website has “captchas” to prevent massive downloads of its data sets, which hinders the scaling of a tool that is supposed to work without any human dependence. [11]

In order to overcome such obstacles, the Operation Serenata de Amor may consider focusing on three key areas. First, they should uncover and publicize a high-profile, clear-cut case of ill-intentioned reimbursement. This will increase the credibility of the project, attract media coverage and reduce the likelihood of future cases being ignored by the Congress. Secondly, it should go beyond Twitter (which is not very popular in Brazil) and amplify its presence in other social media channels to reach more people. Thirdly, they should be less centered on the how and more centered on the what and the why. The current messaging is complex and full of jargon, and should be simplified to reach a larger audience.

Despite the undeniable progress since 2016, these challenges make it clear that open questions still remain and could determine the project’s success in the future: How can the organization circumvent the mechanisms that are currently hindering Rosie’s deployment? Does making it an open-source application facilitate the creation of barriers for its use? What are other potential applications of Rosie’s algorithms, in the public realm and in other contexts?

(799 words)

 

References:

[1] Stephen Kinzer, “The Shame of a Swedish Shopper (a Morality Tale)”, New York Times, November 14, 1995, https://www.nytimes.com/1995/11/14/world/stockholm-journal-the-shame-of-a-swedish-shopper-a-morality-tale.html, accessed November 2018.

[2] Operação Serenata de Amor, “About”, https://serenata.ai/en/about/, accessed November 2018.

[3] Chamber of Deputies, “Parliamentary Activity Charge”, http://www2.camara.leg.br/comunicacao/assessoria-de-imprensa/cota-parlamentar, accessed November 2018.

[4] Operação Serenata de Amor, “Non-tech crash course into Operação Serenata de Amor”, https://github.com/okfn-brasil/serenata-de-amor, accessed November 2018.

[5] Operação Serenata de Amor, “FAQ”, https://serenata.ai/en/faq/, accessed November 2018.

[6] Ibid.

[7] Erik Brynjolfsson and Andrew McAfee, “What’s driving the machine learning explosion?”, Harvard Business Review Digital Articles (July 18, 2017).

[8] J. Wilson, S. Sachdev, and A. Alter, “How companies are using machine learning to get faster and more efficient”, Harvard Business Review Digital Articles (May 3, 2016).

[9] Mike Yeomans, “What every manager should know about machine learning”, Harvard Business Review Digital Articles (July 7, 2015).

[10] Operação Serenata de Amor, “About”, https://serenata.ai/en/about/, accessed November 2018.

[11] Bruno Pazzim, “Como está acontecendo a hackaton de denúncias da Operação Serenata de Amor?”, January 12, 2017, https://medium.com/data-science-brigade/como-est%C3%A1-acontecendo-a-hackaton-de-den%C3%BAncias-da-opera%C3%A7%C3%A3o-serenata-de-amor-a8bd193e0c76, accessed November 2018.

Previous:

Waze: the application which supports (or even incentivizes?) its users to break the rules

Next:

🍕 Pizza & 🤖 Robots: How Zume will change future meal consumption and the food supply chain

2 thoughts on “Operation Love Serenade: Fighting corruption in Brazil with an open-source, Machine Learning-powered robot

  1. This was a fascinating read, and well-organized! I thought the recommendations were clear and actionable, although they also indicated one of the key issues with machine learning algorithms: once we have all this data, how do we create actions that utilize what we’ve learned from it? It seems like some of these challenges can be resolved with something like a Marketing campaign – utilizing more popular social media channels, and also having a “product writer” simplify and standardize the messaging so that it is more accessible to the average internet user. A focused effort sustained over a few months in this regard might reap significant dividends. Because a lot of Rosie’s barriers to effectiveness are being hindered by the politicians it is criticizing, the project might continue to be stymied unless there is a lot of pressure on these politicians to enact change.

    This article had me thinking about whether such an application might work in other countries, like the Philippines, where corruption is also the cause of much strife and debate in politics today. I’m impressed with how Brazil was able to leverage its talented community of developers to tackle critical social issues, but I do wonder about the risk those 10 dedicated volunteers are taking on with this project.

  2. This is an awesome read, especially in our current global political climate fraught with corruption scandals. I love that they open-sourced their coding efforts as there are plenty of highly qualified computer scientists and engineers around who are willing to devote their spare time for a greater cause. While a noble effort, I wonder how actionable would their findings and recommendations be. Since data input is an issue, where if the government is not cooperative enough to provide an API which is friendly to pull data from, perhaps they could further expand to have crowd-sourced information on politicians. However, this could introduce further layers of complexity on verifying these crowd-sourced information. I think Rosie’s application could be broader to provide other forms of information on the politicians, such as ‘donations’ for their political campaign. This could allow for increased transparency on political candidates, possibly stemming corruption even before they gain power.

Leave a comment