ICIJ · The International Consortium of Investigative Journalists

The Panama Papers Reading List

Introduction People Data Game
articles/00Response/170410-pulitzer-01th

Panama Papers Wins Pulitzer Prize

The honor is a testament to the enterprise and teamwork of our staff and our partners here in the U.S. and around the world, ICIJ's director says.

Awards and recognition

The Panama Papers project, led by ICIJ and German newspaper Süddeutsche Zeitung working in collaboration with more than 100 media outlets, has been honored with awards and finalist mentions by more than a dozen major international prizes, including:

articles/00Response/170403-watn-01th

Where Are They Now? A Year Later, Mixed Fortunes For Panama Papers Line-Up

One year after the Panama Papers first became an international catchphrase, here’s a globe-hopping update on the people and institutions caught up in the scandal.

articles/00Response/170403-annivid-01th

VIDEO: Twelve Months of Investigations, Impact and Outrage

A year ago the Panama Papers dominated newspaper headlines and brought now-iconic images to TV screens around the world. Since then, investigations have continued and outrage has grown.

articles/00Response/170327-bethancourt-01th.jpg

Panama Prosecutor Claims 'Solid Case' Against Mossack Fonseca

The law firm at the heart of the Panama Papers affair, sold shell companies and held bank accounts that were used to help conceal bribes paid across South America, a Panamanian prosecutor alleged at a press conference.

articles/00Response/170210-mossfon-01th.jpg

Founders of Panama Papers Law Firm Arrested on Money Laundering Charges

Police in Panama arrested the founders of Mossack Fonseca, the law firm at the center of the Panama Papers scandal, on money laundering charges Thursday after authorities raided the firm’s headquarters as part of investigations into Brazil’s largest-ever bribery scandal.

articles/00Response/170120-oecd-01th.jpg

Tax Agencies Draw Up ‘Target List’ of Offshore Enablers

Tax agencies from 30 countries convened in Paris this week to take part in the largest ever simultaneous exchange of tax information and to share results and details on thousands of investigations sparked by the Panama Papers.

articles/00Impact/161219-panama-01th.jpg

Panama’s Revolving Door Shows Global Challenge of Offshore Reform

In a country where top-drawer lawyers move freely between high government posts and law firms selling secrecy-cloaked shell companies, bringing lasting change to the offshore industry is a challenge.

articles/00Impact/161201-backlash-04.jpg

Journalists Hang Tough in Face of Backlash Against Panama Papers Reporting

Reporters have faced consequences both in nations where media crackdowns are common and also in nations with reputations for high levels of press freedom.

articles/00Impact/161201-impact-02th.jpg

Panama Papers Have Had Historic Global Effects — and the Impacts Keep Coming

The investigation has produced an almost daily drumbeat of regulatory moves, follow-up stories and calls by politicians and activists for more action to combat offshore financial secrecy.

articles/00Response/161116-bvi-01th.jpg

BVI Hits Mossack Fonseca With Largest Fine Ever After Panama Papers Investigation

The $440,000 penalty followed a six-month investigation which included on-site compliance inspections and the appointment of an officer to monitor Mossack Fonseca's operations.

articles/00Response/161115-stiglitz-01th.jpg

Experts Who Quit Panama's Transparency Commission Produce Their Own Report

Report's authors say that the U.S. and EU have the power to force other nations to embrace transparency reforms by threatening to cut off access to their financial systems.

articles/00Response/161103-pakistan-01th.jpg

Pakistan's PM Responds to Supreme Court Hearing on Panama Papers

Nawaz Sharif defended himself before the nation’s highest court, as opposition supporters celebrated in Islamabad.

articles/00Response/161017-varela-01th.jpg

Panama Hires PR Firm Amid Ongoing Panama Papers Fallout

A PR firm is being paid $50,000 a month to help the Panama government, while arrests, protests and more continue around the world.

articles/00Response/160901-nevada-01th.jpg

Hedge Fund Sues Mossack Fonseca For Alleged Obstruction of Justice in Nevada

Confidential emails revealed in the Panama Papers have opened a new front in a bitter court battle in Nevada involving a hedge fund led by an American billionaire, new court filings show.

articles/00Response/160805-committee-01th.jpg

Experts Quit Panama's Transparency Committee Over Lack of Transparency

The committee was established in the wake of the Panama Papers to probe Panama's financial services industry, but now two out of three international members have resigned.

articles/00Africa/160725-game-01.jpg

Continent of Secrets: Uncovering Africa's Offshore Empires

Africa receives $50 billion of foreign aid money annually, but then loses roughly the same amount through illicit outflows. Can you uncover Africa's offshore empires? Play now!

articles/00Africa/160725-resources-02th.jpg

Secret Offshore Deals Deprive Africa of Billions in Natural Resource Dollars

The Panama Papers show how politicians and mining, oil and gas interests benefit from secrecy and dubious multimillion dollar transfers.

articles/00Africa/160725-nigeria-01th.jpg

Secret Documents Expose Nigerian Oil Mogul’s Offshore Hideaways

A dealmaker’s backstage maneuverings are revealed in the Panama Papers as he hung with celebrities while criminal investigators closed in.

articles/00Africa/160725-sierraleone-01th.jpg

Diamond Mine with Offshore Ties Leaves Trail of Complaints

The Panama Papers reveal a network of shell companies linked to a mining operation that has been accused of environmental harms and unpaid taxes.

articles/00Africa/160725-safari-01th.jpg

Out of Africa, Into Tax Havens

As visitors come to see what’s in Africa, some safari operators’ profits head offshore.

articles/00Response/160715-africa-01th.jpg

Reporters Warned, Inquiries Opened as African Nations Respond to Panama Papers

Mossack Fonseca targeted clients in Africa for business, but now some of those clients have become targets themselves as authorities launch investigations into the Panama Papers revelations.

articles/00Response/160707-eu-01.jpg

Panama Papers Credited As New EU Anti Money-Laundering And Tax Abuse Rules Proposed

The European Commission has announced it will tighten the European Union’s anti-money laundering rules and increase transparency requirements for companies and trusts.

articles/00Response/160629-venezuela-01th.jpg

Venezuela and Panama To Launch Joint Panama Papers Investigation

The joint investigation will be the "first of its kind," and Venezuela's attorney general has hinted at a long list of suspects.

articles/00Response/160617-eu-01th.jpg

European Inquiry to Call UK Chancellor, Mossack Fonseca to Testify

A special 65-member Panama Papers committee of inquiry has been created by the European parliament to investigate potential wrongdoing exposed by ICIJ's investigation.

articles/00Response/160527-mfusa-01th.jpg

Mossack Fonseca's US Operations Under Pressure, Island Offices Closed

Panamanian law firm Mossack Fonseca’s local affiliate in Nevada has resigned from more than 1,000 companies and paid a penalty to the state amid investigations on multiple fronts.

articles/00Response/160513-kerry-01th.jpg

US States Under Pressure As World Pushes For Financial Transparency

Nevada, Wyoming and Delaware are facing growing pressure over their lack of corporate transparency, as the United States and the international community continue to respond to fallout from the Panama Papers.

articles/0xDataRelease/160509-malefactor-01th.jpg

The Malefactors of Mossack Fonseca

Meet The Dutchman, the Queen of the South, the Boss of Bosses and other convicted felons and alleged wrongdoers who have benefited from services provided by the law firm.

articles/0xDataRelease/160509-dressel-01th.jpg

Panama Papers Include Dozens of Americans Tied to Fraud and Financial Misconduct

Mossack Fonseca's files include offshore companies linked to at least 36 Americans accused of serious financial wrongdoing, including fraud and racketeering.

articles/00Jurisdictions/160509-jurisdictions-01th.jpg

Beyond Panama: Unlocking the world’s secrecy jurisdictions

The 21 jurisdictions covered by the Panama Papers data vary from the rolling hills of Wyoming to tropical getaways like the British Virgin Islands. But all have at least one thing in common - secrecy is the rule.

articles/00Response/160506-johndoe-01th.jpg

Panama Papers Source Offers Documents To Governments, Hints At More To Come

The anonymous whistleblower behind the Panama Papers has conditionally offered to make the documents available to government authorities.

articles/00Response/160505-obama-01th.jpg

US Officials React to Panama Papers Disclosures With Get-Tough Proposals

The Obama administration has proposed a national registry documenting the real owners of shell companies and other measures aimed at fighting offshore chicanery.

articles/00Iceland/160502-grimsson-02th.jpg

Iceland’s First Lady Linked to Offshore Investments

Records in the Panama Papers and the Swiss Leaks leaked files tie the wife of Iceland President Ólafur Grímsson to offshore companies and accounts.

articles/0xDataTeam/160426-database-01th.jpg

Coming Soon: ICIJ to Release Panama Papers Offshore Companies Data

The database, to be released on May 9, will likely be the largest ever release of secret offshore companies and the people behind them.

articles/00Response/160425-cartel-01th.jpg

Cartel-Linked Suspects Arrested After Panama Papers Revelations

Uruguayan prosecutors are seeking to bring to trial at least five individuals detained on suspicion of laundering money for a powerful Mexican drug cartel.

articles/00Response/160421-bharara-01th.jpg

US Prosecutor Opens Investigation Into 'Panama Papers Matters'

ICIJ welcomes the interest from the Manhattan U.S. Attorney's office, but has made it clear it won't be turning over its data or taking part in any investigation.

articles/00Response/160420-banks-01th.jpg

Banks Ordered to Provide Info on Panama Dealings to NY Regulator

More than a dozen banks identified in the Panama Papers investigation have been asked to hand over details of their communications with Mossack Fonseca.

articles/05Art/160407-art-12th.jpg

Art held offshore

articles/00Response/160415-Sharif-01th.jpg

Pakistan's PM Leaves Country, Spanish Minister Resigns

Nawaz Sharif faces growing pressure and calls for his resignation, a Spanish minister has stepped aside, and more governments are pledging reform as fallout from the Panama Papers revelations continues.

articles/00Response/160413-MFraid-01th.jpg

Panama Police Raid Mossack Fonseca As Global Fallout Continues

The search of Mossack Fonseca's Panama headquarters comes after a number of raids and official action taken in response to the Panama Papers revelations.

articles/00Response/160412-OECD-01th.jpg

Global joint investigation to be proposed at special tax meeting

Tax officials from 28 nations met in Paris to develop a strategy for collaborative action based on Panama Papers revelations.

articles/00Response/160411-Cameron-01th.jpg

British PM Announces New Transparency Measures Following Panama Papers Revelations

David Cameron appeared before parliament on Monday to address concerns about his own links to offshore holdings revealed in the Panama Papers, as well as announce reform aimed at boosting transparency.

articles/05Art/160407-art-02th.jpg

The Art of Secrecy

Locked in the files of a Panama law firm are the answers to mysteries involving Van Goghs, Picassos, Rembrandts and other masterworks.

articles/00Response/160406-UEFA-01th.jpg

Panama Papers Spark High-Level FIFA Resignation and Swiss Police Raid

Swiss police searched the office of Europe's top soccer association and a member of FIFA's ethics panel resigned following Panama Papers revelations.

articles/04China/160406-china-04th.jpg

Leaked Files Offer Many Clues To Offshore Dealings by Top Chinese

Eight current and former members of the Politburo Standing Committee, the country's top decision makers, have relatives with secret offshore companies.

articles/03Spies/160405-spies-10.jpg

Spies and Shadowy Allies Lurk in Secret With Help From Offshore Firm

Firm helps CIA operatives and other characters — real or fanciful — from the world of espionage set up offshore companies to obscure their dealings.

articles/00Response/160405-gunnlaugsson-01th.jpg

Iceland Prime Minister Tenders Resignation Following Panama Papers Revelations

The prime minister of Iceland said he would resign following mass protests triggered by reports from ICIJ and partners that he had owned an offshore company in the British Virgin Islands with his wife.

articles/02Sanctions/160404-sanctions-01th.jpg

Law Firm’s Files Include Dozens of Companies and People Blacklisted by U.S. Authorities

Global law firm’s customers include suspected financiers of terrorism, nuclear weapons proliferators and gunrunners.

articles/02Azerbaijan/160404-azerbaijan-01th.jpg

How Family that Runs Azerbaijan Built an Empire of Hidden Wealth

Documents peel away three layers of secret ownership in a conglomerate and lead to gold mines and overseas real estate.

articles/02Banks/160404-banks-01th.jpg

Global Banks Team with Law Firms To Help the Wealthy Hide Assets

Leaked records show that hundreds of banks and their subsidiaries and branches registered nearly 15,600 shell companies.

About this project

The Panama Papers is an unprecedented investigation that reveals the offshore links of some of the globe’s most prominent figures.

articles/00Russia/160403-russia-01th.jpg

All Putin’s Men: Secret Records Reveal Money Network Tied to Russian Leader

Complex offshore financial deals channel money and power towards a network of people and companies linked to President Vladimir Putin.

articles/00Overview/160403-overview-01th.jpg

Giant Leak of Offshore Financial Records Exposes Global Array of Crime and Corruption

Millions of documents show heads of state, criminals and celebrities using secret hideaways in tax havens.

articles/00Background/160403-background-01th.jpg

Panamanian Law Firm Is Gatekeeper To Vast Flow of Murky Offshore Secrets

Files show client roster that includes drug dealers, Mafia members, corrupt politicians and tax evaders — and wrongdoing galore.

articles/00Sports/160403-sports-02th.jpg

Leak Ties Ethics Guru to Three Men Charged in FIFA Scandal

Secret documents show how deeply the world of soccer has become enmeshed in the world of offshore havens.

articles/00Iceland/160403-iceland-02th.jpg

Iceland’s Prime Minister Ducks Question But the Answer Catches Up with Him

He came to power after the country’s financial collapse while hiding his offshore holdings of millions in bonds from Icelandic banks.

articles/00Divorce/160403-divorce-04th.jpg

How the One Percenters Divorce: Offshore Intrigue Plays Hide and Seek with Millions

Firm that practices no matrimonial law nonetheless plays big role when the superrich around the globe decide to split.

articles/0xDataTeam/160425-sourcepost-01.jpg

Wrangling 2.6TB of data: The people and the technology behind the Panama Papers

The trove of files that make up the Panama Papers is likely the largest dataset of leaked insider information in the history of journalism.

By

The trove of files that make up the Panama Papers is likely the largest dataset of leaked insider information in the history of journalism.

For ICIJ’s Data and Research Unit, it offered a unique set of challenges. The overall size of the data (2.6 terabytes, 11.5 million files), the variety of file types (from spreadsheets, emails and PDFs to obscure and old formats no longer in use), and the logistics of making it all securely searchable for more than 370 journalists around the world are just a few of the hurdles faced over the course of the 12 month investigation.

ICIJ member and data unit leader Mar Cabra recently spoke with journalism tech site Source about the people, the technology and the data journalism behind the Panama Papers. This post is republished with their kind permission.

So, very first thing, ICIJ has said that it will release a batch of data later this spring, but not the entire dataset—could you say a little about that, and about the way you’re timing the reporting?

The plan is that we’re actually going to keep reporting – some partners are publishing for almost two weeks for sure. Then in early May we’re going to release all the names connected to more than 200,000 offshore companies – so we’re talking about the beneficiaries, the directories, the shareholders, the intermediaries, and the addresses connected to those entities in 21 jurisdictions. We expect to have some bang around that, too.

But we’re not going to release all 11.5 million files, we’re going to release the structured data, which is the internal Mossack Fonseca database. This is especially valuable because tax havens sell secrecy, and their secrecy relies mainly on the fact that corporate registries are opaque and not accessible, so we think there’s a great public value in releasing the names of companies and who’s behind them.

We already did this is June 2013 in the Offshore Leaks database that you can access right now. We had a leak then similar to what we had now — we had internal documents and data from two offshore service providers, which is basically what Mossack does. The only difference now is that this leak includes much more information and is much bigger, and the clients are high-level clients, so that’s why this leak is very important. We’re going to merge the two databases and all of them are going to be put together at the Offshore Leaks URL. You’ll be able to search what could amount to the biggest public database of offshore companies ever.

Data forensics

What was it like to work with the leaked data? What kind of processing did you have to do?

Working with this data has been challenging for many different reasons. The first reason is, it’s huge — we’re talking about 2.6TB. The second reason is that it didn’t all come at the same time; we didn’t receive a 2.6TB hard drive. We had to deal with incremental information, and we also had to deal with a lot of images. The majority of the files are emails and database files. There are also a lot of PDFs and TIFFs, so we have to do a lot of OCR-ing for millions of documents.

So first, most of the leak was unstructured data. Second, it was not easy working with the structured data. The Mossack Fonseca internal database didn’t come to us in the raw, original format, unfortunately. We had to do reverse-engineering to reconstruct the database, and connect the dots based on codes that the documents had.

articles/0xDataTeam/160425-sourcepost-05.png

We’ve had to do that with every leak we’ve received: We had to do it with Offshore Leaks in 2013, we had to do it with Swiss Leaks last year, and we had to do it again this year. Our programmer, Rigoberto Carvajal, is a true magician, because he has become an expert in reverse-engineering databases. He and Miguel Fiandor reverse-engineered the database, reconstructed the Mossack Fonseca internal files, and put it into a graphed-database format. And that’s the base of what we’re going to be doing in the new Offshore Leaks database – the improved version.

The tech stack

We believe in open source technology and try to use it as much as possible. We used Apache Solr for the indexing and Apache Tika for document processing, and it’s great because it processes dozens of different formats and it’s very powerful. Tika interacts with Tesseract, so we did the OCRing on Tesseract.

To OCR the images, we created an army of 30–40 temporary servers in Amazon that allowed us to process the documents in parallel and do parallel OCR-ing. If it was very slow, we’d increase the number of servers — if it was going fine, we would decrease because of course those servers have a cost.

Project Blacklight user interfaceThen we put the data up, but the problem with Solr was it didn’t have a user interface, so we used Project Blacklight, which is open source software normally used by librarians. We used it for the journalists. It’s simple because it allows you to do faceted search — so, for example, you can facet by the folder structure of the leak, by years, by type of file. There were more complex things — it supports queries in regular expressions, so the more advanced users were able to search for documents with a certain pattern of numbers that, for example, passports use. You could also preview and download the documents. ICIJ open-sourced the code of our document processing chain, created by our web developer Matthew Caruana Galizia.

We also developed a batch-searching feature. So say you were looking for politicians in your country, you just run it through the system, and you upload your list to Blacklight and you would get a CSV back saying yes, there are matches for these names — not only exact matches, but also matches based on proximity. So you would say “I want Mar Cabra proximity 2” and that would give you “Mar Cabra,” “Mar whatever Cabra,” “Cabra, Mar,” — so that was good, because very quickly journalists were able to see… I have this list of politicians and they are in the data!

For the visualization of the Mossack Fonseca internal database, we worked with another tool called Linkurious. It’s not open source, it’s licensed software, but we have an agreement with them, and they allowed us to work with it. It allows you to represent data in graphs. We had a version of Linkurious on our servers, so no one else had the data. It was pretty intuitive — journalists had to click on dots that expanded, basically, and could search the names.

We had the data in a relational database format in SQL, and thanks to ETL (Extract, Transform, and Load) software Talend, we were able to easily transform the data from SQL to Neo4j (the graph-database format we used). Once the data was transformed, it was just a matter of plugging it into Linkurious, and in a couple of minutes, you have it visualized in a networked way, so anyone can log in from anywhere in the world. That was another reason we really liked Linkurious and Neo4j — they’re very quick when representing graph data, and the visualizations were easy to understand for everybody. The not-very-tech-savvy reporter could expand the docs like magic, and more technically expert reporters and programmers could use the Neo4j query language, Cypher, to do more complex queries, like show me everybody within two degrees of separation of this person, or show me all the connected dots…

We’re already using the graphs from Linkurious and the database in the interactive The Power Players, which shows more than 70 politicians. Every time you see a graph in the interactive, that’s the database. Linkurious has a great feature, which is that you can make calls to the API, so we make calls to the API to draw the data from this new database. It also has a built-in widget feature, so if you’re using Linkurious for your reporting and you’d like a graph, you create a widget, publish the widget and embed it in your story, and it’s interactive — you can move the nodes around and display any kind of info panel… It’s great because we didn’t have to work on any of that ourselves.

We’re really happy with Linkurious — they were super supportive. Whenever we asked them questions, or asked for a feature we needed, two days later it was implemented! That communication was great, it was like having an expanded development team.

For communication, we have the Global I-Hub, which is a platform based on open source software called Oxwall. Oxwall is a social network, like Facebook, which has a wall when you log in with the latest in your network — it has forum topics, links, you can share files, and you can chat with people in real time.

Oxwall is designed for people who want to have a social network — in the form for user registration, one of the options we had to disable was “Are you looking for a male or for a female?” So… that we disabled, because of course it was a bit confusing! We repurposed it to use it for sharing and social networking around investigative reporting, and thanks to a grant from the Knight Prototype Fund, we improved the security around Oxwall and implemented two-step authentication on the I-Hub.

That was a bit of a nightmare, because some reporters didn’t quite get it, and there were a lot of problems, but we did two-step authentication using Google Authenticator. In the end, everybody got it! We were worried because we were working with journalists in developing countries, and we worried that maybe some reporters wouldn’t have a smartphone, but we were lucky and we didn’t have that problem.

Everybody was using the platform to communicate and log in every day or several times every week and share the tips and exchange ideas, and when somebody found a cross-border connection… One day a colleague of ICIJ in Spain was like, “Oh my god, I found [football player Lionel] Messi!” And everybody’s like “Oh my god, Messi!”

We knew we had things connected to FIFA and to UEFA, we knew there were soccer players, we knew that sports and offshore were intimately connected, but it’s at that point that’s it’s so useful because he says “Oh my god I found Messi!” and all of a sudden everybody has Messi and everybody’s covering Messi. The communication was very important.

A platform three years in the making

We had already used many of these platforms before. We really have three types of platforms: the Global I-Hub for communication, the combination of Solr and Blacklight, and Linkurious.

articles/0xDataTeam/160425-sourcepost-04.jpgWe had already used previous versions of the communication platform, but it’s not until we did the Knight Prototype that we improved its security, and this was the first time we had put it into practice. We were using Oxwall in our previous investigations — in Luxembourg Leaks, which was published in November 2014, we were already using Oxwall. But Oxwall, again — it’s a social networking platform to meet people that had to be improved. We got the Knight Prototype Fund and started work mid–2014 — it took us six months, and then a bit more than six months because in the end there are tricks you want to do. At that point, we’re at the beginning of 2015, and publishing the Swiss Leaks data and investigation. And then a year ago in April we got the call from Süddeutsche Zeitung. We were testing the platform at that time with ICIJ members, and then saw the perfect opportunity in this project to put it to work.

The other two platforms we had already used in Swiss Leaks, but we’ve improved them. The biggest problem was all the file formats that we had. Before, we had used Blacklight and Solr with all PDFs or all XLS files, but in here, you cannot imagine! There are formats I’ve never heard of, there are things you can’t even find in Google. We got around 99% of the data OCRed and indexed — I think that’s amazing given the great variety of formats that we encountered.

I think something that is very important to have in mind is that my users, who are journalists, range from the super-techie reporter who has covered the Snowden files, knows everything about encryption, and works with a great developer, to the other side of the spectrum, the very good traditional investigative reporter who has sources and is great at digging into documents and talking to people but has a hard time dealing with technology. So every tool we produce and use has to cover both fronts. We have to go for simple tools that also allow for more complex work.

The team behind the data

There’s something important to know about ICIJ: we’re a very small team. I’ve been working with ICIJ since 2011. I’m from Spain, and I studied at Columbia doing investigative reporting and data journalism there, and ICIJ hired me to come back to Spain and work here. When I started in 2011, ICIJ had a team of four people, and the team expanded or not depending on the project — we hired contractors. Back then, we didn’t have any in-house data capabilities.

articles/0xDataTeam/160425-sourcepost-02.jpgAfter Offshore Leaks in 2013, and especially after the release of the Offshore Leaks database in June of 2013, which we did in cooperation with La Nación in Costa Rica, they had a very good data team that had two great programmers, and a great leader, Giannina Segnini. At that point, we realized, oh my god, we need to stop doing this externally! We need to have experts and developers in-house that work with us. I had been specializing on data journalism, so when we made that decision, it became evident that I was a good fit to lead the team. Giannina had left her position at La Nación to teach at Columbia, and the two programmers on her team, Rigoberto Carvajal and Matthew Caruana Galizia, came to work with us, and we started a data team at ICIJ. ICIJ today has 12 staff members and the Data and Research unit is half of the total staff. We have four developers and three journalists. Emilia Díaz Struck, a great data-oriented researcher, is the research editor, and I lead the team.

I’m not a programmer, I’m a journalist who discovered data journalism at Columbia University back in 2009, 2010 — I thought if I could tell stories in a systematic way, it was much better than telling random stories of random victims. I’ve been pushing for data journalism in Spain, and I co-created the first-ever Master’s degree in Spain on investigative reporting, data journalism, and visualization. With a colleague, I also created Jornadas Periodismo de Datos “the NICAR conference of Spain” — it’s an annual data journalism conference for around 500 people trying to learn about data. So I’m very tuned into data, and I’m a data journalist myself — not from a developer background, but I have great developers on my team.

One thing that is very important with this work is trust. So whenever we hire somebody, it has to be someone who’s been highly recommended by colleagues and people we know, because we cannot trust this data to just anybody. We have to have references from very close people when dealing with this.

An alliance built on trust

Q. So, speaking of trust, there is this central mystery to me about this project — how did you keep it secret for all this time, with so many people working on it around the world?

I have to say, I’m amazed myself that we haven’t had any major problems with this, but it makes me believe in the human race, because it’s really about trust. That’s why choosing and picking the team is so important. We need media organizations that want to collaborate, we need journalists that we can trust and that follow the agreement. Every person that joins the project needs to sign an agreement saying they’re going to respect the embargo and we’re all going to go out at the same time. And the journalists who have worked with us before know that it’s for their own benefit, to keep it quiet, because if we all publish at the same time, there’s a big bang. But if there’s a leak, it loses that power.

articles/0xDataTeam/160425-sourcepost-06.jpgAnd you just have to look at the impact, you know? If we had not published all together at the same time in more than 100 media organizations, it would not have been the same! Here in Spain, the two media organizations we work with were amazed about the world reaction. But again, it’s all about trust.

My boss, Marina Walker Guevara, always says that this is like bringing guests to a dinner party. You need to choose the ones who are not going to cause trouble, the ones who are going to have a nice conversation. If you know that two of them don’t get along very well, you seat them on opposite sides of the table.

To me, it’s the most amazing thing — it may sound harsh, but it makes me believe in humans, in us, in our power as individuals and the power we can achieve working together. And I think that’s the way to go. There’s no way we could have analyzed 11.5 million files, 2.6 TB of information without a collaborative effort.

The future of leak reporting

Q. So what happens after the Panama Papers?

Something we’ve realized is that journalists are starting to get big collections of documents on their computers. So for example, in Argentina, journalists had all the official gazettes of Argentina and some other documents, and they have these big searchable database of documents. In Switzerland, they had the same thing — they had a lot of documents from their investigation all in one place. Right now, they have to download the documents from us and feed them into their system to see if there are connections. We want to get the platforms of the media organizations and our platforms to talk to each other and do massive matches. Right now we were only able to do targeted searches or searches through spreadsheets, and the next step is to get collections of documents talking to each other.

In parallel with the Panama Papers, ICIJ is already working on DataShare, something we presented for a Knight News Challenge grant. We didn’t get the money, so we’re funding it through other means — the Open Society Foundations have given us some money to work on it, and we’re looking for more funding.

So we’re actually working on a program that they’ll be able to install on their computers in, like, “Tabula mode,” right? So as with Tabula, you install it on your computer. It works in your browser, and it allows you to extract the entities in your documents. What you share with the network are the entities in your documents, and then our search engine basically does the matches using fuzzy matching between the entities. It tells you if there’s somebody else across the network with the same entities, and then you two have to get together and start talking to share the documents.

So we need big collections of documents to talk to each other, and we’re trying to solve that at the level of the entities, because journalists don’t want to share everything they have — they have exclusive documents. But if you create an index of entities in their documents, it’s not so much of a problem, and everybody can benefit from those matches. And of course, we’re having a lot of headaches with natural language processing, and that’s something we’re dealing with inside the Panama Papers project as well.

Q. Do you already have your next data project in view?

Yes! We have a team working on the next project, and scoping it — we have a few themes and we’re seeing which of the themes is the next one.

Tweet Facebook

Find this content interesting and worth supporting?

Donate to ICIJ. Help us continue investigating important global issues.

Donate now
Donate now