Text-as-Data Analysis of Trade Agreements

Over the last decades, trade agreements have increased not only in number but also in-depth. Nowadays, trade agreements seek to regulate a larger number of trade policy instruments than in the past.

Conditionalities, exceptions and concessions also add to the complexity of regulatory texts. As a consequence, the average agreement text is now about ten times longer than 25 years ago.

This makes it more and more difficult to analyze and compare the content of trade agreements, which is necessary for estimating their impact on international trade and welfare. Big data and text-as-data methods can help researchers, policy-makers and other stakeholders to systematically extract information (and data) from trade regulatory texts.

To give researchers easy access to this type of data, UNCTAD has created Texts of Trade Agreements (ToTA), a publicly available repository of trade agreement texts in HTML format, in a joint project with The Graduate Institute, University of Ottawa and European University at St. Petersburg.

Texts of Trade Agreements (ToTA)

Text-as-data methods comprise a variety of research tools that allow us to gain new insights on trade agreements. Textual similarity measures, for example, are able to capture fine-grained differences in treaty design. So-called dimensionality reduction techniques, which compress the textual information contained in a text into a set of abstract variables, can help predict the effects of trade agreements more accurately than previously available measures.

Treating the texts of trade agreements as data can help us find better answers to a large number of policy questions, such as:

  • What is the impact of trade agreements on international trade?

  • Which features of trade agreements are decisive in increasing trade? Under which conditions?

  • How similar are trade agreements across countries and regions?

  • What are the similarities and differences between investment chapters in international trade agreements and bilateral investment treaties?

  • How similar are provisions on particular topics, for example labor and the environment, across different treaties?

First UNCTAD research on the ToTA text corpus has shown that:

  • Trade agreements are more heterogeneous as a group than bilateral investment agreements

  • Trade agreements converge in regional or inter-regional clusters of similarly worded agreements

  • Even agreements that are similar in overall design display important textual variation in specific chapters

Trade Analysis: Text as data


The Texts of Trade Agreements (ToTA) project makes a machine-readable and annotated full-text corpus of preferential trade agreements (PTAs) publicly available to scholars and policy-makers and uses state-of-the-art text-as-data techniques to analyze it.

Quick Links: | About the project | Download the Data |