In the IGF Report to the 65th UN General Assembly in 2010, Ban Ki Moon, the present UN Secretary-General, used the words ‘multistakeholder’, ‘stakeholders’, or ‘government, private sector, civil society and technical community’ 57 times (on 11 pages) – in other words, the main agents in Internet governance.
The principle of multistakeholderism has been present in the Internet Governance Forum (IGF) since its inception in 2006. It was introduced as the single most important principle in the Tunis Agenda for the Information Society  which laid the roadwork for the IGF in the form we know today. In the Tunis Agenda, the concept of multistakeholderism was explicitly invoked 56 times, not counting the number of occurrences in the Annex. While it is recognised as one of the main principles of Internet governance, multistakeholderism remains one of the controversial issues in Internet governance debate.
Since its introduction in the theoretical discourse in the 1980s, the very concept of multistakeholderism has been the subject of thorough theoretical analyses in political and organisational science and political philosophy, and of more specific analyses closely related to the issues of Internet governance. The concept of multistakeholderism received analytical treatment under various theoretical and evaluative assumptions, resulting in conclusions ranging from positive connotation and open acceptance (as an almost central concept in contemporary, postmodern understanding of governance)  to sceptical views about multistakehodlerism questioning legitimacy of non-governmental stakeholders and highlighting the risk of hijacking multistakeholderism by lobbyist and business interests. 
In this analysis, we mapped the differences and similarities between the ways different stakeholders express their positions in the discourse of the emerging language of Internet diplomacy. The analysis is based on word frequencies counted from the IGF text corpus. We used the following six groups of stakeholders: governments, international organisations, non-governmental organisations, business sector, technical communities, and academic organisations. International organisations and academic organisations are added to the stakeholder typography used in the WSIS documents because of their frequent participation in IGF deliberations.
Next, we counted the most frequently used words by the representatives of these six groups of stakeholders. The analyses we present here start with a basic overview of the volume of contribution and the most frequently used words; we then begin to rely on progressively more complicated quantitative, analytical approaches. [Those of you interested in the details of statistical analysis should check the footnotes.]
Volume of contribution
First, let’s take a look at the number of participants whose affiliations to different stakeholders could be identified for the purposes of the following analyses. Figure 1a presents the absolute number of participants identified in the IGF text corpus from 2006 to 2012, while Figure 1b presents the total (overall 2006–2012) percentage of representatives of each stakeholder group that made verbal statements and were identified in the text corpus.
Figure 1a. Number of participants who contributed verbally to the IGF 2006–2012 and represented different stakeholders.
Figure 1b. Number of participants who spoke at the IGF sessions between 2006 and 2012 (stakeholder representation).
Now let’s take a look at the overall and relative volume of contribution. As in the case of gender analysis previously introduced, we counted the absolute number of words produced by each stakeholder group, and then divided that number by the number of participants representing a particular stakeholder to find out which group was more talkative. Figure 2a presents the absolute count of words produced by each group, while Figure 2b presents the relative volume of contribution as explained.
Figure 2a. The volume of verbal interventions on behalf of each stakeholder group 2006–2012; the scale is given in thousands of word occurrences.
Figure 2b. The relative volume of contribution on behalf of each stakeholder group 2006–2012; the scale is given in hundreds of word occurrences. The dashed grey line represents the mean relative contribution in each year and merely provides a reference point for comparison among different stakeholders.
Interestingly, the government sector representatives did not change their relative volume of contributions much as the IGF meetings progressed. Government representatives were the least talkative. We found them to have the lowest relative verbal contribution in 2011, far behind the remaining five groups that cluster close to the mean relative contribution in 2011. A possible explanation could be that government representatives made official statements instead of ad lib interventions that tend to be longer. Half of the time (i.e. in 2007, 2008, and 2010), the technical communities representatives dominated the discussion. In general, when we take a look at the total number of words spoken and the average number of words produced by a particular stakeholder in the period 2006–2012, a picture where government representatives contributed the most in absolute terms while the representatives of technical communities tended to be the most talkative group emerges. This is depicted in Figures 3a and 3b.
Figure 3a. Total absolute verbal contribution, IGF 2006–2012. The scale is given in thousands of words contributed.
Figure 3b. Total relative verbal contribution, IGF 2006–2012. The scale is given in hundreds of words contributed.
Positioning of the stakeholders
What can word usage, measured in terms of word frequency in the transcripts, tell us about the similarities and differences between different stakeholders in the Internet governance arena? The general assumption of the Emerging Language of the Internet Diplomacy project is that language – as the manifestation of thought – embodies information about the ways participants have cognitively framed (represented) the concepts they have used in the Internet governance debate. Thanks to the information available from the IGF text corpus, we were able to develop a map of the debate spanning 2006–2011 from the perspective of each stakeholder group. In order to represent a spontaneous coordination of different perspectives in the multistakeholder framework, we conducted a joint analysis that compared the use of most frequent words in each year by each stakeholder group.
The following analysis is based on the 100 most frequently used words by each stakeholder group. We first counted the words used; these words varied across stakeholder groups. We then recognised the set of unique, most frequently used words shared by all stakeholder groups. This enabled us to compute the patterns of word usage similarity between all stakeholder groups. In order to understand what follows, think of the similarity in word usage between any two different stakeholder groups as the distance between any two everyday objects you might imagine; let’s say, cities. The distance between two cities in the world is expressed visually on a geographical map. Now, imagine you have the opportunity to express the similarity in the way two stakeholder groups use the most frequently used words as you would express the distance between two cities on a map. The result is usually called a semantic map (or a semantic space): it is a map where any two objects of analysis (stakeholder groups, in our case) are represented as points, with the distance between them representing their dissimilarity: the more distant they are from each other on the map, the more dissimilar they are, and vice versa. 
Figure 4a presents the whole semantic map of the IGF 2006–2011 for various stakeholders. Figure 4b zooms in on the dense rectangular region encompassed by the red dotted line in Figure 4a in order to make it clearly visible. Take a look at the map first and then read through the discussion that follows. Figures 4a and 4b represent, by points, all particular stakeholder groups as defined by the patterns of the words they used most frequently in the IGF 2006–2011. In order to simplify the representation, we have used the following abbreviation schema: G stands for governments, I for international organisations, N for NGOs, T for technical communities, B for business representatives, and A for academia. Thus, if we take a look at the distance between the points G2006 and A2008, for example, the distance between them corresponds to the (dis)similarity between the pattern of word usage by government representatives in 2006 and academia in 2008. The closer the two points representing particular stakeholders in particular years stand, the more similar their patterns of word usage were in the respective years.
Figure 4a. Positioning of stakeholders: similarity in word usage, IGF 2006–2011. The dense rectangular region encompassed by the red dotted line is zoomed in on in Figure 4b.
Figure 4b. Positioning of stakeholders: similarity in word usage, IGF 2006–2011. The figure zooms in on the dense rectangular region encompassed by the red dotted line in Figure 4a.
First take a look at the region on the left of Figure 4a, outside of the area bounded by a red dotted line. This region contains only the points representing the word usage patterns of the representatives of governments and academic institutions. For the sake of ease of interpretation, we have connected the points representing the patterns of word usage of government representatives (by a dotted black line) and the representatives of academic institutions (by a solid black line). Obviously, the first result is that the patterns of word usage by these two stakeholder groups significantly differ from the patterns characteristic of other stakeholder groups. Namely, all other stakeholder groups are represented by points in the rectangular area on the right of Figure 4a. This region is magnified in Figure 4b.
In Figure 4b, we connected all points representing the patterns of word usage across the years for international organisations by a solid black line, and the patterns of word usage across years for technical communities by a dotted black line. These lines look like they encircle the core of the semantic map, where we find the points representing the patterns of word usage for NGOs and the representatives of the business sector, which are found to have the highest mutual similarity according to our analysis. Overall, technical communities seem to stand closer to the average from NGOs and the representatives of the business sector than the representatives of international organisations do. Taking the overall picture (Figure 4a), roughly speaking, international organisations seem to stand somewhere between the representatives of academia and the government sector, with technical communities revealing patterns of word usage across the years somewhat more similar to the patterns exhibited by representatives of the business sector and NGOs. The discussion we have presented is a slight idealisation: the points representing a particular stakeholder really bounce around across IGF years, sometime matching closer the position of one stakeholder group and sometimes the position of another. These variations are much more expressed in the left region of Figure 4a, for the points representing governments and academia. In 2008 and 2009, representatives of the government sector tended to be more similar in the way they used the most frequent words at the IGF to the representatives of other stakeholder groups; the representatives of academic institutions, however, came close in 2010, bouncing back in 2011 to where the majority of their patterns were found in previous years.
This pilot analysis provides the first insights into the similarity of the language used by different stakeholders. The full study will combine further developed quantitative analysis with qualitative analysis by Internet governance experts. We invite you to send you comments and suggestions on whether the pattern of similarities presented here could be linked to the overall perception of relations among various stakeholders in the multistakeholder framework of the IGF.
Figure 5 presents the 30 most frequently used words in total (2006–2011) for each stakeholder group.
Figure 5. Thirty most frequently used words by different stakeholders in the IGF 2006–2011.
How often do you change? The dynamics of stakeholder word usage patterns
A final word in this section introduces a measurement of how dynamic the pattern of word usage was on behalf of each stakeholder group across 2006–2011. Let us quickly explain what exactly we mean by ‘dynamic’ in the context of the present analysis. A particular stakeholder group develops its own IGF specific discourse by placing different emphasis on different words: thus it tends to use different words with different frequency – i.e. more or less often.
Imagine a particular group of participants who tend to use a selected number of words, say 100, with very different probabilities of usage associated with each word. Most of the time they use the word Internet, then much less often they use the word think, IGF, and development, each with decreasing frequency, and so on. In a particular (probabilistic) sense, the word usage of such a group of participants is predictable: we know they will use the word Internet much more often than the word development, and after some time we can develop our own expectations about the future word usage by the observed group of participants. Now, imagine a group of participants who use the 100 most frequently used words with equal, or almost equal probability that they will use any of them. For example, in a particular year they may have used the word Internet 230 times, the word think 228 times, the word IGF 220 times, the word development 219 times, and so on. In a particular sense, their word usage is not predictable: since all the words they use tend to be used with approximately the same (or at least similar) probability, we can never develop a reliable expectation of what word they might use. Namely, at any time, they could use any of the words they most frequently use overall. These two examples are, of course, extremes that will be not met in real life.
The point is that different stakeholders (or any other properly defined groups of IGF participants) can be compared in the sense expressed by the discussion of these two examples. Whether they tend to diversify the frequency of word usage, or whether they tend to have a more uniform distribution of word usage, tells us something important about how dynamic – or diversified, or (un)predictable – their patterns of word usage are.
Figure 6 presents such variations in the dynamics of the patterns of word usage by all six stakeholder groups in our analysis of the IGF 2006–2011. The higher the value on the Y-axis for a particular stakeholder in a particular year, the more diversified and dynamic their word usage in that year. The value that represents this diversity inthe pattern of word usage in Figure 6 is linked to the information entropy of the word usage probability distribution. If you are interested in the technical details, see note .
All analyses presented in this section refer to only a fraction of the useful structure of information that can be extracted from the IGF text corpus. First, our present analyses use actors – different stakeholders – as basic units of analysis. We have studied how similar or dissimilar are the uses of language by different stakeholders by comparing the patterns of word usage they have produced at the IGFs. We could have posed a symmetrical, and no less interesting question: how do different concepts, embodied in characteristic words and phrases, group together on the basis of their co-occurrences in the interventions of different stakeholders at the IGF? Our next step will be based on this and similar analyses that will provide an insight into the structure of the content of communication. We will be looking at semantic maps of words, just as we have examined a semantic map of actors this time.
Furthermore, we have used a single, well-known analytical approach in cognitive sciences – the attributive multidimensional scaling  – to conduct the analysis discussed in this section. Although this technique is readily applied in the field of statistical natural language processing, different methods are also at hand in order to examine the structure of relations among actors, texts, and concepts. The full development of the Emerging Language of Internet Diplomacy will rely on a wider spectrum of analytical methods and approaches than those used to produce the current analysis. Of course, in order to make full use of the results of exact, scientific procedures, the inclusion of qualitative analyses on behalf of Internet governance experts and the participants of the IGFs themselves will be invaluable; after all, it remains true that only the human mind can be the judge of whether scientific procedures do or do not produce meaningful results that can be relied on and used in practice.
 Kleinwächter W (Ed.) (2011) Internet Policy Making #2. Co:Llaboratory Discussion Paper Series No. 1. Mind: Multistakeholder Internet Dialogue. September 2011. Available at http://dl.collaboratory.de/mind/mind_02_neu.pdf [accessed 28 October 2012].
 United Nations, International Telecommunication Union (2005) Tunis Agenda for the Information Society. Document: WSIS-05/TUNIS/DOC/6(Rev.1)-E. World Summit on the Information Society, Geneva 2003 – Tunis 2005. Available at
http://www.itu.int/wsis/docs2/tunis/off/6rev1.pdf [accessed28 October 2012].
 For discussion on an interplay between multistakeholderism and modern diplomacy consult: KatrandzijevV and Kurbalija J (Eds) (2006) Multistakeholder Diplomacy - Challenges and Opportunities. Malta and Geneva: DiploFoundation. Available at
http://www.diplomacy.edu/resources/books/multistakeholder-diplomacy-challenges-and-opportunities [accessed 29 October 2012].
 Antonova S (2007) Power and multistakeholderism in internet global governance. Towards a synergetic theoretical framework. Available at http://muir.massey.ac.nz/handle/10179/653[accessed 27 October 2012].
 Chenou J (2011) Is Internet Governance a Democratic Process? Multistakeholderism and Transnational Elites. 6th European Consortium for Political Research General Conference, University of Iceland, 25–27 August 2011. Available at http://www.ecprnet.eu/MyECPR/proposals/reykjavik/uploads/papers/3135.pdf [accessed 27 October 2012].
 The semantic map presented in Figures 4a and 4b was generated by attributive multidimensional scaling. We selected the first 100 most frequently used words by each stakeholder group in each of six IGF years. From the obtained word lists we generated a list of unique words that have been most frequently used by all stakeholders. There were 492 such words. The patterns of word frequency were than mapped onto the list of unique words for each stakeholder, each year and the distance matrix computed from the resulting vectors of word frequencies. An algorithm for non-metric multidimensional scaling was used to generate a three-dimensional solution with a satisfactory value of Stress; Figure 4a (and 4b) presents the first two dimensions from the final solution.
 The connectedness of points referring to the same stakeholder group is not a result of any analytical procedure; it is given for the purposes of the ease of interpretation of Figures 4a and 4b only.
 Shannon’s information entropy was computed for each word usage probability distribution (for each stakeholder group and each year) and then rescaled to relative information entropy. Since high entropy indicates a more uniform distribution, the diversity measure used in Figure 6 is simply 1-entropy; we used this measure in order to make use of the Y-axis naturally, with higher values representing more diversified patterns of word usage.