The Role of Domestic Knowledge Pools

In a previous post, I hypothesized that internationality in innovation networks is negatively related to the size of the domestic knowledge pool. Countries with a small domestic knowledge pools can be expected to depend on foreign co-operation in many areas of research, whereas countries with large innovative capacities probably have many of the necessary innovative resource within national borders. This post aims to substantiate this logic with some quantitative arguments.

How to measure the size of knowledge pools?

I assume that the number of annual patent applications is a good measure of a the size of a country’s knowledge pool in a given year. Thus, the independent variable in the subsequent simple linear regression analysis is the count of patent applications per year and per country. Naturally, this is not completely accurate, as countries have different propensities to patent, but the patent count should provide a reasonable estimator. Furthermore, the data was reduced to contain only those countries with a significant patent output. To do this, all countries with less than 20 patent applications in the year 2013 were excluded. 2013 data was chosen to determine which countries to include in the analysis, as it is relatively recent and relatively complete (unlike data for the most recent years).

For simplicity, only inventor data is considered here. The knowledge pool should be best represented by the inventors residing in a given country, not the companies that operate in the country.

Analysis

The histogram of patent application counts per country in the year 2013 is strongly right-skewed. There a many countries with small patent output and very few countries with very large patent output.

library(ggplot2)
#dispdata is a previously created dataframe that contains patent count and dispersion data
attach(dispdata)

qplot(count_inv[year == 2013], geom = "histogram", binwidth = 500, col = I("black"),
      xlab = "patent applications in 2013", ylab = "count", 
      main = "Distribution of patent applications per country (2013)")

The logarithmic transformation of the patent count approximates the normal distribution and is therefore more suitable for the linear regression analysis.

qplot(log(count_inv[year == 2013]), geom = "histogram", binwidth = 1, col = I("black"),
      xlab = "log(patent applications in 2013)", ylab = "count", 
      main = "Distribution of logarithmized patent applications per country (2013)")

As the relationship between the size of national knowledge pools and internationality is investigated per year here, a series of annual regression analyses serves to illustrate the persistence of the hypothesized relationship over time. The below series of scatterplots with fitted regression lines showes that there is a clear negative relationship between countries’ patent count and average annual country dispersion.

ggplot(data = dispdata[year %in% seq(1985, 2015, 6), ], aes(x = log(count_inv), y = invd)) +
  geom_point() +
  geom_smooth(method = 'lm',formula = y~x) +
  facet_wrap(~ year, ncol = 3) +
  xlab("log(patent count)") + ylab("average country dispersion") +
  ggtitle("Patent count and country dispersion over time") +
  theme_light()

The following graph illustrates the relationship in 2013 in more detail and visualizes the positions of individual countries in the scatterplot.

library(ggrepel)

ggplot(data = dispdata[year == 2013, ], aes(x = log(count_inv), y = invd, label = country)) +
  geom_point() +
  geom_smooth(method = 'lm',formula = y~x) +
  geom_text_repel() +
  xlab("log(patent count)") + ylab("average country dispersion") +
  ggtitle("Patent count and country dispersion in 2013") +
  theme_light()

Fitting linear model: invd[year == 2013] ~ log(count_inv[year == 2013])
	Estimate	Std. Error	t value	Pr(>\|t\|)
(Intercept)	0.2774	0.03495	7.938	9.375e-09
log(count_inv[year == 2013])	-0.02246	0.005783	-3.885	0.0005466

The negative relationship is highly significant. The Q-Q-Plot shows that the residuals are approximately normally distributed. Thus the assumption of a linear relationship is reasonable.

Having established the significance of the relationship under investigation, the next step investigates how countries perform in terms of internationality when controlling for the size of their domestic knowledge pools. To do this, the residuals of regression models for the years 2003 to 2013 are calculated and aggregated into an average residual value. Ordering the countries in terms of their averaged residuals shows in how far they differ from the country dispersion values predicted by the regression model

allres <- data.frame(sapply(2003:2013, function (x) {
  model <- lm(invd[year == x] ~ log(count_inv[year == x]))
  res <- resid(model)
  names(res) <- country[year == x]
  return(res)
}))

meanres <- data.frame(apply(allres, 1, mean))
names(meanres) <- "residual_d"

#import a list that contains ISO 2digit country-codes to make the country names explicit
isocountries <- read.csv("../../datasource/2digit ISO country codes.csv")
isocountries <- isocountries[order(isocountries$Code), ]
meanres$country <- isocountries[isocountries$Code %in% row.names(meanres), "Name"]

pander(meanres[order(-meanres$r), c(2, 1)], row.names = FALSE)

country	residual_d
Switzerland	0.09129
Belgium	0.09007
Singapore	0.0767
Canada	0.05237
India	0.04803
Ireland	0.04522
Russian Federation	0.04308
Poland	0.03315
United Kingdom	0.0327
Austria	0.02034
China	0.01951
Netherlands	0.01821
Germany	0.01649
France	0.009478
United States	0.006218
Spain	0.003483
Sweden	-0.01945
Australia	-0.02178
Denmark	-0.02238
Italy	-0.02672
New Zealand	-0.0342
Brazil	-0.0344
Saudi Arabia	-0.03747
Turkey	-0.03942
Norway	-0.04077
Taiwan, Province of China	-0.04281
Japan	-0.04595
Israel	-0.04672
Finland	-0.05372
South Africa	-0.05745
Korea, Republic of	-0.08312

The above table shows that countries which perform superior in terms of the internationality of their innovation networks (when controlling for domestic knowledge pool size), are those that have close ties to other large innovators via shared language, culture, and geographical proximity (e.g. Switzerland, Belgium, Canada, Ireland).

Internationality and the Role of Domestic Knowledge Pools

The Role of Domestic Knowledge Pools

How to measure the size of knowledge pools?

Analysis