21.04.2021
Existential Risks and Global Governance. Interview with philanthropist and technologist Jaan Tallinn

by Olena Boytsun, Founder of the
Tech & Society Communication Group
21.04.2021
Existential Risks and Global Governance. Interview with philanthropist and technologist Jaan Tallinn
by Olena Boytsun, Founder of the
Tech & Society Communication Group
Jaan Tallinn (Photo: Annika Metsla)
Jaan Tallinn, a founding engineer of Skype and Kazaa, has chosen the subject of prevention and reduction of existential risks for humanity as the main focus of his philanthropic activities. Jaan co-founded the Cambridge Centre for the Study of Existential Risk and the Future of Life Institute and financially supports other existential risk research organisations.

Jaan is an active angel investor and a partner at Ambient Sound Investments. He has served on the High-Level Expert Group on AI at the European Commission and the Estonian President's Academic Advisory Board, as well as is a former investor director of the AI company DeepMind that became world-famous, having developed AlphaZero, a computer program that showed unique results in the games of chess and go.

Olena Boytsun, an impact investor and the founder of the Tech & Society Communication Group, had a conversation with Jaan Tallinn about the current status in the development of artificial intelligence and global governance systems.

— Jaan, for more than a decade you have been working on the subjects of existential risks. You identified this field as the focus of your philanthropic commitments with the primary goal to reduce existential risks from advanced technologies to humanity. How would you define existential risks?

— There is a wider and more complicated definition of existential risk as catastrophically reducing the maximum potential of humanity. Let us imagine the future in the stars, trillions of people all over the universe are living happy lives and doing fun things. And then one year, for example, in 2040 or about, that kind of vision suddenly becomes no longer possible to reach due to some catastrophe. According to the complicated definition, existential risk is connected to such a situation. The easier definition though is just a risk of everyone's dying.

Such risks certainly appeal to everyone.

— Existential risks are classified into two categories. The first one is natural risks. We know that every 10 million years or so a big enough rock comes along and hits the planet. That is what dinosaurs found out, apparently, in a hard way. And it's very possible that if we wait another 10 million, 20 million, 30 million years, there's going to be a big enough rock to wipe out humanity as well. Also, every once in a while there is a risk of a supervolcanic eruption that would be big enough to drastically change the environment in a way that might actually lead to extinction of human species. These are typical natural risks.

The second category is about technological risks.
If you want to predict the future of the planet, the most important factor is going to be what kind of technology we will have.
Technology in some ways is shaping the future, and in these processes there might be things that are really bad for human survival. Another way of looking at this question is to realize that the size of the planet is not increasing. At the same time, the effective radius of every new technology is on average increasing. It's much harder to kill everyone using stone axes, than to kill everyone using nuclear weapons or by something that hasn't been invented yet like synthetic biology, for example.

So, there are risks that by continued technological development we are entering a regime where the world is increasingly fragile and there are more and more things that fewer and fewer people can do to completely disrupt either accidentally or deliberately the future of human species.

Would you define natural risks as those that humanity cannot prevent?

— People actually can do and have done a lot to mitigate natural risks. For example, we are monitoring situations with asteroids. The "positive" thing about natural risks is that we roughly know how bad they are. And as far as we know, they are not going to be much worse in the next century or so, whereas with technological risks we know that they're going to get worse with every year.

So, natural risks are not caused by human activities, at least to the extent we can understand. While technological progress is totally advanced by people.

— Exactly, we can reduce natural risks, whereas with technological risks we can both increase and reduce them, so they are much higher leverage.
— From all dangerous technologies that humanity can create, you focus on the questions of artificial intelligence (AI) development. Why do you think that it is important to concentrate on mitigating the risks connected to the AI deployment?

There are a couple of reasons. When it comes to current technologies, artificial intelligence is a sort of meta-technology — a technology that is potentially able to invent, develop, deploy technologies itself.

Whatever concerns we have with technology, as we are going to increasingly delegate technological development to AI, we should also transfer such nuances to AI. By default AI is not going to have any concerns. In particular, by default AI does not have environmental considerations like humans have. Biologically we require a very narrow range of parameters in order to remain alive whereas robots don't care. That's why we send robots to space in radioactive areas, they do not care about the environment.

AI is a very high leverage technology. Whatever technological concerns we, the humans, have, we can either address it using AI or at least ensure that the AI-driven technological development will continue taking into account our problems, concerns and limitations.


Jaan Tallinn (Photo: Annika Metsla)
— When you are talking about AI in this context, do you mean AGI — Artificial General Intelligence?

— AGI is one term that points to the core of the problem. At the moment to control AI humans are using the fact that AI is narrow. For example, if you are playing a chess game against AI, you're not really concerned about AI killing you during the match, although it would be useful for AI, because then it will no longer have an opponent and will get the victory. Depending on its value function, killing you might be a positive thing from its prospective to do, however, AI is just not aware of this option. It's not general intelligence in the sense that AI can't look up from the chess board and see what's going on in the world.

A concerning thesis is that once we have AIs that are general enough (and AIs are becoming more general by today), then we no longer will have the ability to control them. Because right now all the mechanisms we use for AI control are predicated on the fact that AI is not aware of a control mechanism. That is why AGI is a potentially dangerous thing, at least as long as we don't figure out how to control it.
In addition, it is possible to have major disasters from AI that wouldn't be general. A friend and colleague of mine Andrew Critch from Berkeley together with David Krueger from Montreal have written a detailed paper called ARCHES — AI Research Considerations for Human Existential Safety.

They define a term Prepotent AI, which is an AI that might or might not be general, but it has two properties. Firstly, once we have deployed such AI, it will have an environmental impact at least as big as humans have had. Secondly, once we have deployed it, we can no longer turn it off. The reason for this is unimportant: it can be because it's not easy to turn off AGI that is aware of the control mechanism, but it also might be some mundane reason, for example, the administrative one that it's very hard to turn off the entire Internet, or any other systemic or economic reason, why people are not willing to turn off such Prepotent AI. It means logically that once we have Prepotent AI out there, we're going to have a massive environmental impact that might actually kill us.

— This is a fascinating example about chess. I have devoted a number of years to the game of chess and know that there has been discussions in the professional chess community about the fact that computers are "killing" the game, especially with the development of the AlphaZero program. But chess players did not suspect that the risks could have been much higher.

— The really interesting thing about AlphaZero is the ending "zero". It means that in training the AlphaZero did not use any bit of information that human civilization has ever produced. If you think about it, it is an alien. When you're playing chess, or go, or shogi against this program, you're playing against an entity that has never interacted with human civilization. This is very interesting.

— You have mentioned a very interesting term — Prepotent AI. Currently, what is called AI is generally far from being AGI. Modern systems are machine learning ones, algorithms for making limited decisions based on databases. There is no flexibility in them.

— There is one important heuristic approach that I've been pushing people to use. Whenever you see the term AI just mentally replace it with delegating human decisions to machines and both the opportunities and the risks immediately become much clearer.

— Do you like the term "artificial intelligence" at all?

— It is a convenient shorthand. In the rationality community that I'm part of there is a kind of mental trick that is called «taboo the word». Very often people have slightly different definitions for the words. You think you're saying one thing, but the listener is hearing another completely different thing.

In situations where the definition of a word could prohibit communication, it is useful to take this approach where you deliberately state that «we no longer use the term». Whenever you have a temptation to use a term, you need to stop and use some other explanation. In the context of AI, it is often productive to simply not use the term AI, because people understand different things by that.

Why do you concentrate on risks instead of the ideal powers of AI? As far as I understand, there is already a concept of Beneficial AI. Why not to propagate the idea of the potential benefits of AI that could save the planet?

— I'm a technologist, right? I do think that AI is a high leverage technology. One of the good reasons that I am focused on AI is that whenever we fix the problems with AI, these changes will positively influence other problems. If we fix AI risk, we automatically fix biological or natural existential risks. Whereas, if we fix the asteroid risk or bio risk, we still have the AI risk to contend with.

The reason why I really focus on the downside is because if we don't pay attention to the downside, we are not going to have the upside. The problem is that the positive result is conditional on the negative one. So if a negative result happens, you will not get the upside. There is not a very meaningful change of the goodness of the outcome, if you only focus on the upside.

If you focus on the upside, even if the positive outcome is dozens of times higher, a downside will limit most of the scenarios that this increase would actually be able to manifest.

Also, it's valuable to focus both on the downside and upside, but it's important in which order you do it. If you only focus on the upside and think about the downside later, you might not be around to address the downsides later, because the downside will kill you before. Whereas if you fix the downside first, then you will have the rest of the universe to focus on the upside. That's why I think it's important to focus on the downside.

The target group you are working with is the developer community, highly specialized experts who are aware of the great risks of their work. But as far as I understand, you are also trying to involve the general public into the discussion of understanding the risks.

— Sort of.

— We also should be involved in the discussion about the existential risks, shouldn't we? We, the humans.

— I would say that I don't have a very deliberate agenda when it comes to informing the public. I see that there is a strong public interest in this topic, but I am not tackling this issue proactively. I agree to give interviews when approached, but I think it's much more important to make sure that the people who are actually developing these technologies are aware of the risks. At the same time, people who work with technologies have friends, and even if I cannot reach a person directly, it is still useful for friends to know about all the risks that a person may face in the work.
Sometimes people who are working on technology, are deliberately trying not to think about the downsides of their work, because that is quite hard psychologically.
Therefore, in some sense, it could be easier to get the friends on board then that person. There are multiple examples in the field of AI, when people are refusing to acknowledge very simple truths.

— It seems to be that the technological community you are talking about is very closed and developed. When I studied this topic for our conversation, I was impressed by the fact how much developed this community is and how much it is closed from the outside world. It reminded me of the chess community, which is also self-sufficient. But chess players are not developing technology that could have such important consequences for the general public as AI.

— As my friend Max Tegmark sometimes says, well, it's quite possible that the future of humanity for the rest of the history of the universe is going to be decided by some guy on Red Bull at 3:00 AM at the server farm.

I'm not saying that's going to happen, but definitely there's risk of that. Very, very few people are deciding now what is going to happen with humanity. It would be great to avoid such a negative scenario, but it's not clear yet how to do that.

This is a very interesting subject. Do you think society should pay more attention to the ongoing research?

— It would be valuable to think about some kind of governance mechanisms. There is a nice group in Oxford that I'm a supporter of directly or indirectly called Gov AI — the Centre for the Governance of AI. This is an academic group thinking about what are the methods that regulators or public in general can use to shape the technology when it comes to AI.

I do think that it is unlikely that a guy with a Red Bull at 3:00 AM does actually want to decide the future of humanity. So in some sense it is in the interest of the AI companies to have a wider mandate. I expect that if there would be more reasonable governance ideas surfaced, there will be also interest from the AI companies to see how they can adopt and cooperate with those. We definitely need ideas on how to govern better.

— When you talk about governance, do you mean the governance of the AI development processes or building management systems, for example, for global coordination, with the help of AI?

— Our common goal is actually the following — how we can take into account the views of larger stakeholders, this is a popular word nowadays. Almost 8 billion people are vested in the future, and it's unfair that almost none of them will have a voice in what's going to happen now. I think that on a philosophical level AI governance is really trying to solve the problem of how to give people back a voice over their own future, rather than leave it to the guy with the Red Bull at 3:00 AM in some server farm. That is the logical thing and a strategical philosophical question too.

The tactical, pragmatic thing is about the mechanisms we can use to channel voice. Some people say that we need to democratize AI development. Sure, but if to search for the most efficient way of solving this problem, it wouldn't be possible to train all 8 billion people in AI development.

It is much better if people who are interested and well-positioned will advance the AI development, while having mechanisms by which either AI facilitates itself, or certain regulations or other mechanisms we use to coordinate to make sure that people who are developing AI will also actually funnel the interests of people who are not developing AI.

The question is whether this group of developers truly believe that they can develop the best system for the whole of humanity? It still seems to me that the chances that a random programmer will develop general artificial intelligence overnight are not so high. Perhaps, there is a greater likelihood that a billionaire will decide to invest in the project and will push the entire team of developers to create such a product, the outcome of which, unfortunately, will not be possible to control in the future.

— You are saying the same thing that I am, only on a different level. What does the team consist of? The team is composed of young guys with Red Bull. You can model it as a billionaire doing something, but at the end of the day, the billionaire is not going to push the button.

I do think that it would be very valuable if there were mechanisms to delegate crucial decisions, for actual engineers to be aware of these mechanisms, as, for example, there is a long tradition of whistleblowing.

If a crazy billionaire, or more likely just some corporate entity led by tops who only care about making a profit, and they were selected to care about profits, insists on continuing with the deployment of a potentially dangerous technology, then engineers could say — shouldn't we follow global AI regulations. For such cases, having regulatory mechanisms and a high level of awareness would also be useful.

— During one of your public talks you said that capitalism created a safety feeling for tech development, which, in my opinion, goes well for our discussion of billionaires versus developers. Do you think that the lack of regulation could be a problem?

— I grew up in Soviet Union, with constant propaganda about routing capitalism. Luckily, in Estonia we could also watch Finnish TV and get some doubts about how much truth there was pictured.

Capitalism is clearly superior than Soviet planned economy has ever been, and the important factor for this is a positive feedback loop between consumers and producers. It is not very valuable to develop technologies that consumers do not want, and consumers can vote with their money, thereby pulling the economy towards what is useful and good to people. But it is important to realise that this does not solve all the questions that may arise, due to two things: externalities that a market does not encourage to include in pricing in a natural way, for example, the impact on the environment, and such things that do not depend on consumers, for example, military technology.

I can observe and compare the US and China: one of the countries, at least nominally, is the communist regime, and the other is free market capitalism.

Jaan Tallinn (Photo: Annika Metsla)
Capitalism creates an almost complete illusion that the future is going to get better and better because the companies are always on the side of consumers. And this is where important exceptions come into play, for example, very large externalities, sudden large-scale disasters that were not taken into account by the market, or other strong market constraints.
When I think about how to address the potential risks associated with advanced technology, I see that China is now in a much better position to regulate because there is much stronger regulation there in general. If a really dangerous situation arises, then they can stop it, at the same time in the US almost everyone can agree that it is going to be bad, but it could still take them decades to stop the process. The ineffective elimination of the impact of such external factors is an additional evidence of the weakness of capitalism.

— Modern economists debate a lot about different systems. For example, the new term of "state capitalism" in China was introduced as opposed to the traditional market capitalism. But if capitalism has challenges, the planned Soviet economy, as we know for sure, does not work, then what system, in your opinion, would be the best for a country or even for the global level of coordination?

— I did try to not take on very strong views. As a technologist, I'm very fascinated by blockchain. In this framing the most important thing that blockchain has brought to the world is that now we have the ability to globally agree about the piece of data without trusting anyone to maintain that piece of data. I have done a couple of workshops with AI safety and blockchain people to think if there are any positive use cases that are now possible to implement for people to coordinate in the cheapest way in terms of how much trust people need to invest into the system.

In general, I believe that in global governance, concepts such as cooperation and transparency of processes are important, and blockchain at least at some level from some angle brings these two things together.

But I don't really hold any strong opinion about a particular system, I just think it's important for us to increase our ability to coordinate.

Do you feel that there is no valuable structured idea for the global governance mechanism still?

— The world has never been coordinated globally. Sometimes there are important "tragedy of commons" situations between states.

For example, let's take the arms race. The reason you are in the arms race is because other people are engaged in the arms race, so it is a vicious circle. That is why it is so important to have international treaties in order to limit things that might push you towards competing interests. This is the classic "prisoner's dilemma" or "tragedy of commons".

It is in the common interests of all people for no one to invest, for example, in more and more powerful weapons, but at the same time it is even more profitable for a separate group to be the only ones who do this. This means that you are in a situation that doesn't have a stable Nash Equilibrium or has a bad Nash Equilibrium.

— You had an idea about the Global Preference Discovery System.

— I used this node term in some preliminary work that I did. Right now the governance in democracy, for example, in some ways is conflating two different concerns: one is figuring out what people want from the future, and the other one is how to get it.

Politicians usually say they know how to solve both issues — we want X and we are going to do Y for this, so vote for us. But I think it would be useful to have a certain system so that people could vote for what they think a good future looks like, so that people independently determine X. The easiest way to do this is to establish a system of regular polls, for example, randomly ask people about their life.

If we have sufficient randomised information from the world about how people are feeling right now, we could build a World Wellness Index to make future predictions. For example, what would happen to this global index if the US completely opened its borders? This could be an interesting tool for policy debates and regulations that would decouple the question of what we want from the future from the deliberation of how to get to this bright future.

I understand the mechanism, though, to be honest, I was deeply impressed for the second time by the fact that these issues are discussed in a closed technology community as well as the way they are discussed. And I am a Ukrainian economist and it is not so easy to impress me at all.

— Recently in the rationalist community, Alexander Scott, a very well known figure in the community, and Glen Weyl, researcher from Microsoft, had a super interesting discussion and friendly debate on the issue you are talking about. Glen Weyl said — yes, I am a mechanism designer, but I do not trust mechanisms, at the end of the day you can just shoot yourself in the foot all the time. Scott Alexander had this response, perhaps overly critical, but still very good, that this doesn't mean that we should always rely on our gut feeling.

At the end of the day, it is a spectrum of how much you trust people or how much you trust mechanisms. If you only trust people, then you still have to have some constraints that guide people, such as the rule of law. And this is already a mechanism, democracy is already a mechanism. It's important to strike a balance.

The concern is that a very small group of technology specialists is developing a mechanism for understanding, for example, such important questions as universal human values.

I don't think I should oversell this thing, it was just an example of a mechanism that the world doesn't have yet, but it could have, and I think it could be very useful.

When I talk about the mechanism for preference discovery, I would like to point to the fact that we need mechanisms that would give people more voice in the outcomes, and I am open to any ideas. The current tools that we have are several hundred years old and were invented and put into place when we still had horses as our main transportation. Our results in deciding how to give people more opportunities to express their opinions could be better, but at the same time I understand very well all the dangers associated with such a system — such as exaggeration or delusion, manipulation in order to pretend, that people want, for example, to maximize the profits of one definite company.

One can look at the example of Amazon's review and preference discovery system. The system, in theory, is supposed to discover preferences based on the entered information, but in practice it gets scammed all the time. It is very important to verify that the system is robust and resilient enough to actually do what it was designed to do.

Thank you for reassuring me a bit that you are aware of this risk. It was also fascinating to learn about the AI alignment movement. How would you describe the goals of the movement for the general public?

— There are several ways to look at it. If we think about AIs as machines to delegate our human decisions, then we need to ensure that our ideas about what the good future looks like are fully transferred to AI.

What is really important to realize is that AIs are more aliens than aliens. They don't have any biological background. They were not evolved biologically, they don't have any concerns about the environment.

We shouldn't underestimate the difficulty of transferring human values to AIs. Because AIs are as autistic as one can get. People are prone to think that AIs are basically human and as they get smarter, they can become even more human like. No, they will not.

— Why are you so sure about it?

— There are very good arguments for that, for example, humans were shaped by biological evolution in a social context, in groups. There is a great book called "Moral Tribes: Emotion, Reason, and the Gap Between Us and Them" by Joshua Green from Harvard.

Joshua looks at how human morality developed, and it was indeed in small tribes of 50 to 100 people. Morality was something that appeared automatically because it helped those tribes to become more competitive, people were looking after each other and behaved altruistically. On some level, altruism is kind of self-sacrificing, but on a tribal level it makes the group more competitive.
It's a very interesting possibility that AI might have something like that happening automatically, but it's not going to be developing in 50 agent groups in an ancestral environment in Africa like 100,000 or 200,000 years ago. It's just vanishing unlikely that AI is going to be human like after it has been trained in servers for a couple of months.
— Unless AI is the next step for evolution, and since nature is rather balanced, organized and mathematically precise...

— Exactly, it's possible that there is some kind of almost mathematical attractor towards cooperation. But you can ask anyone if they are ready to put the lives of their children on the line in hope that AI will automatically develop a system of morality similar to human morality, and most likely the answer will be negative.

The argument about general tendency towards morality in AI development is extraordinarily attractive, but you need to have really strong evidence of this, and not just brush it off, assuring that everything will be fine. Which AI developers quite often do.

— If the task is to transfer the best of human values into artificial intelligence systems, wouldn't it work to use the structures that humanity had already developed? For example, 10 Commandments or 3 laws of robotics by Isaac Asimov?

— The classical answer to the robotics laws is that they aren't even working in novels. Asimov invented those laws to show how they will fail once you put them into the systems. Why do we expect it to work in real life?


Jaan Tallinn (Photo: Annika Metsla)
The more general answer is that whenever you want to constrain something that is smarter than you then the constraints are unlikely to work, because what you consider to be constraints and what it will be able to consider constraints will be different. Your ability to come up with such limitations is weaker than its ability to find ways around them, so putting all sorts of constraints is difficult. You need a much more nuanced approach. There are really interesting ideas on the website called Alignment Forum.

— Humanity has been thinking about the question of values for thousands of years and a number of concepts have been developed to understand better both an individual mind and soul, as well as society as a whole. For example, there are terms of "collective unconscious" by Carl Gustav Jung, "pure reason" by Immanuel Kant, "noosphere" by Vladimir Vernadsky. Is it possible to try transforming these metaphysical concepts in some way for the practical use in the AI alignment processes?

— I am not very familiar with the concepts you are talking about. From the sound of it, I would expect a really big problem of using ideas from continental philosophy, it would be a massive ontological problem. The concepts are built on things that do not translate into the machine code. Therefore, it is difficult to program things that you cannot specify to the level of an assembly language. It would be a translation issue.

— I firmly believe that one can describe almost everything with mathematics, which means that one can program almost everything as well.


— As long as you can put it into math, then of course it is no problem to program it at all. There could be certain engineering problems, but this is already a technical issue. However, in general, usually people who talk about continental philosophy etc do not know mathematics at all.

— Challenge almost accepted and maybe in 35 years time I will come back to you with the solution.

— I think it's valuable to take many perspectives. The first challenge would be how to make the ideas concrete enough to be mathematical and compatible with computer science. Blockchain serves as a great example.

— I brought in this question in the framework of thinking that, perhaps, for AI alignment it would be useful to include and apply achievements in different fields, such as philosophy, sociology, biology, genetics, art, not to mention physics, chemistry, astronomy.

— I agree, and a big problem is focusing on productivity. Usually, when people understand that it is necessary to collect different perspectives, they decide to have a big conference. Now, during the pandemic, we have virtual conferences. People spend a bunch of hours talking and then nothing happens. An important challenge is to make sure that all ideas can be taken up in a productive manner, rather than trying to convince people to read a lot of documents and spend hours on something that will not have any result.

— Probably it makes sense to think about how to make the processes more inclusive and transparent in general, with the participation of all stakeholders. I'm not talking about inviting non-specialists to your professional conferences, but perhaps now your community and its ideas have grown enough to open up a bit and to involve the society to some extent as well.

— I'm a co-founder of Future of Life Institute and the biggest thing that we do is a conference every two years. It was supposed to be in 2021, but because of Covid it didn't happen. The event helps to bring different perspectives on the situation. But it is really important to select people, because if there is even one person in the room, who does not understand what is happening around, but at the same time actively tries to influence the process, this can really ruin productivity for everyone.

— You've built a great community to solve really important problems. However, it seems to be that you have come to the point where you need to involve other groups of experts and society at large. The same way you are deeply in the AI development, there are groups working on, for example, direct democracy mechanisms or civic tech systems.

— I agree, and that's why I'm going to reach out to the mechanism design community, which is different from programmers.

— Thank you so much for your time and conversation, it was extremely useful and interesting. I am glad that we will be able to continue discussing this topic at the first meeting of the Tech & Society Communication Group.
References
  1. Andrew Critch, David Krueger. AI Research Considerations for Human Existential Safety (ARCHES)https://arxiv.org/pdf/2006.04948.pdf

  2. Max Tegmark. Life 3.0: Being Human in the Age of Artificial Intelligencehttps://www.goodreads.com/book/show/34272565-life-3-0

  3. A debate between Alexander Scott and Glen Weyl about technocratic approach to policy development (mentioned in the interview)https://astralcodexten.substack.com/p/contra-weyl-on-technocracy

  4. Joshua Green. Moral Tribes: Emotion, Reason, and the Gap Between Us and Themhttps://www.goodreads.com/book/show/17707599-moral-tribes

  5. Rationality Community https://www.lesswrong.com/tag/rationalist-movement

  6. The Centre for the Governance of AI (GovAI), part of the Future of Humanity Institute at the University of Oxfordhttps://www.fhi.ox.ac.uk/govai/

  7. AI Alignment Forumwww.alignmentforum.org

  8. Future of Life Institutefutureoflife.org
  9. Cambridge Centre for the Study of Existential Riskcser.org
© 2021-2023 Tech&Society Communication group. All rights reserved.