In this episode of our “Political Construction of AI” series, we present our interview with Alex Hanna from the Distributed AI Research (DAIR) Institute, which also counts notable figures such as Timnit Gebru and Emily Bender among its contributors.
A sociologist and AI researcher, Hanna left Google’s ethical AI team and joined DAIR, where she works on developing a critical approach to AI that addresses data exploitation, social biases, environmental impacts, and corporate monopolies. She also contributes to DAIR by championing participatory and distributed research models, aiming to build AI that places social benefit at its core.
In this interview, you’ll find Hanna’s views on the “bubble” nature of AI, data monopolization by big tech, financial policies that threaten academic independence, and new forms of labor exploitation. We also explore the alternatives offered by independent initiatives like DAIR and discuss how to shape a more equitable, participatory, and transparent approach to technology. Alongside her critical perspectives on the AI field, Hanna provides a thought-provoking answer to the question, “Do we really need AI?”
'AI is both overhyped and a bubble'
Do you think the concept of artificial intelligence is just hype or a “bubble”?
AI is not a coherent set of technologies, but the current wave of "AI" focuses around what's known as "generative AI" like large language models, which generate text from text, and diffusion models, which generate images from text. It's both overhyped and a bubble. Hype surrounds technologies which overpromise and underdeliver. Moreover, it's a bubble because massive amounts of capital have been put into AI, with very little to show for it. Even venture capitalists like Sequoia Capital have admitted that to be worth the investment, AI technology has to turn over something like a $600 billion in revenue to pay off.
Critical perspectives describing large-scale language models as “statistical repetition mechanisms” argue that these systems do not offer genuine comprehension or insight. What kinds of misconceptions might arise when AI is attributed with “deep understanding,” and what social and political repercussions could such illusions entail?
That turn of phrase is a good one, as it highlights that models repeat, in different configurations, what is in their training data. These models do not have any sense of self, and they do not understand in any kind of real sense of the world. Attributing understanding to them has the effect of suggesting that they are like humans and can replace real humans in important social and political configurations, which is far from the truth.
Centralization of computing power and data
How do the massive data collection and processing policies of big tech companies create a power asymmetry in the field of AI? In what ways does data monopolization, combined with profit-driven AI narratives, exacerbate global inequalities and power imbalances?
To be at all effective, these technologies need huge amounts of computing and data, which means that often the institutions which can support them are either big tech companies, or companies which specialize in collecting those data and training them with huge amounts of compute. Many of them, like OpenAI and Anthropic, have been subsidized in computing power by Microsoft, Google, and Amazon. AI tools are not necessarily desirable, but compute and data centralization means they are exacerbating existing problems with the centralization of computing power to a small set of firms.
In what ways can AI models trained on large datasets reproduce social biases and discriminatory patterns?
AI models are trained on all the internet has to offer, including its racism, sexism, colonialism, and other forms of discrimination. There are some methods of cleaning them up on the output end, but companies do not spend time cleaning data themselves or even have a good idea of what's in the datasets. So these biases can be reproduced: in text, in images, and in video.
'Many models don't have the data in the format they need'
Through what mechanisms do AI and algorithms exploit human labor, and what kinds of transformations do they bring about in the world of work? (For instance, job losses due to automation, precarious “gig economy” employment, and so on.)
AI requires huge amounts of labor to run in the first instance. There’re immense amounts of labor needed to label, filter, and screen out any type of harmful content which may come out of these models. In addition, many of the models do not have data in the format that they need to operate in the first place. This work, which has been called data labor, has become more pronounced as more models are being created from more companies.
In addition, employers are chomping at the bit to replace workers with these tools, as they perceived much of the work that office workers and creative workers do can be automated by synthetic media. For instance, in the future Bill Gates believes that we won't have doctors and teachers, as they'll be replaced by these models. But in fact, these tools cannot adequately replace this work, especially the important social work of doctoring and teaching. That hasn't stopped many organizations from promising that work can be automated and laying people off because of it. We're already seeing this in industries like graphic and video game design, where many workers have been laid off from full-time work, only to be hired back as gig workers.
It is claimed that data collection and processing increasingly rely on low-wage, insecure, and often invisible forms of labor. How does this situation reinforce the devaluation of labor, and what can be done to build a more equitable system?
This is addressed by the question above with data workers. I want to address the second part of the question in particular. If these workers were treated well on an industry level -- paid sufficient wages, given breaks and mental health care -- it could help prevent the worst elements of this kind of work. But it's also the case that this work is following the same types of patterns we see in the areas of global clothing, coffee, and chocolate production.
Accountability issues
In what ways does the dominance of big tech companies in funding AI research threaten academic independence? What long-term risks arise when research agendas are shaped by commercial interests, and what measures can be taken to address this?
Big tech has become a revolving door for academics, especially in computer science and AI research. Large industry grants are given to academics who do this research, and these researchers spend time at industry labs, while they also send their students there after they graduate. CS and AI research has become dominated by deep learning approaches, to the detriment of other types of research which are not considered as popular and do not receive Big Tech dollars. Long term risks mean ignoring other research questions and paradigms within CS. AI is poorly theoretically justified, but throwing more data at these models has become the dominant paradigm. Transparency around funding sources could help, but there are few accountability mechanisms available. Alternative funding mechanisms could also help, but it's very hard to compete with large tech firms.
The lack of transparency regarding how companies train and operate their models is a major source of concern. To what extent can greater transparency and accountability solve these problems, or do we need more structural regulations?
Transparency is a floor, but not a ceiling. We need much stronger regulation around data transparency but also the ability for collectives to act when corps run a foul of regulation.
Is 'another AI' possible?
Independent and decentralized AI research initiatives, such as DAIR, enable collaboratiotn among experts from diverse geographies and disciplines. How does this approach enrich AI studies, and what alternatives does it offer in the face of big tech monopolies?
DAIR strives to be an institution where communities can research questions that are important to them, outside the influence of big tech. Some other institutions, like Te Hiku Media, work to secure data sovereignty and control of the means of computation within their communities, as well as hold themselves accountable to those communities.
Critical perspectives focusing on data exploitation, environmental harm, social prejudices, and corporate monopolies are exploring ways to build more equitable and responsible AI. Within this framework, how should design principles, regulation, and governance models be shaped? And in light of everything we’ve discussed, do you believe that “another AI” is possible?
I don't think we need "AI." Rather we need technology which works for people, outside of corporate imperatives and their authoritarian tendencies. Principles, regulation, and governance should be organized around broader participation, especially from people who are typically outside of the development cycle of most tech. (DS/VC/VK)