Sometimes I'll mention that people in the humanities don't use "data" to describe sources, and ppl get mad at me? I guarantee you it's true.— Miriam Posner (@miriamkp) April 26, 2017
data specialists and data librarians in particular get mad at me, I should say.— Miriam Posner (@miriamkp) April 26, 2017
Big Data, analytics, data analysis, databases - all these have been with us for years. But when Miriam Posner tweeted the above yesterday, I began to wonder if we aren’t seeing a “data turn” similar to the linguistic turn following the adoption of (primarily French) cultural theory in the US in the 1970s and 80s. The inaugural issue of the Journal of Critical Library and Information Studies contained “A Case for Critical Data Studies in Library and Information Studies” by Tami Oliphant, and the “Collections as Data” recently released the Santa Barbara Statement, suggesting that that there is more to “data” than the narrow definition that might be provided by, say, a database administrator.
The use of “data” instead of “sources” reminds me of the resistance to using the word “text” to describe a non-textual object of interpretation. Yes, we can understand what Derrida meant, and we can recognize the characteristics shared by, say, a literary text and a non-textual object like a sound-recording. But we can also recognize that there is something metaphorical about the use of the word “text” here. Traditional scholars resisted describing non-textual objects as texts, and so it would be easy for us to suggest that today’s traditional scholars are also resisting the description of non-data object as data.
But what is non-data? There are common-sense understandings of “text” and “data” that might indicate the dividing line between text and non-text, or data and non-data. But where Derrida’s “il n’y a pas de hors-texte” could be dismissed as cultural-theory-obscurantism, it’s much more difficult for us to dismiss the idea that something can be “not data”. We could describe non-data as something not amenable to computation; we could also describe non-data as something which is incapable of being used as the basis for information, but both of these definitions seem particularly slippery. On the one hand, what’s resistant to computation today may not be tomorrow (witness the albeit narrowly focused advanced in machine learning lately); on the other hand, defining data in terms of information seems too circular (“what is information - something you drew from the data”).
In The Prison-House of Language, Fredric Jameson’s 1974 study of Russian Formalism and Structuralism, he talks about the use of language as a model for non-linguistic objects of study (the basis of structuralism and post-structuralism).
The history of thought is the history of its models […] The lifetime of any given model knows a fairly predictable rhythm. Initially, the new concept relases quantities of new energies, permits hosts of new perceptions and discoveries, causes a whole dimension of new problems to come into view, which result in turn in a volume of new work and research. (v)
Language as a model! To rethink everything through once again in terms of linguistics! What is surprising, it would seem, is only that no-one ever thought of doing so before; for of all the elements of consciousness and of social life, language would appear to enjoy some incomparable ontological priority, of a type yet to be determined. (vii)
What I see in the discourse around “data”, and in the funding priority given to, for example, data and digital librarianship, at the expense of other, more traditional fields, is a commitment to data as a model, to an understanding of data as having “ontological priority” in the world of late/digital/cybernetic capitalism. And given that so much of our daily life - from social media, to financial transactions, to industrial production and circulation - is data driven, perhaps there is something to this. The focus on metadata quality, linked data, and computation is, perhaps, nothing more than the necessary response to cyber-capital’s use of data as foundational infrastructure. But, we have to remember that data is both “real” data and “metaphorical” data, just as a text was sometimes textual and sometimes non-textual. And we have to bear in mind not only the audience for our discourse, but the users of our data, recognizing when they have need for, or are already using, real data or metaphorical data.
In a sense, we have to work at several different levels at once. Jameson goes on to say that
We may say that as a method, Structuralism may be considered one of the first consistent and self-conscious attempts to work out a philosophy of models (constructed on the analogy with language): the presupposition here is that all conscious thought takes place within the limits of a given model and is in that sense determined by it. (101)
We must be aware that while we think of things in terms of the model (either a specific data model or data as a model), scholarship, research, teaching, learning and other modes of life go on without an interest in our model. The tribes studied by Levi-Strauss had no need for his structural model of their society; they were just living. So we don’t need to force our “data turn” on other kinds of researchers; rather, we need to develop practices and construct systems that work on real data models that are flexible enough to afford many different kinds of engagement. This is not new, I think, to anyone who works with library systems and data, but it is important, I think, to bear the “unreal” nature of all models in mind as we go along.
And this is where I think the problem lies. There’s a hermeticism to a lot of technical library work (whether that’s systems, cataloguing, or metadata), that tries to ignore the broader social context of the decisions being made. We might be user-focused, we might try to future-proof our decisions, but fundamentally, we follow best practices that are based on professional practice and knowledge (metadata and cataloguing) or institutional culture (systems and development), all of which are slow to change and averse to changing too radically. If, indeed, we are going through a data turn, then in Marxist terms, our knowledge and practices are becoming fetters on the work that we need to do. But the main culprit, from my perspective, is an organizational culture that has no model, that is a collection of ad-hoc decision-making processes (almost exclusively top-down) that are focused on getting as much out of the neoliberal dynamics of a university or municipality as possible, rather than on leading an organization into a position where we can explore and support the data infrastructure which enables scholarship and research, and is both methdology and object of study.
Fundamentally, as Rachel M Fleming noted this morning we need a much firmer grasp on the economics of our situation, both at the granular budgetary level (“are digital initative units better funded than public service units?”) and at the level of political economy. How we sign contracts, how we work with our vendors, how we engage with our administrations and our parent organizations, all of this is being done too positively, too hermetically, too naively. If we seriously want to change our library culture in order to support library work that will allow us to engage and work with researchers, teachers, students, we are going to have to make some serious changes, we have to recognize our collective power to force things to change (a lesson we need to learn on the labour front as well), and we will likely need to get our hands dirty. For, even if we are going through a turn towards (digital) data as the overarching model, we need to be materialists too, and understand, challenge, and employ the material conditions in which we work.