A Business Prototype for the Digital Humanities

Sep 3, 2015. | By: Dean Irvine

do not change IBM into International Busa Machines

– Thomas J. Watson, in conversation with Father Roberto Busa

Contrary to anachronistic origin stories of the digital humanities, Father Roberto Busa’s earliest experiments in humanities computing were conducted using analogue technologies and mechanical instruments. In the preface to his first machine-generated concordance, the Varia Speciminaof 1951, Busa foregrounds the analogue mechanics of its computation and production: “The concordance which I am presenting as an example is precisely an off-set reproduction of tabulated sheets turned out by the accounting machine.” Already a specialized type of counting, his concordances enlisted and evolved into instruments of accounting.

Busa’s partnership with IBM, which extended over six decades, inaugurated a prototypical business model for humanities computing. “I could recompense IBM in any way except financially,” Busa recalled telling then IBM chairman and CEO Thomas J. Watson prior to their now-legendary meetings in 1949. There were two meetings: the first, at which Busa made his pitch and Watson requested a formal proposal to distribute to his engineers, and the second, at which Watson was prepared to reject the proposal based on a report from his technical team, but changed his mind and decided to support it for a trial period. Busa’s initial partnership was premised, from the onset, on IBM’s ownership of Herman Hollerith’s patents for the card punch machine, as well as card tabulating and sorting machines. Once he entered into this partnership, there was no economically feasible means of porting his proprietary data to another company’s processing machines. Like IBM’s business clients, Busa effectively entered into a licensing agreement, which guaranteed IBM recurrent – if unpredictable – returns on its investment. To end the agreement would not only have brought an end to their partnership; it also would have brought an end to his career-long investment in IBM’s proprietary systems. Exemplary of a sustainable business model, Busa’s partnership with IBM underwent massive technological and institutional changes across the course of more than half a century – a period from 1949 to 2010 that saw their collaborations transition from punch cards to magnetic tape, card readers to mainframes, RAM to CD-ROM, and multivolume bound concordances to Web-based databases.

What were the details of their agreement between the Jesuit priest and the CEO of IBM can be based only on published accounts? To take Busa’s recollection of the meetings at face value, as it were, the agreement resembles what we would now call an “angel investment” – that is, an investment by affluent individuals, companies, or trusts that demonstrates a willingness “to assume bigger risks and accept lower rewards when they are attracted by the nonfinancial characteristics of an entrepreneur’s proposal” (see Dilek Cetindamar, The Growth of Venture Capital, p. 42). Notably, the concept of angel investment has its origins in the history of early Broadway, where the arts and business meet directly, when so-called “angels” would finance theatrical productions. Unlike some angel investors, who sometimes engage more directly in their investments, Watson assigned IBM executive Paul Tasman to oversee the company’s partnership with Busa. In any event, it is undeniably serendipitous that one of the period’s most powerful capitalists should have played the role of “angel” to the priest. All the better that Watson and IBM backed Busa’s work on machine- and computer-generated concordances and not a Broadway show about St. Thomas.

As the proof of concept for Busa’s magnum opus, Index Thomisticus, the Varia Specimina is a signal example of production models adopted by mid-century engineers and, eventually, programmers. This stage is a typical business-model requirement and a step toward securing additional investment. As it happens, the Varia Specimina could also be classified as a prototype, since the proof of concept had already been worked out in trials on Dante’s Cantos. The forensic detail that Busa provides in documenting his procedures in the Varia Specimina is akin to a laboratory report – or, given its potential audience at IBM, a marketing report. Perhaps Busa’s most prominent finding is the predictable, yet telling, discovery that the “greatest hindrance” in conducting trials with punch card technologies is “transposing the system from the commercial and statistical uses to the sorting of words from a literary text.” For IBM, the capitalization on these trials would require the realization of the obverse: to convert the literary text into a data system for commercial and statistical use. This, as it happens, was the advent of natural language processing and machine translation.

Working to create the successor to the analogue Varia Speciminia, Busa collaborated with Tasman on digital projects in the mid-1950s, which included programming a machine-readable index to the Dead Sea Scrolls on magnetic tape read by an IBM 705. After working for several years out of the IBM offices in New York and Milan, Busa raised enough capital by 1956 to found the Centro Automazione Analisi Linguistica (CAAL) at Gallarate, Italy. Reporting on their experiments at Gallarate in the July 1957 issue of IBM’s in-house research and development journal, Tasman offered a prescient account and made explicit some of the ways in which IBM expected to derive value from their collaboration: “The indexing and coding techniques developed by this method offer a comparatively fast method of literature searching, and it appears that the machine searching application may initiate a new era of language engineering.” Although Tasman’s predictions were relatively modest, suggesting that the algorithmic processes that they had developed could “lead to improved and more sophisticated techniques for use in libraries, chemical documents, and abstract preparation, as well as in literary analysis,” the history of information technology that has since transpired would support far more ambitious outcomes. If Busa’s legendary 1949 meetings with Watson initiated a business model for humanities computing, the returns on that investment would prove far more substantial than either man could reasonably have anticipated. Tasman’s promise of a “new era of language engineering” is the one in which we live now: it is the era of IBM as big data corporation. Watson’s meetings with Busa may well have launched the priest on a course to seek investors in his own research laboratory and data-driven empire, one that effectively translated “IBM into International Busa Machines,” but the greater empire would be built on the investment in Busa, one that leads to the computational transvaluation of linguistic data into capital. Making it new, for IBM, is making it count.

By the time Busa wrote the foreword to A Companion to Digital Humanities (2004), the story of a humanities scholar making the kind of pitch that he delivered in 1949 had become legend. Who could imagine a sequel? Writing in the aftermath of the attacks on September 11, 2001, he observed that we were living in “an unforeseen season of lean kine,” that he had witnessed “reductions in public funds for research,” but held out the promise that the “period will pass” and that “cutbacks in finance” could lead to “the according of priority to a definitive solution … which could facilitate the fulfillment of the globalization of economic exchange.” With these words, the priest became the CEO of a field that had recently rebranded itself as the “digital humanities.” These prognostications resounded with the belt-tightening neoliberal rhetoric of austerity economics and the triumphalist discourse of global capitalism. After six decades of working with IBM, it comes as no surprise that Busa could convincingly play the spokesman for the business empire that had backed his enterprise.

[This post is an abridgment and adaptation of my open-access article “From Angel to Agile: The Business of the Digital Humanities,” Scholarly and Research Communication 6.4 (2015)]

