Linguists vs Computer Scientists
Computational linguistics and natural language processing ought to be field which combines the talents of two very different academic disciplines, linguistics and computer science. Having somewhat stumbled into this fascinating area by chance (by taking a computing subject as ‘breadth’ while studying towards a linguistics major), I’ve been hooked ever since but in this post I want to talk about the curious cross-discipline characteristics of linguistics and computer science.
Now having passed the half-way mark in my undergraduate course work, this 40-year-old nuisance student has finally arrived in the depths of linguistics and, this semester, a formal course of study in NLP called Language and Computation. At UniMelb this subject is marked as ‘breadth’ which means that no matter if you’re studying science or arts, you can opt to study the subject as part of your compulsory cross-school breadth units. Linguistics is a very popular major. I don’t have statistics but I can tell you that in the core subjects we’re talking about the larger lecture theatres. There’s probably a couple of hundred linguistics majors in any one core subject.
It would follow, then, that Language and Computation would be an obvious subject choice for linguistics majors right? In fact I’m the only one in the entire class that’s a linguistics major, 90% or so are science majors, engineering, computer science, that sort of thing. If I may make some sweeping generalisations, not edge cases but overwealmingly true by mere observation:
1. Linguistics is female dominated. I’d say at least 70% of the students. Most of the staff.
2. The school of languages (European) and linguistics appears to have a very low competency with technology.
3. Computer science is male dominated. The L&T class has I think one girl in it. *
4. Computational Linguistics is dominated by computer science-type problems of a practical nature rather than technology applied to the study of language itself.
* My first year comp-sci subjects appeared to perform better than this with much more women, curiously Asian students made up at least 3/4 of the mix. Most of the students appeared to be economics and business though.
I’m not claiming that this is any way empirical but I think the basic trends I describe would be recognisable to people who work in those fields. My sense, and this is rampant speculation, is that the field of NLP sprang out of a necessity to come to grips with language in so far as tricky problems in computer science are language comprehension bound and have very high levels of practicality. Everything from search engines to voice mail systems, sentiment analysis, automated agents and AI.
Conversely at UniMelb there’s a strong theme of studying Aboriginal languages (which are linguistically fascinating) which really is the other end of the scale in not being very practical (a few thousand speakers in remote Australia) but rather seeking to grow the body of human knowledge around the fundamental forms that language may take, how it arises and how we teach and acquire it.
Computer scientists and practitioners of NLP must find formal and rigid ways to analyse language, devise ingenious mechanisms and machines to achieve better results with results gauged in hard percent terms, then ultimately made practical by building into some system with a tangle benefit. Linguists live in a stunningly obscure and diverse world of trying to describe an ever shifting, exception laden, measurement-error prone study of what is essentially an aspect of human nature.
The long and the short of it is that the means, goals and focus of Linguistics resides within the approach of the school of Arts and Computer Science resides in the school of Science respectively . Female and male dominated areas respectively. I find that fascinating but there are also some concerning imbalances which I think has given rise to vast black spots where the collective natural inclinations of these different schools-of-thought, in a more literal sense, means that many of areas where linguists and computer scientists could really make a difference.
For a start, there should be a lot more linguistics taking computational linguistics courses. They would find programming difficult to start but it is just another language, really, and it in a very short period of time they could gain some extraordinarily powerful skills and tools which they can apply to virtually anything else in linguistics they do. I feel so strongly about this I’m going to see what I can do to improve matters at UniMelb.
I’m not really sure what can be done about the low technical competence within the linguistics department, I don’t really feel it’s my place to make waves. I also haven’t the faintest idea what can be done about the almost depressingly narrow focus of NLP related research I see coming out of the field other than to just hope more linguists make an entrance.
I’m also pretty tired of the focus on European languages. Alright, obviously I’m biased but honestly you just don’t see anything other than bloody English while some cavernous problems lie totally untackled such as the utterly diabolical state of machine translation for Asian languages.
Again this is because computer science people don’t speak any other language, they have little interest in tackling those problems – and even when they do, they don’t seem to work on them. Is there an image problem?
Most of the NLP work on Asian languages appears to come out of a very narrow set of universities in Asia to the point that they have their own tools and approaches. (I say this on the basis of reading a few papers as I tried to solve word segmentation for Chinese, noting that the citations were almost invariably scholars from the same three or four universities in China and Singapore).
There’s a gulf between languages here that you don’t see in linguistics as a whole. I don’t really have sufficient insight into the field at this point to say more but I already have the sense that a few more linguistics, a few more ladies, would do wonders for driving the field forward.

You’re spot on – I recruit people from Psychology, Linguistics, Neuroscience and Computing to work with me in Computational Linguistics/Computational Language Learning. Most of my Postdocs have an interdisciplinaray Cognitive Science background, but such a degree ends up being a political bunfight because there is really too little time to cover everything. So now I/Flinders have replaced a single CogSci undergraduate offering by a double degree covering all four disciplines, and am putting up scholarships (undergrad to start or swap into the double degree or postgrad to follow on with a PhD).
Basically, there’s too few a pool of people that I can really accept for a PhD because they don’t have both the linguistics/psychology/humanities background and the science/computing background.
My point is, it’s not the fault of computer scientists or linguists for knowing only their own area, because the system is biased towards become superspecialized – knowing almost everything about next to nothing, and interdisciplinary projects are notoriously hard to get funded, and similarly hard to get people for.
So it’s great to see someone trying to connect the dots.
As far as Chinese go, I’m an Australian learning Chinese (and writing this from Beijing where I spend several months a year on Chinese NLP projects) after many years of European languages, and needing to see/understand how our techniques work in Asian languages, and what new twists we get…
Anyway, if you’re ever in the neighbourhood (Adelaide or Beijing), look me up… Alternatively, drop me a line and I can put you in touch with a Computational Linguist in Melbourne who works on Asian languages…
David
Wow, thank you David. I’ll be in touch!