All we’ve got to do is develop indicators of how people use words when they’re trying to be deceitful. Then we apply statistical measures similar to those I described in my column on computer translation to see how these measures play out in the real world of daily communication and disinformation.
One fascinating attempt at doing this is a recent analysis of 289,695 publicly available Enron e-mails by Queen’s University computer science professor David Skillicorn.
“We were expecting to find small groups in the company who were specifically colluding on things,” he said in explaining the hypothesis underlying his work.
His method for documenting collusion was to measure the use of first-person versus third-person pronouns – “I” versus “one,” for example – as well as exclusive words such as “but,” “except” and “without,” words with negative emotion – think “hate,” “anger,” “greed” – and action verbs such as “go,” “carry” and “run.”
Laboratory experiments run by University of Texas psychologist James Pennebaker showed that when people were told to lie, their writing contained fewer first-person pronouns and exclusive words and more negative emotion words and active words. The “why” of this pattern is not precisely clear, though there is some thought that when you say “It is thought” rather than “I think” the circumlocution allows you to dissociate yourself from the clear meaning of any expressed intention.
Furthermore, the use of the negative emotion words may indicate in a très Freudian way some unconscious discomfort with the intrinsic veracity of what you are saying.
What was fascinating about the Queen’s study is that it presented evidence of what Sir Arthur Conan Doyle described as the dog that didn’t bark. That is, there was no evidence that any of the 176 former Enron executives and workers whose e-mails were looked at operated with a guilty mind.
“Everybody knew about them [the wrongdoings] and everybody was talking about them completely freely within the corporation,” Skillicorn said.
Thus, from their word use alone it appears that for some people in the Enron corporation a criminal culture had developed in which there was no perceived need to lie to one another about misdeeds.
Wow, you might respond, this word analysis thing sounds as though it might also be of interest to the Western intelligence community, which intercepts an estimated three billion messages of one sort or another a day and has to decide which is a real threat.
“There is an awful lot of stuff that sounds the same, but suppose there is a difference between the word use of those people who are 14 and have an excess of testosterone and other people who are genuinely thinking about a suicide bombing or an attack?” Skillicorn said.
“The rational content of those messages might look a lot the same: ‘It’s all America’s fault, and we are going to attack them.’ But what we are hoping to pick up are the tonality differences between, ‘I am frustrated and this is how I vent,’ with, ‘I am actually planning something and this is me talking to my fellow conspirators.’ ”
How good the language use tool will be at picking out the real thing from teenage bravado is not entirely clear, but people are working – and working hard – on the problem.
“We can’t hope to get the boundary exact enough to pick out just the anti-terrorist messages,” Skillicorn said. “But we can hope to reduce a billion potential troubling communications down to a few thousand and then pass them on to a human or a clever system for further analysis.”
Finally, there is political doublespeak.
Skillicorn made news during the past federal election when he analyzed candidates’ speeches against 88 words that had been linked by Pennebaker to deception and found that Paul Martin had a spin ranking of 124, Jack Layton 88 and Stephen Harper 73. Not terribly surprising.
“I think it’s expected that any party in power is going to use spin more than the challengers: They have a track record to defend,” he said at the time.
But his analysis was likely just the beginning of “spinometrics.”
Be prepared for word analysis counters to produce a spin reading for each candidate immediately after a political debate. “It is within the bounds of possibility that we will be able to do that by the next U.S. presidential campaign,” Skillicorn said.
OK, so what does all this massive word analysis mean?
The computer, as we know it today is never going to be as intuitive as a trained human psychoanalyst.
But it may well turn out that computers simply counting the words we use, and the context in which we use them, may turn up deep truths about human intentions that our less precise enumerating minds miss.