Tuesday, February 25, 2014

Embiggen and other go-figure puzzles in English

Click to embiggen
You see it all the time: a smallish image posted on a web page, and an instruction telling visitors how to view it at higher-resolution. Maybe: "Click to enlarge." Or: "Click for a larger image."

Ho hum.

But Tom Tomorrow (a.k.a. Dan Perkins) doesn't leave memos-to-readers that are as pedestrian as those. Nope. Not only is Tom Tomorrow's This Modern World consistently in my top tier of Best Progressive Political Comic Strips, but when his material appears on Daily Kos (which is where I look for his work nowadays), a visitor is instructed that to see a larger image of the comic, s/he should:
Click to embiggen.
This warms my heart.

I've met plenty of neologisms I loathe: to Facebook or to friend, for example. Or to calendar, as in "Let's calendar a meeting with the marketing people. Dick, can you PowerPoint the product positives by next week?"

OTOH, there are as many others that I've adopted whole hog, like zillions of other English speakers: to Google, for one. Or internet, for that matter. Or grok, my personal favorite among neologisms of the '60s (though "Bogart" was pretty good too, as in Don't Bogart that joint, my friend).

But embiggen? There's something about embiggen that feels so right I want to grin every time I see the word in action.

You may already be familiar with the origin of "embiggen" ... but I wasn't until I decided recently to suss out where Tom Tomorrow found it. There's nothing secret about the word: it came from The Simpsons. Not originally, exactly, but epidemiologically speaking. Sort of.

Here's how the word's origin is described on Wikipedia, in an entry about the episode of The Simpsons in which "embiggen" occurs:
"Lisa the Iconoclast" is the sixteenth episode of The Simpsons' seventh season. [...]

The episode features two neologisms: embiggen and cromulent. [...] The Springfield town motto is "A noble spirit embiggens the smallest man." Schoolteacher Edna Krabappel comments that she never heard the word embiggens until she moved to Springfield. Miss Hoover, another teacher, replies, "I don't know why; it’s a perfectly cromulent word." [...]

Embiggen—in the context it is used in the episode—is a verb that was coined by Dan Greaney in 1996. The verb previously occurred in an 1884 edition of the British journal Notes and Queries: A Medium of Intercommunication for Literary Men, General Readers, Etc. by C. A. Ward, in the sentence "but the people magnified them, to make great or embiggen, if we may invent an English parallel as ugly. After all, use is nearly everything." The literal meaning of embiggen is to make something larger. The word has made its way to common use [...]
Here's the relevant excerpt from the show itself:

So I was thinking about how much I lurve the word "embiggen" on my way to work the other day, and when I got there I found the usual daily e-mail from the Chronicle of Higher Education (I work for a university). In that e-missive I found a link to an article by a linguistics professor at the University of Edinburgh, Geoffrey Pullum. The article is titled Coming and Going and it appeared in the CHE on 19 Feb 2014. It's about how English doesn't behave. And how there's not a ding-dang thing to be done about it.

The article started me considering the probability that, for people who speak English as a second or third or fourth language, words like "embiggen" must be crazymaking. Not even a teensy-weensy bit heartwarming.

Excerpting from Pullum's piece:
I heard a Brazilian iron-ore magnate speaking on a BBC news program about how he had become so rich, and he said that at one point "the price of iron ore came from $10 a ton to $180 a ton." I realized that there was a subtle mistake in English usage here: Even if the price is still $180 now, we do not say that the price came from $10 to $180; we say the price went from $10 to $180. But why?

Come is standardly used for motion (including metaphorical motion) toward the notional location providing the utterer’s reference point: We talk about going away but coming back. It would be quite reasonable to imagine talking about a price starting at some remote point in past time and climbing up the metaphorical price curve, while proceeding along the time axis, toward its present point on the graph. Visualizing ourselves as located at the current price point, we could see the price as climbing up toward where we are now.

But we don’t. In fact we never seem to do anything like that. It is the future that comes; the past goes away.
The future comes and the past goes away? That's not what Creedence Clearwater Revival sang.

But more to the question of price movements, does the matter of iron ore prices going from $10 to $180/ton make more sense to me than coming from $10 to $180/ton because, having had my consciousness shaped in the United States, I understand that the coming and going of prices has nothing to do with my own superfluous presence at the location of a price point, but with movement that occurs from the price's own point of view. Here in 'Merica, corporations are people. Why shouldn't prices themselves have consciousness, and even agency? Perhaps even souls, by gum!


The word "embiggen" seems so cozy to me, so on the mark, so that's not a word, but boy is it cute! because ...
  • Embiggen is a little bit "enlarge" and a little bit "enlighten."
  • It's a monosyllabic Anglo-Saxon word bracketed by a Latinate prefix and an Old English suffix; so it's kind of awkward, but in a funkalicious way.
  • It's a word that you can easily imagine being spoken by a wide-eyed, ebullient four year old who just watched a blimp inflate.

And so on.

In an early post to One Finger Typing, I paraphrased my ninth-grade English teacher, Miss Barbara Ballou, who scolded the whelps in her charge if we dared claim a stylistic right to break the rules of grammar in essays on Billy Shakespeare, say, or Nate Hawthorne: You have no right to break the rules until you know what they are and how to apply them, she informed us.

I admired Miss Ballou a great deal. She was one of the best teachers I ever had, and I've had some doozies. But here's what Geoffrey Pullum has to say about rules, logic, common sense, and speaking English:
The important lesson, to me, is that it isn't logic or common sense that prevents us from saying that [the iron-ore price came from $10 to $180]. It just isn't how we use the language, that’s all.

Don’t ask me why. I genuinely don't know. What I do know is that English lexical semantics (and, I assume, the lexical semantics of any other language) is extraordinarily complex. It continues to astonish me that I learned the meanings of the words I know. Even simple words like come and go. 

[T]here is no guarantee that English will or ever could be logical. English is the way it is: Its rules, some of them quite strict, evolved the way they did over the past millennium without being under any constraint of a directly logical nature.

The user of the language is constrained only by the hundreds of millions of their fellow speakers, who unwittingly negotiate every day about how to set the conventions of usage that define them too as English speakers. Railing against the decision of a few tens of millions of our fellow speakers who have adopted or abandoned some expression is, to put it in terms of the old joke, like trying to teach a pig to sing: It not only wastes your time, it also annoys the pig.
Professor Pullum has a cromulent point.

Related posts on One Finger Typing:
Google Translate, AI, and Searle's Chinese Room
Linguistics, semantics, pragmatics: words, meaning, and wacky translations
Are computer languages really languages?
Raising a glass to Miss Ballou

Thanks to wordle.net for the word cloud of Lewis Carroll's "Jaberwocky," from Through the Looking-glass.

Monday, February 10, 2014

Is data security worth it? Depends who's counting.

An article published by Reuters a couple weeks back caught my eye: Davos executives see data theft as too costly, too hard to beat. From the article, dated 24 January 2014:
Fighting online data fraudsters is almost impossible as their ability to hack into new technology often outpaces companies efforts to protect it, senior businessmen and bankers gathering for the World Economic Forum (WEF) said.

The mammoth data breach at U.S. No. 3 retailer Target (TGT.N) has made executives even more aware of the need to improve safety standards, but the cost is often prohibitive.


While losses on complex derivatives transactions could punch a big hole in a banks' balance sheet or even compromise its stability, the potential losses resulting from the theft of retail customers' data are often minimal.
Really? Minimal on whose balance sheet?

A study sponsored by security behemoth Symantec, and conducted by the Ponemon Institute measured costs of data breaches to business. From 2013 Cost of Data Breach Study: Global Analysis (PDF), published in May 2013 and reporting on cost per data breach victim in calendar year 2012:
As the findings reveal, the average per capita cost of data breach (compiled for nine countries and converted to US dollars) differs widely among the countries. Many of these cost differences can be attributed to the types of attacks and threats organizations face  as well as the data protection regulations and laws in their respective countries. In this year’s global study, the average consolidated data breach increased from $130 to $136. However, German and US organizations on average experienced much higher costs at $199 and $188, respectively.
Contrast that with an NBC article (on Today.com) published the following month, Data breaches cost consumers billions of dollars:
A new report from Javelin Strategy and Research released on Wednesday concludes that a single massive data breach can result in “billions of dollars” in consumer fraud losses. [...]

Hackers were after Social Security numbers when they attacked the South Carolina Department of Revenue last year. They got 3.6 million of them. Javelin puts the total loss from this fraud at $5.2 billion dollars, making the breach one of the most costly ever.

The average fraud victim in this case will spend $776 out of pocket and take 20 hours to resolve their problems, the report estimated.
$188 is the cost to businesses per victim per data breach incident in the United States, from the Symantec sponsored study. NBC reports, from an incident in South Carolina, a cash cost to consumers of $776 plus whatever 20 of your hours are worth, applied to following up some company's compromise of your data by contacting banks, writing to credit agencies, trying to get the attention of law enforcement, and other such entertainments.

What are those 20 hours worth? The Social Security Administration calculated average U.S. wage data at $42,498.21 for 2012, or a little over $20/hr for a full 52-week year of 40-hour work weeks.

So let's peg the worth of those 20 hours at $400, for a total cost to data breach victims of $1,176 per incident.

Admittedly, this is arithmetic, not methodologically sound statistics I'm batting around here. But by my rough and sketchy comparisons, a data breach costs U.S. individuals over six times what such an incident costs a U.S. business for each affected person.

And yet: Davos executives see data theft as too costly, too hard to beat.

Uh huh.

In case it's not obvious yet that what "Davos executives see" is different from what you, an individual, are at risk of experiencing, let's go back to that Symantec sponsored study for a moment.

From the study's Executive Summary, bold emphasis added:
Factors that increase the cost. US companies realized the greatest increase in data breach costs if caused by a third party error or quick notification of data breach victims, regulators and other stakeholders. [...]
And from the Key Findings section of the report, bold emphasis added again:
In many countries, regulations dictate the notification of data breach victims. However, if organizations are too fast in contacting individuals it can actually result in higher costs. In this year’s study, in the US quick notification added as much as $37 per record , as shown in Figure 11c. It is understandable that this factor would have little impact on Brazil and India, because data breach notification regulations are non-existent.
No regulations, no need to notify data breach victims. No need to notify, lower cost to business. Hmmmm.... I believe what we're seeing here is what a certain category of spin-doctor might call, with respect to the United States, unfriendly business environments resulting from over-regulation, no?

The World Economic Forum's February 2013 report, Unlocking the Value of Personal Data: From Collection to Usage (PDF) contains an airbrushed sound-byte framing the old and insidious concept that what's good for the CEOs attending WEF meetings is good for the countries from which they extract wealth. From a chapter cozily titled "The World is Changing," here's the last point in a figure summarizing "New perspectives on the use of data":
Traditional approach: Policy framework focuses on minimizing risks to the individual

New perspective: Policy focuses on balancing protection with innovation and economic growth
Balance. We like balance, right?

Full disclosure: I am over-simplifying some long and complex analyses.

For example, just a couple of pages past the bit quoted just above from the WEF report, a series of figures asserts that health care outcomes for individuals is significantly improved by "personalised individual interventions based on health data" and "public disclosure of aggregated, anonymized patient outcome data."

Yes, there are not only costs, but benefits as well that accrue to individuals when vast data stores are aggregated and mined. It's complicated, and I acknowledge that.

The WEF report contains, for example, this reasonable and nuanced passage in Chapter 2:
This new approach also needs to carefully distinguish between using data for discovery to generate insight and the subsequent application of those insights to impact an individual. Often in the process of discovery, when combining data and looking for patterns and insights, possible applications are not always clear. Allowing data to be used for discovery more freely, but ensuring appropriate controls over the applications of that discovery to protect the individual, is one way of striking the balance between social and economic value creation and protection.

However, just as the discovery of new opportunities for growth is unknown, so are the possibilities for unleashing unintended consequences. Principled and flexible governance is required to assess the risk profile of actions taken in the use of data analytics.
But I would argue that this nuance is used as a self-interested prop to justify current and contemplated data collection and retention practices, on the grounds that, paraphrasing, we'll figure out how to protect people eventually.

I'm skeptical, okay? YMMV.

But here, setting aside reasonable nuance, figures and appendices, footnotes, and kumbaya use cases, let's consider this unsettling video, circa 2009, courtesy of the ACLU. What happens when you, an individual, call up a retailer to place the simplest order -- for takeout pizza -- and they know pretty much everything about your home, habits, relationships, work, and health. To wit:

It's a perspective worth balancing against the carefully groomed reports coming out of Davos.

I'll close with a report from just yesterday, 9 Feb 2014, Reuters again, titled Barclays launches investigation after customer data leak:
Barclays said it had launched an investigation after a newspaper reported that the personal details of 27,000 customers had been stolen and sold, raising the prospect of new fines for the bank. [...]

Barclays thanked the Mail on Sunday for bringing the data leak to its attention.

"Protecting our customers' data is a top priority and we take this issue extremely seriously," Barclays said in its statement.

"We would like to reassure all of our customers that we have taken every practical measure to ensure that personal and financial details remain as safe and secure as possible."

Yessiree, Bob. Every practical measure.

Related posts on One Finger Typing:
Six ways your electronica owns you
Pimped by our own devices: electronica, the cloud, and privacy piracy
Monoculture v complexity; agribusiness and deceit

Thanks to Wikimedia Commons for the image of the Davos Congress Centre, site of the World Economic Forum meetings since 1971; and also for the pile of cash image, contributed to WC by Moritz Wickendorf. And thanks to the ACLU, for all that organization's fine work and principled tenacity.