Write Fielder

Clicks Dig the Long Blog.

20 notes

The Computer that wants to be a Sportswriter

ST. LOUIS - There was a legend when I was a young baseball writer that eventually every reporter covers enough games, sees enough results, and writes enough copy that he could keep a Rolodex of gamers. They could be indexed by outcome (Rout, seven runs or greater; Walk-off), theme (Injury, back from; Redemption, veteran) or feat (Home Runs, three hit; Shutout, one hit allowed). Simply spin the directory, thumb the appropriate gamer and fill in the new names, appropriate score, location and perhaps spruce it up with some hip new verbs. And, voila! -30-.

Far from being something to achieve, the Rolodex of gamers was a cautionary tale, something to work diligently to avoid before you became repetitive and obsolete.

Apparently the Rolodex is real. It’s coming for our jobs.

In Sunday’s New York Times, a business column by Steve Lohr explored the growing efficiency and effectiveness of “robot journalists,” or artificial intelligence programs that are writing – ahem, producing – articles. Lohr’s story focuses Narrative Science, a tech company in Evanston, Ill., home of that other top journalism school in America, Northwestern’s Medill School of Journalism. Narrative Science specializes in computer-generated content, and that content is getting less computer-like all the time. For his lede, Lohr quotes from an article written by the Narrative Science program. Of course, it’s a sports story.

“WISCONSIN appears to be in the driver’s seat en route to a win, as it leads 51-10 after the third quarter. Wisconsin added to its lead when Russell Wilson found Jacob Pedersen for an eight-yard touchdown to make the score 44-3 … . ”    

Sportswriting is the natural place for the roboreporters to start their revolution. Games are easy to distill into numbers, right down to the integers that are the very definition of sport – the final score. What happened in a baseball game can be conveyed in a box score or strings of code that detail play by play. We are able to quantify everything these days – right down to the millimeter of break on Mariano Rivera’s cut fastball – and all that info plays right into the roboreporter’s wheelhouse. The computer doesn’t have any problem taking this data and transforming it into a paint-by-numbers game story that tells what happened.

The improving quality of articles from these AI programs prompted Businessweek.com to ask in August 2010, “Are Sportswriters Really Necessary?” The article, by Justin Bachman, used press releases from college sports information departments – again press releases, not game stories from beat writers; press releases! – to compare the flesh-and-blood copy against the J-bot’s. In the computer-generated story, the program writes (relatively speaking) about a college baseball game: “The Hawkeyes (16-21) were unable to overcome a four-run sixth inning deficit. The Hawkeyes clawed back in the eighth inning, putting up one run.”

“There’s no human author and no human editing,” Narrative Science’s CEO Stuart Frankel told Bachman more than a year ago. “But the stories sound really good.”

No, news flash, they don’t.

They sound formulaic. They sound stilted. They are, by the nature of their learning database, going to rely on cliché instead of toy with cliché. They are dull.

But, here’s the problem: It may not matter.

Narrative Science has 20 customers, according to The New York Times article. One of them is the Big Ten Network, which used computer-generated coverage from football and basketball games to update its Web site’s content. About halfway through Lohr’s chilling article is this heart-stopper:

Those reports helped drive a surge in referrals to the Web site from Google’s search algorithm, which highly ranks new content on popular subjects, (Big Ten Network official Michael) Calderon says. The network’s Web traffic for football games last season was 40 percent higher than in 2009.

Traffic rules. Clicks matter.

We have all been schooled in the importance of SEO, Search Engine Optimization. It is why on first reference the Cardinals are always the St. Louis Cardinals and never the Cards. It’s why online headlines seem so cumbersome, less conversational. Tags are way more important than the byline, but the byline can be a tag to make it easier for Google to find a specific writer. Now, here is a program that specializes in producing articles tailor-made SEO success. Writers with hearts have to force themselves to make headlines and sentences more SEO-friendly and, thus, draw traffic through Google and Yahoo! and other sites. J-bots do it innately. It is literally what they were created to do — cater to the clicks.

“The leaders of Narrative Science emphasized that their technology would be primarily a low-cost tool for publications to expand and enrich coverage when editorial budgets are under pressure,” Lohr wrote in Sunday’s Times.

Well, that’s comforting. So these automated writing programs are designed to help newspapers or news organizations that have dwindling travel budgets, limited ability to pay reporters overtime, shrinking staffs, reduced manpower on the copy desk and a readership that is ever-hungrier for free content.

That narrows it down to only every newspaper that, um, still exists.

As the news business changes and rolls with punch after punch, the hope is that not just any content wins, but quality content does. Several long-form sportswriting sites (Grantland, The Classical) are out to prove this (again). It’s not enough for a consumer to know the difference between a computer-generated game story and one written by a beat writer. That consumer has to value the game story written by the beat writer. The legend of the Rolodex wasn’t a goal, it was a reminder of something to avoid — no matter how many thousands of games you cover don’t fall into the trap of repetition. As the Narrative Science advances illustrate, any old program can retell What happened. It’s the other tenets of journalism that are missing. A deft beat writer can also explain Why it happened and How it happened. Context isn’t the computer’s strength. The two S’s are – speed and salary.

As journalism students at Mizzou, a few of us on the sports desk had a derisive name for stories that focused only on What happened and didn’t offer anything stylish, substantive or, you know, human. The name came from the small type that makes up box scores and it inferred that the writer just dropped adjectives, active words, names and transitions into the raw stats. We called these stories “Agate with Verbs.” It was a joke.

It’s not so funny anymore.

-30-

Filed under journalism mizzou newspapers sportswriting

  1. derrickgoold posted this