Copyediting and Computer Code

Every now and again I’ll find myself in a conversation with copyeditors about the future of their craft. One point I often bring up is that a big part of the job in online newsrooms needs to be overall QA of the site. And one of the most challenging workflows to support that is the copyediting of computer code. The example I always use to illustrate the point is the AP style on state abbreviations. If the Web developers define the abbreviation for California as “CA” instead of “Calif.” … well that’s something that should stick in the craw of every copyeditor until the code gets changed.

And now I have an actual piece of code to illustrate the example. (This comes from the code that runs OpenBlock — the much awaited open-source version of Adrian Holovaty’s EveryBlock. This isn’t meant to pick on that community. They’re doing difficult and needed work. And this could happen anywhere… which makes it a good anecdote.)

What’s the workflow in your newsroom for making sure that this gets changed to “Reporting Officers’ Names” before launch? Should the designers give editors a mock-up of all the static text elements (including words-as-graphics) on the page? Should the developers give editors printouts of all the tables that contain datafields that might get on the live site? Or do you just publish and come up with some sort of sampling scenario?

How does it work in your newsroom? How should it?


Ryan Thornburg is the author of the new online journalism textbook and newsroom manual, Producing Online News, available from CQPress.com

Article Comments Are Alienated Experience

Jaron Lanier, one of the pioneers of virtual reality, once kindly said — I guess — that I often use when thinking about or speaking about online journalism: “Information is alienated experience.” A blog post from one of my students at UNC has done a nice job recording an anecdote from the 2010 Online News Association conference that I think brings into focus the role of comments as form of alienated shared experiences.

Michelle Cerulli, a second-year MA student, told me this story and I encouraged her to blog about it. The short version is this: While attending a session about article comments, she watched a mild-mannered man use Twitter to quietly excoriate one of the speakers. This man didn’t stand up and confront or question the speaker in person. Instead he used this virtual soapbox to disagree with her — in what Michelle described to me as incredibly rude terms — about the role of comments on online news articles.

What was his beef with NPR ombudsman Alicia Shepard? She was saying that online comments tended to be more vitriolic than you hear in “the real world.” His words on Twitter said that Shepard was wrong. But his behavior said that she was dead on. And, according to Michelle, he appeared to be oblivious to the irony.

And while this story so far might seem to some a perfect set-up for a conclusion in which I rail against online comments, that’s not where I’m heading. Online comments are important because it is there that our collective id gets revealed. Many of us reveal in anonymous or pseudonymous comments our fears and hopes n ways that most of us would deny if we were ever confronted with them. Online comments show how us — or at least some non-representative sample of us — experience the world in a way that we alienated from ourselves and the polite company around us.

And that unfiltered id — that alienated experience — is a happy hunting ground for a reporter who hopes to more clearly explain to his readers our increasingly complicated and interconnected world. The problem with comments is not that they are mean. The problem is that there are too few people mining them for hidden hopes and fears and too few people willing to patiently ask probing questions of the crowd.

More and more news organizations are hiring “social media producers.” I hope they’re given the challenge of not just distributing the news to the crowd, but also diving into it and finding individuals who are able to articulate why they’re much more scared, angry or jealous than they are willing to admit in a room full of their peers.

Audience Engagement Starts With Audience Tracking

In the match-making game that is the summer internship and job hunt now getting underway at J-schools across America, I always warn students to never take a job working for an editor who talks about how many “hits” her site gets. And I train my students so that they’ll never be the person whose resume gets tossed for doing the same.

Chapter 3 of Producing Online News and its related tipsheet provide a good overview of the who, what, when, where and why of online news audiences. (And that’s something that’s always changing, Pew reported yesterday that 4 percent of online adults — and 10 percent of Hispanic online adults — use geosocial tools such as Gowalla or Foursquare.)

But students can begin to learn both mass communication research concepts as well as skills if they have the chance to use Google Analytics (or the high-priced and industry dominating Omniture service) on a real, live news site. That will prove to be one of the strength’s of UNC’s new Reese Felts Digital Newsroom, an opportunity that is still pretty rare for journalism students.

The good news is that the Web is full of good free guides to using Google Analytics on a news site. Here are four good guides to get you started:

  • Tracking Your Users (j-learning.org)
  • The Journalists’ Guide to Analytics (Mark S. Luckie)
  • Google Analytics – Adding Tracking Code(Brett Atwood)
  • Installing Google Analytics
  • Lessons From ONA ’10: What It Takes, Part 2

    Aggregation continued to be one of the online news community’s big buzzwords at the 2010 Online News Association conference last week. The idea behind aggregation is that individual news organizations can achieve comparative advantages and that the entire information economy can function more efficiently if the news organization links to reliable information from bloggers, sources and other news organizations rather than replicating the information with its own take.

    But aggregation isn’t free. You can either automate it, which might cost a newsroom $25,000 to $100,000 in up-front costs, plus constant tweaking of the algorithms and processes that gather, organize and automatically publish news stories from external sources. Or, you can put humans and their infinitely superior cognitive flexibility on the task.

    But what does that cost? Based on some estimates I’ve put together based on conversations at ONA:

    * It takes an average of 8 minutes for a news producer to read a blog post or news story, write a summary and categorize it by location and subject.
    * Based on a VERY limited sample that desperately needs further research, you can estimate pulling in one blog post per week for every 4,500 people in your market. (Please send me any data you have that would help me solidify this number.)

    In my home market of Raleigh-Durham, which has about 1.5 million people, aggregating local content might take about one full-time position and cost a news organization maybe $35,000 a year plus benefits.

    How does that match your experience with aggregation? What am I missing?

    Lessons From ONA ’10: What It Takes, Part 1

    At least three national news organizations approached me at last weekend’s Online News Association conference to see whether I could recommend any students with great news judgment and programming skills. That’s what news organizations are desperate to hire today. Why? Well, as former president George W. Bush will tell you some things — like learning how to program — are just hard work.

    Lunch with a friend last week helped me put some numbers on just how hard it is. I was meeting with him so that he could show me the server he set up and the computational journalism he had been doing since we last had a chance to catch up. At heart, he is a writer and a reporter, yearning during our conversation for the chance to do more long-form narrative text stories. But in his newsroom, he is the resident programmer/journalist and has asked by his editors to hire more people like him.

    Here’s what it took for him to become “tech savvy.”
    * In high school, he took one computer programming class. He didn’t study or use computer programming at all in college. He wrote and edited stories at the campus paper. After graduation, he was hired in jobs as a researcher or blogger.
    * During the last two years, he taught himself how to code. He set up his own Ubuntu server, with PHP and MySQL. He learned some ActionScript, JavaScript and XML. He uses Excel, Visual Basic and screen-scraper.com to report stories and build interactive editorial Web applications.
    * He works 60 to 75 hours per week.
    * He spends 90 percent of his time working with and learning about computer coding.
    * It took him two years to get to this point of technical proficiency.
    * That is a total of 5,500 hours.

    He was not born with the IT chromosome. He did not wish himself to state of savvy. He has clearly been blessed with an incredible brain that was nurtured in an environment that valued education and intellectual curiosity. But that didn’t get him his job. He got his job because. He. Worked. Hard.

    Let’s point out how difficult it is to get 5,500 hours of computer time under your belt.
    * College students spend about 15 hours a week in class. Good ones will spend another 25 hours reading and working outside of class. That’s 480 hours a semester, 560 hours a year. At that rate, taking ONLY coding classes, you’ll get to 5,500 hours in just under 10 years. Which makes you this guy. Nobody wants to be that guy, so it’s time to accept that editorial programmers are committed to life-long learning.

    * Let’s say you knock out a few coding classes in school — 500 hours worth — enough to get hired by a big news organization as a developer. That leaves you with just 5,000 hours to go. Working a standard 40-hour week, you’ll burn through those in 125 weeks. That’s about 2.5 years, after various and sundry holidays, illnesses and vacations.

    * Or, maybe you were a good liberal arts student and didn’t blow any of your tuition on coding classes. But your smarts and broad-based knowledge land you a job at one of a very few news organizations that commit seriously to career development. Google spurs innovation with its famous “20 percent time,” which allows its developers to spend a day a week working on projects that are not part of their job descriptions. So, your boss lets you play with computers for one day a week. You’ve got 5,500 hours to make up. And by the time you’re celebrating your 35 birthday you’ll probably be at the point where you can start developing your own editorial applications.

    What the conversation with my friend made me realize is why it irks me so much when people come to me saying that they can’t perform some computing taks because they are “technically illiterate” or “not a computer person.” My friend isn’t a computer person. I’m not a computer person either. But we try. We hack our ways through incredibly frustrating failures by simply doing this. And so can you. If you want.

    The List: Quotes and Notes From TBD.com at ONA ’10

    Having a pithy quote or an insightful stat is key to being a good panelist at a conference. TBD.com’s Jim Brady and Erik Wemple have both in spades, so it was good to open on the Online News Association conference with a session on their new Web site. Here in List form, are the best insights from the panel.

    Having a pithy quote or an insightful stat is key to being a good panelist at a conference. TBD.com’s Jim Brady and Erik Wemple have both in spades, so it was good to open on the Online News Association conference with a session on their new Web site. Here in List form, are the best insights from the panel.

    1. Jim Brady, on the need for diverse revenue streams for digital news media: “There’s no silver bullet, there’s shrapnel.”

    2. Erik Wemple, on the importance of linking to other news sources: “With 12 reporters and 5.3 million people in our market, our editorial vision is smoke and mirrors.” (Note to self: I haven’t seen anyone ask TBDers about how much risk they see in incumbent media orgs putting their news behind pay walls. They must have calculated that either the odds of that happening are pretty low or that the impact on revenue wouldn’t be a company killer.)

    3. Wemple, expanding on the editorial vision of the site: “We want to be a place where, if you hear a siren, you go to #tbd and you find out what’s wrong.” (That news sensibility speaks to TBD’s legacy partner, the local news channel formerly known as Newschannel 8. It does, at least at first blush, seem to ignore anything that’s not event-based news.)

    3. Brady, on selling local ads: The biggest challenge in local advertising is getting business online. TBD is developing service models, network models and Paper G to help bridge that gap. A quarter to a third of the blogs that TBD aggregates participate in its ad network.

    4. TBD aggregates 196 blogs from the Washington area. This includes professional media, amateur media, and corporate, government and non-profit organizations. That one blog for every 27,000 people in TBD’s market. I wonder what is the smallest market size that could sustain an operation like TBD? It seems to me that this indicates a roll for news organizations in medium-sized markets to cultivate local bloggers to reach some minimal threshold that would make aggregation useful. I also wonder what the typical blog-to-person ratio is a typical media market? (Calling all grad students. There’s a good research question for you.)

    5. The panel of TBDers recounted the news organization’s coverage of its first big breaking news story. Bloggers in TBD’s network as well as the site’s general audience provided important eyewitness accounts of the scene. This anecdote illustrated two important elements of crowd sourcing: First, that crowds are best at stenographic journalism — they are good at supplying answers to the who, what, when and where question when those answers come from immediate observation of events or documents. That means that crowds are relatively efficient at feeding editors information during breaking news, especially stories that develop across a broad geographic all at once. Second, that if you want to use crowd sourcing when news breaks you have to develop relationships with your audience BEFORE news breaks. Anyone whose ever cultivated a source knows that means a lot of chit-chat that appears to have nothing to do with the news value of the source. Same on social media.

    6. Wemple, on the power of fertile failure: “If you have a Web site that doesn’t have something terrible on it, you’re not trying hard enough.”

    7. TBD social media producer Mandy Jenkins, on ignoring your critics: “”In the age of social media, that’s something we can’t do anymore.” (More of her thoughts here.)

    8. Steve Buttry re-counted his tale of interviewing Jenkins for the job. It was a good reminder that journalism job candidates who display curiosity always move their resumes to the top of the pile. In online media, this means your ability to show that you try to hack devices and services just to see how they might be able to solve a problem other than the one their developers intended them to solve.

    9. Jenkins, raising suspicion that she may be the Mike Allen of TBD, said she has 22 Tweetdeck columns, follows about 200 feeds and that “We follow anyone who’s ever given us a tip.” (Note to journalism grad students looking for a academic research question: It would be very interesting to see whether news Twitter accounts with high follow/follower ratios yield significantly different levels of trust or relevance among the audience.)

    10. TBD.com’s corporate parent anticipates the site to take about as long as its sibling, Politico, to become profitable. That took about three years. So, a note to the laid-off reporters and editors who’ve called me with dreams of starting your own news site to compete with your former employer — Step 1: Gather up three years of operating costs…

    11. Writing brief, smart, newsy lists without an editor… is hard. Perhaps something we could be teaching our students.

    Job Post: OpenBlock, Django and Community Newspapers

    Request for Proposals:
    Specifications for Community News Tool Using Python and Django.

    The School of Journalism & Mass Communication at the University of North Carolina at Chapel Hill, with funding from the McCormick Foundation, is developing business models and editorial products to help community newspapers transition to the digital age.

    We are seeking someone who has experience with Python and the Django Web development framework to install a Django application called OpenBlock on a Web server and write a report that details the technical challenges, specifications and scope required for integrating OpenBlock into newspaper websites hosted by TownNews.com. The report would also propose potential alternatives that would be more efficient than using OpenBlock.

    In order to write the report, the person we hire will need to perform these tasks:
    1. Install the OpenBlock application on a server, and become familiar with its codebase.

    2. Identify technical specifications for transforming data formats given to students by city and county government into geo-coded data formats optimized for use in OpenBlock. (See https://developer.openblockproject.org/wiki/Ideal%20Feed%20Formats) These technical specs might include the technical specs for building a site scraper (See https://developer.openblockproject.org/wiki/ScraperScripts) to retrieve the data, a feed parser or a program to impute the latitude and longitude of data types that are vaguely described in their original format from the government.

    3. Identify high-level technical specifications for integrating an OpenBlock installation with the CSS styles, site navigation and URL structure of the news organizations so that users and search engines perceive the TownNews.com content and the OpenBlock content as a single site.

    4. Contribute findings back to the OpenBlock project developers wiki at https://developer.openblockproject.org/wiki
    We intend to select a candidate by December 1. The project would start immediately upon selection.

    Please e-mail your proposals – including a proposed timeline, cost bid, resume, cover-letter and three references — to Christine Shia at shia AT email DOT unc DOT edu. Please include “Proposal – OpenBlock RFP” in your subject line.

    Questions about this RFP can be addressed to Assistant Professor Ryan Thornburg at 919-962-4080 or ryan DOT thornburg AT unc DOT edu. Please include: “Query – OpenBlock RFP” in your subject line.

    Sex With Me and Porn on YouTube

    I met yesterday with two old friends who work at two news organizations in Washington, D.C., where the Online News Association annual conference is being held. I asked both about what had been their big online stories and what was driving traffic to their sites. One said that the biggest story of their year was about porn on YouTube. The other said his site’s most trafficked story was about “sex with me”.

    So what?

    To me, it highlights two trends — the audience is control and is much more difficult to force feeds its civic vegetables. And that having a well-designed homepage, while important, isn’t as important as writing water cooler stories.

    One of the news organizations said that they had noticed a lot of traffic on stories about pot. And that definitely played in to their conversations about what to cover.

    As the news environment gets more and more competitive and news organizations get more sophisticated, the pressure to cover sex, drugs and rock and roll is only going to increase.

    How is your organization balancing that with journalistic ethics? How are you teaching the ethics of these situations?

    Triangle’s Media Ecosystem Needs Tributaries and Mainstream

    Sitting next to News & Observer editor John Drescher last Friday during a forum about the Triangle’s media landscape, I had to feel a bit sorry for him. Of the nearly 20 representatives of news media in the region, he was the most prominent representative of the mainstream media and drew all the fire from the bloggers, entrepreneurs, do-gooders and pontificators who had him easily outnumbered and whose smaller organizations had often beaten his Goliath newsroom on important stories.

    But I also envied Drescher. He was also the only one at the table who had ever dropped $200,000 of his company’s money on an investigation of a state agency. And the only one who knew what it was like to spend four years pinging the government for public records before he had a story solid enough to sell to his subscribers and advertisers.

    One other thing made Drescher an enviable character in the Triangle’s media ecosystem. Despite their valid criticisms of increasing gaps in The News & Observer’s coverage of our communities many noted without irony in their voices, the small, independent and non-profit news operations had the most impact on public policy when they got the attention of Drescher’s paper or one of the local television stations.

    And that made me realize that if our state is going to retain its generation-long reputation as a home for journalism that gives voice to the voiceless and holds powerful people accountable, then we must find a way to foster dozens of new and diverse tributaries of news and information that flow into the big, slow-moving mainstream media. Without the tributaries, the MSM seems likely to evaporate entirely. Without a larger channel into which they can empty, the tributaries seem likely to overwhelm us with a flood of disconnected datapoints.
    Continue reading “Triangle’s Media Ecosystem Needs Tributaries and Mainstream”