Category: Online Newsrooms
Knight News Challenge Proposal: Crowdsourcing Data to Bring OpenBlock to Rural America
At the top of my To Do List this week is the completion of one of the proposals I’ve submitted to the Knight News Challenge this year. I’m posting it here in the hope that you’ll have some feedback on whether/how a service like this would be technically feasible. editorially useful and financially viable. I’m especially interested in hearing from editors of small papers, public records experts, civic/community organizers and anyone who’s worked on the OpenBlock code.
Under what conditions would you volunteer to help a project like this in your community? News organization — how much would you pay for a service like this? What characteristics would it need to have to make it worth your money? What else do you see here that needs further clarification?
(And a big hat-tip here to Penny Abernathy, the Knight Chair in Digital Media Economics here at the UNC-Chapel Hill School of Journalism and Mass Communication. She got this project kicked off with a grant from the McCormick Foundation and who is my co-pilot on this application.)
Here’s our draft pitch:
Crowdsourcing Data to Bring OpenBlock to Rural America
This project would create a co-op to develop and deploy public records databases at news organizations, especially those serving communities of fewer than 75,000 people, preparing those records for presentation and integration in an OpenBlock format.
These rural news organizations are struggling to move to the digital age in part because their staffs are so small they don’t have the capacity to identify, digitize, re-aggregate and map all the various public records available at the state and local level into databases that can be accessed intelligently by both reporters and the reading public.
The project would tackle the lack of capacity at rural papers from two directions. It would create a centralized repository of state, county and city schemas and datafeeds that could be easily used in OpenBlock. This a job well-suited for a small group of experts. In addition, the project will create a statewide corps of amateur data-checkers and records requesters. Data quality assurance and data gathering are jobs well-suited for a crowd of many people, each working on a small piece of the puzzle.
These volunteer citizen-journalists would actually be member-owners of a co-op business. Each task they perform would earn them additional shares in the company’s annual profits. We would generate revenue by charging rural newspapers a fee. The more records and the better their accuracy, the more news organizations would sign on for the service.
In some cases, volunteers would pick up CDs of data from county offices. In others volunteers would scan and upload PDFs of hand-written police incident reports. In still other cases, people would key into a database the information on those PDFs. This job is so big that no single small news organization could do it. But with a corps of member-owners working together, we could create a model for gathering valuable public records from rural America. To individual communities, these records are necessary to foster an informed civic dialog and healthy economy. But in aggregate, these records may also be able to shed light on trends in rural America that would otherwise go unreported.
Improving Delivery of News and Information to Geographic Communities
In small towns and rural America, the local newspaper is more than just a source of information and an engine of commerce. More importantly, it fosters and builds geographic community and sets the agenda for public policy debate. This project will foster civic and community engagement — first, by forming a network of knowledgeable volunteer citizen-journalists, and also, by making public records readily available and organized to support decision-making and accountability at all levels of government.
In many cases, data that is readily available in GeoRSS or at least CSV format from big cities (such as this example from San Francisco) is simply not available even in print from rural governments. For example, journalism students at the University of North Carolina working last semester to gather and organize public records in two rural counties for an OpenBlock application met with a number of obstacles (which they describe in their blogs) – ranging from significant photocopying fees to inappropriate redactions and denial of access to public information.
Even when acquisition of public datasets is relatively simple – for example, public health restaurant inspections — someone must request that data from a specific county be exported in fielded data format. It is inefficient for each rural news organization to make separate requests for this data in each of North Carolina’s 100 counties. In these cases, our public records coop would outline an initial request for the data for each county.
Currently there is no tool or service that can efficiently gather, format and publish public records on rural news organizations’ sites. In part, this is a technology problem that may soon be overcome with the alpha rollout of OpenBlock later in 2011. But a much bigger piece of the problem is the data itself – neither OpenBlock nor any other technology has the ability to obtain public records as fielded digital data and create a newsworthy user interface for all the various types of records a news organization might need.
Without a project like this there is no indication that OpenBlock will be a viable option for papers in rural communities.
What Will Change?
By the end of the project, we will have
• at least one member-owner in each county in North Carolina
• at least 12 news organizations subscribing to the service
• at least one type of schema for which we’ve collected data from each county
Most importantly, we will have raised public awareness of open government and we will start seeing rural counties and towns publish public data in standardized, machine-readable formats on the Web.
What tasks/benchmarks need to be accomplished to develop your project and by when will you complete them?
How will you measure progress?
Do you see any risk in the development of your project?
How will people learn about what you are doing?
Is this a one-time experiment or do you think it will continue after the grant?
Breaking News Emails: An Under-Appreciated Art
I have a tumultuous relationship with breaking news e-mails. One day we have a strong relationship that I value. And the next thing I know they get all high-maintenance on me. Sheesh.
So today I unsubscribed from breaking news e-mail alerts from CNN and NPR. I kept the alert from the New York Times for two reasons:
After thinning the herd on the national news, I planned to dump my alerts from either the News & Observer or WRAL. But when I went to do it, I just couldn’t choose. Looking over the past six months of alerts, their news judgment seems to be radically different. It’s almost as if one news organization will not send an alert if the other organization already has. So in order to get a complete range of local news alerts, I need both. But the downside to that arrangement is that probably 50 percent of the local alerts from either provider do I consider important enough to merit an interruption in my inbox.
So now what strikes me is how little time I spend talking with students about “good” news judgment and writing style for e-mail alerts. And how difficult it is to teach a technique that seems to have no consistent application among professionals. This is the perfect example where we in the classroom need to document the editorial processes around writing and distributing breaking news alerts in various newsrooms. In each newsroom, what do the journalists say are the goals of the alerts? Is there internal or external agreement on those goals? And then we in the classroom need to develop quantitative research that can help the professionals know which news judgment and writing styles best meet those goals. And then we in the classroom need to develop experimental editorial products that do a better job meeting the goals — maybe change the way news judgment and style could be tailored to the needs of individual users based on their demographics, location or behavior.
In the end, the common email alert seems to be a great example of a place where academics and industry could work together to build a better product and foster a more information society.
Copyediting and Computer Code
Every now and again I’ll find myself in a conversation with copyeditors about the future of their craft. One point I often bring up is that a big part of the job in online newsrooms needs to be overall QA of the site. And one of the most challenging workflows to support that is the copyediting of computer code. The example I always use to illustrate the point is the AP style on state abbreviations. If the Web developers define the abbreviation for California as “CA” instead of “Calif.” … well that’s something that should stick in the craw of every copyeditor until the code gets changed.
And now I have an actual piece of code to illustrate the example. (This comes from the code that runs OpenBlock — the much awaited open-source version of Adrian Holovaty’s EveryBlock. This isn’t meant to pick on that community. They’re doing difficult and needed work. And this could happen anywhere… which makes it a good anecdote.)
What’s the workflow in your newsroom for making sure that this gets changed to “Reporting Officers’ Names” before launch? Should the designers give editors a mock-up of all the static text elements (including words-as-graphics) on the page? Should the developers give editors printouts of all the tables that contain datafields that might get on the live site? Or do you just publish and come up with some sort of sampling scenario?
How does it work in your newsroom? How should it?
Ryan Thornburg is the author of the new online journalism textbook and newsroom manual, Producing Online News, available from CQPress.com
Article Comments Are Alienated Experience
Jaron Lanier, one of the pioneers of virtual reality, once kindly said — I guess — that I often use when thinking about or speaking about online journalism: “Information is alienated experience.” A blog post from one of my students at UNC has done a nice job recording an anecdote from the 2010 Online News Association conference that I think brings into focus the role of comments as form of alienated shared experiences.
Michelle Cerulli, a second-year MA student, told me this story and I encouraged her to blog about it. The short version is this: While attending a session about article comments, she watched a mild-mannered man use Twitter to quietly excoriate one of the speakers. This man didn’t stand up and confront or question the speaker in person. Instead he used this virtual soapbox to disagree with her — in what Michelle described to me as incredibly rude terms — about the role of comments on online news articles.
What was his beef with NPR ombudsman Alicia Shepard? She was saying that online comments tended to be more vitriolic than you hear in “the real world.” His words on Twitter said that Shepard was wrong. But his behavior said that she was dead on. And, according to Michelle, he appeared to be oblivious to the irony.
And while this story so far might seem to some a perfect set-up for a conclusion in which I rail against online comments, that’s not where I’m heading. Online comments are important because it is there that our collective id gets revealed. Many of us reveal in anonymous or pseudonymous comments our fears and hopes n ways that most of us would deny if we were ever confronted with them. Online comments show how us — or at least some non-representative sample of us — experience the world in a way that we alienated from ourselves and the polite company around us.
And that unfiltered id — that alienated experience — is a happy hunting ground for a reporter who hopes to more clearly explain to his readers our increasingly complicated and interconnected world. The problem with comments is not that they are mean. The problem is that there are too few people mining them for hidden hopes and fears and too few people willing to patiently ask probing questions of the crowd.
More and more news organizations are hiring “social media producers.” I hope they’re given the challenge of not just distributing the news to the crowd, but also diving into it and finding individuals who are able to articulate why they’re much more scared, angry or jealous than they are willing to admit in a room full of their peers.
Lessons From ONA ’10: What It Takes, Part 2
Aggregation continued to be one of the online news community’s big buzzwords at the 2010 Online News Association conference last week. The idea behind aggregation is that individual news organizations can achieve comparative advantages and that the entire information economy can function more efficiently if the news organization links to reliable information from bloggers, sources and other news organizations rather than replicating the information with its own take.
But aggregation isn’t free. You can either automate it, which might cost a newsroom $25,000 to $100,000 in up-front costs, plus constant tweaking of the algorithms and processes that gather, organize and automatically publish news stories from external sources. Or, you can put humans and their infinitely superior cognitive flexibility on the task.
But what does that cost? Based on some estimates I’ve put together based on conversations at ONA:
* It takes an average of 8 minutes for a news producer to read a blog post or news story, write a summary and categorize it by location and subject.
* Based on a VERY limited sample that desperately needs further research, you can estimate pulling in one blog post per week for every 4,500 people in your market. (Please send me any data you have that would help me solidify this number.)
In my home market of Raleigh-Durham, which has about 1.5 million people, aggregating local content might take about one full-time position and cost a news organization maybe $35,000 a year plus benefits.
How does that match your experience with aggregation? What am I missing?
Lessons From ONA ’10: What It Takes, Part 1
At least three national news organizations approached me at last weekend’s Online News Association conference to see whether I could recommend any students with great news judgment and programming skills. That’s what news organizations are desperate to hire today. Why? Well, as former president George W. Bush will tell you some things — like learning how to program — are just hard work.
Lunch with a friend last week helped me put some numbers on just how hard it is. I was meeting with him so that he could show me the server he set up and the computational journalism he had been doing since we last had a chance to catch up. At heart, he is a writer and a reporter, yearning during our conversation for the chance to do more long-form narrative text stories. But in his newsroom, he is the resident programmer/journalist and has asked by his editors to hire more people like him.
Here’s what it took for him to become “tech savvy.”
* In high school, he took one computer programming class. He didn’t study or use computer programming at all in college. He wrote and edited stories at the campus paper. After graduation, he was hired in jobs as a researcher or blogger.
* He works 60 to 75 hours per week.
* He spends 90 percent of his time working with and learning about computer coding.
* It took him two years to get to this point of technical proficiency.
* That is a total of 5,500 hours.
He was not born with the IT chromosome. He did not wish himself to state of savvy. He has clearly been blessed with an incredible brain that was nurtured in an environment that valued education and intellectual curiosity. But that didn’t get him his job. He got his job because. He. Worked. Hard.
Let’s point out how difficult it is to get 5,500 hours of computer time under your belt.
* College students spend about 15 hours a week in class. Good ones will spend another 25 hours reading and working outside of class. That’s 480 hours a semester, 560 hours a year. At that rate, taking ONLY coding classes, you’ll get to 5,500 hours in just under 10 years. Which makes you this guy. Nobody wants to be that guy, so it’s time to accept that editorial programmers are committed to life-long learning.
* Let’s say you knock out a few coding classes in school — 500 hours worth — enough to get hired by a big news organization as a developer. That leaves you with just 5,000 hours to go. Working a standard 40-hour week, you’ll burn through those in 125 weeks. That’s about 2.5 years, after various and sundry holidays, illnesses and vacations.
* Or, maybe you were a good liberal arts student and didn’t blow any of your tuition on coding classes. But your smarts and broad-based knowledge land you a job at one of a very few news organizations that commit seriously to career development. Google spurs innovation with its famous “20 percent time,” which allows its developers to spend a day a week working on projects that are not part of their job descriptions. So, your boss lets you play with computers for one day a week. You’ve got 5,500 hours to make up. And by the time you’re celebrating your 35 birthday you’ll probably be at the point where you can start developing your own editorial applications.
What the conversation with my friend made me realize is why it irks me so much when people come to me saying that they can’t perform some computing taks because they are “technically illiterate” or “not a computer person.” My friend isn’t a computer person. I’m not a computer person either. But we try. We hack our ways through incredibly frustrating failures by simply doing this. And so can you. If you want.
The List: Quotes and Notes From TBD.com at ONA ’10
Having a pithy quote or an insightful stat is key to being a good panelist at a conference. TBD.com’s Jim Brady and Erik Wemple have both in spades, so it was good to open on the Online News Association conference with a session on their new Web site. Here in List form, are the best insights from the panel.
Having a pithy quote or an insightful stat is key to being a good panelist at a conference. TBD.com’s Jim Brady and Erik Wemple have both in spades, so it was good to open on the Online News Association conference with a session on their new Web site. Here in List form, are the best insights from the panel.
1. Jim Brady, on the need for diverse revenue streams for digital news media: “There’s no silver bullet, there’s shrapnel.”
2. Erik Wemple, on the importance of linking to other news sources: “With 12 reporters and 5.3 million people in our market, our editorial vision is smoke and mirrors.” (Note to self: I haven’t seen anyone ask TBDers about how much risk they see in incumbent media orgs putting their news behind pay walls. They must have calculated that either the odds of that happening are pretty low or that the impact on revenue wouldn’t be a company killer.)
3. Wemple, expanding on the editorial vision of the site: “We want to be a place where, if you hear a siren, you go to #tbd and you find out what’s wrong.” (That news sensibility speaks to TBD’s legacy partner, the local news channel formerly known as Newschannel 8. It does, at least at first blush, seem to ignore anything that’s not event-based news.)
3. Brady, on selling local ads: The biggest challenge in local advertising is getting business online. TBD is developing service models, network models and Paper G to help bridge that gap. A quarter to a third of the blogs that TBD aggregates participate in its ad network.
4. TBD aggregates 196 blogs from the Washington area. This includes professional media, amateur media, and corporate, government and non-profit organizations. That one blog for every 27,000 people in TBD’s market. I wonder what is the smallest market size that could sustain an operation like TBD? It seems to me that this indicates a roll for news organizations in medium-sized markets to cultivate local bloggers to reach some minimal threshold that would make aggregation useful. I also wonder what the typical blog-to-person ratio is a typical media market? (Calling all grad students. There’s a good research question for you.)
5. The panel of TBDers recounted the news organization’s coverage of its first big breaking news story. Bloggers in TBD’s network as well as the site’s general audience provided important eyewitness accounts of the scene. This anecdote illustrated two important elements of crowd sourcing: First, that crowds are best at stenographic journalism — they are good at supplying answers to the who, what, when and where question when those answers come from immediate observation of events or documents. That means that crowds are relatively efficient at feeding editors information during breaking news, especially stories that develop across a broad geographic all at once. Second, that if you want to use crowd sourcing when news breaks you have to develop relationships with your audience BEFORE news breaks. Anyone whose ever cultivated a source knows that means a lot of chit-chat that appears to have nothing to do with the news value of the source. Same on social media.
6. Wemple, on the power of fertile failure: “If you have a Web site that doesn’t have something terrible on it, you’re not trying hard enough.”
7. TBD social media producer Mandy Jenkins, on ignoring your critics: “”In the age of social media, that’s something we can’t do anymore.” (More of her thoughts here.)
8. Steve Buttry re-counted his tale of interviewing Jenkins for the job. It was a good reminder that journalism job candidates who display curiosity always move their resumes to the top of the pile. In online media, this means your ability to show that you try to hack devices and services just to see how they might be able to solve a problem other than the one their developers intended them to solve.
9. Jenkins, raising suspicion that she may be the Mike Allen of TBD, said she has 22 Tweetdeck columns, follows about 200 feeds and that “We follow anyone who’s ever given us a tip.” (Note to journalism grad students looking for a academic research question: It would be very interesting to see whether news Twitter accounts with high follow/follower ratios yield significantly different levels of trust or relevance among the audience.)
10. TBD.com’s corporate parent anticipates the site to take about as long as its sibling, Politico, to become profitable. That took about three years. So, a note to the laid-off reporters and editors who’ve called me with dreams of starting your own news site to compete with your former employer — Step 1: Gather up three years of operating costs…
11. Writing brief, smart, newsy lists without an editor… is hard. Perhaps something we could be teaching our students.
Welcome to JOMC 491: Public Affairs Reporting for New Media
With only a few weeks left before the start of the fall semester, I wanted to quickly give registered and prospective students a little bit of an idea about what we’ll be doing in Public Affairs Reporting for New Media this semester. Seats are still available, so act now!
The goal of the class will be to develop a new online editorial product for the newspapers in Whiteville and Washington, N.C., that will help them provide be a comprehensive and highly engaging source of news and information for their communities. (Perhaps something like Everyblock.com)
So, the first thing to know about the class is that you will be expected to go to those cities — both about 2.5 hours from Chapel Hill — at least once and probably more during the semester. I’ll pick up the tab for your trips, but you will need to arrange your own transportation and schedule.
The reason we’ll be working with these two towns is that they are part of a larger effort being led by Knight journalism professor Penny Abernathy and funded by The McCormick Foundation (founding family of The Chicago Tribune) aimed at helping small newspapers make a financially sound transition to a digital economy.
So do you need to know anything about computer programming, or media economics or news reporting and editing? Not really, but you’ll probably be much better off if you’ve had exposure to at least one of those topics. If you haven’t then you’ll need to rely on your own curiosity, self-motivation and time commitment to ensure your success and happiness in the course.
The class is going to be structure probably unlike any other class you’ve taken at Carolina. First, it has the experiential service-learning component. That means less reading and note-taking from lectures. It means more class discussion and hands-on group projects. My goal is for this class to teach you — as much as anything else — how to clearly articulate and creatively solve messy, complex real-world problems. To do that, we’ll be using the context of improving public affairs reporting for the people of North Carolina by using new digital news tools and concepts.
What will you do in the class?
The first half of the class will be an introduction to the problem with the second half focused on trying out different solutions. In class, we’ll be discussing articles, brainstorming and prototyping (making models that can give us a better idea of how people might use our website). Outside of class, you’ll be keeping a 2x/week blog of reflections, reading articles, and working in groups to figure out what barriers stand in our way of building a great site and then figuring out for yourself how you will overcome those barriers. I promise to be your guide.
How will you be graded?
30% – You’ll launch your own blog and update it twice a week. Some weeks I will give you specific assignments (write a descriptive report about Whiteville, discuss the readings, etc.) but most of the time you’ll simply write about your experiences.
30% – Prototyping. In many classes, you may have been asked to write or create one big final project that demonstrates your knowledge of what you learned. But in this class, you’ll practice the art of “fertile failure” — trying a lot of ideas, making a lot of mistakes and learning from them. You will be rewarded for failing fast and failing smart. We will use everything from toothpicks to MySQL to build our prototypes. You’ll start by using the materials with which you’re comfortable and end the semester by using tools that terrified you just three months earlier. These will be different tools for each student.
30% – Participation. Come to every class with a lot of questions, fulfill your service obligation, participate in online discussions outside of class.
10% – Data management and public records assignments. A big part of our prototyping and brainstorming will be around how to obtain public records and make them useable in an online database. You’ll have a few projects to get you familiar with the basics of the technology and issues surrounding this topic.
I hope that gives you a rough idea of the class. I’ll be posting a full syllabus and calendar soon. But in the meanwhile, enjoy the rest of your summer and let me know if you have any questions.
Convergence in the Classroom, Metamorphosis in the Newsroom
Newsrooms still have people who specialize – some in news skills and some in old. But they also have folks who have a wider variety of skills and duties. Journalism schools have to give students the opportunity to prepare for both kinds of roles.
“Convergence” has always been my least favorite word to use to talk about newsrooms. Yesterday’s AEJMC conference presentation by John Russial and Arthur Santana reminded me why.
Oh, their presentation was very good. Russial’s research about newsroom technology and roles is always enlightening. But a blog post from Alfred Hermida (who, by the way, is the conference’s best tweeter @Hermida) picked up on the presentation’s use of the word “convergence” and made me realize how broad of a definition that word can have. Hermida’s headline was “AEJMC: Newsrooms slow to move toward convergence” and he goes on to report that “Russial concluded that job specialisation remained the dominant organizing principle, with editors prizing depth rather than breadth.”
On Twitter, the unfortunate headline has been in circulation. I say it’s unfortunate because I think it misrepresents Russial’s presentation in a way that the rest of the blog post does not. My impression was that Russial’s research found that convergence IS happening in newsrooms, but that it is happening at the organizational level rather than at the individual level. He didn’t address whether convergence was happening at the story level.
And if you had to read that last paragraph a few times, you know why I don’t like to use the word convergence.
That said, I think Russial is right about the level at which convergence is happening. His findings are supported by the paper that Ying Du and I presented at the same session and they are supported, too, by an earlier unpublished study I did of online journalists in North Carolina.
The North Carolina study found that, on average, online journalists say they have had nine different duties at least once in the last three months. More often than anything else, a respondent said he or she had five different duties. But it also found that not everyone is doing everything. There is specialization of “new media” skills.
And in the paper we presented yesterday, online journalists said that the concept most important to their job was “multitasking”. (Journalism instructors however, ranked multitasking as seventh out of 10 concept. Leading to the challenging question: How do you teach multitasking?)
I didn’t research this, but I suspect that photographers are also shooting video. Reporters are blogging. Designers are animating. Copyeditors are producing story packages in a CMS. It’s not convergence as much as it is metamorphosis. And we aren’t seeing caterpillars becoming ducks. Not surprisingly, we’re seeing caterpillars becoming butterflies.
There are some roles in the newsroom that AREN’T converging. In the North Carolina survey, journalists who write original stories for the Web, edit text for content, and work with databases tend to perform very few other tasks.
I don’t have enough data to support this, but I also suspect that role convergence is much more likely to take place at small news organizations while specialization (and diversity) of roles is more common at the largest news organizations. And because students tend to start at small organizations and later join large organizations, this distinction is important (if indeed true). Understanding it can help journalism educators better frame the choices they have when dealing with curriculum change.
So, what does that mean for journalism education and curriculum change? I think a few things:
- Every journalism student should have a basic introduction to a broad variety of skills – writing/editing, reporting, photography/design, computer programming/algorithmic thinking and law/ethics.
- Journalism students should become proficient in a particular set of concepts and skills that we some define as being similar.
- “New media” skills should be incorporated into core classes. That means squeezing audio-video information gathering into reporting and design classes. It means that every class should talk about using social media for gathering and distributing news. If there is a specific class in “social media” or “animated graphics” or even “magazine design” or “sports writing” they should be advanced courses that students take after getting a basic introduction to them in earlier classes.
- The purpose of incorporating new skills and concepts into core classes comes at a cost of spending less time on the traditional skills that are still so valuable. That’s why further specialization is so important.
- Journalism students should also have a broad education that introduces them to economics, art, history, science, politics and all the rest. And students should also specialize in a subject area. (Again, I suspect that as newspaper staffs shrink that the place where we’ll find the most convergence in beat assignments. At the same time, the brand disloyalty of the online news audience is promoting beat specialization and the development of new niche topical expertise.)
- The purpose of the broad-based core curriculum – and the reason for including “new media” skills and concepts into those course is to give journalism students the vocabulary and news judgment they need to collaborate with specialists.
- Finally, as Russial pointed out in his presentation, the adoption of newsroom technology has tended to follow a pattern. First, technology leads to automation. Journalists whose careers are built around their expertise in quickly and accurately performing a rote task and not around thinking creatively and critically will lose their jobs. But then, technology leads to specialization. As new tools become available not everyone can be equally skilled at each one.
Dealing with the unresolved debate over convergence or specialization was one of the biggest challenges of writing my textbook. I dealt with it in a way that supports the solution I’ve begun to outline here: we need both. How’s that for convergence?