First of all, let’s not let allow the alluring alliteration to distract from we’re really talking about — not robot reporters, but robot writers.
Mashable’s Lance Ulanoff asked me what I thought about the news that Durham’s Automated Insights would be writing automated business stories for the Associated Press.
This trend excites me about the future of journalism. I’ve been talking with folks about it for about five years, since I first saw similar work that was being incubated by Northwestern’s journalism school. That effort grew into the company Narrative Science, which has been writing earnings preview stories for Forbes.com. The Los Angeles Times uses an algorithm to write earthquake stories. The Washington Post has looked into using Narrative Science for high school sports stories.
The Guardian learned how hard it is to build a robot writer, but the automated stories I’ve seen written by both Automated Insights and Narrative Science are pretty good. And 46 media and communications undergrads couldn’t distinguish a computer written story from one written by a human.
The trend in automation should free up the best writers and best reporters to add the how and why context that still needs to be done by humans. If I were a beat reporter at a newspaper I’d be working as fast I could to convince by editor to let a computer write the scut stories I have to write and free me up to do more explanatory and accountability reporting, or to craft beautifully written narratives.
One significant risk is that for the last decade we’ve seen “good enough” journalism growing in popularity. News organizations that continue to have a strategy of harvesting profits rather than investing in growth will no doubt cut reporters if machines can write commodity news at a lower cost.
If I were a young journalist looking for my first job, I’d be looking for news organizations that are sustaining a small margin and growing both expenses and revenues — the ones that are using both bots and humans.
The trend toward automation will result in an emphasis on the news value of impact. Mass customization is going to change the nouns in the leads of stories from the third person to the second — “investors” will become “you.”
The trick is how to make money off this. News organizations that continue to see themselves as manufacturers of goods will probably increase the volume of digital commodity content they publish and continue to drive down ad rates.
But smart content companies are evolving from a manufacturing industry to a service industry, and trying to create, explain and capture the value they provide to each client by getting the right information to the right people at the right time.
What we see now as data is as unsophisticated as what many of us thought of data when Google first made its mission organizing all of it. We think of data now as numbers in tables — scores, money, temperatures, but we’ll soon see data as behavior and content metadata. And we will see automated stories that incorporate the user’s data and the data of her social network as well.
That level of concierge news service, though, is going to come at a price for users. If we’ve seen the democratization of media this automation trend has the potential to create a world of media haves and have nots — the haves will pay premium subscription fees to get highly personalized news from bots. The have-nots will get generic news (maybe written by bots as well).
The one thing from which I think everyone will benefit is an increase in the quality and frequency of narrative writing, and of explanatory and accountability reporting.
To aid that transition I’m working on the idea that we can use digital public records to build a newsroom dashboard system that will alert beat reporters to possible story ideas. Automated Insights and Narrative Science are scaling commodity news stories. I want to see if we can lower the human reporters’ opportunity cost of pursuing enterprise stories that land with much bigger and much longer lasting impact.
If you want a pithy quote from a journalism prof. on the effect that robot writers are going to have on the job market for journalism students, here it is: “My C students are probably screwed. My A students are going to do better than ever.”
Students in my “Social Media for Reporters” class have been working with Storify this semester, most recently on an assignment to cover University Day — UNC’s birthday. This year, the usually low-attention affair was interrupted by the news of the death of one of UNC and higher education’s most influential people of the last half century — Bill Friday.
Here’s what we learned about using Storify as a reaction piece:
- Students agreed they would be inclined to use it primarily as a tool for summarizing reaction or public sentiment, rather than a tool for replaying the tick-tock of a breaking news event. That may have been because of the nature of the assignment.
- Don’t write a placeholder headline for your Storify. Once you save it, it permanently becomes the URL of your Storify.
- If possible, lead with a photo. Perhaps this is a visual convention that comes over from news article Web pages. But images often set the scene for reaction pieces. Images can quickly show the reader the “what” in what might otherwise be called the lead of the piece, allowing tweets and other content to focus on the “so what”. The images that work best are also images with some text on top of them.
- Transitions are critically important, making the difference between a narrative and what otherwise is merely a spreadsheet of quotes. The transitions you write between Storify elements must introduce the immediate next item. If you describe a state of shock at the news of Friday’s death, the next piece of content can’t be a photo of students standing in line for waffles, which is then followed by a tweet reacting to Friday’s death.
- Transitions are the paragraph that sets up the quote. Be sure not to repeat in your transition all the information in the quote. Your writing is the “what” and the following tweet is the “so what”.
- Tweets work better than Facebook posts as content components of Storify, perhaps because of the visual nature of Storify and the brevity of the tweets.
- Also, we found the interaction between Storify and Facebook to be both unwieldy and unreliable. For example, Storify links would disappear from Facebook pages as we navigated through the site, increasing significantly the amount of time it took us to add content from Facebook. We also experienced unpredictable reliability of Storify links on Facebook — sometimes they would work and sometimes they wouldn’t. We tested across browsers, platforms, privacy settings of content and types of content. But we couldn’t discern a clear pattern.
- We had two reports of students who said they had to essentially do the assignment twice because of technical difficulties curating and editing the pieces of content in Storify. One had to do with inability to arrange a YouTube video within Storify once it had been added. Solution was to close the browser and re-open Storify.
- Note that there’s an important ethical issue to consider when using Storify on Facebook content — you can Storify any content that you can see and then make it public — not “Friend” public, or even Facebook-only public. But Everyone public. You cannot do this with tweets.
- The ending of your Storify is critical. Good endings seemed to be a tweet that either spun the story forward, summarized public sentiment or drove the reader to further interactivity and engagement — a place where the reader could react to the story.
- As in any reaction piece, you have to be aware of the diversity of your sources. First, consider which kinds of diversity might contribute to different viewpoints. Sometimes it’s racial diversity and sometimes geographic and sometimes political, depending on the story. When journalists mix personal and professional uses on their social networks, they are more likely going to see content that looks like them. Compounding that social and cultural bias is the algorithmic bias — Facebook and Twitter are going to try to give you content its algorithms think you will like. This will be based, presumably, on your previous interactions with content as well as your demographics and the demographics of the people you follow. When using Storify to create a reaction piece, journalists have to go out of their ways to look for different viewpoints. Using custom lists on Twitter and Facebook, geo-targeted searching in HootSuite, and following partisan hashtags or accounts can help mitigate against algorithmic and personal tendencies toward homogeneity.
- It’s an old conversation, but one worth bringing up again in this context — understand that reaction pieces on Storify are inherently anecdotal and not a valid survey of public opinion. That said, consider whether you should give an equal amount of space to competing points of view regardless of how frequently you see each perspective. Or, whether you should instead try to weight the balance of space in your piece to reflect the frequency of each point of view you found in your search for content.
- Finally, the mix between “inside” and “outside” sources can dramatically change the tone of your Storify piece. In our case, we had reactions to Friday’s death from both University officials as well as students and alumni. Reaction from officials adds the news value of prominence to your piece, but broad public reaction can increase the news values of magnitude and impact.
We didn’t discuss these in class, but here are a few use cases for Storify I’d like to see:
- Virtual debate between two or more people on opposing sides of an issue. Take unprompted content from a community and splice it together to create the kind of conversations that seem to be less common in our disaggregated media world.
- Fact checking of what people say on social media. Use tools to determine assertions that are both common on social media and appear to be based on fact. Even better would be to find assertions made by different sides on an issue, but based on the same set of facts. It might be interesting to note whether we spin each other just as much as our candidates do.
Linking in news articles is a wildly under-appreciated craft, with many news organizations turning over this critical editorial task to algorithms rather than editors. Today’s sentencing of punk band Pussy Riot in Russia is top news this morning on most national sites, and is a perfect example of how links contribute to the editorial voice of a brand. For example, a quick look at a few of the top sites at 10:30 a.m. ET shows that nytimes.com is the only site linking from their story straight to the video that caused the ruckus in the first place.
The New York Times:
“The saga began in February when the women infiltrated Moscow’s main cathedral wearing colorful balaclavas, and pranced around in front of the golden Holy Doors leading into the altar, dancing, chanting and lip-syncing for what would later become a music video of a profane song in which they beseeched the Virgin Mary to rid Russia of Mr. Putin.
Security guards quickly stripped them of their guitars, but the video was completed with splices of footage from another church.”
These two links are among the five in the story. The other three are links to Times Topics pages on Russia, Putin and the Russian Orthodox Church.
The departure text in the story is a bit unclear to me. Links should usually be brief nouns, and I know what I’m getting when I click on “music video”, but I wasn’t sure what I’d be getting when I clicked on “video was completed”. That verb in the departure text made me think it would be a video of the act of completing, or a story about the completing done at the time of completion.
I need to follow up with two questions to the Times. I wonder why there are two links to what at first glance appear to be essentially the same videos. But I also wonder what was the editorial thinking behind linking to the videos in the first place — since clearly others either chose not to do so, or simply didn’t think about the option — and the reason they link to the Russian language versions rather than versions that have an English subtitle. The words in the video are offensive — ones that I’ll bet a good deal of money that the Times wouldn’t print online or on paper. Did they have concerns about linking to offensive language? Is that a reason they didn’t link to the English version?
Newspapers often make choices about what not to link. The Times — and most other papers — didn’t link to video of reporter Daniel Pearl’s beheading. The differences may seem obvious to you. Even if they do, it is important for journalism students and professionals to be able to articulate to themselves and others the differences between a Pussy Riot video and a beheading video.
Finally, I wonder whether there was conversation about the implications that linking to these videos might have on The Times’ ability to distribute the story in Russia or on its ability to report there.
The site has a very brief story right now, with only three internal links to topics pages — one on the Virgin Mary, another on Putin and another on Pussy Riot. An interesting mix.
The Huffington Post
Often held up an example of “webby” news production, The Huffington Post story has no links in its story. It seems as if its readers would be better served with links.
Yahoo’s story has more links than any other, and most of them go offsite. And while they link to a video of another video that got Pussy Riot in trouble, they don’t link to the video that’s at the center of today’s story.
The lead graph links to the band’s Live Journal website, and several links throughout the story go to source material — such as Paul McCartney’s statement in support of Pussy Riot — or go to other media sites. Yahoo links both to NPR, The New York Times, The Guardian, the BBC, The Week, Wikipedia, Gazeta.ru and Reuters. It’s an example of one media company benefiting from the reporting of others by curating their stories — and I say that without judgment one way or the other.
NBC News & CNN
This site has long had a style of putting links to related stories in-line between paragraphs, rather than off to the site where the click-through rate is — or at least was at one time — about half of the in-line click-through rate. It continues that editorial voice here, which is a decision to make links not part of the narrative, but a diversion from it. This is not a single multi-linear story, but several adjacent pieces of related articles.
CNN’s story follows the same editorial construction.
What do you think?
What’s the right way to link in this story? Why? Share your comments below, or send them to @rtburg
For More on Linking
See Chapter 7 of my book, Producing Online News
As part of the Knight News Challenge grant for OpenBlock Rural, I’d like to build capacity of North Carolina journalism students to contribute to the application’s code. It’s not the main point of the project, but it’s an element that will help the longterm sustainability of the community — both the OpenBlock community and the rural communities we hope to serve.
But building that capacity from scratch is no short task. As I’ve begun to map out a class or workshop on it, I was reminded of a book that I read to my kids.
- If you’d like to learn how to use OpenBlock, you need to know Django …
- If you want to work with Django, you’re going to need to understand how to edit files with nano or some other text editor, and you’ll need to know PostgreSQL, and you’ll need to know some Python …
- If you want to use Python in any meaningful way, you’re going to need to install some Python packages, or modules …
- If you want to install Python packages, you have to know how Python works on your computer’s operating system (Mac, Windows, and Unix) …
- If you want to know how Python works on your system, you have to be comfortable using the command line of Windows or Unix. You need to be able to list directory contents, change directories, read and change file permissions, manage Linux users, download and decompress files using gunzip and tar commands.
- … and you’ll need to know HTML and CSS
The paradox of teaching these things to students is that as the user interfaces of Web applications and computers get easier, and their use becomes more ubiquitous the proportion of students with the hacker ethic they need to approach projects like this is reduced. That’s not a dig on students. The better something works out of the box the less the need to tear them apart, fix them, improve them. It’s like me and my car. Wheels turn. Radio works. Doors open. I couldn’t care less how the gears actually shift or how the “snow” traction works.
But I hope we’re not just training college students to be users of technology. College journalism students need an entrepreneurial mindset. It’s not just about teaching the technology. It’s about cultivating an entrepreneurial spirit, a way of skeptical knowing, and a hacker ethic.
It might have been an offhand comment, but the idea that “print is the new vinyl” is a rich analogy. It was made last week by the Knight Foundation’s John Bracken speaking at the Asian American Journalists Association conference.
“Bands … have recognized that vinyl encourages exclusivity, maximizes design potential and creates a depth of involvement that 0s and 1s cannot. Vinyl’s renaissance is not due to nostalgia — it’s due to the fact that musicians, labels and fans have built a creative and consumer experience based on what the format does well.
“I don’t want to beat this metaphor to death. Here’s the core of the comparison: as more and more of the content we consume is based on bits, the ability to engage with atom-based media will, for some, gain value.”
Based on a lot of time talking with people who prefer print to digital, the tangibility of it must be the top reason people prefer print.
The analogy is good because it not only deals with consumption habits, but also production.
Albums — in vinyl or CD — are a product, a good, a widget. They are a complete package. Digital music is disaggregated. It’s becoming increasingly a social experience. I suspect that one day soon we’ll be paying more for the service of digital music storage and delivery than we do for the content itself. This is going to be the same for news. Maybe its always been the same for both news and music businesses.
In any case I do think that print is going to be primarily for “hipsters.” The presence of high quality print is going to become a social signal — “I’m considerate. I invest time and money into my collection of knowledge. I enjoy learning about the world around me, not because it helps me make or save money but because I enjoy being aware of the world. I’m not a news junkie; I’m a news connoisseur.”
Vinyl signals the same things. Both the person with 100 records and the person with 10,000 digital songs can legitimately say “I’m REALLY into music.” But they mean different things.
As I wait for the Knight Foundation to give OpenBlock Rural the official “Go!” I’ve been talking with developers and working on nailing down an estimate of what it would take to get OpenBlock up and running. For that conversation, please tune in to this discussion thread on the eb code Google Group for developers.
I posted an earlier draft of this proposal already, but here is the full version of the proposal that has been selected as one of the finalists by the Knight Foundation in their 2011 Knight News Challenge. Let me know your thoughts. Thanks to everyone who has helped spawn and cultivate this project. Every conversation has me more excited about what we’ll be able to do for rural newspapers in North Carolina and across the country.
Describe your project
We will build a not-for-profit clearinghouse of data from state, county and municipal governments in North Carolina and deploy them through pilot OpenBlock installations at the websites of nine rural newspapers in the state. The datasets will include public records of particular interest, such as crime reports, real estate and restaurant inspections.
We have already conducted research, funded by the McCormick Foundation, that indicates deployment of OpenBlock on the websites of small and mid-sized papers could provide significant digital revenue potential — given the interest readers have in understanding their communities. But that the main barrier to implementing OpenBlock is a lack of technical expertise at small papers as well as the high cost of ongoing data collection.
At the end of our 27-month funding period, we will have reduced the costs of acquiring, aggregating and publishing public data at community newspapers. We will also have developed one or more revenue models that demonstrate how meeting the information needs of a community can also be good business, even in small towns and rural counties of fewer than 75,000 people.
Rural news organizations are struggling to move to the digital age in part because their staffs are so small that they don’t have the capacity to identify, digitize, re-aggregate and map all the various public records available at the state and local level into databases that can be accessed intelligently by both reporters and the public.
This project tackles the lack of capacity at rural papers from two directions. It will create a centralized clearinghouse of state, county and city schemas and datafeeds that could be easily used in OpenBlock. It will also create compelling editorial content that will draw new, young readers to community information presented in a format and medium they want. Audiences for this kind of editorial product are loyal. They generate repeat visits by returning to seek updates on crime especially. And they also generate page-views and increased time-on-site as they search and sort the information.
We expect through this project to lower the costs of data acquisition and organization through a variety of methods that we will be able to assess and compare. In some cases, volunteers will pick up CDs of data from county offices. In others, journalists may scan and upload PDFs of hand-written police incident reports. In still other cases, people would key into a database the information on those PDFs. This job is so big that no single small news organization could do it. But with the support of a not-for-profit organization that provides centralized technical, editorial and advertising expertise, we could create a model for gathering valuable public records from rural America. To individual communities, these records are necessary to foster an informed civic dialog and healthy economy. But in aggregate, these records may also be able to shed light on trends in rural America that would otherwise go unreported.
This project will demonstrate one way that universities can support and advance journalistic activity – by providing a launchpad for new ventures that draws upon broad faculty expertise and student workers to lower the costs of professional, independent public affairs journalism and by absorbing some of the risk associated with new editorial product development.
Knight funding will get us off the ground and put us in a position to be a self-sustaining not-for-profit company, serving North Carolina journalists and citizens and providing a model for other states and regions to adopt.
Improving Delivery of News and Information to Geographic Communities
In small towns and rural America, the local newspaper is more than just a source of information and an engine of commerce. More importantly, it fosters and builds geographic community and sets the agenda for public policy debate. By making public records readily available and well-organized, we will support decision-making and accountability in local and state government.
This project most clearly improves the delivery of news and information to geographic communities by helping rural community newspapers make the transition to the digital age and remain relevant for younger audiences that are less informed and engaged in their own communities.
We expect several community newspapers to incorporate crowd-sourcing – a technique once known in their newsrooms as “neighbors editors” – into the process of data acquisition. Where this happens, we expect an increase in civic and community engagement. — first, by forming a network of knowledgeable volunteer citizen-journalists and by creating greater demand for truly open government records.
In many cases, data that is readily available in GeoRSS or at least CSV format from big cities is simply not available even in print from rural governments. For example, journalism students at the University of North Carolina working last semester to gather and organize public records in two rural counties for an OpenBlock application met with a number of obstacles – ranging from significant photocopying fees to inappropriate redactions and denial of access to public information.
Even when acquisition of public datasets is relatively simple – for example, public health restaurant inspections — someone must request that data from a specific county be exported in fielded data format. It is inefficient for each rural news organization to make separate requests for this data in each of North Carolina’s 100 counties. In these cases, our centralized organization would outline an initial request for the data for each county.
When Rick Thames, the editor of The Charlotte Observer, reviewed our proposal, he offered his enthusiastic endorsement. “There is no question but what this would fill a need,” he said. “Small papers can’t do this sort of work on their own. So, sadly, it just isn’t getting done. What a gift this would be for those communities. A very worthy effort that would be warmly received by the editors and publishers of every small and mid-sized paper that I know.”
Currently there is no tool or service that can efficiently gather, format and publish public records on rural news organizations’ sites. In part, this is a technology problem that may soon be overcome with the alpha rollout of OpenBlock later in 2011. But a much bigger piece of the problem is the data itself – neither OpenBlock nor any other technology has the ability to obtain public records as fielded digital data and create a newsworthy user interface for all the various types of records that a news organization might need.
Without a project like this there is no indication that OpenBlock will be a viable option for papers in rural communities.
What Will Change?
By the end of the project, we will have …
1. About 95 up-to-date feeds of local government data in standardized, fielded formats such as GeoRSS. These feeds will be available under a Creative Commons Attribution, Share Alike license. By providing public information in this format, we will lower the barriers to North Carolinians interested in researching trends or patterns in public policy and we’ll provide the raw material for the development of mashups or entrepreneurial applications we haven’t even thought of yet.
2. Nine community newspapers using OpenBlock to publish fresh, local government data to their audiences. These newspapers will be on the frontlines of a statewide effort to get complete and current government datasets in open, machine-readable formats. They will demonstrate multiple approaches to implementation that will be relevant to others’ during the broader roll-out.
3. Identified new revenue opportunities structured around the presentation and analysis of this data that will support their journalism.
4. Journalists and citizens interested in public policy issues will have a new tool for analyzing trends and patterns in rural issues such as environmental stewardship, public health, crime and justice, education, and economic development. Community newspapers will be able to more easily compare the experiences of their communities with the experiences in other places across the state.
5. A cost-effective model for building similar independent, not-for-profit data repositories in other states.
6. Most importantly, we will have raised public awareness of open government and we will start seeing rural counties and towns publish public data in standardized, machine-readable formats on the Web.
Why are you the right person?
This project would be tested in North Carolina and rolled out nationally. While the Raleigh-Chapel Hill area is a hub for information technology, the state has a high percentage of rural counties (roughly 70 out of 100) and a strong tradition of quality community news organizations.
The project builds on extensive and longstanding collaborations between the University of North Carolina and North Carolina Press Association.
“WOW! This is an interesting and ambitious project and I know there will be many Carolina newspapers that will want this service,” Beth Grace, the director of the N.C. Press Association, told us. “At a time when papers have lost staff and have had to postpone in-depth/investigative and trend reporting, this could bring some of that information back to papers and their readers. The North Carolina Press Association stands ready to assist — we can work with you to help assess what records most papers –and importantly, their readers — would want.”
This project will address a critical need that’s been identified through the work of UNC Knight Chair Penny Muse Abernathy with three rural papers, and in partnership with professor Ryan Thornburg, whose students have already begun collecting digital public data in these rural counties. The project was funded by the McCormick Foundation to develop sustainable business models for community news organizations.
“Our newspaper has worked with Ryan Thornburg for the past year as we try to figure out how to take advantage of OpenBlock for Whiteville.com,” said Les High, the editor and publishers of The Whiteville News-Reporter. “As is the case in most rural communities in this state, the public information we plan to display is not readily downloadable to the site. This project would provide an important community service to residents of all of North Carolina’s 100 counties – bringing the benefits of the digital highway to even the most remote areas. And just as important, OpenBlock could well be an important source of new revenue for community newspapers everywhere. This is a very important first step in making OpenBlock economically feasible for small papers to implement and use.
A database of local information – and we believe OpenBlock is the best solution at this point — is a central component of the financial strategy in the digital age. Yet, the obstacles in collecting and digitizing loom as a barrier to successful implementation.”
What tasks/benchmarks need to be accomplished to develop your project and by when will you complete them?
The project has three phases, each with its own tasks and benchmarks. We have developed a detailed timeline and budget that are available immediately upon request.
Phase I is underway with funding from the McCormick Foundation to install the OpenBlock codebase on a virtual machine, to format, ingest and publish two datasets from North Carolina local governments, and write a public report on the technical risks of the project.
The report and any code we develop will be shared with the OpenBlock community. We will publish this report by April 15. (We understand this is before funding would be available from the Knight News Challenge.)
This summer, with Knight funding, journalism students and community newspaper reporters around the state will conduct a census of public records. We will pay participants to complete forms describing the location and characteristics of state and local datasets.
At the completion of Phase I, we will publish a directory of the datasets and a report that describes the economic cost to journalism of governments not publishing data in machine readable format.
Phase I will end September 30, 2011.
Phase II: The focus in this Phase will be on reducing the costs of deploying OpenBlock at rural papers as well as the costs of acquiring, organizing and maintaining data feeds that can be easily integrated into the OpenBlock application.
By the end of this Phase, we will install eight additional OpenBlock sites, publish relevant data feeds and make them freely available under a Creative Commons non-commercial, share alike license.
We will also design, test, iterate and document sample data-collection processes for a variety of scenarios we expect to encounter during the statewide deployment of OpenBlock installations. The documentation will be critical to news organizations across the country as they plan and budget their own efforts.
Phase II will run from October 2011 through September 2012..
Phase III: During the final phase of the project we will focus on generating for community newspapers revenue models that will be used to support and encourage the continuing maintenance and development of our OpenBlock installations.
During this phase we will begin a phased, statewide rollout of OpenBlock to community newspapers and we will have a comprehensive, statewide collection of public records feeds available from our clearinghouse.
Phase III will also see the incorporation of a not-for-profit organization that will house the project after the end of the grant. It will be funded with the annual membership fees from community newspapers at which we have installed OpenBlock. This organization would also maintain the clearinghouse of public data, some of which may come from places where we don’t have media partners. Finally, it will provide editorial guidance to anyone interested in using the data to create their own data tools or to write stories about trends or patterns revealed by the aggregated data.
Phase III will run from October 2012 to August 2013.
How will you measure progress?
We will measure progress primarily by meeting our benchmarks on deadline and within budget. We will recruit partners, successfully install OpenBlock at community news websites, and collect and distribute feedback from partner newspapers.
We have developed a detailed timeline and budget that are available immediately upon request.
Ultimately, we hope to see a statewide movement to support laws and systems that make government documents and data more easily accessible to North Carolina citizens. With those public policy shifts, we believe we will see more and better public affairs journalism as well as faster and more equitable resolution of civic debates.
The risks of our project fall into three categories: data acquisition, data management and publication, and revenue generation.
Data Acquisition — The goal of the project is to reduce the cost of acquiring current and complete local government data in small communities. The costs now make widespread deployment of the OpenBlock application prohibitive for small publishers.
Challenges to low-cost data acquisition are technical, political and legal. The technical problems are all surmountable – at some cost, perhaps higher than we hope. In our early going, we anticipate many data sources that will require manual entry. The risks with these data sources will be accuracy and efficiency. We hope to test various quality assurance methods across our nine initial sites.
But the real challenge we believe will be reticent government agencies and uncooperative vendors with government that make their money through government contracts for digital data storage and management.
Our experiences with student efforts to collect digital, fielded data in rural communities give us a pretty good idea of the type of challenges, if not their scope. For this project, we intend to employ reporters within each community to leverage their community-based knowledge and relationships to help overcome these challenges.
Through conversations with attorneys for the N.C. Press Association, we don’t see any legal reason that we cannot gather the data we need for the feeds to be editorially meaningful.
Data Management and Publication — This project depends significantly on the successful alpha launch of the OpenBlock installations at The Columbia Daily Tribune and The Boston Globe. We anticipate these Knight-funded launches to happen late Spring 2011.
To mitigate this risk, we have already consulted with developers to help us more clearly see the technical challenges that might stand between data collection and our goal of deploying the OpenBlock application at nine community newspapers by the end of August, 2012.
As we understand it now, the technical challenges involve scraping data, developing locally meaningful schemas for various datatypes, the development of a simple user-interface for data editing on the backend, and customizing the front-end look and feel of OpenBlock to match the websites of existing community newspapers, many of which use the commercial TownNews CMS/service.
To ensure this is adequately addressed by those with sufficient technical experience to assess and solve these problems, we will hire a qualified and cost-effective group of developers to help us.
Revenue Generation – Community newspaper publishers will participate in this project only if we can demonstrate a positive return on their investment. While Foundation support is essential to the launch of this project, sustaining it will only be worthwhile if we can help small newspapers generate revenue. On the cost side, we quickly discover the most efficient strategies for data acquisition and maintenance. On the revenue side, Penny Abernathy has been working with three small newspapers to develop sponsorship models that we believe will yield enough revenue for publishers to justify the annual cost of the service.
How will people learn about what you are doing?
There are three critical audiences for this project. First, is a national audience that we will reach through trade websites and conferences as well as the OpenBlock community that is being so well nurtured by OpenPlans and its Knight-funded efforts.
In North Carolina, we have a statewide audience of newspaper publishers, editors and engaged citizens. Our affiliations with the N.C. Press Association, the N.C. Open Government Coalition, and the School of Journalism and Mass Communication at the University of North Carolina will help us identify partner newspapers and datasets that are editorially significant.
Our most important audiences will be the local news site users and advertisers. We expect and need these citizens to become consumers of public records and advocates for digital, fielded government local government data. In many cases, we also expect that these audiences will also be our collaborators and key elements of the data collection workflow. For these audiences, the local newspaper partners will be our most important channels of communication.
But even if one of the risks we’ve outlined prevents us from creating a self-funding not-for-profit, the journalism community at the end of this grant will have several hard deliverables that will be used to guide further efforts:
- A description of state and local datasets in one of the nation’s most populated states. (August 15, 2011)
- A Paper that describes the economic cost to journalism of governments not publishing data in machine-readable format, compared to the costs of the governments – and taxpayers – to do so. (September 30, 2011)
- A clearinghouse of state and local government datasets, in open, machine-readable format. (September 30, 2012) A handbook of data collection processes suitable for six different public records request scenarios. (September 30, 2012)
- Nine installations of the OpenBlock application at community newspapers. (July 2011 to August 2012)
- Scraperscripts, schemas and other contributions to the OpenBlock Project. (April 2011 to August 2013)
When I walk into the classroom to teach my introductory news writing students at UNC, I remind myself that I’m giving a map to people who have always driven sports cars, but never out of their neighborhood.
Some of the students are younger than Mosaic, and throughout their lives, their access to information technology has outpaced their understanding of it.
The answer to the question of “What is news?” for many of them is “Whatever my friends share on Facebook.” And that means popularity — and for many of them it’s popularity among a narrow subset of people who look, act and see the world similarly — trumps all the traditional news values of impact, proximity, prominence, timeliness, emotional appeal, oddity and conflict.
But rather than try to replace one with the other, I’m trying a technique that I hope will use their familiarity with social media to get them to think more about their audience. Try the following and let me know how it works for you, too.
1. Have the students organize their Facebook friends into various lists, using traditional news values. So, for example, students might organize their friends by geography, share experiences, relationship status, number of friends they have, frequency of posting, or a combination of those. Instructions for Creating a Facebook List
2. Throughout the semester, your students are already required to read the news. But this technique also asks them to share the stories they read with their friends on Facebook. Instructions for Sharing a Link on Facebook
3. The key is that they can’t share a link with ALL their friends. They have to pick no more than two lists with which they share each story. This gets the students thinking about how different audience value different information. Or how different audiences value the same information, but for different reasons. Instructions for Sharing Links With Specific Lists
4. Finally, with each link that a student posts she is required to “Say something about this link …” It doesn’t count if the annotation is merely a re-phrasing of the facts in the story. And it doesn’t count if the student merely writes about why she likes the story. The annotation must answer the question “So What?” for that particular list. The goal here is get students to change their belief that writing is about self-expression into a journalistic mindset in which writing is selfless expression.
Journalists have to give audiences what they want and need, and often must go to great lengths to explain to them why they need it. This isn’t paternalism. This is a service, and it’s the same one that attorneys and physicians and financial advisers provide. The choice remains in the customer’s hands. But we — as journalists — have a professional obligation to provide the best advice on the most relevant information possible.
Grading: You have two choices for grading this assignment. One option is to get a Facebook account and require that all of your students friend you and put you on every list they’ve created for the class. That way you’ll be able to see what they’re doing and use your own rubric to score their efforts. The other option is to have the students write a weekly reflection about their experiences sharing stories with their friends. What did they share with whom? How did they describe it? What didn’t they share? Why not? What responses did they get from their friends?
(For the sake of ease, you may consider creating a mock version of this assignment in which students simply write Word documents using imaginary friends, imaginary lists, imaginary stories or use an imaginary social network. But do not do that. It smacks of being phoney. And students — and journalists — hate phonies.