Knight News Challenge Proposal: Crowdsourcing Data to Bring OpenBlock to Rural America

At the top of my To Do List this week is the completion of one of the proposals I’ve submitted to the Knight News Challenge this year. I’m posting it here in the hope that you’ll have some feedback on whether/how a service like this would be technically feasible. editorially useful and financially viable. I’m especially interested in hearing from editors of small papers, public records experts, civic/community organizers and anyone who’s worked on the OpenBlock code.

Under what conditions would you volunteer to help a project like this in your community? News organization — how much would you pay for a service like this? What characteristics would it need to have to make it worth your money? What else do you see here that needs further clarification?

(And a big hat-tip here to Penny Abernathy, the Knight Chair in Digital Media Economics here at the UNC-Chapel Hill School of Journalism and Mass Communication. She got this project kicked off with a grant from the McCormick Foundation and who is my co-pilot on this application.)

Here’s our draft pitch:

Crowdsourcing Data to Bring OpenBlock to Rural America

This project would create a co-op to develop and deploy public records databases at news organizations, especially those serving communities of fewer than 75,000 people, preparing those records for presentation and integration in an OpenBlock format.

These rural news organizations are struggling to move to the digital age in part because their staffs are so small they don’t have the capacity to identify, digitize, re-aggregate and map all the various public records available at the state and local level into databases that can be accessed intelligently by both reporters and the reading public.

The project would tackle the lack of capacity at rural papers from two directions. It would create a centralized repository of state, county and city schemas and datafeeds that could be easily used in OpenBlock. This a job well-suited for a small group of experts. In addition, the project will create a statewide corps of amateur data-checkers and records requesters. Data quality assurance and data gathering are jobs well-suited for a crowd of many people, each working on a small piece of the puzzle.

These volunteer citizen-journalists would actually be member-owners of a co-op business. Each task they perform would earn them additional shares in the company’s annual profits. We would generate revenue by charging rural newspapers a fee. The more records and the better their accuracy, the more news organizations would sign on for the service.

In some cases, volunteers would pick up CDs of data from county offices. In others volunteers would scan and upload PDFs of hand-written police incident reports. In still other cases, people would key into a database the information on those PDFs. This job is so big that no single small news organization could do it. But with a corps of member-owners working together, we could create a model for gathering valuable public records from rural America. To individual communities, these records are necessary to foster an informed civic dialog and healthy economy. But in aggregate, these records may also be able to shed light on trends in rural America that would otherwise go unreported.

Improving Delivery of News and Information to Geographic Communities

In small towns and rural America, the local newspaper is more than just a source of information and an engine of commerce.  More importantly, it fosters and builds geographic community and sets the agenda for public policy debate.  This project will foster civic and community engagement — first, by forming a network of knowledgeable volunteer citizen-journalists, and also, by making public records readily available and organized to support decision-making and accountability at all levels of government.

Unmet Needs

In many cases, data that is readily available in GeoRSS or at least CSV format from big cities (such as this example from San Francisco) is simply not available even in print from rural governments. For example, journalism students at the University of North Carolina working last semester to gather and organize public records in two rural counties for an OpenBlock application met with a number of obstacles (which they describe in their blogs) – ranging from significant photocopying fees to inappropriate redactions and denial of access to public information.

Even when acquisition of public datasets is relatively simple – for example, public health restaurant inspections — someone must request that data from a specific county be exported in fielded data format. It is inefficient for each rural news organization to make separate requests for this data in each of North Carolina’s 100  counties. In these cases, our public records coop would outline an initial request for the data for each county.

What’s New?

Currently there is no tool or service that can efficiently gather, format and publish public records on rural news organizations’ sites. In part, this is a technology problem that may soon be overcome with the alpha rollout of OpenBlock later in 2011. But a much bigger piece of the problem is the data itself – neither OpenBlock nor any other technology has the ability to obtain public records as fielded digital data and create a newsworthy user interface for all the various types of records a news organization might need.

Without a project like this there is no indication that OpenBlock will be a viable option for papers in rural communities.

What Will Change?

By the end of the project, we will have

•          at least one member-owner in each county in North Carolina

•          at least 12 news organizations subscribing to the service

•          at least one type of schema for which we’ve collected data from each county

Most importantly, we will have raised public awareness of open government and we will start seeing rural counties and towns publish public data in standardized, machine-readable formats on the Web.

What tasks/benchmarks need to be accomplished to develop your project and by when will you complete them?

How will you measure progress?

Do you see any risk in the development of your project?

How will people learn about what you are doing?

Is this a one-time experiment or do you think it will continue after the grant?