Repositories Update Continued: VoteStream Dominates

August 22, 2015 E. John Sebes

Today we provide another follow-up to our continuing report on our Repositories and source code development efforts. As others of the Core team have mentioned when contributing posts to the OSET blog (verses the TrustTheVote Project Blog), we appreciate the audience is diversifying over here, and want to forewarn you that parts of what follow get kinda geeky but we try to provide links for those curious to learn more. (Also Note: The TrustTheVote site is about to be re-launched within the next month, so we're trying to limit blog posts over there.) Anyway, we suspect what makes it geekish more than anything are code-names and acronyms. We’ll try to minimize the alphabet soup. OK, here we go…

We’ve been crazy busy, and I’ve not had a chance to follow-up with the balance of posting about our repositories and our work on VoteStream from nearly a month ago, although one of our code contributors recently provided a review of another element—Horatio. So, I add my next installment below. This is a brief supplement, plus a summary of work-in-progress on related standards committee work.

In my last post I explained a 3-tier architecture for our open-source reference implementations of common data formats emerging from standards
work. The repositories I mentioned earlier are VEDaSpace, VEDaStore, VEDapiService. These software packages are the 3-tier set for the existing standard on election results reporting.

Here, I want to point you to early work, that's parallel, and related to election participation and performance (“P&P”) reporting—a key new area of capability for VoteStream, in our 2nd phase of work funded by the Knight Foundation. The next VoteStream-related post will have more on P&P generally, but for now let me explain the repositories (as that’s more the intent here as part of our periodic update). One of them is VITALspace. Like VEDaSpace, it's a library for parsing a common data format to enable direct data manipulation, but in this case the data set consists of log data about election administration transactions (described below). We use this package in VoteStream already, and as we progress, we will add VITALstore and VITALapiService to complete the three tiers. The same is true of the new repo VODaSpace to be supplemented by VODaStore and VODapisService.

To understand how these common data formats—and our reference software for them—relate to election participation and performance, here are some examples of the various types of VITAL records. Each record has a privacy-preserving unique ID for the voter, and describes a particular type of event about the voter:

Voter cast ballot in person;
Voter's absentee ballot was accepted;
Voter's absentee ballot was rejected (and why);
Voter's provisional ballot was accepted;
Voter's provisional ballot was rejected (and why);
Voter's absentee ballot request was accepted (or rejected and why);
Voter's change-of-address update was accepted (or rejected and why);
Voter's registration was suspended or canceled, and why.

VODa records complement VITAL records by using the same privacy-preserving unique ID for the voter, and providing demographic characteristics such as year of birth, zip code of residence, party affiliation, voter status (regular, absentee, military, overseas, etc.), and voting precinct assignment. These data enable the construction of demographic profiles down to the precinct level.

Finally, many of the dozens of state and local election officials who attended the Election Assistance Commission’s Data Summit recently will immediately recognize both these kinds of common data formats as exactly the basis for the raw data behind the federally-required election participation reporting, called the Election Administration and Voting Survey or “EAVS.” This periodic report is an imperative tools for analyzing the effectiveness of public elections. We need to innovate how that data is compiled, analyzed, and published. Of course, the first step is standardized formats. There's a strong sentiment that if we can standardize a data format for the base data, then not only will this type of reporting get much easier, but also other kinds of reporting will be enabled as well.

In fact, additional innovative types of reporting are exactly what we're doing with this P&P data in VoteStream. We're collaborating with election officials on the data format, harvesting data in these formats, including the data in VoteStream, presenting P&P views, and we working on combined views of election results and P&P data—think “an election results map combined with a voter turnout map combined with a map showing your choice of voter demographic.”

That's why we're spinning up standards groups in both areas, under the auspices of the NIST-managed interoperability working-group. These groups are open, so for the data-heads among the readers, I urge you to learn more here, and even participate!

As NIST's new standards program grows to include more working groups on topics like these, we'll share them with you here.