Unleashing the Potential of Election Data

Section 2: Open Election Data Principles

Principle 6: Election data is open when it is in a non-proprietary format


Proprietary formats are, by definition, restrictive either of the ability to use the information or the ability to share the information. Thus, in order for election data to be open, it must be in a format over which no entity has exclusive control. The Open Knowledge Foundation's Open Definition defines an open (i.e., non-proprietary) format as one "which places no restrictions, monetary or otherwise, upon its use" and goes on to specifically state that, "at the very least, [the data] can be processed with at least one free/libre/open-source software tool." In addition to the goal of making election data widely available, there is a practical reason for avoiding proprietary formats: they are not created with permanence as an explicit goal. Proprietary formats are controlled by a specific software company or entity and those entities may stop supporting their proprietary file format or go out of business, which would mean support for that proprietary format would end.

Recommended open file formats

As mentioned in the principle that open election data is analyzable, CSV, XML and JSON are considered open formats. In contrast, file formats such as XLS and DOC are proprietary formats owned by Microsoft and are thus not open. PDF was previously a proprietary format, developed to ensure the layout of the document would roughly stay the same regardless of the operating system. In 2008, Adobe released PDF as an open (i.e., non-proprietary) standard. However, as mentioned before, PDF files are not recommended for data that is to be analyzed, although they are useful for laying out information in a format that is to be printed or read.