Open Data: Providing access to historical Government of Canada studies

Canada’s Action Plan on Open Government details how the federal government is promoting transparency and accountability and encouraging citizen engagement by releasing unrestricted government data and information. Releasable information falls under two categories: structured data (machine readable) and open information (unstructured documents and multimedia assets). To make this information easily discoverable and reusable, it will be located on the Open Government website and made available under the unrestricted Open Government Licence. Structured data is made available through the Open Data portal of the website and unstructured information through the Open Information portal.

Library and Archives Canada is in the process of extracting and preserving datasets from outdated storage devices that are related to studies undertaken by federal departments. The studies cover a wide range of topics, such as the environment, health and immigration. The digital content from these studies, acquired since the early 1970s, is being converted from its outdated file structures and encoding schemes so it can be used by contemporary computers that are based on the ASCII encoding scheme.

Once the datasets are migrated, they will be made available on the Open Data portal. Codebooks that describe the file structure of the data and define the variables contained in each field will also be supplied. These migrated datasets will be in the form of raw data. To interpret and analyze the content in each file, you will require specialized software, such as a spreadsheet or a statistical tool. Raw data preserves the integrity of this archival content and will allow you to perform your own interpretation and analysis.

Stay tuned in the coming months for news about dataset releases.

Linked Open Data sets for the First World War

We are proud to report that Library and Archives Canada (LAC) has recently released a new data set on the Canadian Government Open Data portal as part of a First World War collaborative initiative with the Muninn Project. The project involved the partial transcription of the service records of soldiers who served in the Canadian Expeditionary Forces (CEF) during the First World War. LAC provided the digitized service files of 1,000 soldiers while the Munnin Project organized the crowdsourcing for the transcription and data linking of these historical documents. As a pilot project, the scope was limited to a specific medical form—the medical case sheet—which is found in most of the files and which contains information recorded by hospital staff on a specific soldier’s medical history.

Colour reproduction of a form which provides information on a soldier’s medical history. In this case, the soldier suffered a gunshot wound to the eye.

An example of a medical case sheet from the LAC collection – Private Addison Baker

The information that has been gathered from the transcriptions represents a spectrum of the types of health issues one would expect to occur in a large group of men. Some of the medical cases are directly related to combat injuries such as gunshot or shell wounds or shell shock. Others are related to the living conditions found in trenches which would increase ailments affecting the respiratory system and the outbreak of diseases such as influenza. A large proportion of the recorded information is just the everyday health issues of the time: toothaches, measles, etc.

To learn more about the information that was gathered from the service files, visit the First World War Linked Open Data project. The raw data is also available on the Canadian Open Data Portal in Linked Open Data and plain text format.