Bilingual census data: a better search experience for all Canadians

Web banner for The 1931 Census series. On the right, typed text: "The 1931 Census". On the left, moving train going by a train station.By Julia Barkhouse

This article contains historical language and content that some may consider offensive, such as language used to refer to gender, racial, ethnic and cultural groups. Please see our historical language advisory for more information.

Library and Archives Canada (LAC) is the guardian of Canada’s distant past and recent history. It holds the historical census returns for Canada, including some dating back to New France and some for Newfoundland. We have indexed some dating from 1825 to 1926, and these are available online through Census Search.

Before Confederation, censuses were generally collected in either English or French, depending on the location. The Dominion Bureau of Statistics (now Statistics Canada) phased in bilingual forms after Confederation in 1867.

Example of a bilingual Census 1921 form enumerated in English and French:

A census-taking sheet from Census 1921. This particular image is page 6 for the sub-district of Scots Bay in Kings District, Nova Scotia.

Census 1921 form enumerated in English (e002910991).

A census-taking sheet from Census 1921. This particular image is page 19 for the sub-district of Wolfestown (Township) in Richmond-Wolfe District, Quebec.

Census 1921 form enumerated in French (e003096782).

The language used to record answers to census questions may reflect the language preference of the enumerator or the language in which the answers were provided. The historical census data that we have reflects our linguistic duality as a nation. Census returns from Quebec and some parts of New Brunswick and Manitoba are written (or, enumerated) in French, while the rest of Canada was enumerated in English.

When our partners, including Ancestry and FamilySearch, indexed the censuses from 1825 to 1926, we produced a wealth of data with names of individuals, their gender, marital status, etc. However, we were faced with a serious challenge: census data could be collected in either English or French depending on the personal preference of the enumerator. So how did we handle this?

Life as an Enumerator

Let us detour for a moment and describe the journey of the enumerator. Enumerators were Canadians hired by the Dominion Bureau of Statistics to collect census data in one or more sub-districts. They received a book of instructions (such as this one for Census 1921) that detailed what they were supposed to write on the form depending on what people answered. They were given a booklet of census return forms and instructions on which sub-districts to enumerate. Then this person had a timeframe to enumerate a number of sub-districts and mail these forms back to the government department. You can imagine this person going from door to door in a horse-drawn carriage or perhaps an early automobile (maybe a Ford Model T) by 1921.

The enumerator knocked on the door and asked to speak to the head of the household (typically the father and/or husband). They might be invited in to sit at the kitchen table as they asked questions. If the family was not home, there might be a notice or calling card left on the door with contact details to follow up and meet the enumerator by a given date to be counted in the census.

Depending on the province in Canada, the enumerator either wrote down information in the language of the person speaking or in their personal language preference. Therefore, it is possible that French regions of Canada around Quebec, New Brunswick and Manitoba were enumerated in one or both languages depending on the enumerator’s personal preference.

Fast forward: the data captured on the forms was transcribed by our partners around 92 years after the census taking and put online.

Language Barriers

When LAC put these databases online, we noticed that we had data in both languages. If you wish to search for your ancestor, you have to search in the language of the enumerator. Did the enumerator write your grandmother’s information in English or French? Does your name have an accent (é, è …) that might have been misheard (or not captured) by the enumerator? Does your uncle’s name have a silent “h” that might have been omitted? This creates a language barrier for our researchers, who want to find people but do not speak the language used at the time. Some of our Francophone researchers have to search in English to find their French-speaking ancestors. This is an unbalanced search experience for Canadians who access our Census Search interface in French.

Creation of Census Search

When the Digital Access Agile Team reorganized and consolidated the 17 census databases into Census Search last November, we wanted to deliver a better search experience for all Canadians. Our aim was to provide the same search experience for Francophones as for Anglophones, so that any of our clients who use the French Census Search interface can search and get the same results as if they were searching in English.

So how do you do this? How do you translate information like gender, marital status, ethnic origin and occupation for over 44 million individuals to offer an equal experience for all Canadians? It’s actually very simple. The solution? Data cleanup.

A Peek Under the Hood

Let’s go behind the scenes for a moment to look at how census data is saved. Census Search is the public interface that LAC clients can use to search. The census data for each individual in Census Search is saved in one master table called EnumAll.

Census data saved in a table in SQL Management Server.

Screenshot of Census.EnumAll from SQL Management Server (Library and Archives Canada).

In this table, each line represents an individual person. The data captured about that person is separated into columns. If we do not have data in a particular column, it says NULL.

Creation of Common Data Pools

Census.EnumAll acts as the master data table. From this, we created common data groups (or, pools). What do I mean by this? We copied all of the data for one of the columns (Gender, Marital Status, Ethnic Origin, Religion, etc.) into a separate table. The only information in this separate table is a list of Genders or options for Marital Status, etc. We call this a common data pool, meaning that all the data in this table (or, pool) relates to one piece of information.

The common data table separates the data (e.g., “Male” or “Female”) from the individual person. If you look at 44 million individuals, you see the same data repeated, such as the number of times the enumerator wrote “Male” or “Married.” In a common table, you see “Male” only once, with a value count for the number of people with this information (which we call an attribute).

This is where the magic happens.

The Gender table in the back end of Census Search. Of note, there are variances (Male and M, Female and F) and two columns titled TextLongEn (English display) and TextLongFr (French display).

Screenshot of T_Gender from SQL Management Server (Library and Archives Canada).

As you can see in this separate common data table, we can do more things. With codes, we establish one way to write each Gender (in this example). This is called an authority. We then perform cleanup so all the variants point to this one authority. In the screenshot above, you’ll notice a variance between “Male” and “M.”

Once we have this authority, we create columns for how we want to display the information in Census Search. We create an English (TextLongEn) and a French (TextLongFr) display. We then add the bilingual translation once and it applies to everything. In this case, we translated “Male” to “Homme,” and it applies to all 20,163,488 people who identified as “Male” across 17 censuses.

We then put all the tables back together and index the records to display in Census Search. So depending on the language of your choice, the interface and the data itself will now translate for you.

English Census Search interface showing Gender drop-down with values for Female, Male and Unknown alongside the French Census Search interface showing the same drop-down with Gender values for Femme, Homme and Inconnu.

Screenshots of Census Search in English and French (Library and Archives Canada).

Now, when I search for my great-grandfather, Henry D. Barkhouse, and display any of his census entries, the data translates as well.

Two screenshots, one in English and one in French, of a Census 1911 record for Henry D. Barkhouse, with arrows pointing out where the data translates.

English and French display of Census 1911 record for Henry D. Barkhouse (e001973146).

Progress Check-in

As you can imagine, this work takes time as we diligently clean up and translate our data. Our first priority was to create drop-down menus on Census Search for Gender and Marital Status. Now, if you wish to search by either of these fields, you will see a short list of terms that are translated and available in both official languages. As we continue this work, our next priorities are Ethnic Origin and Place of Birth. We are about 60–70% finished with these two, and our clients should see new options coming to Census Search in 2024. After these two priority fields, we will continue to translate other fields like Religion, Relationship to head of household, Occupation, etc.

Conclusion

Consolidating all 17 censuses into one platform, Census Search, gave us the opportunity to create a bilingual display for our census data by cleaning up the data. Since its launch, our platform delivers a more equal search experience in the language of your choice. I encourage you to try it out and tell us if your search experience has improved.

As always, we love to read your feedback and ideas via our email or you can sign up for a 10-minute feedback session with us.


Julia Barkhouse has worked at Library and Archives Canada in data quality, database management and administration for the last 14 years. She is currently the Collections Data Analyst on the Digital Access Agile Team.

Found in translation: discovering Canadian literary translations

By Liane Belway

Discovering new and exciting books and authors is a rewarding experience for most readers. In Published Heritage—the library side of Library and Archives Canada (LAC)—we connect with the publishers who bring us these works and make our diverse published Canadian heritage accessible to a wider audience.

When Canadian publishers make material available, they deposit copies with LAC with the help of our Legal Deposit team. What kinds of material do we acquire in Legal Deposit? A wealth of Canadian content: books, music, spoken-word recordings, magazines and other serials, and digital material as well. Each offers a unique perspective on Canadian society and culture, reflecting the publisher’s vision, interests and identity. One source of new knowledge and literary artistry is the translation of such works, making these publications available to a completely new audience.

Canadian Translations

One way of making great literature available to wider audiences is through literary translation, an often overlooked literary skill but a highly valuable one in a multicultural and multilingual society. Translations offer a window into new perspectives and styles, and a chance to discover literary traditions and innovations often not otherwise easily accessible. In fact, the Governor General’s Awards have a category for Translation, acknowledging the value of bringing French-language works to new readers in English when they would not ordinarily have the chance to read them. Each year, this award recognizes the translation of a work into English for its literary excellence and cultural contribution.

Award Winners

The 2017 Governor General’s Literary Award for Translation was awarded to Readopolis, translated into English by Oana Avasilichioaei and published by BookThug in Toronto. It is a translation of Lectodôme by Bertrand Laverdure, published by Le Quartanier, a francophone publishing house in Montreal. The Peer Assessment Committee had high praise for Avasilichioaei: “In Readopolis, Oana Avasilichioaei has risen to and matched the stylistic acrobatics of Bertrand Laverdure’s Lectodôme. The many voices of Quebecois writing sing through in this intelligent translation – a vertiginous ode to the pure, if rarely rewarded, pursuit of literature.”

David Clerson’s Brothers, a worthy finalist for the same award in 2017, also offers an excellent introduction to a new publisher’s vision. QC Fiction, an imprint of Baraka Books with a fresh perspective, is a Quebec-based English-language book publisher in Montreal. Recognizing the value of translations, QC Fiction’s goal is to publish contemporary Quebec fiction originally published in French, in English translations for a wider Canadian and international audience. Another QC Fiction title, I Never Talk About It, contains 37 stories and as many translators. As Fiction editor Peter McCambridge states, “37 different translators to translate each of the short stories published in a collection by Véronique Côté and Steve Gagnon. It’s a reminder that there are at least 37 different ways to translate an author’s voice—something to consider the next time you pick up a book in translation!”

Six colourful book covers with similar designs laid out side by side, displaying all titles: Listening for Jupiter, I Never Talk About It, Behind the Eyes We Meet, Brothers, The Unknown Huntsman, Life in the Court of Matane.

A selection of publications from QC Fiction, including Brothers (2016), the finalist of the Governor General prize for translation. Image used with permission from QC Fiction.

Providing works in translation allows audiences outside of Canada access to a large and, in our ever more connected world, growing national literature, and Canadian authors are enjoying an increasingly international audience. QC Fiction is also a great example of Canadian fiction’s global appeal. Says McCambridge: “So far the formula seems to be working: 3 of our first 5 books have been mentioned in The Guardian newspaper in England and bloggers from Scotland to Australia have picked up on what we’re doing and praised our ‘intriguing light reads.’”

With these award-winning publishers—just two examples of the innovative work in the world of Canadian literary translations—Canadian publishing remains a creative, varied, and thriving world that LAC strives to collect and preserve for readers now and in the future. To see what else LAC has in its collections, try our new search tool at: http://www.collectionscanada.gc.ca/lac-bac/search/all.


Liane Belway is the Acquisitions Librarian for English monographs in the Published Heritage Branch at Library and Archives Canada.

Introducing Co-Lab: your tool to collaborate on historical records

A turquoise banner with the words Co-Lab: Your collaboration tool Crowdsourcing has arrived at Library and Archives Canada (LAC). You can now transcribe, add keywords and image tags, translate content from an image or document and add descriptions to digitized images using Co-Lab and the new Collection SearchBETA.

Take on a challenge

To make it even more interesting, we will launch what we call “challenges”.  These challenges are content put together under a theme. For example one of our first challenges is on Rosemary Gilliat (Eaton)’s. Your challenge will be to transcribe her diary and describe her photographs from her Arctic travels. Or instead, try your hand at transcribing the love letters from Sir Wilfred Laurier to his sweetheart and future wife, Zoé – another challenge now available.

A screenshot of the Co-Lab Challenges page showing what challenges are available.Contribute using Collection SearchBETA

When you are conducting research using our new search tool and find images, you’ll see that you have the option to “enable this image for Co-Lab contributions”. After answering just a few short questions, you can enable an image found in Collection SearchBETA for Co-Lab use and transcribe/translate/tag/describe to your heart’s content. If an image has already been enabled for Co-Lab use, you’ll be able to add your own or edit the contributions of others’. If you create a user account, you will be able to keep track of your contribution history and be able to hear about new challenges and updates to Co-Lab.

A new way to view images

A screenshot of an excerpt of a handwritten letter in a window and on the right-hand side there’s a space to transcribe the letter and underneath is a box with the transcription status.

The launch of Co-Lab also introduces a new image viewer – which lets you scroll to zoom in on different parts of the image, or click and drag to move around the image itself. This is particularly useful when looking to transcribe or add keywords and image tags to describe small details!

What if something’s wrong?

It’s inevitable that mistakes will be made, especially when transcribing handwritten documents. Every image in Co-Lab is subject to review by other crowd members. If you see something written incorrectly, go ahead and edit it yourself, or mark it as “Needs review” for others to take a second, or third look.

The best thing about this new tool is that every contribution made by the public directly benefits fellow researchers and improves access. Every addition to a record becomes new metadata – which is searchable within 24 hours, helping LAC’s records become more “discoverable” day after day. Transcription of textual material that was previously just digital images also becomes accessible to those who use text-to-speech machines or screen readers, and translation of transcribed documents opens the door to unilingual Canadians.

For more info and frequently asked questions, you can read the About Co-Lab page. If you’re ready to start contributing, give a hand to history and try Co-Lab now.

The Governor General’s Literary Awards for 2013

The Governor General’s Literary Awards are given annually to the best English-language and the best French-language book in each of the seven categories of Fiction, Literary Non-fiction, Poetry, Drama, Children’s Literature (text), Children’s Literature (illustration) and Translation.

Every year, Library and Archives Canada works to ensure that each Canadian nominee is acquired, catalogued and made available prior to the final announcement of the winners. Usually, this is done through legal deposit, but in some cases the nominated books are not published in Canada and need to be acquired through other means so that a complete selection of the Governor General’s nominees are preserved for future generations.

Congratulations to all!

Fiction
English
The Luminaries, by Eleanor Catton (AMICUS 41787649)
French
Quand les guêpes se taisent, by Stéphanie Pelletier (AMICUS 40915742)

Poetry
English
North End Love Songs, by Katherena Vermette (AMICUS 40823688)
French
Pour les désespérés seulement, by René Lapierre (AMICUS 40824154)

Drama
English
Fault Lines, by Nicolas Billon (AMICUS 41530643)
French
Bienveillance, by Fanny Britt (AMICUS 41316358)

Non-Fiction
English
Journey with No Maps: A Life of P.K. Page, by Sandra Djwa (AMICUS 40812690)
French
Aimer, enseigner, by Yvon Rivard (AMICUS 40909709)

Children’s Text
English
The Unlikely Hero of Room 13B, by Teresa Toten (AMICUS 41749214)
French
À l’ombre de la grande maison, by Geneviève Mativat (AMICUS 40696767)

Children’s Illustration
English
Northwest Passage, by Matt James (AMICUS 40320781)
French
Jane, le renard et moi, by Isabelle Arsenault (AMICUS 41921688)

Translation
English
The Major Verbs, by Donald Winkler (AMICUS 40717619)
French
L’enfant du jeudi, by Sophie Voillot (AMICUS 40772400)