File:Animation of a video interview using phone.gif

Recording native speakers speaking their languages is critical to the humanity. Language documentation helps in keeping historical records of a society, protecting cultures and furthering the growth of languages through practical use. OpenSpeaks is a lab that has learning resources for the digital documentation of languages.

These resources are divided into four interrelated chapters: a) Consent, Copyright and Open Licensing, b) Multimedia (audiovisual) recording, c) Metadata collection and publication, and d) Accessibility.

Different practitioners use different methodologies for the recording as their purpose of the documentation is different. For instance, linguistic documentary as a practice focuses on collecting the rich linguistic data of a language (Seyfeddinipur​1​)  like recording the everyday conversation in a marketplace whereas documentary filmmakers focus on the aesthetics and storytelling.

Table of Contents

    Chapter 1: Consent, Rights, Copyright and Open Licensing

    A 2020 Creative Commons Global Summit session titled “Building consent from bottom up” helped brainstorm ideas around this Chapter. Many of the collected ideas contributed in furthering this chapter and the development is still under process. Please submit your ideas here to improve this. The development of this Chapter was also possible through a Creative Commons Global Network Communities Activity Fund in 2020.

    Consent is about ethical acquiring one’s permission to record. In the context of language documentation it means that the person who is interviewed has voluntarily given permission for the recording and the publication of the recorded media content. This chapter will guide the aspects of how, when, where and who for acquiring consent. Rights are a broader area which details the different kinds of rights that individuals and communities are entitled to. Human rights and digital rights are some of the relevant areas under rights.


    Rights are tied to ownership of the content. Copyright is the legal ownership of a recording. It protects the use of content without a legal contract or agreement with the legal copyright holder. A copyright holder is the person or organization that has the legal ownership of the content. Please note that copyright is only applicable for the content that is originally created by someone (often referred in the English language as an “author” or a “content producer” e.g. a singer or performer).

    There is also a term called “moral rights” that is in use. Moral rights always stay with the “original creator” of a content. But copyright is transferable. Let us look at a common example. If a particular original song is created and sung by a performer then she/he would naturally have the moral rights over that song. Now imagine a record company purchases the sole right of that song from the performer. Both the company and the performer sign a mutual contract to agree. Then the copyright gets transferred from the performer to the record company. Moral rights are an ethical right whereas copyright is a legal right. Because of the moral rights, many works that are out of copyright are still attributed to the original creator.

    Open licensing
    Creative Commons is a nonprofit and a steward of open licensing. (logo trademarked by Creative Commons)

    Open Licensing is about publication of the recorded content under a license that allows others to openly access, use and reuse recorded content with or without attribution. A set of licenses called the Creative Commons Licenses (or CC licenses) are widely used for content (and sometimes data). Creative Commons Licenses allow the users to use, make derivative versions or remixes, and redistribute. These license range from the CC0 License (most open and least restrictive) to the CC-BY-NC-ND License (as restrictive as “All Rights Reserved”). CC0 is also equivalent to Public Domain and it allows anyone to commercially use a work without any permission and they do not even have to attribute the author. On the other hand, other Creative Commons Licenses like the CC-BY-SA Licenses encourage attribution to the author and even use a compatable License category i.e. CC-BY-SA.

    Chapter 2: Multimedia (audiovisual) recording

    This chapter details process of audiovisual recording the use of languages. first languages. This four-module toolkit designed keeping beginner/intermediate-level archivists who are working on building digital archives for any language though the focus has been on underrepresented communities and indigenous languages. One needs to have some basic understanding of audio-visual documentations to begin with this toolkit.

    Module 1: Basics of audio-visual recording

    An overview of what are aimed from the recording process and how to go about it.


    1. Be honest and ask your interviewee to be honest
    Language is a very sensitive element of a society. When any known/unknown mistakes like mispronunciations get recorded and shared publicly, native speakers might take an offense. So, please check with your interviewee to ensure that you document any unintended mistakes in the description part of the video/audio while publishing. You might not always be able to delete portions of such unintended mistakes but you can always admit that there is any unintended mistake that got recorded. Similarly, if the interviewee is not a native speaker and is trying to learn a language, you should mention clearly about that. The real native speakers will welcome such honesty.
    2. Imagine yourself out in the field interviewing someone speaking a language that you don’t probably understand
    Think of the challenges that you might face—the loss in translation, the lack of your understanding of their cultural/linguistics nuances. Are you going to use a language that is mutually intelligible by you both or get the questions translated or just have a translator along with you to assist?
    3. Plan in advance and practice well
    Planning for a documentation starts with knowing your interviewee(s) well. Do some research about their language, culture, and may be a few most used phrases in their language that you can say to amaze them while interviewing them. People generally appreciate when someone alien makes an effort to speak in their language. Use a spreadsheet or even an app to have a rough and agile plan. Things might change while interviewing and you need to be prepared for the same. Also, have a plan B in case anything fails. If you’re someone who gets a cold feet while meeting a stranger, write down and practice your questions with a friend/family member or in front of a mirror.
    4. Know your hardware and software
    As you are going to rely on your recording equipment and software (you will learn about them in the next module), it’s important that you know well about them. But how well is well? Well, as long as you know the ins and outs of your gears and some troubleshoot in case of emergency. For instance, if you’re planning to use your phone for the audio and video recording, check what apps are best for your workflow. It’s advisable to use apps (e.g. Filmic Pro for iOS devices) that show the audio levels on screen while recording so you know for sure that the audio is indeed being recorded.
    5. Keep a notebook/note-taking app to capture some important data
    Physical/digital note-taking while recording always helps during post-production. Also, you need to capture some metadata (more in Module 3) for which you can use the note or use a printed template. But please keep in mind that the noise you might make while writing might get recorded so choose your pen carefully.
    6. Ensure you get to record in a quiet place
    The most challenging aspect of any recording in a quiet place for clean audio and and well-lit place for good quality video. Check below to know what to avoid:
    Noise sourcesPossible solutions
    Ambient noise (Audio)
    1. Talk to the interviewee before recording to check what could be the least noisy place where you’re going to record
    2. If you can, get a lavalier microphone (also known as lav mic, lapel mic, clip mic, etc.) so that you get a nice clean sound as it is placed close to the interviewee’s face
    LED and other home electric lights (Video)Most home lights, when captured in a camera, look flickering and disturbing. When you’ll learn more about the solution for such issues in the next module, avoid home lighting and use lights that are recommended (more here) for filing if you can afford. Alternatively, if you’re filming during the day, you can sit close to a window with the subject’s face lit with the natural lighting.

    Interview process

    • Friendliness and empathy: The best emotion is captured when your interviewee trusts you the most. Try to be empathetic and friendly, relate to them in a human level and keep a check on their comfort level. They would open up to share something that they care about only when they think they can trust you. Trust is built over time. How do you bring it in a short interview?
    • Ice braker questions: You can always ask some trivial ice-breaking questions in the beginning and slowly move towards asking more personal questions.
    • Body language: In a physical interview, your body language matters much more than a telephonic or voice/video call. Positive body posture can entirely set the mood of the subject. So a thumb rule is be a good listener and show curiosity to learn from the interviewee. But when you’re interviewing someone speaking a endangered language that is alien to you, you still can start with the same body posture. Even though you won’t understand the vocabulary, being empathetic and trying to relate by observing the interview’s emotional flow. You could reflect that by the right kind of camera moves.
    • Motion is emotion: Documenting a language is not just about placing a camera on a tripod and interview someone though that’s a good starting point. But you need to capture the life of someone on the camera if you’re capturing them saying about their life. If a picture means a thousand words, a video means a million! So, take some ample amount of time to shoot some b-rolls. For instance, if your interviewee has narrated about a bedtime story during the interview, capture some relevant shots—like kids sitting around an old person, or parents with kids. B-rolls are generally short so shoot really tiny videos (30 seconds – 1 minute max.) and cover a wider range of areas because you never know where you can use them. You can use the b-rolls as cut shots.

    Module 2. Hardware and software for recording, and recording process

    a. Audio recording


    Different scenarios:

    1. Home studio: If you’re recording at home, try to create a minimal setup You need a microphone to be able to record the audio. If you can, I would suggest to record in a small home studio setup like the picture above (consists of a USB microphone, a computer, and a monitor headphone).
    2. Field recording with a recorder or phone: The recording setup will largely vary if you are meeting someone outside your home for a field recording. In that case you will need to carry an audio recorder or a smartphone (some sort of recording app installed in it) with earphones. If you’re using a portable recorder make sure you cover the top of the mic with a soft cotton cloth or fake fur to a) avoid dust going inside, and b) the sound of the wind during outdoor recording. Use a rubber band to tighten the base and never touch the cloth/fur while recording. Mics can capture small little movements and completely distort the audio.
    3. Recording from phone: Earphones that come with the phones generally work both for phones and computers as compared to the default microphone provided along with . However, avoid sitting in an open space as there is a high probability of a lot of noise being captured unless if you are using a shotgun microphone.
    4. Audio editing software: If editing from a computer, Audacity, a free and open source audio editing software is the first choice for many seasoned recording artists. It is robust, easy to use and can be used in multiple platforms. If you are using your phone or tablet to record and edit the audio, then, use your native recording app or try to find a good free alternative in your respective app store. Ideally the recording/editing app should be allowing you to record in a decent lossless quality (minimum requirement is 44100 Hz, above 16 bit PCM i.e. 24 or 32 bit, above 220 kbps; check your settings to find these). Save the audio in .WAV or .FLAC (Audacity supports both). If your recorder/phone does not support these formats, try to use an app/online converter like this (MP3→FLAC or M4A→FLAC) to convert the audio into .FLAC.
    How to make your vocals clear and loud in Audacity (tutorial in English, watch an Odia-language version here)

    b. Video recording

    Which camera to use
    Frankly speaking, the video is less important here as compared to the audio. With low quality video, viewers would still be able to manage if the audio is loud and clear. So if you are keen on investing, invest on a good quality microphone that can either be connected with the camera or can be used as a secondary recorder. But do not trust your camera’s default microphone. They can literally jeopardize your hard work. As far as the camera goes, you can literally use any camera that allows you to record in a decent quality i.e. above 720p (1280×720 px)—from your phone to a point and shoot camera to a dSLR.

        a) Using a camera: Use a shotgun microphone that can be connected directly into your camera so that you don’t need to invest much on audio syncing during post production.
    b) Using a phone for recording video: These days most phones come with high quality hardware that are capable of recording good video. But the real key to recording quality video in a phone lies in stabilizing the shot while recording. You can only do that by investing in a small tripod (they are generally really cheap and do the job) that can hold your phone. For this particular project, tripods will be the best.

    How to edit the videos: You need to compress the video using a free software like Handbrake, and upload that into YouTube or something similar without making it public. We will download it and ask you to delete so that you don’t have to worry about the amount of space it will take in your hard drive.

    Chapter 3: Metadata collection and publication

    Annotation, subtitling of audio/video, translation of transcription and other content

    Download Content Release form (editable document in .odt and .docx, fillable form in .pdf); Metadata Documentation Sheet in .ods, .xlsx)

    Metadata is a set of important information that is often collected and

    Transcription and annotation

    Transcription is the process of converting an oral recording into text. It simply means that the transcribed text matches what an interviewee has said. Hence, transcription can currently be done in a language that has an established writing system or script, and the script is encoded with the universal standard Unicode. There are two kinds of transcriptions that are generally used: a) verbatim transcription which includes even the mistakes (like fillers and stutters) a speaker makes during their natural speaking, and b) non-verbatim transcription where many common mistakes are simplified to make more meaningful and readable text. Transcribing audio or video content also helps translation of recorded content and make the content easily discoverable and searchable as well.

    See the Web Accessibility Initiative (W3C) guidelines for more details on how to create better transcriptions.

    Annotation is the process of collecting certain metadata that are not necessarily transcriptions. Audio/video content will surely need subtitles in largely spoken languages like English for a wider coverage. Transcriptions are generally created to have a verbatim version of the interview. Ideally, you need to work post-interview with a native speaker to create the transcription to ensure there is no loss of information in the process. However, transcription is not a easily digestible. So you need to create summaries for each section of the interview which will capture the highlights and sometimes details (for instance a game play or story).

    Chapter 4: Accessibility

    Captioning and subtitling

    Making the recorded content accessible for the people with disability is extremely crucial. A deaf person cannot listen to an audio recording or the sound of a video recording. So, it is very important to create captions or subtitles for your audio or video recording. Captioning is a time consuming process and it needs some extra time allotment and budget. There are many ways to add caption. We recommend Aegisub (user manual) for captioning on computer as it supports all platforms (Windows, Mac and other Unix operating systems). Many modern video editors also support captioning. If you are collaborating with remote translators then Amara is a recommended option. It is an Open Source video subtitling platform (learn how to use it from here). Popular platforms like Internet Archive, Vimeo and YouTube are supported on Amara. YouTube also supports an in-built Closed Captioning. We strongly recommend the comprehensive guides that BBC has created (short version here, long version here) to learn how to create accessible captioning.


    1. What is OpenSpeaks?
      OpenSpeaks is a set of free and open resources that are intended to help anyone who is documenting a language. It includes guides on asking for consent before recording, how to record a language in a multimedia format, the process of selecting copyright for a recording, recording metadata (important information that is useful for archival), and publishing the content. It also contains downlodable forms and other templates.
    2. How this project is maintained?
      OpenSpeaks was originally started on Wikimedia Commons, a sister project of Wikipedia, by Subhashish Panigrahi. Later, it was housed here at the O Foundation. To make it a truly open project, it was mirrored on Wikiversity at so that anyone can edit and improve it. Both the versions are synchronized on a regular basis.
    3. How is it different than other resources?
      We think of OpenSpeaks as a directory of resources and a platform that is complimentary to other similar platforms. Many useful resources that are developed by  language documentation organizations and other leaders are included and attributed here as well.
    4. I am interested to contribute to OpenSpeaks. How can I help?
      You can certainly help grow OpenSpeaks. No skill is a small skill and your contribution would be valuable. Please go to the Wikiversity site to log on using your existing Wikipedia or other Wikimedia project credentials or create a new account there (a different set of Privacy Policy applicable as Wikiversity is a non-OFDN site).
    5. Will I be attributed when someone uses my contributed work?
      Yes. Both the License terms and the attribution guide below encourage attributing to the authors with a hyperlink to the list of authors.
    6. Can I use the content of this website or the OpenSpeaks page on Wikiversity?
      Yes. We encourage everyone to make use of this content in their own work, translate, and even distribute for commercial reproduction. However, when you do that, please attribute (see next answer for details) properly.
    7. What license OpenSpeaks is available under and how to attribute when I use any content?
      OpenSpekas is currently licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0) License (see license terms here). If you use the content from here from this website, cite as below:
      O Foundation (OFDN) (January 15, 2021) OpenSpeaks Multimedia Toolkit. Retrieved from
      "OpenSpeaks Multimedia Toolkit." O Foundation (OFDN) - January 15, 2021,
      O Foundation (OFDN) August 26, 2019 OpenSpeaks Multimedia Toolkit., viewed January 15, 2021,<>
      O Foundation (OFDN) - OpenSpeaks Multimedia Toolkit. [Internet]. [Accessed January 15, 2021]. Available from:
      "OpenSpeaks Multimedia Toolkit." O Foundation (OFDN) - Accessed January 15, 2021.
      "OpenSpeaks Multimedia Toolkit." O Foundation (OFDN) [Online]. Available: [Accessed: January 15, 2021]

      However, if you’re citing anything from Wikiversity, you need to attribute to all the contributors as below:
      Wikiversity contributors. OpenSpeaks Multimedia Toolkit. [OER] Published September 23, 2020. Accessed MMM DD, YYYY.
      (NOTE: Replace the YYYY with the actual year above e.g. 2020. Similarly, replace MMM with the month e.g. September and DD with the date 23)


    1. 1.
      Seyfeddinipur M, Rau F. Keeping it real: Video data in language documentationand language archiving. Language Documentation & Conservation. 2020;14:17.

    Recommended resources

    1. [Course] “Archiving for the Future: Simple Steps for Archiving Language Documentation Collections“. Accessed 30 September 2020.
    2. [Online guide] “Language Sustainability Toolkit“. Living Tongues Institute. Accessed 30 September 2020. (Archive, also see other recommended educational resources by Living Tongues)