Tutorials

JavaScript Speech to Text and Text to Speech Note Taking App

In this tutorial, you will learn how to implement Speech to Text and Text to Speech functionality using JavaScript. Basically, we will build a Voice Note-Taking App from scratch.

Live Demo

This app uses the Web Speech API to build a voice-powered note app to do 3 things:

  • Take notes by using voice-to-text or keyboard input.
  • Save voice notes to localStorage.
  • Display all of the saved notes and give the option to either Listen to the Notes or delete them.

The source code of this Voice Note App is written using HTML5, CSS3, JavaScript, Bootstrap, and Web Speech API. I have provided the complete source code in this article, but you have to download the image files using the download link given at the end of this tutorial.


Intro to Web Speech API and Where I Used it?

The Web Speech API provides two distinct areas of functionality.

  • Speech Recognition
  • Speech Synthesis (also known as text to speech, or TTS)

Our Voice Note App is divided into two separate interfaces.

The first will be “Speech Recognition” which will involve receiving speech through a device’s microphone, which is then checked by a speech recognition service against a list of grammar (basically, the vocabulary you want to have recognized in a particular app.) When a word or phrase is successfully recognized, it is returned as a result (or list of results) as a text string, and further actions can be initiated as a result.

The second is “Speech Synthesis” (aka text-to-speech, or TTS) which involves receiving synthesizing text contained within an app to speech, and playing it out of a device’s speaker or audio output connection.


JavaScript Speech to Text and Text to Speech Note Taking App – Full Source Code

index.html

<!DOCTYPE html>
<html>
<head>
    <meta charset="utf-8">
    <meta http-equiv="X-UA-Compatible" content="IE=edge">
    <title>Voice Note App</title>
    <meta name="description" content="A Voice Note App that allows you to take voice and/or text notes and play them back.">
    <meta name="viewport" content="width=device-width, initial-scale=1">
    <link rel="shortcut icon" href="assets/ico/favicon.ico">
    <!-- Custom Style Sheet -->
    <link rel="stylesheet" href="assets/css/style.css">
    <!-- Bootstrap CDN CSS -->
    <link href="https://stackpath.bootstrapcdn.com/bootswatch/4.1.1/cerulean/bootstrap.min.css" rel="stylesheet" integrity="sha384-0Mou2qXGeXK7k/Ue/a1hspEVcEP2zCpoQZw8/MPeUgISww+VmDJcy2ri9tX0a6iy" crossorigin="anonymous">

</head>

<body>
    <div class="container-fluid align-center">
        <img src="assets/img/code-mic-150.png" alt="Voice Note App Logo">
        <h1>Voice Note App</h1>
        <p class="page-description">This app allows you to take voice and/or text notes and play them back.</p>
        <hr>
        <h3 class="no-browser-support">Sorry, Your Browser Doesn't Support the Web Speech API. Try Opening This Demo In Google Chrome.</h3>
        <div class="app">
            <div class="row">
                <div class="col-md-6 align-center">
                    <h3>Add New Voice Note</h3>
                    <div class="input-single">
                        <textarea id="note-textarea" placeholder="Create a new note by typing or using voice recognition." rows="6"></textarea>
                    </div>
                    <button id="start-record-btn" class="btn-success" title="Start Recording">Start Recognition</button>
                    <button id="pause-record-btn" class="btn-warning" title="Pause Recording">Pause Recognition</button>
                    <button id="save-note-btn" class="btn-info" title="Save Note">Save Note</button>
                    <p id="recording-instructions">Press the
                        <strong>Start Recognition</strong> button and allow access.</p>
                </div>
                <div class="col-md-6 align-center">
                    <h3>My Voice Notes</h3>
                    <ul id="notes">
                        <li>
                            <p class="no-notes">You don't have any notes.</p>
                        </li>
                    </ul>
                </div>
            </div>
            <!-- /row -->
        </div>
        <!-- /app -->
    </div>

    <div id="footer">
        <div class="clearfix1">
            <div class="container">
                <div class="row">

                    <div class="center">
                        <p>Speech to Text Voice Note App 2018</p>
                    </div>

                </div>
                <!-- /row -->
            </div>
            <!-- /container -->
        </div>
    </div>
    <!-- /footer -->

    <script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.2.1/jquery.min.js"></script>
    <script src="assets/js/script.js"></script>

</body>

</html>

/assets/css/style.css

ul {
    list-style: none;
    padding: 0;
}

p {
    color: #444;
}

button {
    margin-bottom: 20px;
    padding: 10px 10px;
}

button:focus {
    outline: 0;
}

.container {
    max-width: 700px;
    margin: 0 auto;
    padding: 30px 50px;
    text-align: center;
}

.container h1 {
    margin-bottom: 20px;
}

.page-description {
    font-size: 1.1rem;
    margin: 0 auto;
}

.tz-link {
    font-size: 1em;
    color: #1da7da;
    text-decoration: none;
}

.no-browser-support {
    display: none;
    font-size: 1.2rem;
    color: #e64427;
    margin-top: 35px;
}

.app {
    margin: 40px auto;
}

#note-textarea {
    margin: 20px 0;
    width: 80%;
}

#recording-instructions {
    margin: 15px auto 60px;
}

#notes {
    padding-top: 20px;
}

.note .header {
    font-size: 0.9em;
    color: #888;
    margin-bottom: 10px;
}

.note .delete-note,
.note .listen-note {
    text-decoration: none;
    margin-left: 15px;
}

.note .content {
    margin-bottom: 30px;
}

.align-center img {
    padding-top: 10px;
}

.align-center {
    text-align: center;
}

.center {
    text-align: center;
    width: 100%;
}

@media (max-width: 768px) {
    .container {
        padding: 50px 25px;
    }
    button {
        margin-bottom: 20px;
        padding: 10px 10px;
        width: 80%;
    }
}

/assets/js/script.js

try {
    var SpeechRecognition = window.SpeechRecognition || window.webkitSpeechRecognition;
    var recognition = new SpeechRecognition();
} catch (e) {
    console.error(e);
    $('.no-browser-support').show();
    $('.app').hide();
}


var noteTextarea = $('#note-textarea');
var instructions = $('#recording-instructions');
var notesList = $('ul#notes');

var noteContent = '';

// Get all notes from previous sessions and display them.
var notes = getAllNotes();
renderNotes(notes);



/*-----------------------------
      Voice Recognition 
------------------------------*/
// If false, the recording will stop after a few seconds of silence.
// When true, the silence period is longer (about 15 seconds),
// allowing us to keep recording even when the user pauses. 
recognition.continuous = true;

// This block is called every time the Speech APi captures a line. 
recognition.onresult = function(event) {

    // event is a SpeechRecognitionEvent object.
    // It holds all the lines we have captured so far. 
    // We only need the current one.
    var current = event.resultIndex;

    // Get a transcript of what was said.
    var transcript = event.results[current][0].transcript;

    // Add the current transcript to the contents of our Note.
    // There is a weird bug on mobile, where everything is repeated twice.
    // There is no official solution so far so we have to handle an edge case.
    var mobileRepeatBug = (current == 1 && transcript == event.results[0][0].transcript);

    if (!mobileRepeatBug) {
        noteContent += transcript;
        noteTextarea.val(noteContent);
    }
};

recognition.onstart = function() {
    instructions.text('Voice recognition activated. Try speaking into the microphone.');
}

recognition.onspeechend = function() {
    instructions.text('You were quiet for a while so voice recognition turned itself off.');
}

recognition.onerror = function(event) {
    if (event.error == 'no-speech') {
        instructions.text('No speech was detected. Try again.');
    };
}



/*-----------------------------
      App buttons and input 
------------------------------*/
$('#start-record-btn').on('click', function(e) {
    if (noteContent.length) {
        noteContent += ' ';
    }
    recognition.start();
});


$('#pause-record-btn').on('click', function(e) {
    recognition.stop();
    instructions.text('Voice recognition paused.');
});

// Sync the text inside the text area with the noteContent variable.
noteTextarea.on('input', function() {
    noteContent = $(this).val();
})

$('#save-note-btn').on('click', function(e) {
    recognition.stop();

    if (!noteContent.length) {
        instructions.text('Could not save empty note. Please add a message to your note.');
    } else {
        // Save note to localStorage.
        // The key is the dateTime with seconds, the value is the content of the note.
        saveNote(new Date().toLocaleString(), noteContent);

        // Reset variables and update UI.
        noteContent = '';
        renderNotes(getAllNotes());
        noteTextarea.val('');
        instructions.text('Note saved successfully.');
    }

})


notesList.on('click', function(e) {
    e.preventDefault();
    var target = $(e.target);

    // Listen to the selected note.
    if (target.hasClass('listen-note')) {
        var content = target.closest('.note').find('.content').text();
        readOutLoud(content);
    }

    // Delete note.
    if (target.hasClass('delete-note')) {
        var dateTime = target.siblings('.date').text();
        deleteNote(dateTime);
        target.closest('.note').remove();
    }
});



/*-----------------------------
      Speech Synthesis 
------------------------------*/
function readOutLoud(message) {
    var speech = new SpeechSynthesisUtterance();

    // Set the text and voice attributes.
    speech.text = message;
    speech.volume = 1;
    speech.rate = 1;
    speech.pitch = 3;

    window.speechSynthesis.speak(speech);
}



/*-----------------------------
      Helper Functions 
------------------------------*/
function renderNotes(notes) {
    var html = '';
    if (notes.length) {
        notes.forEach(function(note) {
            html += `<li class="note">
        <p class="header">
          <span class="date">${note.date}</span>
          <a href="#" class="listen-note" title="Listen to Note">Listen to Note</a>
          <a href="#" class="delete-note" title="Delete">Delete</a>
        </p>
        <p class="content">${note.content}</p>
      </li>`;
        });
    } else {
        html = '<li><p class="content">You don\'t have any notes yet.</p></li>';
    }
    notesList.html(html);
}


function saveNote(dateTime, content) {
    localStorage.setItem('note-' + dateTime, content);
}


function getAllNotes() {
    var notes = [];
    var key;
    for (var i = 0; i < localStorage.length; i++) {
        key = localStorage.key(i);

        if (key.substring(0, 5) == 'note-') {
            notes.push({
                date: key.replace('note-', ''),
                content: localStorage.getItem(localStorage.key(i))
            });
        }
    }
    return notes;
}


function deleteNote(dateTime) {
    localStorage.removeItem('note-' + dateTime);
}

How to Use Voice Note App?

Add A New Voice or Text Note

  1. Click on the “Start Recognition” button and give the app permission to use your microphone, and start speaking your note (if no microphone or you don’t want to use the microphone you can type into the text box.)
  2. ​When done speaking click on the “Pause Recognition” button, and then click the “Save Note” button. (if you typed your note into the text box you do not need to click on the “Pause Recognition” button, just click the “Save Note” button.)

Listen To Notes

Click on the “Listen to Note” link next to the date of the note that you want to listen to.

Delete Voice Notes

Click on the “Delete” link next to the date of the note that you want to delete.


Important Note

On Google Chrome, using Speech Recognition on a web page involves a server-based recognition engine. Your audio is sent to a web service for recognition processing, so it won’t work offline.

Most APIs that require user permission doesn’t work on non-secure hosts. Make sure you are serving your Web Speech apps over HTTPS.


Download JavaScript Voice Note App (including images)

Download

Furqan

Well. I've been working for the past three years as a web designer and developer. I have successfully created websites for small to medium sized companies as part of my freelance career. During that time I've also completed my bachelor's in Information Technology.

Recent Posts

Llama 3.1 vs GPT-4 Benchmarks

We evaluated the performance of Llama 3.1 vs GPT-4 models on over 150 benchmark datasets…

July 24, 2024

Transforming Manufacturing with Industrial IoT Solutions and Machine Learning

The manufacturing industry is undergoing a significant transformation with the advent of Industrial IoT Solutions.…

July 6, 2024

How can IT Professionals use ChatGPT?

If you're reading this, you must have heard the buzz about ChatGPT and its incredible…

September 2, 2023

ChatGPT in Cybersecurity: The Ultimate Guide

How to Use ChatGPT in Cybersecurity If you're a cybersecurity geek, you've probably heard about…

September 1, 2023

Add Cryptocurrency Price Widget in WordPress Website

Introduction In the dynamic world of cryptocurrencies, staying informed about the latest market trends is…

August 30, 2023

Best Addons for The Events Calendar Elementor Integration

The Events Calendar Widgets for Elementor has become easiest solution for managing events on WordPress…

August 30, 2023