In this tutorial, you will learn how to implement Speech to Text and Text to Speech functionality using JavaScript. Basically, we will build a Voice Note-Taking App from scratch.
This app uses the Web Speech API to build a voice-powered note app to do 3 things:
The source code of this Voice Note App is written using HTML5, CSS3, JavaScript, Bootstrap, and Web Speech API. I have provided the complete source code in this article, but you have to download the image files using the download link given at the end of this tutorial.
The Web Speech API provides two distinct areas of functionality.
Our Voice Note App is divided into two separate interfaces.
The first will be “Speech Recognition” which will involve receiving speech through a device’s microphone, which is then checked by a speech recognition service against a list of grammar (basically, the vocabulary you want to have recognized in a particular app.) When a word or phrase is successfully recognized, it is returned as a result (or list of results) as a text string, and further actions can be initiated as a result.
The second is “Speech Synthesis” (aka text-to-speech, or TTS) which involves receiving synthesizing text contained within an app to speech, and playing it out of a device’s speaker or audio output connection.
<!DOCTYPE html> <html> <head> <meta charset="utf-8"> <meta http-equiv="X-UA-Compatible" content="IE=edge"> <title>Voice Note App</title> <meta name="description" content="A Voice Note App that allows you to take voice and/or text notes and play them back."> <meta name="viewport" content="width=device-width, initial-scale=1"> <link rel="shortcut icon" href="assets/ico/favicon.ico"> <!-- Custom Style Sheet --> <link rel="stylesheet" href="assets/css/style.css"> <!-- Bootstrap CDN CSS --> <link href="https://stackpath.bootstrapcdn.com/bootswatch/4.1.1/cerulean/bootstrap.min.css" rel="stylesheet" integrity="sha384-0Mou2qXGeXK7k/Ue/a1hspEVcEP2zCpoQZw8/MPeUgISww+VmDJcy2ri9tX0a6iy" crossorigin="anonymous"> </head> <body> <div class="container-fluid align-center"> <img src="assets/img/code-mic-150.png" alt="Voice Note App Logo"> <h1>Voice Note App</h1> <p class="page-description">This app allows you to take voice and/or text notes and play them back.</p> <hr> <h3 class="no-browser-support">Sorry, Your Browser Doesn't Support the Web Speech API. Try Opening This Demo In Google Chrome.</h3> <div class="app"> <div class="row"> <div class="col-md-6 align-center"> <h3>Add New Voice Note</h3> <div class="input-single"> <textarea id="note-textarea" placeholder="Create a new note by typing or using voice recognition." rows="6"></textarea> </div> <button id="start-record-btn" class="btn-success" title="Start Recording">Start Recognition</button> <button id="pause-record-btn" class="btn-warning" title="Pause Recording">Pause Recognition</button> <button id="save-note-btn" class="btn-info" title="Save Note">Save Note</button> <p id="recording-instructions">Press the <strong>Start Recognition</strong> button and allow access.</p> </div> <div class="col-md-6 align-center"> <h3>My Voice Notes</h3> <ul id="notes"> <li> <p class="no-notes">You don't have any notes.</p> </li> </ul> </div> </div> <!-- /row --> </div> <!-- /app --> </div> <div id="footer"> <div class="clearfix1"> <div class="container"> <div class="row"> <div class="center"> <p>Speech to Text Voice Note App 2018</p> </div> </div> <!-- /row --> </div> <!-- /container --> </div> </div> <!-- /footer --> <script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.2.1/jquery.min.js"></script> <script src="assets/js/script.js"></script> </body> </html>
ul { list-style: none; padding: 0; } p { color: #444; } button { margin-bottom: 20px; padding: 10px 10px; } button:focus { outline: 0; } .container { max-width: 700px; margin: 0 auto; padding: 30px 50px; text-align: center; } .container h1 { margin-bottom: 20px; } .page-description { font-size: 1.1rem; margin: 0 auto; } .tz-link { font-size: 1em; color: #1da7da; text-decoration: none; } .no-browser-support { display: none; font-size: 1.2rem; color: #e64427; margin-top: 35px; } .app { margin: 40px auto; } #note-textarea { margin: 20px 0; width: 80%; } #recording-instructions { margin: 15px auto 60px; } #notes { padding-top: 20px; } .note .header { font-size: 0.9em; color: #888; margin-bottom: 10px; } .note .delete-note, .note .listen-note { text-decoration: none; margin-left: 15px; } .note .content { margin-bottom: 30px; } .align-center img { padding-top: 10px; } .align-center { text-align: center; } .center { text-align: center; width: 100%; } @media (max-width: 768px) { .container { padding: 50px 25px; } button { margin-bottom: 20px; padding: 10px 10px; width: 80%; } }
try { var SpeechRecognition = window.SpeechRecognition || window.webkitSpeechRecognition; var recognition = new SpeechRecognition(); } catch (e) { console.error(e); $('.no-browser-support').show(); $('.app').hide(); } var noteTextarea = $('#note-textarea'); var instructions = $('#recording-instructions'); var notesList = $('ul#notes'); var noteContent = ''; // Get all notes from previous sessions and display them. var notes = getAllNotes(); renderNotes(notes); /*----------------------------- Voice Recognition ------------------------------*/ // If false, the recording will stop after a few seconds of silence. // When true, the silence period is longer (about 15 seconds), // allowing us to keep recording even when the user pauses. recognition.continuous = true; // This block is called every time the Speech APi captures a line. recognition.onresult = function(event) { // event is a SpeechRecognitionEvent object. // It holds all the lines we have captured so far. // We only need the current one. var current = event.resultIndex; // Get a transcript of what was said. var transcript = event.results[current][0].transcript; // Add the current transcript to the contents of our Note. // There is a weird bug on mobile, where everything is repeated twice. // There is no official solution so far so we have to handle an edge case. var mobileRepeatBug = (current == 1 && transcript == event.results[0][0].transcript); if (!mobileRepeatBug) { noteContent += transcript; noteTextarea.val(noteContent); } }; recognition.onstart = function() { instructions.text('Voice recognition activated. Try speaking into the microphone.'); } recognition.onspeechend = function() { instructions.text('You were quiet for a while so voice recognition turned itself off.'); } recognition.onerror = function(event) { if (event.error == 'no-speech') { instructions.text('No speech was detected. Try again.'); }; } /*----------------------------- App buttons and input ------------------------------*/ $('#start-record-btn').on('click', function(e) { if (noteContent.length) { noteContent += ' '; } recognition.start(); }); $('#pause-record-btn').on('click', function(e) { recognition.stop(); instructions.text('Voice recognition paused.'); }); // Sync the text inside the text area with the noteContent variable. noteTextarea.on('input', function() { noteContent = $(this).val(); }) $('#save-note-btn').on('click', function(e) { recognition.stop(); if (!noteContent.length) { instructions.text('Could not save empty note. Please add a message to your note.'); } else { // Save note to localStorage. // The key is the dateTime with seconds, the value is the content of the note. saveNote(new Date().toLocaleString(), noteContent); // Reset variables and update UI. noteContent = ''; renderNotes(getAllNotes()); noteTextarea.val(''); instructions.text('Note saved successfully.'); } }) notesList.on('click', function(e) { e.preventDefault(); var target = $(e.target); // Listen to the selected note. if (target.hasClass('listen-note')) { var content = target.closest('.note').find('.content').text(); readOutLoud(content); } // Delete note. if (target.hasClass('delete-note')) { var dateTime = target.siblings('.date').text(); deleteNote(dateTime); target.closest('.note').remove(); } }); /*----------------------------- Speech Synthesis ------------------------------*/ function readOutLoud(message) { var speech = new SpeechSynthesisUtterance(); // Set the text and voice attributes. speech.text = message; speech.volume = 1; speech.rate = 1; speech.pitch = 3; window.speechSynthesis.speak(speech); } /*----------------------------- Helper Functions ------------------------------*/ function renderNotes(notes) { var html = ''; if (notes.length) { notes.forEach(function(note) { html += `<li class="note"> <p class="header"> <span class="date">${note.date}</span> <a href="#" class="listen-note" title="Listen to Note">Listen to Note</a> <a href="#" class="delete-note" title="Delete">Delete</a> </p> <p class="content">${note.content}</p> </li>`; }); } else { html = '<li><p class="content">You don\'t have any notes yet.</p></li>'; } notesList.html(html); } function saveNote(dateTime, content) { localStorage.setItem('note-' + dateTime, content); } function getAllNotes() { var notes = []; var key; for (var i = 0; i < localStorage.length; i++) { key = localStorage.key(i); if (key.substring(0, 5) == 'note-') { notes.push({ date: key.replace('note-', ''), content: localStorage.getItem(localStorage.key(i)) }); } } return notes; } function deleteNote(dateTime) { localStorage.removeItem('note-' + dateTime); }
Click on the “Listen to Note” link next to the date of the note that you want to listen to.
Click on the “Delete” link next to the date of the note that you want to delete.
On Google Chrome, using Speech Recognition on a web page involves a server-based recognition engine. Your audio is sent to a web service for recognition processing, so it won’t work offline.
Most APIs that require user permission doesn’t work on non-secure hosts. Make sure you are serving your Web Speech apps over HTTPS.
We evaluated the performance of Llama 3.1 vs GPT-4 models on over 150 benchmark datasets…
The manufacturing industry is undergoing a significant transformation with the advent of Industrial IoT Solutions.…
If you're reading this, you must have heard the buzz about ChatGPT and its incredible…
How to Use ChatGPT in Cybersecurity If you're a cybersecurity geek, you've probably heard about…
Introduction In the dynamic world of cryptocurrencies, staying informed about the latest market trends is…
The Events Calendar Widgets for Elementor has become easiest solution for managing events on WordPress…