ShadowSpeak

A full-stack ESL learning platform where students practice pronunciation by recording themselves mimicking YouTube video segments, and teachers review submissions and provide feedback. Students receive AI-scored pronunciation evaluations powered by Azure Cognitive Services Speech API.

Live Version Repo

Project Goal

Build a platform that replaces the scattered workflow ESL teachers use today: Google Drive for storage, email for submissions, and screen recording software for practice. Students can loop specific YouTube segments, record themselves in-browser, and submit directly. Teachers create lessons, assign them to students, and review submissions with threaded feedback. Azure Speech API automatically scores each recording for pronunciation accuracy.

Tech Stack

Next.js 15 with App Router, React 19, TypeScript
Express.js backend with PostgreSQL database (8 tables)
Azure Blob Storage for audio recordings and images
Azure Cognitive Services Speech SDK for pronunciation scoring
Cloudinary for teacher video uploads and audio extraction
JWT authentication with role-based access control
SWR for data fetching with automatic revalidation
Material UI components and WaveSurfer.js waveform visualization
next-intl for multi-language support
Resend for email notifications, Sentry for error monitoring
Husky + lint-staged for pre-commit code quality enforcement

Solo Project

This is my most comprehensive project. I built everything from scratch:

Azure Speech API integration — evaluates recorded audio and returns per-word scores for accuracy, fluency, completeness, and pronunciation
Browser audio recording using MediaRecorder API with start, pause, resume, and stop controls
YouTube segment looping via custom 100ms polling (YouTube API does not support native looping)
Cloud storage pipeline: recordings converted to base64, uploaded to Azure Blob Storage, URLs persisted to database
Phrase practice pipeline — audio segments with start/end timestamps linked to lessons so students can practice individual phrases
Threaded feedback system — teachers and students exchange replies per submission stored in a feedback_replies table
Student progress tracking — status machine: new → in_progress → submitted → completed with collapsible lesson archive
JWT auth with RBAC: teachers and students have different permissions enforced on both frontend and backend
Email notifications via Resend API alert teachers when students submit recordings
PostgreSQL schema with 8 tables (users, lessons, assignments, audio_segments, practice_words, practice_results, lists, feedback_replies), proper foreign keys and cascade deletes

Solo Project

Technical Challenges Solved

YouTube Segment Looping

The YouTube embedded player does not support native segment looping. I built a custom solution using a useLoopButtons hook that polls the video position every 100ms. When playback reaches the end timestamp, it automatically seeks back to the start. The state machine manages transitions: idle, start_set, ready, looping.

Browser Audio Recording Pipeline

Recording audio in the browser and uploading to cloud storage required multiple integration steps: capture audio stream with MediaRecorder API, convert Blob to base64, send to Express backend, convert to Buffer, stream to Azure Blob Storage, return URL, and persist to PostgreSQL. All managed through a reducer pattern in RecorderPanelContext.

Role-Based Access Control

Implemented two-tier access: JWT tokens contain role claims ("teacher" or "student"), middleware validates every API request, and server-side route protection checks req.user.role !== "teacher" before allowing destructive operations. Students get 403 Forbidden if they try to access teacher endpoints.

What I Learned

How to integrate AI/speech recognition APIs (Azure Cognitive Services) into a production full-stack application
How to build a complete full-stack application from database schema design to deployed production frontend
How to work with browser APIs (MediaRecorder) and multiple cloud services (Azure Blob, Azure Speech, Cloudinary)
How to implement proper authentication and authorization with JWT and role-based access control
How to manage complex UI state using reducer patterns and context
How to design relational database schemas with proper foreign keys, constraints, and indexes for query performance
How to set up production monitoring (Sentry), transactional email (Resend), and pre-commit code quality enforcement (Husky)