A web application that helps users extract text from images, summarize content, and provides accessibility features.
- 📄 Image Upload: Upload images (PNG, JPG, JPEG)
- 🔍 Text Extraction: Extract text from images using OCR
- 📝 Text Summarization: Generate summaries of extracted text
- 🔊 Text-to-Speech: Listen to extracted text and summaries
- ❓ Q&A: (Coming soon) Ask questions about the content
- Python 3.7+
- Tesseract OCR: Required for text extraction
- Windows: Download from GitHub
- macOS:
brew install tesseract - Linux:
sudo apt-get install tesseract-ocr
-
Install Python dependencies:
pip install -r requirements.txt
-
Run the Flask backend:
python app.py
The backend will run on
http://localhost:5000
-
Navigate to the frontend directory:
cd frontend -
Install Node.js dependencies:
npm install
-
Start the React development server:
npm start
The frontend will run on
http://localhost:3000
- Open your browser and go to
http://localhost:3000 - Upload an image containing text
- Click "Extract Text" to extract text from the image
- Click "Summarize" to generate a summary of the extracted text
- Click "Listen to Text" to hear the extracted text
- Click "Listen to Summary" to hear the summarized text
POST /api/upload- Upload an image filePOST /api/extract- Extract text from uploaded imagePOST /api/summarize- Summarize extracted textPOST /api/synthesize- Convert text to speech (returns audio file)
The application includes AWS S3 integration for file storage. To use AWS features:
- Configure AWS credentials via environment variables or AWS CLI
- Update the S3 bucket name in
aws_utils.pyif needed
- "Tesseract not found" error: Make sure Tesseract OCR is installed and in your system PATH
- CORS errors: The backend includes CORS support, but ensure both frontend and backend are running
- File upload issues: Check that the
uploads/directory exists and has write permissions