A local AI tool for analyzing medical reports using OCR and machine learning. Processes scanned medical reports (images or PDFs) and provides automated analysis including value extraction, diagnosis prediction, and case similarity matching.
- 📄 Multi-format Support: Analyze image files (JPG, PNG) and PDF documents
- 🔍 OCR Text Extraction: Uses Tesseract OCR for text extraction from medical documents
- 🩺 Medical Value Parsing: Extracts vital signs, lab values, and symptoms automatically
- � AI-Powered Analysis: Uses SentenceTransformer embeddings for semantic understanding
- 📊 Case-Based Learning: Finds similar past cases using machine learning
- 🌍 Multi-Language Support: Works with English and Albanian medical documents
- 🔒 Privacy-First: 100% local processing - no cloud services or data sharing
- Python 3.8 or higher (Check with
python --version) - Git (to clone the repository)
- Tesseract OCR (for text extraction)
Windows:
- Download from: https://github.com/UB-Mannheim/tesseract/wiki
- Install the executable
- Add Tesseract to your system PATH
- Verify installation:
tesseract --version
macOS:
brew install tesseract
tesseract --version # Verify installationLinux (Ubuntu/Debian):
sudo apt-get update
sudo apt-get install tesseract-ocr
tesseract --version # Verify installationOption A: Clone with Git
git clone https://github.com/Dielldev/MedAI.git
cd MedAIOption B: Download ZIP
- Download the ZIP file from GitHub
- Extract to your desired folder
- Open terminal/command prompt in that folder
# Make sure you're in the MedAI directory
cd MedAI
# Install required packages
pip install -r requirements.txt# Test if everything is working
python test_installation.py# Run the GUI application
python gui.py# Run the web app
python web_app.pyThen open your browser and go to: http://localhost:5000
Features:
- Upload medical documents through web interface
- View analysis results in your browser
- Download reports as JSON or text files
# Run enhanced demo with sample data
python tests/enhanced_demo.pyMedAI/
├── main.py # Main application logic
├── gui.py # Graphical user interface
├── web_app.py # Web application
├── medical_analyzer.py # Core medical analysis engine
├── requirements.txt # Python dependencies
├── config.yaml # Configuration settings
├── README.md # This file
├── setup.py # Installation script
├── data/ # Database and embeddings
│ ├── cases_db.json # Case database
│ └── case_embeddings.npy # ML embeddings
├── exports/ # Analysis results
├── uploads/ # Input documents
├── templates/ # Web app templates
│ └── index.html
├── tests/ # Test scripts
│ ├── test_installation.py
│ └── enhanced_demo.py
└── poppler-23.08.0/ # PDF processing tools
- All processing is done locally (no cloud services)
- No data is sent to external servers
- Medical data remains on your local machine
-
Check the test files:
python tests/test_installation.py python tests/enhanced_demo.py
-
Enable debug mode in your Python scripts:
medai = MedAI(debug=True)
-
Check log files in the project directory
- sentence-transformers: Text embeddings
- pytesseract: OCR text extraction
- opencv-python: Image preprocessing
- scikit-learn: Similarity calculations
- pdf2image: PDF processing
🏥 Medical Disclaimer: This tool is designed for educational and research purposes only. Always consult qualified healthcare professionals for medical decisions.




