SmartDocs Architect

Final Project for EMGT 308 - Solutions Architecture and the Cloud

Project Overview

SmartDocs Architect is a cloud-hosted AI-powered document intelligence system built using Retrieval-Augmented Generation (RAG).

It allows users to upload documents, convert them into vector embeddings, retrieve semantically relevant context, and ask grounded questions based only on uploaded content.

The current MVP is deployed on AWS EC2 as a single-instance cloud workload.

Live Demo

Cloud Deployment (AWS EC2): http://54.226.223.159:8501/

Course Alignment

This project demonstrates core EMGT 308 concepts:

Cloud deployment on AWS EC2
Workload hosting in cloud infrastructure
Layered solution architecture
Infrastructure and application separation
Scalable architecture planning

Features

Upload PDF and TXT files
Parse and chunk document text
Generate embeddings using Gemini API
Store vectors in ChromaDB
Ask grounded questions
Display retrieved source chunks
Clear knowledge base

Implemented Architecture


User
↓
Internet
↓
AWS EC2 Instance
↓
Streamlit Application
↓
Gemini API + ChromaDB

Proposed Scalable Architecture


User
↓
Application Load Balancer
↓
Auto Scaling Group
↓
Multiple EC2 Instances
↓
Amazon S3

Technology Stack

Python
Streamlit
Gemini API
ChromaDB
PyPDF
python-dotenv
AWS EC2

Project Structure

SmartDocs Architect/
│
├── app.py
├── requirements.txt
├── README.md
├── .gitignore
├── .env.example
│
├── app/
│   ├── __init__.py
│   ├── ingest.py
│   ├── rag.py
│   ├── parsers.py
│   └── utils/
│       ├── __init__.py
│       └── helpers.py
│
├── docs/
│   └── architecture-notes.md
│
├── screenshots/
└── chroma_db/

System Flow

Upload document
Parse text
Chunk content
Generate embeddings
Store vectors
Ask question
Retrieve relevant chunks
Generate grounded answer

AWS Deployment

Ubuntu EC2 instance
Security Group:
- SSH (22)
- TCP 8501
Streamlit served externally

Mac / Linux:

python3 -m streamlit run app.py --server.port 8501 --server.address 0.0.0.0

Windows:

python -m streamlit run app.py --server.port 8501 --server.address 0.0.0.0

py -m streamlit run app.py --server.port 8501 --server.address 0.0.0.0

Public Access

http://54.226.223.159:8501

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SmartDocs Architect

Project Overview

Live Demo

Course Alignment

Features

Implemented Architecture

Proposed Scalable Architecture

Technology Stack

Project Structure

System Flow

AWS Deployment

Public Access

Screenshots

Application UI

AWS EC2 Deployment

SSH Access

Streamlit Running

Video Demonstration

Limitations

Future Improvements

Author

License

About

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
app		app
docs		docs
media		media
screenshots		screenshots
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

SmartDocs Architect

Project Overview

Live Demo

Course Alignment

Features

Implemented Architecture

Proposed Scalable Architecture

Technology Stack

Project Structure

System Flow

AWS Deployment

Public Access

Screenshots

Application UI

AWS EC2 Deployment

SSH Access

Streamlit Running

Video Demonstration

Limitations

Future Improvements

Author

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages