A complete system consisting of an Android camera app and a REST API backend that uses OpenAI's GPT-4o Vision API for intelligent image analysis. Capture photos with your Android device and get AI-powered insights in real-time.
This project consists of two main components:
┌─────────────────┐ HTTPS/ngrok ┌─────────────────┐ OpenAI API ┌─────────────────┐
│ Android App │ ──────────────► │ Backend API │ ───────────────► │ GPT-4o API │
│ │ │ │ │ │
│ • Camera UI │ │ • Image Upload │ │ • Vision Model │
│ • Photo Capture │ │ • File Handling │ │ • Analysis │
│ • Custom Prompts│ ◄────────────── │ • OpenAI Client │ ◄─────────────── │ • Response │
│ • Results Display│ JSON Response │ • Auto Cleanup │ │ │
└─────────────────┘ └─────────────────┘ └─────────────────┘
- Android App (
/app) - Kotlin-based camera interface with CameraX - Backend API (
/backend) - Node.js REST API that processes images with OpenAI
- 📱 Modern Android Camera App with real-time preview using CameraX
- 📸 One-tap photo capture with intuitive material design UI
- 🌐 Universal connectivity via ngrok tunneling (works on emulator + physical devices)
- 🤖 GPT-4o Vision Analysis for intelligent image processing
- 💬 Customizable AI prompts with persistent settings
- ⚡ Optimized timeouts for reliable API communication
- 🛡️ Robust error handling and automatic file cleanup
- 🔒 HTTPS support through ngrok for secure communications
- Node.js (v14 or higher) for the backend
- Android Studio with Android SDK for the app
- OpenAI API Key from OpenAI Platform
- ngrok account (free) from ngrok.com
cd backend
npm install# Copy environment template and edit
cp env.example .envAdd your OpenAI API key to .env:
OPENAI_API_KEY=sk-your-actual-api-key-here
PORT=8080npm run devServer will run on http://localhost:8080
- Download ngrok from ngrok.com
- Sign up for a free account and get your auth token
- Install ngrok and authenticate
# In a new terminal window
ngrok http 8080Copy the HTTPS URL (e.g., https://abc123.ngrok.io)
- Open
app/src/main/java/com/example/myapplication/MainActivity.kt - Update the
backendBaseUrlwith your ngrok HTTPS URL:
private val backendBaseUrl = "https://your-ngrok-url.ngrok.io"- Open the project in Android Studio
- Wait for Gradle sync to complete
- Connect an Android device or start an emulator
- Click Run (
▶️ ) to build and install the app
- Grant camera permission when prompted
- Point camera at something interesting
- Tap the red capture button
- Wait for AI analysis (may take 10-30 seconds)
- View results in the dialog that appears
- App Launch: Camera preview starts immediately using CameraX
- Photo Capture: User taps the red capture button to take a photo
- Image Upload: Photo is sent to backend via HTTPS multipart form data through ngrok
- AI Processing: Backend forwards image to OpenAI GPT-4o Vision API with custom prompt
- Response: AI analysis is returned and displayed in an alert dialog
- Cleanup: Backend automatically deletes uploaded files after processing
Android App → CameraX Preview → Capture Photo →
ngrok HTTPS → Backend API → OpenAI GPT-4o Vision →
JSON Response → Android Dialog → File Cleanup
camgpt/
├── app/ # Android Application
│ ├── src/main/
│ │ ├── java/.../MainActivity.kt # Main camera activity
│ │ ├── res/layout/activity_main.xml # UI layout
│ │ └── AndroidManifest.xml # App permissions
│ └── build.gradle.kts # Android dependencies
├── backend/ # REST API Backend
│ ├── server.js # Express.js server
│ ├── package.json # Node.js dependencies
│ ├── Dockerfile # Docker configuration
│ ├── docker-compose.yml # Docker Compose setup
│ └── README.md # Backend documentation
├── gradle/ # Gradle configuration
└── README.md # This file
- Camera: To capture photos
- Internet: To communicate with backend API via HTTPS
- Internet: To communicate with OpenAI API
- File System: For temporary image storage (auto-cleanup)
- Language: Kotlin
- Camera: CameraX (modern camera API)
- HTTP Client: OkHttp with extended timeouts
- JSON Parsing: Gson
- UI: Material Design Components
- Architecture: Single Activity with coroutines
- Runtime: Node.js
- Framework: Express.js
- File Upload: Multer
- AI Integration: OpenAI official client
- Cross-Origin: CORS enabled
- Environment: dotenv configuration
Update the backendBaseUrl in MainActivity.kt:
private val backendBaseUrl = "https://your-ngrok-url.ngrok.io"- In-app: Tap the settings button (✏️) to modify prompts
- Default: Edit
defaultPromptvariable inMainActivity.kt - Examples:
"Describe what you see in detail""Extract all text from this image""List all objects and their colors"
Adjust HTTP timeouts in MainActivity.kt:
private val client = OkHttpClient.Builder()
.connectTimeout(30, TimeUnit.SECONDS)
.writeTimeout(30, TimeUnit.SECONDS)
.readTimeout(60, TimeUnit.SECONDS)
.build()Configure in .env file:
OPENAI_API_KEY=your-key
PORT=8080
NODE_ENV=productionAdd specialized endpoints in server.js:
// Text extraction endpoint
app.post('/api/extract-text', upload.single('image'), async (req, res) => {
const prompt = "Extract all text from this image";
// ... processing logic
});const rateLimit = require('express-rate-limit');
const limiter = rateLimit({
windowMs: 15 * 60 * 1000, // 15 minutes
max: 100 // requests per IP
});
app.use('/api/', limiter);# Terminal 1: Start backend
cd backend
npm run dev
# Terminal 2: Start ngrok tunnel
ngrok http 8080cd backend
# Build and run
docker build -t camgpt-backend .
docker run -p 8080:8080 -e OPENAI_API_KEY=your-key camgpt-backend
# Or use docker-compose
echo "OPENAI_API_KEY=your-key" > .env
docker-compose up -d- Heroku: Deploy with environment variables
- Railway: Connect GitHub repo with env vars
- Render: Deploy Node.js service
- DigitalOcean App Platform: Use environment variables
- AWS/GCP: Deploy with proper environment configuration
- Development: Install APK via Android Studio
- Distribution: Build signed APK or AAB for distribution
- Google Play: Follow standard Android publishing process
| Issue | Solution |
|---|---|
| Camera permission denied | Grant camera permission in app settings |
| Timeout errors | Check ngrok tunnel is active and backend is running |
| Connection refused | Verify backend URL in MainActivity.kt |
| Build errors | Update Android SDK and sync Gradle |
| Issue | Solution |
|---|---|
| Port 8080 in use | Kill process: npx kill-port 8080 |
| OpenAI API errors | Verify API key and account credits |
| File upload fails | Check image size (max 10MB) |
| CORS errors | Ensure CORS is enabled in server.js |
| Issue | Solution |
|---|---|
| Tunnel expired | Restart ngrok and update Android app URL |
| 404 errors | Ensure backend is running on port 8080 |
| HTTPS required | Always use HTTPS ngrok URL in Android app |
- OpenAI GPT-4o Vision: ~$0.01 per image analysis
- ngrok: Free tier with 2-hour sessions (paid plans available)
- Hosting: Free tiers available on most cloud platforms
- Development: Completely free for local development
- Monitor OpenAI API usage in dashboard
- Implement request caching for repeated images
- Add rate limiting to prevent abuse
- Use image compression before sending to API
- ✅ Never commit API keys to version control
- ✅ Use environment variables for all sensitive configuration
- ✅ Enable HTTPS via ngrok for secure communication
- ✅ Implement rate limiting for production APIs
- ✅ Validate file uploads (size, type, content)
- ✅ Auto-cleanup uploaded files after processing
- ✅ Monitor API usage and set spending limits
// Add to server.js for basic logging
app.use((req, res, next) => {
console.log(`${new Date().toISOString()} - ${req.method} ${req.path}`);
next();
});- Access ngrok web interface at
http://localhost:4040 - Monitor requests, response times, and errors
- Useful for debugging API communication
- Fork the repository
- Create a feature branch
- Make your changes
- Test thoroughly on both Android and backend
- Submit a pull request
This project is open source and available under the MIT License.