Fetch Me If You Can!

Give this web app a URL and it will attempt to fetch the site contents for you asynchrously. Use the job ID to retrieve those contents at a later time.

POST /api/job - kick off a job to fetch web content

$ curl -X POST "http://localhost:4000/api/job" \
    -H "Content-Type: application/json" \
    -d '{"url":"https://www.youtube.com/watch?v=dQw4w9WgXcQ"}'
{
  "url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
  "title": null,
  "status": "processing",
  "id": "2d80bd8dc50140089ae1ce6766f38c57",
  "content": null
}

GET /api/job - return the status of a previously run job

$ curl "http://localhost:4000/api/job/2d80bd8dc50140089ae1ce6766f38c57"
{
  "url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
  "title": "Rick Astley - Never Gonna Give You Up - YouTube",
  "status": "success",
  "id": "2d80bd8dc50140089ae1ce6766f38c57",
  "content": "<!DOCTYPE html><html..."
}

Thanks to the magic of Phoenix, it also provides an admin view of the jobs that have been run.

A detailed display provides job status and a thumbnail:

Installation

You'll need elixir, along with:

redis - key-value store for the job queue
postgres - database for job status and website content

Once configured, update config/dev.exs with your database creds:

config :fetch_me_if_you_can, FetchMeIfYouCan.Repo,
  adapter: Ecto.Adapters.Postgres,
  username: "postgres",
  password: "postgres",

Then:

Install mix dependencies with mix deps.get
Create and migrate your database with mix ecto.create && mix ecto.migrate
Start Phoenix endpoint with mix phoenix.server

Now you can visit localhost:4000 from your browser. To view the jobs admin, visit localhost:4000/jobs

Notes

Building this just involved connecting a few frameworks and libraries. The bulk of the development work was in:
1. the worker
2. the controller
3. and to a lesser extent, the view and the model.
The service treats every request uniquely. Probably want some sort of cleanup service that prunes old data periodically. Better yet, why store the data in a database at all? Just store in redis with a reasonable TTL.
Testing and security were not considerations, so don't use this in production.
This was pretty fun to put together!

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
config		config
lib		lib
priv		priv
test		test
web		web
.gitignore		.gitignore
README.md		README.md
mix.exs		mix.exs
mix.lock		mix.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Fetch Me If You Can!

Installation

Notes

About

Uh oh!

Releases

Packages

Languages

cmpaul/fetch-me-if-you-can

Folders and files

Latest commit

History

Repository files navigation

Fetch Me If You Can!

Installation

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages