Gowap [Wappalyzer implementation in Go]

JS analysing (using Rod)
DNS scraping
Confidence rate
Recursive crawling
Rod browser integration (Colly can still be used - faster but not loading JS)
Can be used with as a cmd (technologies.json file embeded)
Test coverage 100%
robots.txt compliance

Usage

Using the package

go get github.com/unstppbl/gowap

Call Init() function with a Config object created with the NewConfig() function. It will return Wappalyzer object on which you can call Analyze method with URL string as argument.

    //Create a Config object and customize it
	config := gowap.NewConfig()
    //Path to override default technologies.json file
	config.AppsJSONPath = "path/to/my/technologies.json"
    //Timeout in seconds for fetching the url
	config.TimeoutSeconds = 5
    //Timeout in seconds for loading the page
	config.LoadingTimeoutSeconds = 5
    //Don't analyze page when depth superior to this number. Default (0) means no recursivity (only first page will be analyzed)
	config.MaxDepth = 2
    //Max number of pages to visit. Exit when reached
	config.MaxVisitedLinks = 10
    //Delay in ms between requests
	config.MsDelayBetweenRequests = 200
    //Choose scraper between rod (default) and colly
	config.Scraper = "colly"
    //Override the user-agent string
	config.UserAgent = "GoWap"
    //Output as a JSON string
    config.JSON = true

    //Initialisation
	wapp, err := gowap.Init(config)
    //Scraping 
    url := "https://scrapethissite.com/"
	res, err := wapp.Analyze(url)

Using the cmd

You can build the cmd using the commande : go build -o gowap cmd/gowap/main.go

Then using the compiled binary :

You must specify a url to analyse
Usage : gowap [options] <url>
  -delay int
    	Delay in ms between requests (default 100)
  -depth int
    	Don't analyze page when depth superior to this number. Default (0) means no recursivity (only first page will be analyzed)
  -file string
    	Path to override default technologies.json file
  -h	Help
  -loadtimeout int
    	Timeout in seconds for loading the page (default 3)
  -maxlinks int
    	Max number of pages to visit. Exit when reached (default 5)
  -pretty
    	Pretty print json output
  -scraper string
    	Choose scraper between rod (default) and colly (default "rod")
  -timeout int
    	Timeout in seconds for fetching the url (default 3)
  -useragent string
    	Override the user-agent string

To Do

List of some ideas :

analyse robots (field certIssuer)
analyse certificates (field certIssuer)
anayse css (field css)
anayse xhr requests (field xhr)
scrape an url list from a file in args
ability to choose what is scraped (DNS, cookies, HTML, scripts, etc...)
more tests in "real life"
perf ? regex html seems long
should output be the same as original wappalizer ? + ordering

Name		Name	Last commit message	Last commit date
Latest commit History 100 Commits
.github/workflows		.github/workflows
cmd/gowap		cmd/gowap
pkg		pkg
vendor		vendor
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Gowap [Wappalyzer implementation in Go]

Usage

Using the package

Using the cmd

To Do

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 6

Uh oh!

Languages

License

unstppbl/gowap

Folders and files

Latest commit

History

Repository files navigation

Gowap [Wappalyzer implementation in Go]

Usage

Using the package

Using the cmd

To Do

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 6

Uh oh!

Languages

Packages