Gowap [Wappalyzer implementation in Go]
- JS analysing (using Rod)
- DNS scraping
- Confidence rate
- Recursive crawling
- Rod browser integration (Colly can still be used - faster but not loading JS)
- Can be used with as a cmd (technologies.json file embeded)
- Test coverage 100%
- robots.txt compliance
go get github.com/unstppbl/gowap
Call Init() function with a Config object created with the NewConfig() function. It will return Wappalyzer object on which you can call Analyze method with URL string as argument.
    //Create a Config object and customize it
	config := gowap.NewConfig()
    //Path to override default technologies.json file
	config.AppsJSONPath = "path/to/my/technologies.json"
    //Timeout in seconds for fetching the url
	config.TimeoutSeconds = 5
    //Timeout in seconds for loading the page
	config.LoadingTimeoutSeconds = 5
    //Don't analyze page when depth superior to this number. Default (0) means no recursivity (only first page will be analyzed)
	config.MaxDepth = 2
    //Max number of pages to visit. Exit when reached
	config.MaxVisitedLinks = 10
    //Delay in ms between requests
	config.MsDelayBetweenRequests = 200
    //Choose scraper between rod (default) and colly
	config.Scraper = "colly"
    //Override the user-agent string
	config.UserAgent = "GoWap"
    //Output as a JSON string
    config.JSON = true
    //Initialisation
	wapp, err := gowap.Init(config)
    //Scraping 
    url := "https://scrapethissite.com/"
	res, err := wapp.Analyze(url)You can build the cmd using the commande :
go build -o gowap cmd/gowap/main.go
Then using the compiled binary :
You must specify a url to analyse
Usage : gowap [options] <url>
  -delay int
    	Delay in ms between requests (default 100)
  -depth int
    	Don't analyze page when depth superior to this number. Default (0) means no recursivity (only first page will be analyzed)
  -file string
    	Path to override default technologies.json file
  -h	Help
  -loadtimeout int
    	Timeout in seconds for loading the page (default 3)
  -maxlinks int
    	Max number of pages to visit. Exit when reached (default 5)
  -pretty
    	Pretty print json output
  -scraper string
    	Choose scraper between rod (default) and colly (default "rod")
  -timeout int
    	Timeout in seconds for fetching the url (default 3)
  -useragent string
    	Override the user-agent string
List of some ideas :
- analyse robots (field certIssuer)
- analyse certificates (field certIssuer)
- anayse css (field css)
- anayse xhr requests (field xhr)
- scrape an url list from a file in args
- ability to choose what is scraped (DNS, cookies, HTML, scripts, etc...)
- more tests in "real life"
- perf ? regex html seems long
- should output be the same as original wappalizer ? + ordering