A python 3 web Scrapper integrating request and BeautifullSoup
Its intended to be used in shell by passing arguments to get the desired part of website params:
- -u url # Target url
- -t tag # Target tag, example a, li, p, span
- -c classes # Target classes of the current target tag as a single string
- -g text, href, img # It search for all elments matching tag/class params and return either the text value or the href value from a tag type a, img will download the src param from img tag
- Getting href value from all a tags:
python3 Scrapper.py -u https://es.wikipedia.org -t a -c external -g href - Download all images from src param in img tags:
python3 Scrapper.py -u https://es.wikipedia.org -t img -g img