Dynamic URL Scraper
Estimated reading: 2 minutes
The Dynamic URL Scraper discovers and collects links from the provided webpages. It crawls across multiple pages to gather URLs, making it suitable for websites where content or URLs change frequently.
Purpose:
1. Automatically gather multiple webpages and links from dynamic websites.
2. Explore multiple levels of linked pages for comprehensive data collection.
3. Adapt to websites where URLs or content change frequently.
Parameter
| Parameter | Description |
|---|---|
| Main URLs | Specifies the webpage(s) from which the scraper begins crawling. The value can be hardcoded or provided as a variable, either as a credential or a generic value. To enter multiple values, separate each URL with a comma. |
| URL Keyword Filter | Specifies keywords used to filter URLs during crawling. Only URLs containing these keywords are collected. The value can be hardcoded or provided as a variable, either as a credential or a generic value. To enter multiple values, separate each keyword with a comma. |
| Crawl Depth | Specifies the crawl depth, which determines how many levels the scraper navigates from the base URL. Use the increase or decrease icons in the field to adjust the value. |
| Max Pages | Specifies the maximum number of pages to scrape during the process. Use the increase or decrease icons in the field to adjust the value. |
Output
Choose any of the following output forms:
| Output Type | Description |
|---|---|
| Scraped URLs | List of URLs collected that match the specified filters. |
| URL DataFrame | Tabular representation of the scraped URLs for analysis or reporting. |