- Maven 3
- JDK 21
mvn clean install
- Your crawler should extend WebCrawler base crawler class
- DTO class which describes collected data should implement CrawlerData marker interface
Crawler for Orthodox torrent tracker pravtor.ru
Check PravtorRuWebCrawler for details
To make search - use run-search script in pravtor.ru-crawler folder.
Collected data will be placed into result.xls file in sandbox folder
Crawler for vacancies aggregator rabota.by
Check RabotaByWebCrawler for details
To make search - use run-search script in rabota.by-crawler folder.
Crawler for Onlíner CPU catalog (AM4)
Check OnlinerByCpuCrawler for details.
It reads the JSON-LD ItemList from catalog pages (filters: socket_cpu[0]=am4, price[from]=1).
To run — use run-search in the onliner.by-crawler folder after mvn package (output JSON path and optional args are set in the script).