Java / Crawlers

0
webmagic 🌿
9483 (+3) ⭐

A scalable web crawler framework for Java.

0
crawler4j 🌿
4023 (+1) ⭐

Open Source Web Crawler for Java

0
WebCollector 🌿
2740 (+0) ⭐

WebCollector is an open source web crawler framework based on Java.It provides some simple interfaces for crawling the Web,you can setup a multi-threaded web crawler in less than 5 minutes.

0
storm-crawler 🌿
685 (+0) ⭐

A scalable, mature and versatile web crawler based on Apache Storm

0
395 (+0) ⭐

Open-source Enterprise Grade Search Engine Software

0
sitemapgen4j 🌿
139 (+0) ⭐

SitemapGen4j is a library to generate XML sitemaps in Java.

0
woothee-java 🌿
55 (+0) ⭐

Woothee Java implementation and Hive UDF

0
201 (+0) ⭐

The Common Crawl Crawler Engine and Related MapReduce code (2008-2012)

0
TACIT 🌿
103 (+0) ⭐

We introduce TACIT: An Open-Source Text Analysis, Crawling and Interpretation Tool. TACIT's plugin architecture has three main components: 1. Crawling plugins 2. Corpus management 3. Analysis plugins. TACIT's open-source plugin platform allows the architecture to easily adapt with the rapid developments text analysis.

0
14 (+0) ⭐

REST and STREAMING crawlers of Twitter (java)

76615 Java libraries
(25638 libraries)
Go
(116334 libraries)
(62930 libraries)
(23384 libraries)
(33208 libraries)
C#
(55801 libraries)
(26084 libraries)
(46348 libraries)
(16048 libraries)
(11827 libraries)
(30549 libraries)
(17962 libraries)
(186955 libraries)
(18435 libraries)
Vue
(22737 libraries)
CSS
(94058 libraries)
(109064 libraries)
(73848 libraries)
(16892 libraries)
C++
(119264 libraries)
C
(95349 libraries)
(56780 libraries)
(67228 libraries)
(11528 libraries)
(76615 libraries)
PHP
(111406 libraries)
(162517 libraries)
(204588 libraries)
(8706 libraries)
Nim
(6031 libraries)
D
(12610 libraries)
(44750 libraries)
(3193 libraries)