Java / Crawlers

0
webmagic 🌿
9095 (+3) ⭐

A scalable web crawler framework for Java.

0
WebCollector 🌿
2655 (+1) ⭐

WebCollector is an open source web crawler framework based on Java.It provides some simple interfaces for crawling the Web,you can setup a multi-threaded web crawler in less than 5 minutes.

0
crawler4j 🌿
3930 (+0) ⭐

Open Source Web Crawler for Java

0
storm-crawler 🌿
655 (+0) ⭐

A scalable, mature and versatile web crawler based on Apache Storm

0
373 (+0) ⭐

Open-source Enterprise Grade Search Engine Software

0
sitemapgen4j 🌿
131 (+0) ⭐

SitemapGen4j is a library to generate XML sitemaps in Java.

0
woothee-java 🌿
53 (+0) ⭐

Woothee Java implementation and Hive UDF

0
200 (+0) ⭐

The Common Crawl Crawler Engine and Related MapReduce code (2008-2012)

0
TACIT 🌿
102 (+0) ⭐

We introduce TACIT: An Open-Source Text Analysis, Crawling and Interpretation Tool. TACIT's plugin architecture has three main components: 1. Crawling plugins 2. Corpus management 3. Analysis plugins. TACIT's open-source plugin platform allows the architecture to easily adapt with the rapid developments text analysis.

0
14 (+0) ⭐

REST and STREAMING crawlers of Twitter (java)

71355 Java libraries
(22505 libraries)
Go
(103778 libraries)
(56470 libraries)
(19548 libraries)
(30590 libraries)
C#
(50150 libraries)
(25127 libraries)
(44961 libraries)
(14791 libraries)
(10828 libraries)
(26976 libraries)
(16904 libraries)
(173776 libraries)
(16839 libraries)
Vue
(18216 libraries)
CSS
(84415 libraries)
(86873 libraries)
(62230 libraries)
(14121 libraries)
C++
(107490 libraries)
C
(87016 libraries)
(51939 libraries)
(52973 libraries)
(11227 libraries)
(71355 libraries)
PHP
(105332 libraries)
(142679 libraries)
(164731 libraries)
(7413 libraries)
Nim
(5040 libraries)
D
(11889 libraries)
(42622 libraries)
(2940 libraries)