Skip to content

Commit a7d9c8a

Browse files
author
Fernando López
committed
Update the program to allow parallel execution of the queries
1 parent 91784bd commit a7d9c8a

8 files changed

Lines changed: 458 additions & 52 deletions

File tree

.env

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
GITHUB_TOKEN=your_github_token_here

Cargo.toml

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,14 @@
11
[package]
2-
name = "github_statistics"
3-
version = "0.1.0"
2+
name = "github_stats"
3+
version = "0.2.0"
44
edition = "2021"
55

66
[dependencies]
7-
reqwest = { version = "0.12.8", features = ["json"] } # reqwest with JSON parsing support
8-
futures = "0.3.31" # for our async / await blocks
9-
tokio = { version = "1.40.0", features = ["full"] } # for our async runtime
7+
axum = "0.7"
108
serde = { version = "1.0", features = ["derive"] }
9+
serde_json = "1.0"
10+
reqwest = { version = "0.12", features = ["json", "gzip", "stream"] }
11+
tokio = { version = "1", features = ["full"] }
12+
futures = "0.3"
13+
dotenv = "0.15"
14+
openssl-sys = { version = "0.9", features = ["vendored"] }

LICENSE

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -186,7 +186,7 @@
186186
same "printed page" as the copyright notice for easier
187187
identification within third-party archives.
188188

189-
Copyright [yyyy] [name of copyright owner]
189+
Copyright [2025] [FIWARE Foundation e.V.]
190190

191191
Licensed under the Apache License, Version 2.0 (the "License");
192192
you may not use this file except in compliance with the License.

README.md

Lines changed: 0 additions & 2 deletions
This file was deleted.

Readme.md

Lines changed: 149 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,149 @@
1+
# GitHub Statistics Collector
2+
3+
A high-performance, asynchronous Rust application that collects engagement statistics from FIWARE GitHub repositories — including **stargazers**, **contributors**, **forks**, **watchers**, and **issue authors**.
4+
5+
This project parallelizes API calls across repositories and forks using **Tokio** and **FuturesUnordered**, ensuring efficient data retrieval even for large organizations such as FIWARE.
6+
7+
---
8+
9+
## 🚀 Features
10+
11+
- ✅ Collects GitHub repository data:
12+
- Stargazers (users who starred)
13+
- Contributors (including forked repos)
14+
- Watchers (subscribers)
15+
- Issue authors
16+
- ⚡ Fully asynchronous using `tokio` and `reqwest`
17+
- 🔀 Parallel processing across multiple repositories and forks
18+
- 🧱 Built with the Axum web framework (includes a simple HTTP status route)
19+
- 🕒 Handles GitHub rate limits gracefully (auto-sleep on 403)
20+
- 🔐 Secure GitHub API access via personal token
21+
22+
---
23+
24+
## 🧩 Tech Stack
25+
26+
| Component | Purpose |
27+
|------------|----------|
28+
| **Rust** | Core programming language |
29+
| **Axum** | Lightweight async web server |
30+
| **Reqwest** | HTTP client for GitHub API calls |
31+
| **Tokio** | Async runtime |
32+
| **Futures** | Parallel async task management |
33+
| **Serde / Serde JSON** | JSON parsing and serialization |
34+
| **Dotenv** | Environment variable management |
35+
36+
---
37+
38+
## 🧰 Installation & Setup
39+
40+
### 1. Prerequisites
41+
- Rust (v1.70+)
42+
- A GitHub Personal Access Token (with `read:public_repo` permissions)
43+
- `cargo` build tool
44+
45+
### 2. Clone the repository
46+
```bash
47+
git clone https://github.com/yourusername/github-stats-collector.git
48+
cd github-stats-collector
49+
```
50+
51+
### 3. Create your .env file
52+
53+
```bash
54+
GITHUB_TOKEN=your_github_token_here
55+
```
56+
57+
The program only reads public repository data, so it does not need write
58+
or admin permissions. There are two options to generate the token, either
59+
Personal Access Token (classic) or Fine-grained Tokens (recommended by
60+
GitHub)
61+
62+
When creating a Personal Access Token (classic):
63+
64+
- Go to → GitHub Settings → Developer Settings → Personal Access Tokens
65+
→ Tokens (classic)
66+
- Click “Generate new token (classic)”. Set:
67+
- Expiration: reasonable (e.g., 90 days or 1 year).
68+
- Scopes: check only, read:public_repo (this grants access to read
69+
public repositories’ metadata)
70+
71+
Copy the generated token and save it securely. If you prefer Fine-grained
72+
Tokens (recommended by GitHub):
73+
74+
- Choose “Fine-grained personal access token”
75+
- Under Repository access, select “All public repositories”
76+
- Under Permissions → Repository Permissions, set:
77+
- Metadata → Read-only
78+
- Contents → Read-only
79+
- Issues → Read-only (for issue authors)
80+
- Pull requests → Read-only (optional)
81+
- No other permissions are required.
82+
83+
### 4. Define the repositories to analyze
84+
85+
Create a repos.json file in the project root:
86+
```bash
87+
[
88+
"FIWARE/context.Orion-LD",
89+
"FIWARE/tutorials.Step-by-Step"
90+
]
91+
```
92+
93+
### 5. Run the collector
94+
```bash
95+
cargo run
96+
```
97+
98+
## 📊 Example Output
99+
100+
```bash
101+
Fetching stats for FIWARE/context.Orion-LD...
102+
[FIWARE/context.Orion-LD] Stargazers: 20, Developers: 10, Total Users: 45
103+
Fetching stats for FIWARE/tutorials.Step-by-Step...
104+
[FIWARE/tutorials.Step-by-Step] Stargazers: 35, Developers: 15, Total Users: 60
105+
106+
Total FIWARE users: 90
107+
Total FIWARE developers: 22
108+
```
109+
110+
## 🌐 Web Endpoint
111+
112+
The project includes a minimal Axum server that exposes a health-check route:
113+
114+
http://localhost:8080/
115+
116+
117+
Response:
118+
119+
GitHub Stats Collector Running 🚀
120+
121+
## 🧩 Directory Structure
122+
123+
github-stats-collector/
124+
├── Cargo.toml
125+
├── src/
126+
│ └── main.rs
127+
├── repos.json
128+
├── .env
129+
├── README.md
130+
└── ROADMAP.md
131+
132+
## Roadmap
133+
134+
To take an overview of the Roadmap defined for this component, please
135+
take a look to the [Roadmap.md](./Roadmap.md) document.
136+
137+
## 🤝 Contributions
138+
139+
Pull requests, feature suggestions, and improvements are welcome!
140+
Please open an issue before submitting major changes.
141+
142+
## 📧 Contact
143+
144+
Maintained by [Your Name or Organization]
145+
If you have questions, reach out via GitHub Issues or email.
146+
147+
## ⚖️ License
148+
149+
This project is licensed under the [Apache 2.0 License](./LICENSE).

Roadmap.md

Lines changed: 73 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,73 @@
1+
# Project Roadmap
2+
3+
This document outlines the planned improvements and future goals for the **GitHub Statistics Collector**.
4+
5+
---
6+
7+
## ✅ Current Status
8+
9+
- Parallelized GitHub API calls for:
10+
- Stargazers
11+
- Contributors (including forks)
12+
- Watchers
13+
- Issue creators
14+
- Automatic rate-limit handling (1-hour cooldown)
15+
- Minimal Axum HTTP service
16+
- CLI-based data aggregation for multiple repositories
17+
- JSON configuration support (`repos.json`)
18+
19+
---
20+
21+
## 🧭 Short-Term Goals (Q4 2025)
22+
23+
| Goal | Description | Status |
24+
|------|--------------|--------|
25+
| **1. REST API for results** | Expose results as `/stats` JSON endpoint with repository-level breakdowns. | 🔜 Planned |
26+
| **2. Caching layer** | Store fetched results to minimize redundant API calls and respect rate limits. | 🔜 Planned |
27+
| **3. Configurable concurrency** | Allow tuning of async task parallelism (e.g., via `TOKIO_MAX_CONCURRENCY`). | 🔜 Planned |
28+
| **4. Improved error handling** | Replace panics with structured error responses using `anyhow` or `thiserror`. | ⚙️ In progress |
29+
30+
---
31+
32+
## 🧱 Medium-Term Goals (2026)
33+
34+
| Goal | Description | Status |
35+
|------|--------------|--------|
36+
| **1. Web Dashboard** | Build a small frontend (React + Axum API) to visualize collected statistics. | 🔜 Planned |
37+
| **2. Data persistence** | Store collected data in SQLite or Postgres using `sqlx`. | 🔜 Planned |
38+
| **3. GitHub Actions Integration** | Automate daily stats collection and store results in repository artifacts. | 🔜 Planned |
39+
| **4. CSV/JSON Export** | Add endpoints or CLI flags to export aggregated results. | 🔜 Planned |
40+
| **5. Parallel rate-limit management** | Implement intelligent queueing with per-token rate tracking. | 🔜 Planned |
41+
42+
---
43+
44+
## 🌍 Long-Term Vision
45+
46+
- Build a **“GitHub Insights Service”** that aggregates ecosystem-wide engagement metrics for open-source projects (especially FIWARE-related).
47+
- Offer both **CLI** and **API** modes for automation.
48+
- Integrate authentication and user dashboards.
49+
- Support other platforms (GitLab, Bitbucket) through modular API adapters.
50+
51+
---
52+
53+
## 🧠 Potential Enhancements
54+
55+
- [ ] Add tracing/logging with `tracing` crate and `flexi_logger`
56+
- [ ] Include unit tests and integration tests with mocked GitHub API
57+
- [ ] Provide Docker container for easy deployment
58+
- [ ] Generate periodic reports (PDF/HTML) using a scheduler
59+
- [ ] Integrate with FIWARE’s internal analytics system
60+
61+
---
62+
63+
## 💬 Feedback & Collaboration
64+
65+
Suggestions and contributions are highly appreciated.
66+
Open a GitHub issue or reach out if you want to collaborate on:
67+
68+
- API design
69+
- Dashboard UI
70+
- Data persistence
71+
- Performance optimizations
72+
73+
---

data/repos.json

Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,61 @@
1+
[
2+
"telefonicaid/fiware-orion",
3+
"FIWARE/context.Orion-LD",
4+
"ScorpioBroker/ScorpioBroker",
5+
"stellio-hub/stellio-context-broker",
6+
"Ficodes/ngsijs",
7+
"telefonicaid/fiware-sth-comet",
8+
"telefonicaid/fiware-cygnus",
9+
"ging/fiware-draco",
10+
"ging/fiware-cosmos-orion-flink-connector",
11+
"ging/fiware-cosmos-orion-spark-connector",
12+
"orchestracities/ngsi-timeseries-api",
13+
"Engineering-Research-and-Development/fiware-orion-pyspark-connector",
14+
"FIWARE/CanisMajor",
15+
"FIWARE/apollo",
16+
"Engineering-Research-and-Development/o2k-connector",
17+
"telefonicaid/iotagent-json",
18+
"telefonicaid/lightweightm2m-iotagent",
19+
"telefonicaid/iotagent-ul",
20+
"Atos-Research-and-Innovation/IoTagent-LoRaWAN",
21+
"telefonicaid/sigfox-iotagent",
22+
"FIWARE/iotagent-isoxml",
23+
"telefonicaid/iotagent-node-lib",
24+
"OpenMTC/OpenMTC",
25+
"Kurento/kurento-media-server",
26+
"OpenVidu/openvidu",
27+
"eProsima/Fast-DDS",
28+
"eProsima/Micro-XRCE-DDS",
29+
"iml130/firos",
30+
"Engineering-Research-and-Development/iotagent-opcua",
31+
"yalewkidane/FIWARE_EPCIS_Mediation_Gateway",
32+
"Wirecloud/wirecloud",
33+
"smartfog/fogflow",
34+
"telefonicaid/perseo-fe",
35+
"telefonicaid/perseo-core",
36+
"ging/fiware-idm",
37+
"ging/fiware-pep-proxy",
38+
"authzforce/server",
39+
"Engineering-Research-and-Development/fiware-true-connector",
40+
"telefonicaid/fiware-pep-steelskin",
41+
"telefonicaid/fiware-keypass",
42+
"telefonicaid/fiware-keystone-scim",
43+
"telefonicaid/fiware-keystone-spassword",
44+
"FIWARE/trusted-issuers-list",
45+
"FIWARE/dsba-pdp",
46+
"FIWARE/VCVerifier",
47+
"FIWARE/keycloak-vc-issuer",
48+
"FIWARE/credentials-config-service",
49+
"FIWARE/trusted-issuers-registry",
50+
"orchestracities/anubis",
51+
"conwetlab/FIWARE-CKAN-Extensions",
52+
"OPSILab/Idra",
53+
"FIWARE-TMForum/Business-API-Ecosystem",
54+
"FIWARE-TMForum/business-ecosystem-charging-backend",
55+
"FIWARE-TMForum/business-ecosystem-logic-proxy",
56+
"FIWARE-TMForum/business-ecosystem-rss",
57+
"coatrack/coatrack",
58+
"FIWARE/endpoint-auth-service",
59+
"FIWARE/kong-plugins-fiware",
60+
"FIWARE/helm-charts"
61+
]

0 commit comments

Comments
 (0)