Skip to content

Commit 0091daa

Browse files
committed
Add canonical docs for JanusQL and baselines
1 parent 6a83929 commit 0091daa

7 files changed

Lines changed: 656 additions & 190 deletions

File tree

docs/ANOMALY_DETECTION.md

Lines changed: 84 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,84 @@
1+
# Anomaly Detection
2+
3+
Janus already supports anomaly-oriented extension functions, but they are stateless functions evaluated within one query execution context.
4+
5+
That distinction matters.
6+
7+
## What Extension Functions Are Good At
8+
9+
Current extension functions are sufficient for:
10+
11+
- fixed thresholds
12+
- relative change checks
13+
- z-score style checks when mean and sigma are already present
14+
- simple outlier or divergence predicates over current bindings
15+
16+
This works well when the query already has everything it needs in one evaluation context.
17+
18+
## Where Baselines Help
19+
20+
Baselines help when live anomaly scoring depends on historical context such as:
21+
22+
- deviation from normal behavior
23+
- per-sensor baselines
24+
- volatility comparison
25+
- recent historical trend
26+
27+
In those cases, Janus can bootstrap compact historical values into live static data and let the live query compare current readings against them.
28+
29+
## What Janus Does Not Do
30+
31+
Janus does not currently maintain a full continuously updated hybrid historical/live relation.
32+
33+
So if you need:
34+
35+
- long-running stateful models
36+
- full seasonal context
37+
- large retained historical buffers inside the engine
38+
39+
you will need either:
40+
41+
- external model state
42+
- future dedicated baseline refresh logic
43+
- more specialized stateful operators
44+
45+
## Recommended Pattern
46+
47+
For a first anomaly-detection pipeline in Janus:
48+
49+
1. Use a historical query that emits one compact row per anchor.
50+
2. Materialize baseline values such as `mean` and `sigma`.
51+
3. Join those values in the live query using `baseline:*` predicates.
52+
4. Apply extension functions on the live side.
53+
54+
Example:
55+
56+
```sparql
57+
PREFIX ex: <http://example.org/>
58+
PREFIX janus: <https://janus.rs/fn#>
59+
PREFIX baseline: <https://janus.rs/baseline#>
60+
61+
REGISTER RStream ex:out AS
62+
SELECT ?sensor ?reading
63+
FROM NAMED WINDOW ex:hist ON LOG ex:store [START 1700000000000 END 1700003600000]
64+
FROM NAMED WINDOW ex:live ON STREAM ex:stream1 [RANGE 5000 STEP 1000]
65+
USING BASELINE ex:hist LAST
66+
WHERE {
67+
WINDOW ex:hist {
68+
?sensor ex:mean ?mean .
69+
?sensor ex:sigma ?sigma .
70+
}
71+
WINDOW ex:live {
72+
?sensor ex:hasReading ?reading .
73+
}
74+
?sensor baseline:mean ?mean .
75+
?sensor baseline:sigma ?sigma .
76+
FILTER(janus:is_outlier(?reading, ?mean, ?sigma, 3))
77+
}
78+
```
79+
80+
## Choosing LAST vs AGGREGATE
81+
82+
- Use `LAST` when you care about the most recent historical regime before live execution.
83+
- Use `AGGREGATE` when you want a more stable summary across multiple historical sliding windows.
84+
- Prefer fixed historical windows unless you have a clear reason to derive a baseline from many historical subwindows.

docs/BASELINES.md

Lines changed: 128 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,128 @@
1+
# Baselines
2+
3+
Baseline support in Janus is meant for hybrid anomaly-style queries where historical data initializes context for live scoring.
4+
5+
It is not a full hybrid-state engine.
6+
7+
## What Baseline Bootstrap Does
8+
9+
When a query has:
10+
11+
- at least one historical window
12+
- at least one live window
13+
- a baseline-aware query shape, typically with `baseline:*` joins
14+
15+
Janus can evaluate the historical side, collapse the result into compact baseline statements, and insert those statements into the live processor as static data.
16+
17+
The live query then joins against those static triples.
18+
19+
## How It Is Enabled
20+
21+
Preferred query-level form:
22+
23+
```sparql
24+
USING BASELINE ex:hist LAST
25+
```
26+
27+
or:
28+
29+
```sparql
30+
USING BASELINE ex:hist AGGREGATE
31+
```
32+
33+
If the clause is missing, registration can still provide:
34+
35+
- `baseline_mode = aggregate`
36+
- `baseline_mode = last`
37+
38+
The query-level clause takes precedence when present.
39+
40+
## LAST vs AGGREGATE
41+
42+
### LAST
43+
44+
For a historical sliding window:
45+
46+
- only the final sliding-window result snapshot is retained
47+
- earlier window outputs are discarded for baseline collapse
48+
49+
This is useful when you want:
50+
51+
- the most recent historical regime
52+
- a low-ambiguity startup baseline
53+
54+
### AGGREGATE
55+
56+
For a historical sliding window:
57+
58+
- all historical sliding-window outputs are folded into one compact baseline
59+
- numeric values are averaged per `(anchor, variable)`
60+
- non-numeric values fall back to the latest seen value
61+
62+
This is useful when you want:
63+
64+
- a broader recent historical summary
65+
- less sensitivity to the last historical subwindow
66+
67+
## Fixed Historical Windows
68+
69+
For a fixed historical window, the distinction between `LAST` and `AGGREGATE` is much smaller because there is only one historical result set.
70+
71+
In practice:
72+
73+
- fixed historical baseline is usually the simplest and clearest baseline path
74+
- historical sliding baseline is more advanced and can cost more at startup
75+
76+
## Async Warm-Up
77+
78+
Janus now warms baseline state asynchronously.
79+
80+
Behavior:
81+
82+
1. live execution starts immediately
83+
2. query status becomes `WarmingBaseline`
84+
3. baseline bootstrap runs in a background thread
85+
4. baseline triples are inserted into live static data
86+
5. query status moves to `Running`
87+
88+
Effect on query results:
89+
90+
- a live query that depends on baseline joins typically produces no matches until the baseline is ready
91+
- once baseline static data exists, future live evaluations can match those joins
92+
93+
## What Janus Stores
94+
95+
Janus does not retain all historical events or all historical sliding-window outputs as permanent runtime state.
96+
97+
For baseline bootstrap it retains:
98+
99+
- a compact accumulator keyed by `(anchor, variable)` during bootstrap
100+
- then final static baseline triples inside live processing
101+
102+
It does not retain:
103+
104+
- all raw historical events in memory
105+
- all sliding-window result batches after bootstrap
106+
- a continuously merged historical/live relation
107+
108+
## Anchor Selection
109+
110+
Baseline values are materialized per anchor subject.
111+
112+
The current implementation prefers binding variables named:
113+
114+
- `sensor`
115+
- `subject`
116+
- `entity`
117+
- `s`
118+
119+
If none of those exist, Janus falls back to the first IRI-like binding it can find.
120+
121+
This means historical baseline queries work best when they explicitly return a stable anchor variable such as `?sensor`.
122+
123+
## Recommended Usage
124+
125+
- Prefer fixed historical windows first.
126+
- Use historical sliding windows only when you need a baseline derived from multiple historical subwindows.
127+
- Keep baseline queries compact, ideally one row per anchor.
128+
- Start with baseline values such as `mean` and `sigma`; add `slope` or quantiles later if needed.

docs/DOCUMENTATION_INDEX.md

Lines changed: 62 additions & 87 deletions
Original file line numberDiff line numberDiff line change
@@ -1,87 +1,62 @@
1-
# Janus HTTP API - Documentation Index
2-
3-
## Getting Started
4-
5-
1. **START_HERE.md** - 🚀 BEGIN HERE - Quick start guide
6-
2. **scripts/test_setup.sh** - Automated setup script
7-
3. **docker-compose.yml** - MQTT broker configuration
8-
9-
## Quick Reference
10-
11-
4. **QUICK_REFERENCE.md** - One-page cheat sheet
12-
5. **FINAL_TEST.md** - Test verification steps
13-
6. **RUNTIME_FIX_SUMMARY.md** - Runtime panic fix explanation
14-
15-
## Complete Guides
16-
17-
7. **SETUP_GUIDE.md** - Comprehensive setup with MQTT
18-
8. **README_HTTP_API.md** - Complete API documentation
19-
9. **COMPLETE_SOLUTION.md** - Full implementation details
20-
10. **HTTP_API_IMPLEMENTATION.md** - Technical architecture
21-
22-
## Code
23-
24-
11. **src/http/server.rs** - HTTP server implementation (537 lines)
25-
12. **src/http/mod.rs** - Module exports
26-
13. **src/bin/http_server.rs** - Server binary (111 lines)
27-
14. **examples/http_client_example.rs** - Client example (370 lines)
28-
15. **examples/demo_dashboard.html** - Interactive dashboard (670 lines)
29-
30-
## Configuration
31-
32-
16. **docker/mosquitto/config/mosquitto.conf** - MQTT broker config
33-
17. **Cargo.toml** - Dependencies (axum, tower-http, tokio-tungstenite, etc.)
34-
35-
## How to Use This Documentation
36-
37-
### If you're brand new:
38-
→ Read **START_HERE.md**
39-
40-
### If you want quick commands:
41-
→ Read **QUICK_REFERENCE.md**
42-
43-
### If you see runtime panics:
44-
→ Read **RUNTIME_FIX_SUMMARY.md**
45-
46-
### If you need detailed setup:
47-
→ Read **SETUP_GUIDE.md**
48-
49-
### If you want to understand the API:
50-
→ Read **README_HTTP_API.md**
51-
52-
### If you need implementation details:
53-
→ Read **COMPLETE_SOLUTION.md** or **HTTP_API_IMPLEMENTATION.md**
54-
55-
### If you want to verify everything works:
56-
→ Follow **FINAL_TEST.md**
57-
58-
## File Sizes
59-
60-
```
61-
START_HERE.md ~1 KB (Quick start)
62-
QUICK_REFERENCE.md ~2 KB (Cheat sheet)
63-
RUNTIME_FIX_SUMMARY.md ~3 KB (Fix explanation)
64-
FINAL_TEST.md ~3 KB (Testing guide)
65-
SETUP_GUIDE.md ~18 KB (Detailed setup)
66-
README_HTTP_API.md ~15 KB (API guide)
67-
COMPLETE_SOLUTION.md ~9 KB (Solution summary)
68-
HTTP_API_IMPLEMENTATION.md ~19 KB (Technical details)
69-
70-
src/http/server.rs ~15 KB (Server code)
71-
examples/demo_dashboard.html ~20 KB (Dashboard)
72-
examples/http_client_example.rs ~11 KB (Client example)
73-
```
74-
75-
## Priority Reading Order
76-
77-
1. START_HERE.md
78-
2. QUICK_REFERENCE.md
79-
3. SETUP_GUIDE.md (if needed)
80-
4. README_HTTP_API.md (for API details)
81-
82-
The rest are reference materials for specific needs.
83-
84-
---
85-
86-
**Total: ~115 KB of documentation + ~50 KB of code**
87-
**Everything you need to use Janus HTTP API successfully!**
1+
# Janus Documentation Index
2+
3+
This is the shortest path to understanding the current Janus implementation.
4+
5+
## Core Reading Order
6+
7+
1. [../README.md](../README.md)
8+
2. [JANUSQL.md](./JANUSQL.md)
9+
3. [QUERY_EXECUTION.md](./QUERY_EXECUTION.md)
10+
4. [BASELINES.md](./BASELINES.md)
11+
5. [HTTP_API_CURRENT.md](./HTTP_API_CURRENT.md)
12+
6. [ANOMALY_DETECTION.md](./ANOMALY_DETECTION.md)
13+
14+
## What Each File Covers
15+
16+
- [JANUSQL.md](./JANUSQL.md)
17+
- query structure
18+
- supported window types
19+
- `USING BASELINE <window> LAST|AGGREGATE`
20+
- how live and historical queries are derived
21+
22+
- [QUERY_EXECUTION.md](./QUERY_EXECUTION.md)
23+
- registration and parsed metadata
24+
- `start_query()` flow
25+
- historical workers
26+
- live workers and MQTT subscription
27+
- result multiplexing and runtime status
28+
29+
- [BASELINES.md](./BASELINES.md)
30+
- what baseline bootstrap does
31+
- `LAST` vs `AGGREGATE`
32+
- async warm-up behavior
33+
- what state is and is not retained
34+
35+
- [HTTP_API_CURRENT.md](./HTTP_API_CURRENT.md)
36+
- current REST endpoints
37+
- WebSocket result flow
38+
- request and response shapes
39+
- `baseline_mode` registration fallback
40+
41+
- [ANOMALY_DETECTION.md](./ANOMALY_DETECTION.md)
42+
- when extension functions are enough
43+
- when baseline state helps
44+
- recommended query patterns
45+
46+
## Legacy Material
47+
48+
The following files remain useful as background, but they are not the main entrypoint for the current code:
49+
50+
- [ARCHITECTURE.md](./ARCHITECTURE.md)
51+
- [EXECUTION_ARCHITECTURE.md](./EXECUTION_ARCHITECTURE.md)
52+
- [HTTP_API.md](./HTTP_API.md)
53+
- [README_HTTP_API.md](./README_HTTP_API.md)
54+
- [SETUP_GUIDE.md](./SETUP_GUIDE.md)
55+
56+
## Related Code
57+
58+
- [../src/parsing/janusql_parser.rs](../src/parsing/janusql_parser.rs)
59+
- [../src/api/janus_api.rs](../src/api/janus_api.rs)
60+
- [../src/http/server.rs](../src/http/server.rs)
61+
- [../src/stream/live_stream_processing.rs](../src/stream/live_stream_processing.rs)
62+
- [../src/execution/historical_executor.rs](../src/execution/historical_executor.rs)

0 commit comments

Comments
 (0)