Integrating clickhouse for metrics.

This commit is contained in:
2026-04-06 23:05:04 -04:00
parent 109ef398ee
commit a33ec14c5f
14 changed files with 577 additions and 2 deletions

View File

@@ -0,0 +1,49 @@
defmodule Mixer.ClickhouseRepo.Migrations.CreatePostEvents do
use Ecto.Migration
@doc """
Creates the `post_events` table using a MergeTree engine.
Key design decisions:
* `LowCardinality(String)` for `event_type` — the cardinality is tiny
(510 values), so ClickHouse can store it as a dictionary, giving both
compression and faster filtering.
* `Nullable(UUID)` / `Nullable(String)` for optional columns — ClickHouse
handles NULLs differently from PostgreSQL; we make the nullable fields
explicit so the schema is unambiguous.
* `ORDER BY (occurred_at, event_type, tweet_id)` — optimises the two most
common query patterns:
1. Time-range scans (`WHERE occurred_at >= now() - interval 24 HOUR`)
2. Per-tweet aggregations (`WHERE tweet_id = ?`)
* `PARTITION BY toYYYYMM(occurred_at)` — monthly partitions make it cheap
to drop old data with `ALTER TABLE … DROP PARTITION`.
* `TTL occurred_at + INTERVAL 1 YEAR DELETE` — automatically reclaim disk
space after two years. Adjust as required.
"""
def up do
execute("""
CREATE TABLE IF NOT EXISTS post_events
(
event_type LowCardinality(String),
tweet_id UUID,
user_id Nullable(UUID),
occurred_at DateTime,
ip_address Nullable(String)
)
ENGINE = MergeTree()
PARTITION BY toYYYYMM(occurred_at)
ORDER BY (occurred_at, event_type, tweet_id)
TTL occurred_at + INTERVAL 1 YEAR DELETE
SETTINGS index_granularity = 8192
""")
end
def down do
execute("DROP TABLE IF EXISTS post_events")
end
end