Skip to content
··閱讀時間4分鐘

我嘅Database係「Correct」嘅。佢亦慢咗296倍。

我發現我嘅Postgres database有89個foreign keys但零個indexes——將millisecond queries變成843ms嘅噩夢,差啲kill咗我嘅alpha launch。

2025年10月25日。我嘅SaaS application終於feature-complete。九個major features、multi-tenant architecture、progressive learning——所有嘢都work。但佢好慢。真係好慢。

Dashboard queries喺度crawl。應該即刻load嘅lists要幾秒。Alpha testers開始問「個site係咪broken?」

我唔止擔心performance。我擔心survival

我花咗兩日optimize queries、rewrite RLS policies、加caching。冇任何嘢make a dent。Frustration不斷累積——離planned alpha launch得5日,我搞唔明一個technically-correct嘅database點解perform得咁差。

然後我run咗一個diagnostic query令我個胃沉落去。

89個foreign keys。零個indexes。

呢個係一個Postgres「feature」冇人同你講嘅故事——差啲kill咗我alpha launch嘅2個禮拜——同埋4分鐘嘅fix救返晒嘅經過。

Stakes:當Slow等於Dead

Research顯示每100ms嘅latency cost 1%嘅conversions。2-3秒嘅load times,我唔止係deliver差嘅experience——我係destroy緊會make or break我alpha嘅first impressions。

即時stakes:

- Alpha testers質疑platform夠唔夠stable去recommend

- 同early users嘅credibility lost,佢哋俾咗我trust

- 如果呢個persist到full launch:industry research suggest 3秒load time有40% bounce rate

- 「呢個好promising」同「呢個feel broken」之間嘅分別

- Database CPU去到80-90%(貴同越嚟越差)

- 離planned launch得5日,完全blocked

嗰2個禮拜嘅actual cost:

- 40+個鐘debug slow queries、profiling code、blame RLS policies

- Launch delay咗1個禮拜(miss咗我internal deadline)

- Lost opportunity去build features

- 同early testers嘅credibility damage

Technical debt變成business debt。而我唔知點解。

---

調查:Follow嗰啲Slow Queries

到10月底,我嘅platform有real complexity:30+個tables、80+個RLS policies、幾百個foreign key relationships。應該instant嘅simple operations慢到痛——load lists、fetch dashboard data、filter by org_id

我由pg_stat_statements開始搵culprits:

```sql
SELECT query, mean_exec_time, calls
FROM pg_stat_statements
ORDER BY mean_exec_time DESC
LIMIT 10;
```

每個slow query都有同一個pattern——filter by foreign keys:

```sql
SELECT * FROM user_data
WHERE org_id = '...' AND resource_id = '...';
```

應該即刻。我有呢啲columns。我有RLS policies。點解咁慢?

我check咗execution plan:

```sql
EXPLAIN ANALYZE
SELECT * FROM user_data
WHERE org_id = 'abc-123' AND resource_id = 'xyz-789';
```

結果令我個胃沉落去:

```
Seq Scan on user_data  (cost=0.00..2847.23 rows=15 width=1024)
  Filter: ((org_id = 'abc-123') AND (resource_id = 'xyz-789'))
  Rows Removed by Filter: 12,834
Execution Time: 843.271 ms
```

Sequential scan。Postgres讀緊table入面每一行,然後filter。一個應該幾milliseconds嘅query。

但我喺org_idresource_id有foreign keys。佢哋唔應該被indexed嗎?

---

頓悟:Foreign Keys唔會Auto-Index

以下係我嗰日hard way學到嘅:

Postgres自動為PRIMARY KEYs同UNIQUE constraints create indexes。

Postgres唔會自動為foreign keys create indexes。

等我再講一次,因為呢個single misunderstanding cost咗我2個禮拜嘅生命:

FOREIGN KEYS ≠ INDEXES

當你寫呢個:

```sql
CREATE TABLE user_data (
  id UUID PRIMARY KEY,  -- ✅ Automatically indexed
  org_id UUID REFERENCES organizations(id),  -- ❌ NOT indexed!
  resource_id UUID REFERENCES resources(id),  -- ❌ NOT indexed!
  created_at TIMESTAMPTZ
);
```

Postgres create咗foreign key constraint(referential integrity),但佢****唔會****喺org_idresource_idcreate index。

點解?因為Postgres唔可以assume你會點query data。可能你永遠唔filter by foreign keys。可能你always join喺specific direction。所以佢將decision留俾你。

問題? 我唔知我需要做呢個decision。我assume「foreign key」等於「indexed for queries」。佢唔係。

---

診斷:有幾差?

我寫咗一個query搵每個冇index嘅foreign key:

```sql
SELECT
  c.conrelid::regclass AS table_name,
  a.attname AS column_name
FROM pg_constraint c
JOIN pg_attribute a ON a.attnum = ANY(c.conkey) AND a.attrelid = c.conrelid
WHERE c.contype = 'f'
  AND NOT EXISTS (
    SELECT 1 FROM pg_index i
    WHERE i.indrelid = c.conrelid AND a.attnum = ANY(i.indkey)
  );
```

結果好似freight train咁hit我:

89行。

89個foreign keys across 32個tables。零個indexes。

每個join、每個filter、每個RLS policy check都做緊full table scans。難怪所有嘢都慢。

---

Fix:Index Apocalypse

2025年10月25日。朝早7:51。我知我要做咩。

一個migration。89個indexes。我叫佢「The Index Apocalypse」。

```sql
-- The Index Apocalypse Migration
-- Purpose: Fix performance by indexing all foreign key columns

-- Organization-scoped tables
CREATE INDEX idx_users_org_id ON users(org_id);
CREATE INDEX idx_resources_org_id ON resources(org_id);
CREATE INDEX idx_documents_org_id ON documents(org_id);

-- Resource-scoped tables
CREATE INDEX idx_user_data_resource_id ON user_data(resource_id);
CREATE INDEX idx_tasks_resource_id ON tasks(resource_id);

-- Multi-column indexes for common query patterns
CREATE INDEX idx_user_data_org_resource ON user_data(org_id, resource_id);

-- Soft-delete queries
CREATE INDEX idx_resources_org_archived ON resources(org_id, archived_at);

-- Total: 89 indexes across 32 tables
```

我run咗migration。我隻手literally hover喺Enter key上面一陣,wondering我係咪就嚟令嘢變得更差。T.T

Migration time:4分鐘

---

結果:由Disaster到Launch-Ready

Before(冇Indexes)

```sql
Seq Scan on user_data
  Execution Time: 843.271 ms
  Rows Removed by Filter: 12,834
```

After(有Indexes)

```sql
Index Scan using idx_user_data_org_resource
  Execution Time: 2.847 ms
```

843ms → 2.8ms

嗰個係單一query 296x更快

Real-world impact:

- Dashboard load:2-3秒 → 120ms

- Resource lists:1+秒 → 45ms

- Data queries:850ms → 12ms

Average improvement:20-40x更快。有啲有multiple foreign keys嘅complex joins?100x更快

差別係天與地。Platform由「呢個係咪broken?」變成「Wow,呢個好快。」:D

---

點解呢個對Multi-Tenant SaaS重要

如果你喺度build有Row-Level Security嘅multi-tenant SaaS,呢個absolutely critical。

RLS policies喺每個query都run:

```sql
CREATE POLICY resources_org_isolation ON resources
  USING (org_id = get_user_org_id());
```

org_id嘅index,呢個policy force每個query做sequential scan。即使simple SELECT statements都變得painfully slow。

Cost: 冇indexes嘅multi-tenant isolation = 貴嘅infrastructure + 慢嘅user experience = churn。

Business lesson: 冇performance optimization嘅security features唔係actually secure——因為users會喺experience到你嘅security之前就走。

---

Pattern:Index咩

經過呢個expensive lesson,以下係我每個新table嘅checklist:

Always index呢啲:

1. Foreign key columns——每一個

2. RLS policies入面嘅columns——特別係multi-tenant apps嘅org_id

3. WHERE clauses入面嘅columns——如果你frequently filter by佢,index佢

4. ORDER BY入面嘅columns——冇indexes sorting會kill performance

考慮multi-column indexes:

```sql
-- For: WHERE org_id = X AND resource_id = Y
CREATE INDEX idx_table_org_resource ON table(org_id, resource_id);
```

唔好over-index: 每個index cost storage、write performance、同maintenance。Rule of thumb:如果佢frequently出現喺WHERE/JOIN/ORDER BY,index佢。Otherwise,唔好。

---

點樣Check你嘅Database

而家run呢個diagnostic query:

```sql
SELECT
  c.conrelid::regclass AS table_name,
  a.attname AS column_name
FROM pg_constraint c
JOIN pg_attribute a ON a.attnum = ANY(c.conkey) AND a.attrelid = c.conrelid
WHERE c.contype = 'f'
  AND NOT EXISTS (
    SELECT 1 FROM pg_index i
    WHERE i.indrelid = c.conrelid AND a.attnum = ANY(i.indkey)
  );
```

如果你有results,你有missing indexes。

然後搵你嘅slow queries:

```sql
SELECT query, mean_exec_time, calls
FROM pg_stat_statements
WHERE mean_exec_time > 100
ORDER BY mean_exec_time DESC;
```

EXPLAIN ANALYZE喺slow嘅嗰啲上面verify你見到sequential scans。

---

學到嘅嘢

1. Postgres assumptions好危險

我assume foreign keys係indexed嘅。佢哋唔係。Always verify。

2. RLS需要indexes

Row-Level Security冇proper indexes係useless——actually比useless更差。佢add overhead到每個query。

3. Multi-tenant = index org_id everywhere

喺multi-tenant architecture入面,org_id出現喺幾乎每個query。喺每個table都index佢。No exceptions。

4. Performance係business problem

Users唔care你嘅SQL technically correct如果page要3秒先load。Fast queries = better conversion = revenue。

5. 早啲加indexes

喺empty tables加indexes係instant。喺有millions of rows嘅tables加要幾個鐘仲lock個table。喺initial development做。

6. Measure everything

EXPLAIN ANALYZE before同after。用data prove improvement。

---

你嘅Checklist

你create嘅每個新table:

```sql
CREATE TABLE new_table (
  id UUID PRIMARY KEY,
  org_id UUID REFERENCES organizations(id),
  resource_id UUID REFERENCES resources(id),
  created_at TIMESTAMPTZ,
  archived_at TIMESTAMPTZ
);

-- ✅ Index foreign keys
CREATE INDEX idx_new_table_org_id ON new_table(org_id);
CREATE INDEX idx_new_table_resource_id ON new_table(resource_id);

-- ✅ Index RLS columns + common filters
CREATE INDEX idx_new_table_org_archived ON new_table(org_id, archived_at);

-- ✅ Index sort columns
CREATE INDEX idx_new_table_created ON new_table(created_at DESC);

-- ✅ Create RLS policies
CREATE POLICY new_table_org_isolation ON new_table
  FOR ALL TO authenticated
  USING (org_id = get_user_org_id());
```

---

最後諗法

89個missing indexes。2個禮拜debugging。一個diagnostic query搵晒佢哋。4分鐘fix。

Irony?Postgres俾晒你所有tools嚟diagnose呢個。我只係唔知要點睇。而家我create嘅每個table都即刻indexed——foreign keys、RLS columns、sort fields,所有嘢。

嗰個加咗89個indexes嘅4分鐘migration省咗幾個禮拜嘅future optimization work同prevent咗一個launch disaster。

For founders: Performance唔止係engineering problem。佢係conversion problem、retention problem、同credibility problem。嗰個「technically correct」但太慢用唔到嘅database?佢cost緊你customers。早啲invest in performance。

For engineers: Postgres唔會auto-index你嘅foreign keys。但你應該。你嘅users(同你嘅database CPU)會多謝你。

你有冇試過「technically correct but painfully slow」嘅database moment?Fix係咩——你花咗幾耐先搵到?

祝好,

Chandler

STRAŦUM architecture series: 呢個performance crisis係multi-tenancy puzzle嘅最後一塊,由第2日build multi-tenancy開始,繼續喺第67日complete schema rebuild,同埋fix 31個blank screens因為lost navigation context

---

用Postgres build multi-tenant SaaS? 我hard way學咗呢個lesson等你唔使。而家check你嘅database先,趁你嘅users未notice。

https://stratum.chandlernguyen.com/request-invitation申請alpha access

---

仲喺學緊「it works」同「it works fast」係唔同嘅goals——而只有一個ship到。

---

繼續閱讀

我嘅旅程
聯繫
語言
偏好設定