对我当前聊天机器人的一次升级

更新（2026）： 自从这次 Weaviate 实验后，Sydney 已经进化很多。测试过 FAISS、Weaviate 等方案后，我最终在生产版选择了 Supabase pgvector。Sydney 现在基于 Claude，支持流式响应和工具调用。下文提到的 hybrid search 与 query structuring 思路，直接影响了今天 Sydney 的工作方式。

Ask Sydney →

以下保留 2024 年 5 月原文以供参考。

如我在上一篇文章提到的，我正在做一个可自我评估的 Financial Chatbot。那为什么又绕路去升级当前 chatbot Sydney？主要有几个原因：

端到端测试 Weaviate 向量库：
- 我想把 Weaviate 作为金融 chatbot 的主向量库，但当前遇到扩展/内存规模问题。仅用约 20 家公司的过去 10 年 10K 和 10Q filings，weaviate collection 就已经 2GB。按这个做法扩展到整个 S&P 500，collection 会到 50+ GB。体量太大，运行/维护成本也会很高。所以我在测试不同 Product Quantization（PQ）参数。
- 试过多个向量库后，我目前仍倾向 Weaviate，因为它在“metadata filters + hybrid search”上的速度表现很好。
- 在做 Weaviate 的 PQ 时，我也遇到部署层问题：生产环境里应选择 Weaviate Cloud，还是自己上 AWS/Google Cloud？部署复杂度如何？成本会怎样？
- 因为这些问题，我决定先把当前 chatbot Sydney 部署到 Weaviate。也就是把 FAISS 替换成 Weaviate。
- 当前版本我使用的是 Weaviate Cloud
实现 query translation 与 query structuring，并确保输出为可用 Json。
- query translation 目标：把输入问题拆成可独立回答的子问题/子任务。
- query structuring 目标：我不仅关心生成用于 hybrid search 的检索词/短语，还关心生成正确的 metadata filters。
  - 这点非常关键，因为金融 chatbot 必须在需要时正确按年份、行业等条件过滤。
如何在带多个 filters 的情况下进行 hybrid-search，并返回内容和 metadata。

如你所见，以上能力都是金融 chatbot 的必要条件。那我就先在规模更小的 Sydney 上验证它们，完全合理 :)

很高兴你现在可以直接试 Sydney。它已经具备上述能力。你可以问下面这类问题，chatbot 应该会返回相关答案，并附上内容来源博客文章链接。

What did chandler write about Kevin Rudd in 2020?
Tell me everything that Chandler wrote about Ray Dalio between 2020 until now
what did chandler write about Health Savings Accounts in 2022?
What did chandler do in 2015?

先写到这里。我得回去继续做金融 chatbot 了 :P

如果你也做过 Weaviate 或带 metadata filters 的 hybrid search，很想听听你有哪些实践是有效的。

致敬，

Chandler

2024 年 9 月更新

Sydney 现在是一个多能力 agent，可以：

回答当前 S&P 500 公司的相关问题，包括它们过去 10 年向 SEC 提交的内容。
提供我过去 15 年博客内容相关的洞察。

看这里 here。

P.S：下面是我用于 query translation 与 query structuring 的 prompt 样例。

"You are a helpful assistant that generates multiple sub-questions related to an input question. "
             "The current year is 2024."
             "The goal is to break down the input into a set of specific sub-problems / sub-questions that can be answered in isolation. "
             "Each specific sub-question will be used to retrieve relevant content from a vector store, using similarity search with score. "
             "Phrase the wording of the questions appropriately for this purpose.\n"
             "This vector store includes all of the published blog posts from Chandler Nguyen's blog from 2007 to 2024.\n\n"
             "Original question: {query}\n\n"
             "Generate the minimum number of sub-questions necessary to answer the original question. "
             "Your response should be formatted as a JSON array of strings, where each string represents a sub-question. "
             "Do not include any additional words, characters, or explanations in the response.\n\n"
             "Example response:\n"
             '[\n'
             '  "sub-question 1",\n'
             '  "sub-question 2"\n'
             ']'

"""You are a helpful assistant that generates a structured query related to an input question.
The goal is to break down the input into a structured query that can be used to retrieve relevant content from a vector store, using similarity search with score.
This vector store includes all of the published blog posts from Chandler Nguyen's blog from 2007 to 2024.
Original question: {query}

You must generate a response in JSON format as described below without any additional words or characters:
[
    "content_search": Similarity search query used to apply to the content of the Chandler Nguyen published blog posts to find similar documents related to the sub-question(s). Ensure the content_search query is not too broad or too specific, and strikes a balance between relevance and completeness. \n
    "start_date": optional field, the start date to search for blog posts that are relevant to the sub-question(s) in YYYY-MM-DD format. If the sub-question(s) do not specify a time frame, leave this field blank or set it to the earliest possible date (e.g., 2007-01-01) to cover a broader range. \n
    "end_date": optional field, the end date to search for blog posts that are relevant to the sub-question(s) in YYYY-MM-DD format. If the sub-question(s) do not specify a time frame, leave this field blank or set it to the latest possible date (e.g., 2024-12-31) to cover a broader range. \n
]

If the sub-question(s) include multiple years or a specific time range, generate 1 response for each year or time range, enclosed in separate JSON objects within the outer array.

Example responses:

For an open-ended sub-question without a specific time frame:
Sub-questions: ["What are the key insights Chandler wrote about Health Savings Accounts (HSA)?"]
[
    "content_search": "Chandler Nguyen blog posts about Health Savings Accounts",
    "start_date": "2007-01-01",
    "end_date": "2024-12-31"
]

For a sub-question specifying a year:
Sub-questions: ["What blog posts did Chandler write in 2018?", "Which blog posts written by Chandler in 2018 mention Kevin Rudd?"]
[
    "content_search": "Chandler Nguyen blog posts in 2018",
    "start_date": "2018-01-01",
    "end_date": "2018-12-31"
],
[
    "content_search": "Kevin Rudd mentioned in Chandler Nguyen blog posts in 2018",
    "start_date": "2018-01-01",
    "end_date": "2018-12-31"
]

For a sub-question specifying a time range:
Sub-questions: ["What did Chandler write about Ray Dalio in 2020?", "What did Chandler write about Ray Dalio in 2021?", "What did Chandler write about Ray Dalio in 2022?"]
[
    "content_search": "Chandler Nguyen blog posts about Ray Dalio in 2020",
    "start_date": "2020-01-01",
    "end_date": "2020-12-31"
],
[
    "content_search": "Chandler Nguyen blog posts about Ray Dalio in 2021",
    "start_date": "2021-01-01",
    "end_date": "2021-12-31"
],
[
    "content_search": "Chandler Nguyen blog posts about Ray Dalio in 2022",
    "start_date": "2022-01-01",
    "end_date": "2022-12-31"
]
"""

P.P.S: 我知道前端仍然比较慢，所以我可能需要继续补 front end development 。

对我当前聊天机器人的一次升级

2024 年 9 月更新

继续阅读

我如何借助 AI Agent 从“编码流沙”中爬出来

Chatbot v2.10 发布：通过更快速度、更强扩展性与更简体验提升用户感受

三年之后：生成式AI到底对SEO做了什么

三个月后：还在写代码、还在学习、偶尔还是会卡住

构建一个可自我评估的金融聊天机器人：一段穿越数据、代码与挣扎的旅程

我的 4 周旅程：从前端升级到 Docker 挣扎，再到突破