<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
    <channel>
        <title>Blog on Yerong Li</title>
        <link>https://24ce4b33.yerong-li.pages.dev/tags/blog/</link>
        <description>Curious about how the world works. Building, writing, learning.</description>
        <generator>Hugo -- gohugo.io</generator>
        <language>en</language>
        <managingEditor>ping@yerong.li (Yerong Li)</managingEditor>
        <webMaster>ping@yerong.li (Yerong Li)</webMaster>
        <copyright>© 2024-2026 · CC BY-NC 4.0</copyright>
        <lastBuildDate>Mon, 08 Jun 2026 14:22:32 +1000</lastBuildDate>
        <atom:link href="https://24ce4b33.yerong-li.pages.dev/tags/blog/index.xml" rel="self" type="application/rss+xml" />
        <item>
            <title>Site Log 03: Shelf and Hoot</title>
            <link>https://24ce4b33.yerong-li.pages.dev/posts/2026/06/site-log-03-shelf-and-hoot/</link>
            <pubDate>Mon, 08 Jun 2026 14:22:32 +1000</pubDate>
            <author>ping@yerong.li (Yerong Li)</author>
            <guid>https://24ce4b33.yerong-li.pages.dev/posts/2026/06/site-log-03-shelf-and-hoot/</guid>
            <description>Explore Yerong Li&#39;s personal blog updates featuring the &#39;Shelf&#39; tool for Notion integration and &#39;Hoot&#39;, an AI assistant enhancing site interaction and contact methods.</description>
            <content type="html"><![CDATA[<h4 id="shelf">Shelf</h4>
<p>Shelf 这个功能复用了我之前的一个小项目。</p>
<p>我一直有在 Notion 里记录生活轨迹的习惯，主要是书和电影。早些时候，为了把 Notion 里的数据库改造得更好看，也更符合自己的使用习惯，我用 AI 帮忙写了一套网页信息抓取功能。后端主要由 Cloudflare Worker 处理抓取和写入。因为浏览器 extension 需要上架和审核，为了更轻量，前端触发方式最后做成了一个 bookmarklet。</p>
<p>再往后，我把授权部分拆成了 Notion OAuth。这样其他人如果复制了我的 Notion 模板，也可以一步步 set up，接入自己的 Notion workspace。这本来是一个独立的小工具。但在改进个人站时，我也想把这部分信息展示出来，而且最好能复用现有的东西。于是我在 Notion 数据库里多加了一个字段，用来选择某条记录是否 publish 到网站上，再通过 Worker 触发同步。</p>
<p>前端部分，尤其是效果和审美的实现，主要是靠 Claude 和 Gemini 一起讨论出来的。我发现 Codex 在这类视觉想象和风格探索上还欠缺许多，但它在把明确的方案落地成代码时很可靠。</p>
<p>6 月 7 号的这次更新，是在书籍和电影页面里加入了一个 Gesture Mode。此前我没有真正接触过视觉识别相关的东西，做这个的过程中才知道，浏览器里已经有很多成熟、免费的手部识别和 landmark detection 库可以使用，比如 MediaPipe Hands。这个功能本身只是一个小实验，但它让我很具体地意识到，在很近的未来，人与电子信息的日常交互未必还要被鼠标和键盘这套模式牢牢限制住。</p>
<h4 id="hoot-">Hoot 🦉</h4>
<p>Hoot 的产生，最开始是因为我想在这个个人站里更显性地引入 AI。尽管在写作流程里，其实已经有 AI 生成 metadata 的环节，但那更像是后台流程的一部分，并不是访客可以直接感知到的东西。</p>
<p>创造大概有两种方式。一种是先有问题，然后寻找解决问题的工具。这个工具可以是现有的，也可以是从无到有生出来的。另一种方式是先有工具，然后再考虑它可以解决什么问题。后一种方式里的问题通常没有那么迫切，否则大概早就会采取第一种方式了。</p>
<p>在这里，我是先有了 AI 这个工具，才很自然地开始想：它能做什么？大语言模型的特性是能“说人话”，而现在的检索、索引和工具调用能力，又让它有机会说一些有依据的人话。于是便有了 Hoot：一个可以回答我公开信息的 assistant。</p>
<p>AI 还有调用工具的能力，所以它也很自然地改进了 contact 功能。以前传统博客通常是在 Contact 页面放一个 email 地址，或者触发邮件客户端。而 Hoot 可以在相关对话里触发 workflow，把信息发送到我的 Telegram bot。额外的，我也可以知道联系者在发起联系前问过哪些问题，从而理解这个联系发生的上下文。</p>
<h2 id="english-version">English version</h2>
<h4 id="shelf-1">Shelf</h4>
<p>The Shelf feature reuses a small project I had built earlier.</p>
<p>I have long had the habit of recording traces of my life in Notion, mostly books and films. Earlier, in order to make my Notion database look better and fit my own usage habits more closely, I asked AI to help me build a web information scraping workflow. The backend is mainly handled by Cloudflare Workers, which fetch information and write it into Notion. Because publishing and reviewing a browser extension would add extra friction, I made the front-end trigger a lightweight bookmarklet instead.</p>
<p>Later, I separated the authorization flow into Notion OAuth. This means that if someone copies my Notion template, they can set it up step by step and connect it to their own Notion workspace. Originally, this was an independent tool. But while improving my personal site, I also wanted to show part of this information here, ideally by reusing what already existed. So I added another field in the Notion database to decide whether a record should be published to the website, then used a Worker to trigger the sync.</p>
<p>The front-end part, especially the interaction and visual direction, came mostly from discussions with Claude and Gemini. I find that Codex is still weaker at this kind of visual imagination and style exploration, but it is reliable when the plan is already clear and needs to be turned into code.</p>
<p>The June 7 update added Gesture Mode to the books and films pages. Before this, I had not really worked with computer vision. During the process, I learned that there are already mature and free hand-tracking and landmark detection libraries available in the browser, such as MediaPipe Hands. The feature itself is only a small experiment, but it made me feel very concretely that in the near future, everyday interaction with digital information may not need to remain so tightly bound to the familiar mouse-and-keyboard model.</p>
<h4 id="hoot--1">Hoot 🦉</h4>
<p>Hoot began with a simple desire: I wanted to introduce AI into this personal site in a more visible way. AI was already part of my writing workflow, especially in generating metadata, but that was more of a background process. It was not something visitors could directly experience.</p>
<p>There are roughly two ways to create things. One starts with a problem, then looks for a tool to solve it. The tool can already exist, or it can be built from scratch. The other starts with a tool, then asks what problems it might solve. Problems in the second category are usually less urgent; otherwise, they would probably have already triggered the first path.</p>
<p>Here, I had AI as a tool first, and it was natural to start asking: what can it do? The defining quality of large language models is that they can speak in human language. With today&rsquo;s retrieval, indexing, and tool-calling capabilities, they can also speak with some grounding. That became Hoot: an assistant that can answer questions about my public information.</p>
<p>AI can also call tools, so it naturally improved the contact flow as well. A traditional blog usually puts an email address on the Contact page, or opens an email client. Hoot can trigger a workflow during a relevant conversation and send the message to my Telegram bot. As an extra benefit, I can also see what questions the person asked before contacting me, which gives me more context for why the contact happened.</p>
]]></content>
        </item>
        
    </channel>
</rss>

