Tasklet is the "o1 moment" for long-horizon AI agents that learn on the job
A couple of weeks ago, Andrew Lee unveiled Tasklet, an AI agent with a two-tier design: a long-lived, high-level agent curates the system prompt, toolset, and memories for individual “sub-agents”, i.e., individual task runs.
Memories and results are stored in an SQL database and made available for the sub-agents to explore via SQL queries (agentic search in the DB) to build the context for the specific task. For example, for a customer relationship tasklet, the sub-agent may be tasked to respond to an e-mail from the customer, and it may search in the database past interactions with that specific customer; if this is a new customer inquiring about product X, the sub-agent may search in the database for past sales of product X.
The system invites feedback from the user at the end of the task runs for the higher-level agent to curate the data/memories/results and to improve the system instructions for future runs.
Please read or listen to Andrew Lee’s interview by Nathan Labenz for more details. I highly recommend it.
Also recently, Anthropic introduced Agent Skills for Claude. Agent Skills is a move away from MCP servers towards simple .md instructions for using this or that app or API that are downloaded by the user and could be edited by the user or AI agents, unlike MCP servers that have static, predefined commands and prompts that the user could only turn on or off, but not adjust for their needs.
The adaptable Agent Skills resonate with the Tasklet system design: Tasklet also prefers using HTTP APIs over MCPs and building context-specific instructions for using specific HTTP APIs effectively.
I believe that these two announcements are a kind of “o1 moment” (from OpenAI’s o1 model announcement) in the domain of long-horizon, continually learning AI agency that Dwarkesh Patel has made a meme from. (For more recent context, see Nathan Lambert’s contra opinion from two months ago.)
That is, I believe that the Tasklet two-tier design, plus post-training LLMs by frontier AI labs (starting with Anthropic, obviously, but there is no doubt that all other apps are doing this too or will soon follow) to be more effective at picking up “skills”, i.e., lengthy and possibly multi-file text instructions for using this or that app or API, is sufficient for prosaic continual learning, and in a few months a lot of providers are going to replicate and advance this AI agent architecture.
Above, I referred to this as an “o1 moment” because Tasklet’s two-tier system design is not surprising. It was an “open secret” that OpenAI are doing RL on LLM throughout 2024. The key is a credible demonstration that this design works, at which point many players will rush to replicate it. In the same vein, very soon after the o1 model announcement, many AI labs began doing RL on LLMs and soon matched OpenAI’s results, most famously DeepSeek.
The main difference of the Tasklet announcement from OpenAI’s o1 announcement is, of course, that Tasklet is much lower profile and didn’t make a huge splash, but I’m sure that all the relevant players have taken note.
Let’s build open-source Tasklet-like agents where users fully own their context
Tasklet’s product design is good for AI diffusion, and hence net positive from the perspective of economic disempowerment. But Tasklet is still a classical SaaS that owns and locks in its data, which leads to AI context fragmentation for the users.
Building an open-source Tasklet alternative on top of the personal AI data platform (I’ve called it Pocketdata) would let the user still be in full ownership and control of all of their context, along with other benefits that I outlined in this post:
strict “pay only for the inference that you have actually used” billing with the option to self-host,
the freedom to swap LLM models and other service providers (including local, fully private inference), and
the freedom to mix and match with other pieces of their personal data plane, such as chats or deep research.
If you are interested in building (or already building) an open Tasklet alternative with the above characteristics, let’s talk! You can reach me at leventov.ru@gmail.com. I’m personally busy building Pocketdata (the infrastructure), and there is enough complexity and groundwork there to require my full attention for many more months. On the other hand, if you are focusing on agent engineering, you may want to delegate the infrastructure work to someone.
Why Pocketdata is the right platform for personal Tasklet-like agents
In the rest of this post, I’ll make an argument for why Pocketdata is the “right” platform for Tasklet-like agents if the aim is to make these agents fully private, user-owned, secure, and self-hostable.
There has already been significant convergence between Pocketdata and Tasklet’s agent design unbeknownst to me. The “AGENTS.md equivalent for personal data“ that I’ve proposed is suspiciously similar both to “skill” and the tasklet sub-agent’s instruction for querying past data in its memories and past results.
Also, since publishing the first technical blueprint for Pocketdata, I made two significant changes to the platform design, both of which are conducive to building an open-source Tasklet alternative on top of Pocketdata.
From Pocketbase to Postgres
First, I’ve ditched the idea of using vanilla Pocketbase. I replaced it with Postgres and plan to later rebuild a “Postgres-flavoured Pocketbase” on top of it, such as Zhenruyan’s “Postgrebase“.
The major reason for this change is that choosing Pocketbase as the primary storage in Pocketdata forces AI apps that may onboard Pocketdata to be modified on the source code level, and perhaps quite significantly so if they don’t use ORMs or other suitable abstractions. For example, Mail Zero supports only Postgres as the storage. Realistically, this would be too big an ask for open-source AI app developers to invest resources in supporting Pocketdata unless/until it gains a huge user base.
On the other hand, with the Postgres-first approach, onboarding AI apps on the platform requires just some config changes, perhaps Dockerfile and init script changes, and upstreamable bug fixes and improvements. This is much more sustainable for Pocketdata to maintain on our own for a few key apps (such as Open WebUI, Mail Zero, the proposed open Tasklet reimplementation, and the like), without asking permission from the maintainers of the upstream projects.
Initially, all apps will have their own schemas and users in Postgres, and these Postgres users have read and write rights only in their respective schemas, ensuring isolation between the apps.
When the Postgres in Pocketdata regains Pocketbase/Postgrebase-derived Go sidecar and the “common schema” collections: chats, notes/docs, emails, etc. (see discussion in the previous post), the AI apps could start to expose and integrating their data with the rest of the “personal data plane” gradually, by dual-writing to their own schema and to “Pocketbase-owned” collections (living in their own, protected schema to which only the postgres superuser can write). Alternatively, for the key apps such as Open WebUI, these integrations could be shipped by the Pocketdata itself in the same permissionless way, by bundling the required hooks for Open WebUI’s (Mail Zero’s, etc.) data schema in Pocketdata’s container image.
Another, slightly unexpected advantage of using Postgres and the “separate Postgres schema and Postgres user per app” design is that it permits raw SQL queries in JS hooks with proper app isolation via a trick: wrapping all raw SQL queries in JS hooks as functions with a SECURITY DEFINER clause when apps register their hooks (I’ve mentioned the idea of apps owning their hooks here).
Tasklet-like agents should each have their separate Postgres schemas and users for strong isolation, i.e., they should be separate “apps” in the Pocketdata platform.
From LiteLLM to Bifrost and Agentgateway
In the previous post, I wrote that “LiteLLM doesn’t have serious alternatives at the moment” as an LLM gateway. However, after another series of gripes with LiteLLM’s performance and code quality, I’ve tried to search for alternatives once again, and I’m happy to report that there is a serious alternative to LiteLLM as a standalone LLM gateway server: Bifrost.
Bifrost is written in Go and uses the fasthttp library. This makes me confident in its performance and low memory footprint for streaming LLM requests. This enables hosting Bifrost on the same Fly machine with Postgres and its sidecars (pgBackRest, pgBouncer, and the Pocketbase-like Go process), albeit in a different container.
Since Bifrost is not an MCP gateway nor a generic OpenAPI gateway, a separate gateway would be needed for these purposes. Agentgateway is covering this. Since Agentgateway is written in Rust, it could also live in the same Fly machine. However, since Agentgateway doesn’t support storing config and auth keys in Postgres (cf. this PR), there is some work to do, but it should be relatively straightforward.
Note that LiteLLM is an LLM and MCP gateway, but not a full-fledged HTTP API gateway. Therefore, the introduction of a separate gateway system was inevitable even if Pocketdata was still using LiteLLM.
As Andrew Lee described in the interview, Tasklet agents are more successful in accessing services via HTTP APIs than MCPs, provided that the agents have access to “skills” to teach them how to use these service APIs.
