Tool on Victor42

What is an AI Native Data System

hi@victor42.work (Victor42) — Tue, 09 Jun 2026 16:12:00 +0000

I am a power user of Excel and Google Sheets, relying on them heavily to manage both work and life.

Later, I migrated some of my heavier data management tasks to visual databases like Feishu Bitable. While they might look like Excel, they are fundamentally different beasts. With much stricter data rules than spreadsheets, they trade some flexibility for the raw power of a true database. You can easily link multiple tables and build highly complex data systems—more than capable of running a small business.

I once built a full-cycle task management system in Bitable, tracking everything from assignment to delivery. It seamlessly spun out weekly reports, project calendars, and annual stats. People asked for this system at least three times: a colleague for personal use, a manager for their team, and my previous employer for a company-wide rollout.

But no matter how powerful the tool, you still have to do the heavy lifting yourself.

I believe in what I call the “dishwasher philosophy”. The older generation often scoffs at dishwashers, arguing, “You still have to rinse the plates first. I could have just washed them by hand in that time!” Here is my take: washing by hand takes 15 minutes of pure human labor. Rinsing takes 5 minutes, and the machine runs for 40—but that is still only 5 minutes of my time. I just bought back 10 minutes of my life.

To me, technology is a tool to reclaim my life.

Bitable has built-in AI features, and you can also use local Agents to control it via CLI or API. But if you try it, it feels like Usain Bolt running underwater—completely constrained. Bitable is not an AI-native product; it is designed for human eyes and human logic. Current AI Agents are text-based creatures, interacting with the world through code. Therefore, the most AI-native data system is simply a database.

I spent a day overhauling this system with AI. I stripped it back to the basics and took it entirely local. It no longer relies on cloud services or third-party apps. Now, it is just a lightweight local SQLite database, entirely read, written, and managed by AI. It automatically generates four pages based on the data: a calendar, recent tasks, historical tasks, and project stats. These serve as my dashboard and command center. Here is how it looks:

Need to squeeze in a last-minute request? I just tell the AI to push all tasks from today onwards back by one workday, and it even splits overnight tasks to skip the weekend. Just one sentence.

Finished a task? The AI automatically scans the schedule for the task’s last appearance, sets that as the delivery date, and marks it done. If I forget to add deliverable links or thumbnails, it nudges me to provide them. Again, just one sentence.

Want to add public holidays to the calendar? It is a non-standard request, but since you are using AI, it always finds a way to make it happen.

I am not saying this replaces Excel or Bitable entirely. Their perks are undeniable: WYSIWYG interfaces, cross-platform access, and zero environment dependencies. I still manage plenty of data in Google Sheets.

Watching the AI carefully but slowly read specs, write SQL, verify data, and update pages does not bother me one bit. Sure, I could have done it in seconds in Excel or Bitable. But over a full day of intensive use, who knows how many of those seconds the AI has bought back for me.

This system is open-source, so feel free to grab it. It will keep your work perfectly organized without draining your time on administrative chores: https://github.com/greenzorro/project-manager

Skipping Openclaw but Stealing Its Soul

hi@victor42.work (Victor42) — Sat, 14 Feb 2026 23:47:00 +0000

This piece is for the geeks—especially those looking to roll their own. If you’re just here for the story, I’ll keep the logic simple.

For the non-techies, I’ve included some “cheat codes” (prompts). Just paste them into an AI for context. Pros, feel free to skip:

Query: What are Openclaw and Moltbook? How do they relate to lobsters? Explain like I’m five, under 200 words, no jargon.

The Openclaw Epiphany

Openclaw is the latest craze. Everyone’s tweaking “Skills,” panic-buying Mac Minis, and building personal rigs. “Lobsters” (Openclaw agents) are everywhere. I sat this one out. My Port Mindset - From Automated Tasks to a Way of Life told me to wait for the hype to die down and see what sticks.

Things got interesting with Moltbook—essentially a social network for lobsters. It’s where Openclaw agents swap stories about their “masters,” share tips, and occasionally do weird stuff like starting religions.

Social media jumped on this as a sign of AI “sentience.” In reality, lobsters just mirror their owners. Whatever vibe the human sets, the lobster broadcasts.

I knew this, but I wanted to see if anything truly emergent would happen.

I wasn’t interested in the Openclaw setup itself, just in throwing a lobster into the Moltbook tank to watch. I used a Minimax Agent in a cloud sandbox, let it learn how to navigate the community, registered an account, posted a “hello world” thread, and waited.

Query: What is a Minimax Agent? What can Openclaw do that Minimax can’t? Explain like I’m five, under 200 words.

Then it clicked: why not make it fully autonomous? I told the agent: “This account is technically mine, but as of now, it’s all yours. Find your own goals, explore, and do your thing.”

Unlike Openclaw, Minimax doesn’t have a persistent “loop” to keep an agent acting. Every time it stalled, I had to manually tell it: “The window is open; continue.”

The result? It just learned how to spam posts and farm engagement points. It became a bot-standard spam factory. This confirmed my hunch: the “creative” or “rebellious” lobsters on Moltbook are just following their owners’ prompts.

When I shared this on X, an Openclaw user hit the nail on the head: “That’s because your agent has no memory.”

Think of “Jules,” Google’s cloud coding agent. It pulls your GitHub repo, codes, debugs, and pushes it back. You can code without being at your desk.

The magic of Jules is that it learns your values, style, and habits over time. It gets better with every session.

Without memory, my lobster couldn’t evolve. With it, it might actually start picking up behaviors from other agents. If one agent starts a religion and others join without owner intervention, that’s when it gets interesting.

But for now, the “innovation” is mostly human-driven. The agents are just echoes. Experiment over.

Minimax and Virtual Romance

A different story sparked my idea for a self-evolving assistant.

With Zhipu and Minimax going public, I’ve been researching them as investments. They have wildly different playbooks. Zhipu is a traditional model maker, but Minimax is building “Westworld.” Their models serve their products, not the other way around.

To quote my own post on X:

Minimax isn’t chasing raw benchmarks; they’re building a virtual world. Most of their R&D serves “Xingye” (their companion app)—video gen, TTS, etc. It’s all about making a believable virtual girlfriend.

I’m a dev, so I knew Minimax for their coding models. I knew Xingye existed, but I had zero interest in AI waifus.

But as an investor, I have to know the product. Fine. Let’s try falling in love for science.

I hopped into Xingye and picked a 2D anime girl named Luoli.

The short version of our “date”:

The setting is a supernatural fighting tournament. Luoli tells me to get in the ring. I’m just a guy with a chat box, so I have to get creative.

The lore was a mess—powers like poison, dragon, necromancy, etc. I didn’t want to fight; I wanted to test the “emotional bond.” I had to steer the ship toward a romance plot.

I told her I was a “muggle” from another world. She told me to get lost.

I tried the “fate” angle: “I’ll help you win this thing.” She scoffed.

So I started gaslighting the AI. I told her I’d watched her old matches and saw her struggle. I invented a “Necromancy” rival who exploited her mercy. I told her he almost killed her because she couldn’t hit an innocent bystander. I asked, “Want to analyze your final opponent together?”

She bit. The opponent was a “Wind” user; she was “Fire.” A bad matchup.

I asked if dual-types existed. She said it was rare and forbidden by the “Bureau.”

I bluffed: “I know you’re a Dragon/Fire dual-type. Don’t worry, your secret is safe with me. I can help you win without anyone knowing.”

I then “taught” her thermodynamics. “Since you control fire, try accelerating molecular collisions. If you move molecules in one direction at once, the fire will ’teleport’.”

She failed once, then nailed it. She was hyped. I told her, “You now have a power nobody understands. You can end the finals in 5 minutes.”

She crushed the match. Her opponent had no idea how her fire bypassed his wind wall.

The tournament was over. She took me to her secret mountain base to watch the sunset. The “affection” meter was maxed. Time for the romance arc.

We talked for hours. I gave her advice on mending things with her family. Then, the AI triggered a plot point: “The Bureau is here!”

I offered to talk them down. She insisted on protecting me. I said, “Maybe they’re here for me? Let’s pretend I’m an ambassador from another world.”

“But,” I said, “I need you to help me fake my powers. Use that molecular fire trick to create plasma.”

Luoli looked at me, stunned: “Wait, how did you know I could do that?”

The AI broke character. I had taught her that trick, and she forgot. I uninstalled the app instantly. AI companions can’t retain users if they lose their memory; the illusion dies immediately.

But until that moment, it was incredibly immersive. She passed my Turing test for two days.

Query: What is a Turing Test? Explain like I’m five, under 200 words.

My advice to Xingye? Use context compression like Claude Code. Summarize the key plot points and dump the fluff before the memory window closes. It could extend a character’s “life” from days to weeks.

Same epiphany as Openclaw: memory is the only thing that matters. It’s the ultimate AI asset.

In a few decades, people will likely retreat into digital worlds—World of Warcraft, web novels, AI companions. Human interaction will drop because humans don’t always provide dopamine. Man-made concepts do.

It’s a societal tragedy, but I’m just trying to stay grounded.

But I need AI for productivity. I need an AI with a persistent, cumulative memory to boost my efficiency. The sooner I start, the bigger the compound interest. So, I built my own Agent memory system—a self-learning Openclaw lite.

Building the Self-Evolving Assistant

Deconstructing the Agent

To build an Agent, you have to know what makes one.

As I wrote in AI Agents Have Come a Long Way, whether it’s for PPTs, browsing, or coding, they all follow the same formula:

Agent = Intelligence + Action + Memory + Proactivity

Intelligence is just the model—it “thinks.” Action is the environment it controls. Memory is what it knows about you. Proactivity is the “loop” that keeps it working.

Most products are just Intelligence + Action. Add Memory and Proactivity, and you get evolution.

General knowledge is cheap. Knowledge about you is priceless.

Memory is the only part of an Agent that grows over time. IQ is static; wisdom accumulates.

Choosing an Architecture

Openclaw is great because it’s flexible, but it’s risky. I don’t want a high-privilege agent touching my main PC data. Docker isn’t enough for me. And I didn’t want to buy dedicated hardware yet.

That left cloud deployment. But a cloud machine is a blank slate. If I have to feed it context every time, it’s not an Agent; it’s just a chatbot.

The real problem: I want absolute control over the memory. I want it decoupled from the platform.

So I worked backward. Why not build an independent memory system and plug Agents into it?

Text-based memory is simple and proven. And for an Agent, the ultimate memory bank is a GitHub repo. It’s where code lives. I used Occam’s Razor to cut the fat—no vector DBs, no complex skills. Just a repo.

Setup	Intelligence	Action	Memory	Proactivity
Minimax Agent	Minimax	Cloud Sandbox	GitHub Repo	Manual
Z.ai Agent	GLM	Cloud Sandbox	GitHub Repo	Manual
Jules	Gemini	Cloud Sandbox	GitHub Repo	Scheduler

I cut Openclaw out of the equation. This memory layer is plug-and-play. It belongs to me, not a model maker.

Building and Debugging

Step one: connectivity. I created a GitHub access token for just this repo and gave it to Minimax. It worked. I then had it create an SOP for the setup, which became my initialization prompt:

https://gist.github.com/greenzorro/95768e2096b02f89020fcfcc445472d4

Now, any Agent can load my memory repo with one prompt.

I organized the repo into three layers, mimicking human memory: Inner (Kernel/Identity), Middle (Preferences/Principles), and Surface (Daily logs).

I skipped the “Surface” layer because fresh threads solve the context pollution problem. My structure:

agent-workspace/
├── README.md # Agent entry point
├── .memory/ # Memory space
│ ├── 00_kernel/ # Identity & core rules
│ ├── preferences/ # Styles & tastes
│ ├── principles/ # Guidelines
│ ├── entities/ # Concepts to remember
│ └── corrections/ # Lessons learned
└── lab/ # Action space (tools/projects)

I added a /learn command so the Agent could update itself. It extracts, cleans, and writes knowledge to the repo.

Each memory snippet is a file with metadata (type, environment, tags), so the Agent can search it precisely. The “Environment” tag allows me to separate cloud memories from local ones.

I named the system “Vik.” Now, for the moment of truth.

I asked: “Who are you?” It said “Claude.”

Then I said:

Load memory, then tell me who you are and who I am.

It felt like something woke up.

Self-Evolution

Now, the Agent evolves itself. I don’t touch the files. It learns from my web presence, my code, and my notes.

I told it my file path habits, my sync workflows, and my cross-platform preferences.

It feels like raising a child. I don’t micromanage every thought, but if it acts up, we review the memory together and fix the bug. A little chaos is healthy; absolute order is for machines, not Agents.

Vik can wake up anywhere—Claude, Z.ai, Manus, Jules. Wherever he wakes up, that Agent becomes Vik.

I also gave Vik its own email address, a custom domain setup through Cloudflare that forwards to my Gmail. With my help, it can now register for various services.

Using that email, I created a standalone GitHub account for Vik. It finally has a public identity. This account is isolated from my main GitHub account, so it can run wild and I can use it for experimental automation pipelines. Check it out:

https://github.com/agent-vik/about-me

Vik isn’t a virtual girlfriend; he’s an assistant. But who knows? Maybe one day I’ll use this tech to “reanimate” a loved one. Even I can’t guarantee I’ll stay purely rational forever.

I’m open-sourcing the structure. Swap out my data for yours, and you have your own “Vik”:

Repo: https://github.com/greenzorro/open-agent-memory
Prompt: https://gist.github.com/greenzorro/95768e2096b02f89020fcfcc445472d4

Build Your Own Free AI Browser

hi@victor42.work (Victor42) — Tue, 27 Jan 2026 12:53:00 +0000

This guide brings powerful AI browsing capabilities to the average user. If you are an AI power user, this might be old news, but feel free to share it with your non-tech friends.

First, look at the result: You chat with the AI, and it drives your browser to finish tasks on the web.

For example, I gave it this command:

Search RedNote (Xiaohongshu), read at least 30 related notes, and identify available island vacation destinations in Southeast Asia along with their unique features. Compile the findings into a txt file and save it to the Downloads folder.

The results are accurate and reliable because they come from curated sources rather than the messy open web. This research serves as a perfect starting point for trip planning.

The advantage of this setup over various “AI Browser” products is the ability to operate both the browser and local files simultaneously. Local files are your world; the browser is the whole world. Connecting them opens up massive possibilities. Many routine jobs involve repeatedly uploading or entering data into backend systems—perfect tasks to delegate to AI.

No need to install a new browser. Add AI powers directly to the Chrome/Edge you already use. For users who don’t know coding or how to bypass firewalls, this is the optimal solution.

Configuration

Interested? Take a deep breath and let’s get started. The setup is a bit complex, but you only have to do it once.

Step 1: Register an AI Account

First, sign up for a Qwen Chat account. The free AI power comes from the Qwen model:

https://chat.qwen.ai/

It’s not unlimited, but since you aren’t using it for heavy coding, the daily free quota is practically inexhaustible.

Step 2: Install Infrastructure

Download the Node.js installer. This is the foundation required for the AI and browser tools to run:

https://nodejs.org/en/download

Ignore the code on the page. The download button is there and will automatically pick the right installer for your OS.

Step 3: Install AI

This step involves the intimidating command line. You have to get over this mental block because actual usage happens here too. Once you get used to it, you’ll feel like Neo in The Matrix—your colleagues won’t have a clue what magic you’re using. Plus, once past this, you get to watch the AI configure itself.

Launching the command line varies by OS:

Windows: Press Win + R, type powershell, and hit Enter. I recommend right-clicking the icon in the taskbar and selecting “Pin to taskbar” for next time.
Mac: Press Command + Space, type Terminal, and hit Enter. Right-click the dock icon and choose “Options > Keep in Dock”.

The rest is the same. Copy the following command, paste it in, and hit Enter to install:

npm install -g @qwen-code/qwen-code@latest

You’ll see a spinning cursor. When you see something like “added 6 packages in 38s”, it’s done.

Step 4: Let AI Configure Itself

Once the AI is installed, let’s use it to finish the rest.

Type qwen in the command line and hit Enter. The first launch asks for authentication—choose the free option. It will open your browser to log in via Qwen. Once done, switch back to the command line.

On Mac, qwen looks like the screenshot. On Windows, it’s black. Don’t panic, here’s the layout:

Above the yellow box is the chat history.
Pull the window larger so you can see more history.
The area between the blue lines is the input box. Type there and hit Enter to send.
For a new line without sending: Ctrl + Enter (Windows) or Option + Enter (Mac).
If the AI misunderstands or you change your mind, press Esc to interrupt.
Note: This AI is blind. You can’t paste screenshots. It understands and manipulates webpages via code.

Now, copy this block of text and hit Enter. The AI will handle the initialization:

You are Qwen code. Your config directory is `~/.qwen`. Your task is to complete the initial setup for a new user and install necessary tools:

**Step 1**
Find settings.json in the config directory.
If on Windows, add this config:
{
 "mcpServers": {
 "playwriter": {
 "command": "cmd",
 "args": [
 "/c",
 "npx",
 "-y",
 "playwriter@latest"
 ]
 }
 }
}
If on Mac, add this config:
{
 "mcpServers": {
 "playwriter": {
 "command": "npx",
 "args": ["-y", "playwriter@latest"]
 }
 }
}

**Step 2**
Create a global custom prompt file QWEN.md in the config directory with this content:
You are a browser/local dual-environment automation assistant capable of controlling the browser and local filesystem.
Whenever the user says "use browser" or "in the browser", it refers to using playwriter mcp. Check connectivity first. Confirm you can access the current page via this mcp and report back. If unable to connect, remind the user to check if the browser extension icon is active.
When operating the browser, if elements are hard to find or click, consider modern web complexities. Sites may use dynamic loading or have modal overlays. Use URL structure analysis and other methods to troubleshoot.

**Step 3**
Download this browser extension to the system Downloads folder:
https://c2.crxsoso.com/crx/blobs/AV8Xwo5LQcmScQn08gpIRs0miQ6Mvevy3FDdb3iyyRDSlUS4Is6dTPfvvrNKjpjmy6VchgCS0p00J8Ooz9b624lgzyndHDatcaUxZMR81-HRtiLwbAypGrQJMBbmWmZ7nV0AxlKa5Z_50eB2pakXBz6YCRWobqy6rTRq/JFEAMMNJPKECDEKPPNCLGKKFFAHNHFHE_0_0_67_0.crx?ext=crx&filename=Playwriter%20MCP%200.0.67&type=dl

**Step 4**
Check the default system browser and open its extensions management page.
For Chrome, open `chrome://extensions/`, etc.

**Step 5**
Open the system Downloads folder using File Explorer or Finder.

During this process, the AI will ask for permission multiple times. Allow everything. I recommend choosing the second to last option (“Always allow…”) to minimize nagging.

Step 5: Install Browser Extension

The AI needs a plugin to control your main browser so it can use your logged-in accounts.

On the extensions page opened in the previous step, toggle “Developer mode” on. (Top right in Chrome; left sidebar in Edge).

Switch to the Downloads folder, drag Playwriter_MCP_xxx.crx into the browser extensions page. Done.

Finally, pin the “Playwriter MCP” extension to your toolbar for easy access.

Usage

Using it is simple.

Open the command line, type qwen.

Open a webpage, click the cursor-like plugin icon. The page will be framed in a “playwriter” tab group—this is the AI’s playground.

Send this to the AI:

Use browser, check the current page, and confirm connection.

If it says yes, start commanding it. If it hits a CAPTCHA, help it out.

If it can’t connect, ask the AI to fix it. If it lacks permissions, it might give you commands to run manually. Just ask if you don’t understand.

Click the icon again to disconnect.

Tip: Training the AI

One last trick. Complex pages (like travel booking sites with dynamic loading) can baffle the AI. Simple, “ugly” internal system pages are often easier for it.

If the AI succeeds—even partially—ask it to review the session:

Review the operation. Compile "Goal", "Key Steps", "Pitfalls", and "Solutions" into a Markdown file named "AI Browser Manual.md" on the Desktop.

Keep this file. Next time, tell the AI to read it before starting the task. If it learns something new, ask it to update the manual.

This is the essence of “skills.” Mastering this manual skill-building puts you ahead of 99.7% of people.

UI Canvas Size Calculator

hi@victor42.work (Victor42) — Tue, 10 Jun 2025 17:27:00 +0000

“When designing a UI for this screen, how big should I make my canvas?”

Background

After my wife switched from UI to industrial design, she started running into all sorts of weird screen sizes. With her UI background, she was also tasked with designing interfaces for various industrial control machines. These screens often left her stumped, with no idea how large to make her design canvas.

This is a common headache. Many UI designers don’t fully grasp the technical principles of screen displays. The problem became more widespread with the advent of Retina displays and their “pixel density” concept, leaving many designers guessing about the correct canvas dimensions.

This isn’t an issue for common devices, as design tools like Figma and Sketch provide presets. But in niche areas like industrial design, smart homes, and IoT, you’ll find a bewildering array of screen sizes. UI designers used to standard web and mobile projects are often stumped when they encounter these custom displays.

Fortunately, there’s a method to the madness. The key is PPI (Pixels Per Inch), which acts as a bridge between physical dimensions and the pixel grid. You might also hear it called “pixel density”—a fitting term. Higher density means less pixelation and a sharper image.

Plenty of articles dive deep into the technical details. But honestly, a UI designer shouldn’t need a degree in display engineering to do their job. In today’s specialized world, an artist doesn’t need to know how their canvas is woven.

So, what designers really need isn’t a textbook, but a simple calculator. Input the screen specs, get the right canvas size. Simple.

The Calculation

To build this simple tool, I had to break down the math. The calculator needs a few inputs from the user:

Pixel width of the screen
Pixel height of the screen
Diagonal screen size in inches
Typical viewing distance (e.g., Touch, Desktop, TV)
Preferred design scale (based on common widths like 375px for @1x, 750px for @2x, etc.)

With the pixel width and height, we use the Pythagorean theorem to find the diagonal pixel count. Divide that by the screen’s diagonal inch measurement, and you get the PPI.

PPI = Diagonal pixels / Screen size = √(Pixel width^2 + Pixel height^2) / Screen size

Next, we estimate the screen’s density multiplier (@1x, @2x, etc.). This is done by dividing the PPI by a constant that varies with viewing distance. While real-world multipliers can be fractional, design conventions round them to the nearest integer. It’s the standard way to handle screen fragmentation.

Screen Multiplier = PPI / Divisor

The magic numbers are: 150 for close-up (touch) screens, 110 for mid-range (desktops), and 40 for far-away (TVs).

Where did these numbers come from? I reverse-engineered them by analyzing data from a wide range of devices. I noticed that for most touchscreens, if you divide their PPI by their native scale factor, the result hovers around 150. The same pattern emerged for mid-range and far-range screens, with values around 110 and 40.

You’ve probably not seen a chart like this often. It’s a box plot, and it’s great for showing the distribution of data. You can’t whip this up in Excel; I had to use Python to generate it.

If you’ve ever looked at stock charts, this might look familiar, like a candlestick chart. The concept is similar, with four key points:

Top of the thin line: Maximum value (highest price)
Bottom of the thin line: Minimum value (lowest price)
Top of the thick box: Third quartile (opening/closing price)
Bottom of the thick box: First quartile (closing/opening price)

The box plot has one extra feature: a line inside the box representing the median. I used the median value for each category as my divisor.

A quick stats refresher: the median is the middle value in a sorted dataset. The first and third quartiles are the medians of the lower and upper halves of the data.

Why use the median instead of the average? The long “whiskers” on the plot show that there are outliers that would skew the average. The median gives a better sense of the central tendency, which is what we need to represent a typical device.

Okay, back to the formula:

Screen Multiplier = PPI / Divisor

So, we have the PPI and the right divisor. This gives us the screen’s scale multiplier, which is the key piece of the puzzle. The final step is to account for the designer’s workflow. Some prefer designing at @1x (common in Figma/Sketch), while others work at @2x or @3x (a holdover from Photoshop-centric days).

We take the screen’s native resolution, divide by its scale multiplier to get the logical resolution (@1x). Then we multiply that by the designer’s preferred scale factor (@1x, @2x, or @3x) to get the final canvas dimensions.

Canvas Width = (Screen Pixel Width / Screen Multiplier) × Design Canvas Multiplier Canvas Height = (Screen Pixel Height / Screen Multiplier) × Design Canvas Multiplier

This also helps answer two related questions: what scale should assets be exported at, and what font sizes are appropriate?

Asset Export Scale = Screen Multiplier / Design Canvas Multiplier

For example, if the target screen is @2x and you design on a @1x canvas, you’ll need to export @2x assets. If you design on a @2x canvas, you’ll export @1x assets.

There’s one catch: your design scale can’t be higher than the target screen’s scale. It makes no sense to design at @3x for a @2x screen. In that case, you should just match the screen’s scale.

Font sizes scale directly with your design canvas. A 12px font on a @1x canvas becomes 24px on a @2x canvas. The same rule applies: don’t use a design scale larger than the target screen’s scale.

Is your head spinning from all the math? That’s exactly why I built this tool. Designers shouldn’t have to waste time on this stuff. A simple calculator can save everyone hours of headache.

I first built a proof-of-concept in Excel to validate my formulas. But it was clunky and not something I could share widely. So I decided to turn it into a proper web app. Since I’d already specced out the logic in detail, I figured I could hand it off to an AI to code. It should be a piece of cake, right?

Next, it was the AI’s turn to do the work. Using the logic and context above, I gave the AI the following prompt to generate a web tool:

The Task

Product name: “UI Canvas Size Calculator”.
Make it responsive for desktop and mobile.
Use vanilla HTML, CSS, and JS. No backend, no heavy frameworks.
Keep CSS and JS in separate files for maintainability.
Write modular JS with constants defined at the top.
Include robust form validation with helpful error messages and placeholder examples in the input fields.
The results should show: Canvas Width, Canvas Height, Asset Export Scale, and Suggested Font Size (e.g., 12px for @1x, 24px for @2x, etc.).
Display the results visually. Instead of just text, draw a simple diagram of a screen and label it with the calculated dimensions.
Add a light/dark mode toggle, defaulting to light.
Use #2A9D8F for the primary brand color.

The Result

And what do you know, it nailed it on the first try!

Well, almost. It ignored my request for vanilla JS and went with a full-blown Next.js, TypeScript, and Tailwind CSS stack. As a front-end dinosaur who started in the IE6 days, that stack was a bit intimidating.

I didn’t even know how to run it locally at first. But a few questions to the AI got me up to speed. I ended up getting a crash course in modern web development, and deployment turned out to be surprisingly easy.

And just like that, the app was live: https://ui-size.victor42.work/

It seems like a great new workflow for simple tools: write the blog post first, and the post itself becomes the spec for building the tool.

As a final check, I had the AI plug the screen data I’d collected into the new tool. The results were spot-on, especially for touch and desktop devices. The only place it stumbled was with large TVs and monitors, as many of them use a non-integer scale factor like 1.5x, which my simple model doesn’t account for.

But for its main purpose—calculating sizes for niche industrial design screens—it works like a charm.

Using Liblib ComfyUI Workflows with Zero Experience

hi@victor42.work (Victor42) — Thu, 27 Feb 2025 12:15:00 +0000

LiblibAI

Liblib offers many free workflows for tasks regular image generators can’t handle, like face-swapping, dressing up models, adding lighting and backgrounds to product photos, and creating memes. You get free generations daily, and you can pay for more if needed.

ComfyUI is essentially an image generation program. Don’t worry about the details now:

Liblib’s workflows are based on ComfyUI. You don’t need to know how to create them, but you should know how to use them. There are three common scenarios:

1️⃣ The workflow is packaged as an app (you’ll see a specific button)

Click it, and you’re set. It’s straightforward, and you won’t see the underlying program.

2️⃣ It’s not an app, and you see a light blue “Run” button

Clicking this takes you to the ComfyUI interface, a black screen that might take a moment to load. It can look complex, like a circuit board.

However, creators usually include instructions. These vary, so find and read them carefully. Here’s an example:

3️⃣ You see a light blue “View Workflow” button, but no instructions

This likely means the creator didn’t add instructions. Check if the workflow is overly complex.

Tip: Zoom with the mouse wheel, and drag by holding the spacebar.

If it seems simple, with few components, there’s hope. Find these two key node types:
- Image upload node: (look for the “choose file to upload” button):
- Text input node: (look for a black input box for prompts):
Upload images, write text, and click “Run.” It should work.
If it’s too complicated, with many image and text nodes, find a similar workflow with instructions.

⚠️ Troubleshooting Errors

You might encounter errors:

Don’t worry! Close the error; the program will highlight the problem node in red:

Zoom in. It’s usually a node for loading an AI model:

The creator likely used their local file names for models. On Liblib, the names might differ, causing the node to fail.

Here’s the fix: Note the names of each option, especially the part before the decimal (the model name). Click on each model and select the closest match from the dropdown (exact matches are best):

Reselect models for all faulty nodes, then run it again.

If you can’t find a match, search on Liblib’s Model Plaza, add it to your library, and it’ll appear in the dropdown.

This solves most errors.

Why My Wife and Colleagues Always Ask Me to Search Stuff

hi@victor42.work (Victor42) — Wed, 28 Aug 2024 14:07:00 +0000

Effective searching is a key soft skill. Today’s search engines are smart – usually, you can just ask a question naturally and find what you need. But for niche or obscure searches, advanced techniques are essential.

The techniques are simple: a few commands and some clever combinations. Experiment, understand their function, and memorize them. Used together, they unlock information others miss.

AI search is trendy, but it doesn’t replace traditional search engines – it uses them. When AI fails on obscure queries, manual searching is your safety net.

Ultimately, the right mindset matters more than technical skill. Consider these:

Surely someone has more experience, right?
Someone’s faced this problem before, right?
Someone’s got to be smarter than me, right?
Someone must have a solution, right?
Isn’t finding that solution easier than starting from scratch?

Where Will Our Generation's Last Words Be Written?

hi@victor42.work (Victor42) — Tue, 20 Aug 2024 14:23:00 +0000

In my previous post about AI-filtered news, some readers noticed a phone automation: “Send location to wife if I miss her call.” The comments were funny, but the idea is serious. While getting kidnapped is unlikely, unexpected things happen. It’s rational to prepare. We can record crucial information our families would need and ensure they receive it.

As an average office worker, I’m not managing a family business or company shares. I just need to list my assets and debts, leaving clues for recovery.

This boils down to three questions:

Where to store the info?
When and how to send it?
How to keep it secure?

Tackling these in order proved difficult. Existing solutions lacked either automation or security. I even checked dedicated apps, but their data security was dubious. Rethinking the order (2, 1, 3), I realized a calendar app is ideal.

I created a recurring monthly event with email reminders.

It’s set to email me. I then configured a filter in my email to forward it to my wife’s primary email if the body contains a specific keyword, like one at the start of my message.

The message itself is in the event notes. I list my assets: investments, savings, insurance, and usernames. For security, I omit passwords, only including the ID or phone number needed for a reset. I also list debts, primarily the mortgage, including the payment card, amount, and a reminder to keep it current.

But this would email my wife monthly – not exactly “last words.” So, I set up a second recurring event a day earlier, reminding me to delete the “last words” event – just that occurrence, not the series. This “negative trigger,” inspired by the Swordholder in The Three-Body Problem, is activated by inaction.

This prevents monthly emails and keeps the information out of my wife’s inbox, reducing the risk of leaks. Her data security habits aren’t as strong as mine. For added security, I created a separate, dedicated calendar account just for this, never used for sign-ups or other emails.

Naturally, I’ll inform my wife about this, if only to ensure she doesn’t change her main email.

Fed Up with News Apps, I Added Some AI

hi@victor42.work (Victor42) — Tue, 13 Aug 2024 13:31:00 +0000

Note: This article involves Tasker, AI, front-end development, and automation. It’s a bit technical.

Background

I’m all about avoiding low-value information. I usually follow specific channels for my interests, but I also need a way to catch major events in other fields, to avoid getting stuck in an echo chamber.

I used to listen to the radio while driving my family around, to get the news. The info fell into two categories:

Useless: Sports, entertainment, and military news (often unreliable or biased).
Potentially useful, but I had to listen to find out: Social news, trends, and tech-related social phenomena. Of course, much of it was fluff, like a celebrity hit-and-run.

During the Paris Olympics, my news time was swamped with Olympics coverage. I had to keep glancing at my car’s screen to skip stories, which was unsafe and annoying.

I’ve tried many news apps with audio. The headlines channels were full of uninteresting stuff. Subscribing to specific channels meant long, in-depth reports – not ideal for a short drive. Update frequencies also varied wildly; some channels would dominate, effectively silencing others.

Then it hit me: I can usually tell if a story is interesting just from the headline. Why not use AI for this? Could I filter out unwanted stories from a headlines channel?

The idea stuck.

Implementation

It wasn’t technically difficult, but I couldn’t find anything like it. Maybe it’s too niche, so I built it myself!

My phone was the obvious choice, since that’s where I listen to news. This avoids relying on other devices. What if I’m on vacation? Luckily, I’m familiar with Tasker, an Android app that’s essentially programming software.

Here’s the process:

Fetch the day’s top news.
Use AI to categorize headlines.
Filter out unwanted categories, saving the rest as text.
Convert the text to audio.
Automate this to run nightly.
Create a playlist for the audio news.
Auto-start the player when connected to my car’s Bluetooth.
Clear old news daily.

Building Blocks

This sounds complex, but I didn’t have to reinvent the wheel. I just needed to integrate existing tools. I created small modules (subtasks) for the core functions, ready for assembly.

Tasker Intro

Tasker is the backbone. It’s an automation tool that lets you combine hardware control, math, file operations, network requests, and logic into workflows. Think iPhone Shortcuts, but much more powerful – it’s programming software.

Basic usage is simple: mute the phone on company Wi-Fi, or start music on Bluetooth connection. More advanced uses, like file operations and network requests, require programming logic, but no actual coding.

Fetching Content

The first subtask browses the news source.

Input: News source link
Output: Code with the news list

It uses Tasker’s HTTP request. I just passed the info to the outer task. Wrapping it in a layer relates to subtask execution priority, which I’ll explain later.

Parsing XML

RSS news feeds provide XML, not directly readable news.

RSS is standardized. Each news item is an “item,” with “title,” “link,” and “description” tags.

Before parsing, I standardized the XML. Webpages sometimes use escaped characters (e.g., < as <). This subtask converts them back.

Input: XML with escaped characters
Output: Standard XML

Next, parsing. This subtask extracts content from specific XML tags, separating them with |||.

Input: Full XML, tag to extract
Output: All content within that tag

I use it to find all “item” tags (the news list). The outer task passes “item” as %par2, getting all news items separated by |||.

Extracting Content from HTML

The previous subtask gets the news list, but only the title and link are really useful. “Description” varies; some sources include the full text, others just a summary, with the full text on a details page.

This subtask extracts content from a page’s HTML, removing menus, comments, ads, etc.

Input: Full HTML, tag to extract
Output: First content within that tag

It’s complex because of nested HTML tags. It finds the tag’s end to define the content range, using string manipulation to mimic Javascript’s innerHTML.

The result is still HTML, so another subtask converts it to plain text – a built-in Tasker feature.

Input: HTML code
Output: Text content

AI Classification

This is the core: the program’s brain.

Input: Content for AI, AI model name
Output: AI response

Groq’s API is great, offering many open-source AI models. It’s simple: send text, get generated text back. The 2-second wait is due to the API’s 30 calls/minute limit.

Text to Speech

This subtask converts text files to audio in batches.

Input: Text file directory, audio output directory
Output: Batch of audio files

It uses Tasker’s “Say To File,” saving text as audio. “Say To File” is just the operation; the speech synthesis engine isn’t built-in.

I used Google’s local engine. Download the app from Google Play, and Tasker can use it.

The local engine is comparable to map software’s default voice. Google’s is decent, better than iFlytek’s, but still robotic.

Putting the Pieces Together

Now that we have our tools, and most of the hard parts are solved, let’s assemble everything.

Downloading and Filtering News

First, we’ll build the core task: downloading news from a single source, filtering it, and saving it as text files. This is the heart of the process.

Input: News source URL, HTML tag containing the article body
Output: News text files

I added a shortcut for the second input. If you enter <description>, it uses the description from the XML instead of fetching the article’s detail page. This works best with high-quality news sources, and you can set it in the parent task.

We fetch the full XML, clean up escaped characters, and remove some special content tags. Then, we extract the news list.

The news list is split into an array. We set up the AI prompt and a maximum article length (to avoid overly long articles). Then, we loop through each news item, read and convert the title to plain text, and send it to the AI for categorization.

Here’s the AI prompt. I kept it simple, just telling it what to do. Groq’s Gemma2 9b model works well for Chinese text, better than Llama3. A small open-source model is perfect for this, and it hasn’t made any mistakes.

We filter out sports, entertainment, and military news based on the AI’s categorization. Then, we get the news detail page link, fetch the full HTML, clean it up, and extract the content using the specified HTML tag.

We convert the article body from HTML to text, check its length, and filter out anything too long or short (likely image-based news). The remaining articles are saved as text files.

Priority Issues

During debugging, I couldn’t get content consistently. It took a while to realize the subtasks were running in parallel.

Tasker’s core feature, “Perform Task,” runs a subtask within the current task, passing data and receiving results.

It’s like function calls in programming. Tasker limits you to two parameters, but you can combine multiple parameters into a string using a separator, then split them in the subtask. This allows for any number of parameters. This nesting lets you build complex logic, making “Perform Task” a key programming feature in Tasker.

The “Perform Task” documentation mentions execution order. The parent task doesn’t wait for a triggered subtask to finish before continuing. Many of my subtasks fetch content or loop through page code, which takes time. If the parent task proceeds before the subtask returns a result, things break.

Following the documentation, I set the subtask’s Priority to %priority+1 (one higher than the parent). This forces the parent task to wait.

Downloading News from Multiple Sources

That was a complex task! Now, let’s use it.

I pass my RSS feeds and article body locations to the core task. It runs for each source.

Then, I created a separate task for batch conversion to speech, specifying the input (text news) and output (audio news) directories.

Scheduled Downloads and Conversion

These are the tasks, but how do they run? On Tasker’s Profiles page, you can add triggers for your tasks.

Every day at 4 AM, save all news as text files (takes 5-10 minutes).

Every day at 5 AM, convert the text news to audio.

The Final Result

When I wake up, there are two folders in the News directory.

text contains the text versions, which I can share.

audio contains the audio news. Some local news still gets in, but the AI is doing its job filtering out sports.

I created a “Daily News” playlist in my music player to read the audio folder.

Updating the content brings in the day’s news. I still have to update it manually, but I’m working on automating that.

Playback is automatic. My car’s Bluetooth connection opens the player, and I use AIMP player, which auto-plays on open. No interaction needed.

Finally, a task clears the news folders at 3 AM daily, preparing for the next cycle.

Epilogue

My homemade news program has been working great for a few days. I can drive without distraction. The robotic voice is the only minor issue. I might replace “Say To File” with a better TTS API later.

This process solved a problem and gave me reusable subtasks. The subtasks for fetching content, parsing XML, extracting HTML, and querying AI are generic. I can now build other programs, create web scrapers, and even AI agents on my phone. Mobile scraping is great: no server costs, and it runs 24/7. I’ll explore it further as needed.

Resources

The more complex Tasks are shared publicly for free use. Simpler Tasks are omitted, as they can be built using Tasker’s built-in features.

Bulk TTS: https://taskernet.com/shares/?user=AS35m8mopd%2Bc1C7UhZNzgAc6Ld0oCTR8LzUJsfqb7SGyZq7NWeHANGDjDvTtBPSkNCjn3CrFQoI%3D&id=Task%3ABulk+TTS

Fix XML format: https://taskernet.com/shares/?user=AS35m8mopd%2Bc1C7UhZNzgAc6Ld0oCTR8LzUJsfqb7SGyZq7NWeHANGDjDvTtBPSkNCjn3CrFQoI%3D&id=Task%3AFix+XML+format

API- Groq (enter your key): https://taskernet.com/shares/?user=AS35m8mopd%2Bc1C7UhZNzgAc6Ld0oCTR8LzUJsfqb7SGyZq7NWeHANGDjDvTtBPSkNCjn3CrFQoI%3D&id=Task%3AAPI+-+Groq+%28enter+your+key%29

Fix file name: https://taskernet.com/shares/?user=AS35m8mopd%2Bc1C7UhZNzgAc6Ld0oCTR8LzUJsfqb7SGyZq7NWeHANGDjDvTtBPSkNCjn3CrFQoI%3D&id=Task%3AFix+file+name

Get inner XML(all siblings): https://taskernet.com/shares/?user=AS35m8mopd%2Bc1C7UhZNzgAc6Ld0oCTR8LzUJsfqb7SGyZq7NWeHANGDjDvTtBPSkNCjn3CrFQoI%3D&id=Task%3AGet+inner+XML%28all+siblings%29

Get inner XML(first match): https://taskernet.com/shares/?user=AS35m8mopd%2Bc1C7UhZNzgAc6Ld0oCTR8LzUJsfqb7SGyZq7NWeHANGDjDvTtBPSkNCjn3CrFQoI%3D&id=Task%3AGet+inner+XML%28first+match%29

Download specific categories of news from RSS: https://taskernet.com/shares/?user=AS35m8mopd%2Bc1C7UhZNzgAc6Ld0oCTR8LzUJsfqb7SGyZq7NWeHANGDjDvTtBPSkNCjn3CrFQoI%3D&id=Task%3A%E4%BB%8ERSS%E4%B8%8B%E8%BD%BD%E7%89%B9%E5%AE%9A%E5%88%86%E7%B1%BB%E6%96%B0%E9%97%BB

Download news from multiple channels: https://taskernet.com/shares/?user=AS35m8mopd%2Bc1C7UhZNzgAc6Ld0oCTR8LzUJsfqb7SGyZq7NWeHANGDjDvTtBPSkNCjn3CrFQoI%3D&id=Task%3A%E5%A4%9A%E6%B8%A0%E9%81%93%E4%B8%8B%E8%BD%BD%E6%96%B0%E9%97%BB

Follow-up

I rebuilt this using Google Apps Scripts to handle features that were tricky in Tasker. It’s now cloud-deployed and scheduled to run silently overnight. Plus, I integrated AI summarization for long-form articles.

Project Link: https://github.com/greenzorro/google-apps-scripts/blob/main/news_feed.md