The goal of development was to create a library that passed the html5lib-tests. Initially, Stenström used GitHub Copilot in Agent Mode, developing from scratch until it passed 30% of the tests. However, he hit a wall. He then used Cluade Sonnet 3.7, gradually making improvements until it reached 100% pass, but the code was very slow. During this process, he optimized it by porting the tokenizer to Rust, which improved performance, but Stenström had no understanding of Rust code.
He then looked at the html5ever project, written in Rust, and found it to be highly efficient. He ported the structure from html5ever and developed the project until it passed all the tests. Finally, he used Gemini 3 Pro to optimize the Python code directly, resulting in significantly faster code. Next, tests are run to remove unused code, and a fuzzer is run to generate 3 million sample code files to check for crash potentials. At this point, the library is ready for use.
Stenström summarized five lessons learned from developing the JustHTML library and coding with AI:
1. Give the AI measurable goals: Don't just tell it to improve the code, but specify the tests you want to run.
2. Always check the code to learn from the code the AI writes.
3. Encourage the AI to rethink. Sometimes, if the code looks bad, you can simply tell it you don't like the code it wrote to encourage it to suggest new approaches.
4. Always store code in version control for rollback.
5. Allow the AI to make mistakes so it can learn from them. Don't fix every single error.
The JustHTML project took almost a year to develop (considering the different AI models used; Gemini 3 Pro was recently released). The total code length is 3,000 lines. Stenström confirms that this speed of development would be impossible without the AI agent, and that a significant amount of code checking and decision-making was still required.

No comments:
Post a Comment