Tuesday, December 16, 2025

The developers built an HTML5 reading library from scratch using LLM, and it passed all tests 100%.


The developers built an HTML5 reading library from scratch using LLM, and it passed all tests 100%.
Emil Stenström, Head of AI Product at Odevo, shared his experience building JustHTML, an HTML5 parser written entirely in Python without any external libraries. A key tool he used was LLM to assist in programming.

The goal of development was to create a library that passed the html5lib-tests. Initially, Stenström used GitHub Copilot in Agent Mode, developing from scratch until it passed 30% of the tests. However, he hit a wall. He then used Cluade Sonnet 3.7, gradually making improvements until it reached 100% pass, but the code was very slow. During this process, he optimized it by porting the tokenizer to Rust, which improved performance, but Stenström had no understanding of Rust code.

He then looked at the html5ever project, written in Rust, and found it to be highly efficient. He ported the structure from html5ever and developed the project until it passed all the tests. Finally, he used Gemini 3 Pro to optimize the Python code directly, resulting in significantly faster code. Next, tests are run to remove unused code, and a fuzzer is run to generate 3 million sample code files to check for crash potentials. At this point, the library is ready for use.

Stenström summarized five lessons learned from developing the JustHTML library and coding with AI:

1. Give the AI ​​measurable goals: Don't just tell it to improve the code, but specify the tests you want to run.
2. Always check the code to learn from the code the AI ​​writes.
3. Encourage the AI ​​to rethink. Sometimes, if the code looks bad, you can simply tell it you don't like the code it wrote to encourage it to suggest new approaches.
4. Always store code in version control for rollback.
5. Allow the AI ​​to make mistakes so it can learn from them. Don't fix every single error.

The JustHTML project took almost a year to develop (considering the different AI models used; Gemini 3 Pro was recently released). The total code length is 3,000 lines. Stenström confirms that this speed of development would be impossible without the AI ​​agent, and that a significant amount of code checking and decision-making was still required.

No comments:

Post a Comment