Hypertext Markup Language (HTML) is the core web technology, turning the pre-www internet of 1991 into a World Wide Web. It is basically a document markup language, in which you can store text plus formatting plus layout.
In order to achieve the richest formatting, HTML works together with another standard for styles: CSS. HTML files may contain CSS styles inside, or they may refer to other files for their styling.
The World-wide-web is one of the most dynamic corners of IT. Standards are always on the move, and new patterns of organizing web content replace the best practices of yesterday.
Most HTML in the world has not been typed by humans, but has been generated by software.
HTML in the archive
One scenario by which HTML files may enter the archive, is when a website gets archived. In that case, it is not the individual HTML files themselves that should be judged for their long-term preservability, but rather the website as an integral system. In this scenario, it is preferable to archive the source code of the web site as well, not only the end result.
Another scenario is when HTML files with substantial content have been captured from a legacy system, or from other sources. If possible, scan the file for references to external files, and if possible, rescue those files as well, and store them in an organization that matches the way they are referenced.
HTML is a preferred format for file type Markup Language.