123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150 |
- TODO List
- = KEY ====================
- # Flagship
- - Regular
- ? Maybe I'll Do It
- ==========================
- If no interest is expressed for a feature that may require a considerable
- amount of effort to implement, it may get endlessly delayed. Do not be
- afraid to cast your vote for the next feature to be implemented!
- Things to do as soon as possible:
- - http://htmlpurifier.org/phorum/read.php?3,5560,6307#msg-6307
- - Think about allowing explicit order of operations hooks for transforms
- - Fix "<.<" bug (trailing < is removed if not EOD)
- - Build in better internal state dumps and debugging tools for remote
- debugging
- - Allowed/Allowed* have strange interactions when both set
- ? Transform lone embeds into object tags
- - Deprecated config options that emit warnings when you set them (with'
- a way of muting the warning if you really want to)
- - Make HTML.Trusted work with Output.FlashCompat
- - HTML.Trusted and HTML.SafeObject have funny interaction; general
- problem is what to do when a module "supersedes" another
- (see also tables and basic tables.) This is a little dicier
- because HTML.SafeObject has some extra functionality that
- trusted might find useful. See http://htmlpurifier.org/phorum/read.php?3,5762,6100
- FUTURE VERSIONS
- ---------------
- 4.6 release [OMG CONFIG PONIES]
- ! Fix Printer. It's from the old days when we didn't have decent XML classes
- ! Factor demo.php into a set of Printer classes, and then create a stub
- file for users here (inside the actual HTML Purifier library)
- - Fix error handling with form construction
- - Do encoding validation in Printers, or at least, where user data comes in
- - Config: Add examples to everything (make built-in which also automatically
- gives output)
- - Add "register" field to config schemas to eliminate dependence on
- naming conventions (try to remember why we ultimately decided on tihs)
- 5.0 release [HTML 5]
- # Swap out code to use html5lib tokenizer and tree-builder
- ! Allow turning off of FixNesting and required attribute insertion
- 5.1 release [It's All About Trust] (floating)
- # Implement untrusted, dangerous elements/attributes
- # Implement IDREF support (harder than it seems, since you cannot have
- IDREFs to non-existent IDs)
- - Implement <area> (client and server side image maps are blocking
- on IDREF support)
- # Frameset XHTML 1.0 and HTML 4.01 doctypes
- - Figure out how to simultaneously set %CSS.Trusted and %HTML.Trusted (?)
- 5.2 release [Error'ed]
- # Error logging for filtering/cleanup procedures
- # Additional support for poorly written HTML
- - Microsoft Word HTML cleaning (i.e. MsoNormal, but research essential!)
- - Friendly strict handling of <address> (block -> <br>)
- - XSS-attempt detection--certain errors are flagged XSS-like
- - Append something to duplicate IDs so they're still usable (impl. note: the
- dupe detector would also need to detect the suffix as well)
- 6.0 release [Beyond HTML]
- # Legit token based CSS parsing (will require revamping almost every
- AttrDef class). Probably will use CSSTidy
- # More control over allowed CSS properties using a modularization
- # IRI support (this includes IDN)
- - Standardize token armor for all areas of processing
- 7.0 release [To XML and Beyond]
- - Extended HTML capabilities based on namespacing and tag transforms (COMPLEX)
- - Hooks for adding custom processors to custom namespaced tags and
- attributes, offer default implementation
- - Lots of documentation and samples
- Ongoing
- - More refactoring to take advantage of PHP5's facilities
- - Refactor unit tests into lots of test methods
- - Plugins for major CMSes (COMPLEX)
- - phpBB
- - Also, a FAQ for extension writers with HTML Purifier
- AutoFormat
- - Smileys
- - Syntax highlighting (with GeSHi) with <pre> and possibly <?php
- - Look at http://drupal.org/project/Modules/category/63 for ideas
- Neat feature related
- ! Support exporting configuration, so users can easily tweak settings
- in the demo, and then copy-paste into their own setup
- - Advanced URI filtering schemes (see docs/proposal-new-directives.txt)
- - Allow scoped="scoped" attribute in <style> tags; may be troublesome
- because regular CSS has no way of uniquely identifying nodes, so we'd
- have to generate IDs
- - Explain how to use HTML Purifier in non-PHP languages / create
- a simple command line stub (or complicated?)
- - Fixes for Firefox's inability to handle COL alignment props (Bug 915)
- - Automatically add non-breaking spaces to empty table cells when
- empty-cells:show is applied to have compatibility with Internet Explorer
- - Table of Contents generation (XHTML Compiler might be reusable). May also
- be out-of-band information.
- - Full set of color keywords. Also, a way to add onto them without
- finalizing the configuration object.
- - Write a var_export and memcached DefinitionCache - Denis
- - Built-in support for target="_blank" on all external links
- - Convert RTL/LTR override characters to <bdo> tags, or vice versa on demand.
- Also, enable disabling of directionality
- ? Externalize inline CSS to promote clean HTML, proposed by Sander Tekelenburg
- ? Remove redundant tags, ex. <u><u>Underlined</u></u>. Implementation notes:
- 1. Analyzing which tags to remove duplicants
- 2. Ensure attributes are merged into the parent tag
- 3. Extend the tag exclusion system to specify whether or not the
- contents should be dropped or not (currently, there's code that could do
- something like this if it didn't drop the inner text too.)
- ? Make AutoParagraph also support paragraph-izing double <br> tags, and not
- just double newlines. This is kind of tough to do in the current framework,
- though, and might be reasonably approximated by search replacing double <br>s
- with newlines before running it through HTML Purifier.
- Maintenance related (slightly boring)
- # CHMOD install script for PEAR installs
- ! Factor out command line parser into its own class, and unit test it
- - Reduce size of internal data-structures (esp. HTMLDefinition)
- - Allow merging configurations. Thus,
- a -> b -> default
- c -> d -> default
- becomes
- a -> b -> c -> d -> default
- Maybe allow more fine-grained tuning of this behavior. Alternatively,
- encourage people to use short plist depths before building them up.
- - Time PHPT tests
- ChildDef related (very boring)
- - Abstract ChildDef_BlockQuote to work with all elements that only
- allow blocks in them, required or optional
- - Implement lenient <ruby> child validation
- Wontfix
- - Non-lossy smart alternate character encoding transformations (unless
- patch provided)
- - Pretty-printing HTML: users can use Tidy on the output on entire page
- - Native content compression, whitespace stripping: use gzip if this is
- really important
- vim: et sw=4 sts=4
|