Saturday, December 29, 2007 At 2:34AM
I’ve just uploaded the latest version of AntiXSS for Java (version 0.02) to the GDS Tools page. What is AntiXSS for Java? Its a port to Java of the Microsoft Anti-Cross Site Scripting (AntiXSS) v1.5 library for .NET applications.
For those not familiar with the Microsoft AntiXSS library, it is an output encoding library for avoiding Cross Site Scripting vulnerabilities. Specifically it is intended to safely encode information written to the user’s browser within a specific context (i.e. if writing a string into the HTML of a page, you need to use the correct function – HtmlEncode). Unlike some other solutions the library implements a white listing approach, and encodes everything except characters known to be harmless. For example, the string <script>
will be HTML encoded as <script>
.
AntiXSS for Java was largely written as an educational exercise on my part, and as such the library should be considered “beta quality”, however it should be fairly usable for most applications. The library requires Java 1.4 or higher, but has no other prerequisites.
Usage:
- AntiXSS for Java comes as a source package, or alternatively you can just download the compiled Jar file. An Ant buildfile and JUnit tests are included with the source code.
- Put AntiXSS.jar somewhere in your CLASSPATH
- In your code, import
com.gdssecurity.utils.AntiXSS
- All of the output filtering methods are implemented statically, so just wrap your calls to output functions in a call to one of the filtering methods (identical to the methods in the Microsoft library):
- HtmlEncode() – a string to be used in HTML. This method will return characters a-z, A-Z, 0-9, full stop, comma, dash and underscore unencoded, and encode all other characters in decimal HTML entity format (i.e. < is encoded as <).
- UrlEncode() – a string to be used in a URL. This method will return characters a-z, A-Z, 0-9, full stop, dash, and underscore unencoded, and encode all other characters in short hexadecimal URL notation for non-unicode characters (i.e. < is encoded as %3c), and as unicode hexadecimal notation for unicode characters (i.e. %u0177).
- HtmlAttributeEncode() – a string to be used in an HTML attribute. This method will return characters a-z, A-Z, 0-9, full stop, comma, dash and underscore unencoded, and encode all other characters in decimal HTML entity format (i.e. < is encoded as <).
- JavaScriptEncode() – a string safe to use directly in JavaScript. This method will return characters a-z, A-Z, space, 0-9, full stop, comma, dash, and underscore unencoded, and encode all other characters in a 2 digit hexadecimal escaped format for non-unicode characters (e.g. x17), and in a 4 digit unicode format for unicode characters (e.g. u0177).
- VisualBasicScriptEncodeString() – a string to use directly in VBScript. This method will return characters a-z, A-Z, space, 0-9, full stop, comma, dash, and underscore unencoded (each substring enclosed in double quotes), and encode all other characters in concatenated calls to chrw(). e.g. foo’ will be encoded as “foo”&chrw(39).
- XmlEncode() – a string to be used in XML. This method will return characters a-z, A-Z, 0-9, full stop, comma, dash and underscore unencoded, and encode all other characters in decimal entity format (i.e. < is encoded as <).
- XmlAttributeEncode() – a string to be used in an XML attribute. This method will return characters a-z, A-Z, 0-9, full stop, comma, dash and underscore unencoded, and encode all other character in decimal entity format (i.e. < is encoded as <).
For those of you familiar with output encoding, this library is functionally the same as the OWASP Reform library by Michael Eddington, which is not too surprising as I believe Michael was involved in developing the Microsoft AntiXSS library.
Any feedback, and especially bug reports, welcome.
Author: Justin Clarke
©Aon plc 2023