Applications rarely test for Unicode exploits and hence provides the attacker a route of attack.
The issue to remember here is that the application is safe if Unicode representation or other malformed representation is input.
HTML-encoding and URL-encoding user input when writing back to the client.It is also a good place to look for information leakage issues: errors.required= is required. errors.maxlength= cannot be greater than characters. Many web applications use operating system features and external programs to perform their functions.When a web application passes information from an HTTP request through as part of an external request, it must be carefully data validated for content and min/max length.If a logging mechanism is employed to log all data used in a particular transaction we need to ensure that the payload received is not so big that it may affect the logging mechanism.If the log file is sent a very large payload it may crash or if it is sent a very large payload repeatedly the hard disk of the app server may fill causing a denial of service.Input can be encoded to a format that can still be interpreted correctly by the application but may not be an obvious avenue of attack.The encoding of ASCII to Unicode is another method of bypassing input validation.Rejected Data must not be persisted to the data store unless it is sanitised.This is a common mistake to log erroneous data but that may be what the attacker wishes your application to do.This type of attack can be used to recycle to log file, hence removing the audit trail.If string parsing is performed on the payload received by the application and an extremely large string is sent repeatedly to the application the CPU cycles used by the application to parse the payload may cause service degradation or even denial of service. Server-side code should perform its own validation.