Data Validation and Sanitization with WordPress

What is Data Validation and Sanitization with WordPress?

1. Data Validation:
Validation is to ensure data correctness and usefulness. Untrusted data comes from many sources (users, third party sites, your own database etc.) and all of it needs to be validated both on input and output. Proper security is critical to keeping your site or that of your theme or plugins safe.

Validation simply means the checks that are run to ensure the data you have and what it should be. For example, The email address always contains @ sign. If the input email is without the @ sign it is invalid. So the proper or valid email address should be entered on email field.

Another example is that, while creating an account on a site, we are asked to enter the password twice. Both the passwords are validated; they are checked to confirm whether they both are same or not.

Web application may be vulnerable without practice data validation. While creating an account on a site, we are asked to enter the password twice. Both the passwords are validated; they are checked to confirm whether they both are same or not.

Client-side validation is for user experience. You can inform a user if their input is invalid without making a round trip to the server. However, client-side validation can be bypassed so you should validate server side too, to ensure data integrity.

Significance:

For example:

Imagine a webshop database that would allow you to enter a new customer without an address. You would be unable to ship goods to such a customer.

Imagine that the same webshop database stores the country of residence of its customers. If the database doesn’t enforce a certain input pattern on this data you will end up different with values for the same country, like Nepal, US. This makes it impossible, or at least much harder to extract information like how many customers from the United States have used your webshop, and how many from Nepal.

Validation is done on a different approach, some of them are as follows:

Whitelist:
It only accepts the data from a finite list of known and trusted values.

Blacklist:
Reject data from a finite list of known untrusted values. This is very rarely a good idea.

Format Detection:
Test to see if the data is of the correct format. Only accept it if it is.

Format Correction:
Accept most any data, but remove or alter the dangerous pieces.

Examples:

WordPress provides a couple of functions to validate only some types of data. Developers usually define their own functions to validate data.

WordPress provided is_email () function o check whether the email is valid or not.

Code Example:

if( is_email (“[email protected]”)){
echo “Valid email”;
}
else{
echo “Invalid Email”;
}

2. Data Sanitization:

Sanitization is to ensure data safety and to prevent code injection. Sanitization means cleaning user input. Sanitization is a bit more liberal of an approach to accepting user data. It is a way of removing text, characters or codes from input that is not allowed. For example Widget title cannot have HTML tags in them. If you put HTML tags, then they are automatically removed before the title is saved.

Significance:

When data is included in some context, that data could be misinterpreted as a code for that environment. If the data contains malicious code, then using data without sanitizing it, means that code will be executed. The code doesn’t even necessarily have to be malicious for it to cause undesired effects.

Examples:

There are various functions provided by WordPress to sanitize different data into different forms.

Code Example:

sanitize_email()
echo sanitize_email(“test [email protected]”);
//output is “[email protected]”
Another example is using
Sanitize_file_name()
echo sanitize_file_name(“_profile pic- -1.png”);
//Output is “profile-pic-1.png”

Some of the other function used to sanitize data are:

sanitize_file_name()
sanitize_html_class()
sanitize_key()
sanitize_option() etc.

Concluding,

Validation of data, on the other hand, should be done as soon as it’s received and before it’s written to the database. The idea is that ‘invalid’ data should either be auto-corrected, or be flagged to the data, and only valid data should be given to the database.

That said – you may want to also perform validation when data is displayed too. In fact sometimes, ‘validation’ will also ensure the data is safe.