mozilla

Revision 329021 of Sending and retrieving form data

  • Revision slug: HTML/Forms/Sending_and_retrieving_form_data
  • Revision title: Sending and retrieving form data
  • Revision id: 329021
  • Created:
  • Creator: Jeremie
  • Is current revision? No
  • Comment

Revision Content

In many cases, the purpose of an HTML Form is to send data to a server. The server will process the data then will send an answer to the user. This seams simple but it's important to keep a few things in mind to be sure that those data will not damage your server or will not cause trouble to the users.

Where data go.

A quick reminder about the Client/Server architecture

The web is based on a very basic client/server architecture that can be summarized as follow: A client (most of the time, a browser) sends a request to a server (most of the time a web server software). This request is sent using the HTTP protocol. The server answers to that request using the same protocol.

[ADD SCHEMA]

On the client side, an HTML Form is nothing more than a convenient way to configure an HTTP request in order to send data to a server.

On the client side: define how to send data

To define the way the data will be sent, we just have to deal with the {{HTMLElement("form")}} element. All the attributes of that element are made to configure the request which is send when a user hit a submit button. The 2 most important attributes are {{htmlattrxref("action","form")}} and {{htmlattrxref("method","form")}}:

The {{htmlattrxref("action","form")}} attribute

This attribute defines where the data are sent. Its value must be a valid URL. If it's not set, the data will be sent to the URL of the page which contain the form itself.

<!-- The data will be sent to http://foo.com -->
<form action="http://foo.com">

<!-- The data will be sent to the same server as the one used by the page
     that host the form, but to another URL. -->
<form action="/somewhere_else">

<!-- The data will be sent to the page that host the form -->
<form>

<!-- This is another way to sent the data to the page that host the form
     This notation is quite common on legacy HTML content because
     the action attribute was required with HTML4 and XHTML1.
     With HTML5, this attribute is no longer mandatory -->
<form action="#">

Note: It's possible to set a URL with HTTPS. In that case, the data will be encrypted as the rest of the request even if the form is host on a page access through HTTP. On another hand, if the form is host on a page access through HTTPS and if you set an HTTP URL on the action attribute, all browsers will display a security warning to the user each time he'll try to send data.

The {{htmlattrxref("method","form")}} attribute

This attribute defines how data are sent. The HTTP protocol provides several ways to perform a request and HTML forms data can be sent through at least two of them, the GET method and the POST method.

To understand the difference between those two methods, let's step back and examine how HTTP works. Each time you want to reach a resource on the Web, the browser sends a request to an URL. An HTTP request is made of two part: A header that contain a bunch of global meta-data about the browser capacities, and a body that can contain some specific information necessary to the server to process the specific request.

The GET method is the method used by the browser to ask for a given resource to the server: "Hey server, I want to get this resource". So in such a case, the browser sends an empty body. Because the body have to be empty, if a form is sent using this method, the data will be happens to the URL

<form action="http://foo.com" method="get">
  <input name="say" value="Hi">
  <input name="to" value="Mom">
  <button>Send my greetings</button>
</form>

With the GET method, the HTTP request will look like:

GET /?say=Hi&to=Mom HTTP/1.1
Host: foo.com

The POST method is a little different. It's the method used by the browser to ask for a given resource to the server after the server has update the resource with the data available in the body of the HTTP request: "Hey server, update the resource with those data then send me back the result". In such a case, if a form is sent using this method, the data are appended to the body of the HTTP request.

<form action="http://foo.com" method="post">
  <input name="say" value="Hi">
  <input name="to" value="Mom">
  <button>Send my greetings</button>
</form>

In that case, the HTTP request will look like:

POST / HTTP/1.1
Host: foo.com
Content-Type: application/x-www-form-urlencoded
Content-Length: 13

say=Hi&to=Mom

Of course, HTTP request are never display to the users (if you want to see them, you have to use specific tools such as the Net panel of Firebug or the Chrome Developer Tools). The only thing displayed to the user is the URL called. So with a GET request, the user will see the data in his URL bar, but with a POST request, he will not. This is very important for two reasons:

  1. If you have to send a password (or a sensitive piece of data), never use the GET method or you take the risk to display it in the URL bar.
  2. If you have to send a large amount of data, prefer the POST method because some browsers limit the size of the URLS and, anyway, many servers are configure with limits to the size of URLs they will accept.

On the server side: How to retrieve the data.

So whatever the HTTP method used, the server will retrieve a string that will be parsed in order to get the data as a list of keys/values. The way to access this list of keys/values depends on the technology you use and on any specific framework you would have used with it. The technology you use also determines the way duplicate keys will be handled (many times the last value override the previous value).

Let's see how some examples

Raw PHP example

PHP offer some global objects to access the data. Assuming we used the POST method, the following example will just take the data and will display them to the user. Of course, what you do with the data is up to you (display them, store them into data base, send them to someone by e-mail, etc.)

<?php
  // The global $_POST variable allow to access the data send with the POST method
  // To access the data send with the GET method, you can use $_GET
  $say = htmlspecialchars($_POST['say']);
  $to  = htmlspecialchars($_POST['to']);

  echo  $say, ' ', $to;

This example will display a page with the data we send. In our example:

Hi Mom

Raw Python example

Let's see the same example using the Python language this time. It uses the CGI Python package to access the form data.

#!/usr/bin/env python
import html
import cgi
import cgitb; cgitb.enable()     # for troubleshooting

print("Content-Type: text/html") # HTTP header to say HTML is following
print()                          # blank line, end of headers

form = cgi.FieldStorage()
say  = html.escape(form["say"].value);
to   = html.escape(form["to"].value);

print(say, " ", to)

the result is the same as with PHP:

Hi Mom

Other languages and frameworks

There is many others technologies server side to handle your forms (Perl, Java, .Net, Ruby, etc.), just pick the one you feel better with. That said, it worth noting that it's very uncommon to use those thechnologies as is because it can be quite tricky and difficult.

Fortunately there are many clever developers out there who build nice frameworks to make our life easier when we have to deal with web stuffs, including HTML forms:


Ok, let's be honest, using those frameworks is not easy but if you take the time to learn to use at least one of them, you'll save a huge amount of time in the end (or at worst, you'll be able to talk with your friends/colleagues who will do the job).

A special case: sending files

Files are a special case with HTML forms. They are binary data—or considered as such—where all others data are text data. Because HTTP is a text protocol, there are special requirements to handle binary data.

The {{htmlattrxref("enctype","form")}} attribute

This attribute allows specifying the value of the Content-Type HTTP header. This header is very important because it's the one who say the server what is the kind of data is getting. By default, its value is application/x-www-form-urlencoded. In human term: "This is form data that are URL encoded".

But if you want to send files, you have to do two things:

  • Set the {{htmlattrxref("method","form")}} attribute to POST because a file content can't be put inside a URL parameter using a form.
  • Define the value of {{htmlattrxref("enctype","form")}} to multipart/form-data because data will be split into multiple parts, one for each files plus one for the text data that could be send with the files.
<form method="post" enctype="multipart/form-data>
    <input type="file" name="myFile">
    <button>Send the file</button>
</form>

Note: Some browsers support the {{htmlattrxref("multiple","input")}} attribute ont the {{HTMLElement("input")}} element in order to send more than one file with only one input element. The way those files are handles server side really depend on the technology used on the server. As seen previously, using a framework will make you're life a lot easier.

Warning: Many servers are configured with a size limit for file and HTTP request in order to prevent abuse. It's important to check what is this limit before sending a file.

Security concerns

Each time you send data to a server you have to take care about security. HTML Forms are one of the first attack vector against servers. The problems never come from the HTML forms themselves, it comes from the way data are handled server side.

Common security flows

Depending on what you're doing, there are some very well known security issues:

XSS and CSRF

Cross Site Scripting (XSS) and Cross Site Request Forgery (CSRF) are common type of attacks when you display data send by a user to the user himself or to another user.

XSS allows attackers to inject client-side script into Web pages viewed by other users. A cross-site scripting vulnerability may be used by attackers to bypass access controls such as the same origin policy. Their effect may range from a petty nuisance to a significant security risk.

CSRF are really closed from XSS attack in the way they start the same way—By injecting client-side script into Web pages—but their target is different. CSRF attacker try to escalate privilege by force a granted user (such as a site administrator) to perform an action it shouldn't (for example, sending data to an untrusted user).

XSS attack exploit the trust a user have for a web site when CSRF attack exploit the trust a web site have for a user.

To prevent this you should always check what a user send to your server and if you have to display that content, try to not display HTML content provide by the user. Almost all framework on the market today implement a minimal filter that remove the HTML {{HTMLElement("script")}}, {{HTMLElement("iframe")}} and {{HTMLElement("object")}} element from data send by any user.

SQL Injection

SQL Injection is a type of attack that tries to perform action on a database used by the target web site. Basically it sends an SQL request and expects the server to execute it (many time when the application server try to store the data). This is actually one of the main vector attacks against web sites.

The consequence can be terrible from data loss to access to a whole infrastructure by using privilege escalation. This is a very serious threat and you should never store data send by a user without performing some sanitization (for example by using mysql_real_escape_string on a PHP/MySQL infrastructure).

HTTP Header Injection and e-mail injection

This kind of attacks can occur when your application build HTTP Headers or e-mails based on the user inputs. They will not directly damage your system or affect your users but it's an open door to deeper problems such as Session hijacking or Phishing attack.

Those attack are mainly silent and can turn your server into a zombie.

Be paranoid: Never trust your users

So, how to fight those threats? Well, this is a topic far beyond this guide, however there is a few rules it's good to have in mind. The most important rule is: Never ever trust you users, including yourself (I'm dead serious); even a trusted user could have been hijack.

Each data that come to your server must be check and sanitize. Always. No exception.

  • Escape dangerous characters (all server side languages have functions to do that)
  • Limit the incoming amount of data to what is necessary only
  • Sandbox uploaded files (store them on a different server and allow access to the file only through a different subdomain or even better through a fully different domain name)

If you follow those 3 rules of thumb, you should avoid a large part of the troubles, but it never worth a good security review perform by third party people. As I said, don't even trust yourself ;)

Conclusion

As you can see, sending form data is easy, but securing an application can be tricky. Just remember that a front-end developer is not the one who should define the security model of the data. Yes, as we'll see, it's possible to perform client side data validation but the server can't trust this validation because it has now way to trustly know what's really happen on client side.

See also

If you want to learn more about securing a web application, you can dig into those few resources:

Revision Source

<p>In many cases, the purpose of an <a href="/en-US/docs/HTML/Forms" title="/en-US/docs/HTML/Forms">HTML Form</a> is to send data to a server. The server will process the data then will send an answer to the user. This seams simple but it's important to keep a few things in mind to be sure that those data will not damage your server or will not cause trouble to the users.</p>
<h2>Where data go.</h2>
<h3>A quick reminder about the Client/Server architecture</h3>
<p>The web is based on a very basic client/server architecture that can be summarized as follow: A client (most of the time, a browser) sends a request to a server (most of the time a web server software). This request is sent using the HTTP protocol. The server answers to that request using the same protocol.</p>
<p>[ADD SCHEMA]</p>
<p>On the client side, an HTML Form is nothing more than a convenient way to configure an HTTP request in order to send data to a server.</p>
<h3>On the client side: define how to send data</h3>
<p>To define the way the data will be sent, we just have to deal with the {{HTMLElement("form")}} element. All the attributes of that element are made to configure the request which is send when a user hit a submit button. The 2 most important attributes are {{htmlattrxref("action","form")}} and {{htmlattrxref("method","form")}}:</p>
<h4>The {{htmlattrxref("action","form")}} attribute</h4>
<p>This attribute defines where the data are sent. Its value must be a valid URL. If it's not set, the data will be sent to the URL of the page which contain the form itself.</p>
<pre class="brush: html">
&lt;!-- The data will be sent to http://foo.com --&gt;
&lt;form action="http://foo.com"&gt;

&lt;!-- The data will be sent to the same server as the one used by the page
     that host the form, but to another URL. --&gt;
&lt;form action="/somewhere_else"&gt;

&lt;!-- The data will be sent to the page that host the form --&gt;
&lt;form&gt;

&lt;!-- This is another way to sent the data to the page that host the form
     This notation is quite common on legacy HTML content because
     the action attribute was required with HTML4 and XHTML1.
     With HTML5, this attribute is no longer mandatory --&gt;
&lt;form action="#"&gt;</pre>
<div class="note">
  <p><strong>Note:</strong> It's possible to set a URL with HTTPS. In that case, the data will be encrypted as the rest of the request even if the form is host on a page access through HTTP. On another hand, if the form is host on a page access through HTTPS and if you set an HTTP URL on the action attribute, all browsers will display a security warning to the user each time he'll try to send data.</p>
</div>
<h4>The {{htmlattrxref("method","form")}} attribute</h4>
<p>This attribute defines how data are sent. The <a href="/en-US/docs/HTTP" title="/en-US/docs/HTTP">HTTP protocol</a> provides several ways to perform a request and HTML forms data can be sent through at least two of them, the GET method and the POST method.</p>
<p>To understand the difference between those two methods, let's step back and examine how HTTP works. Each time you want to reach a resource on the Web, the browser sends a request to an URL. An HTTP request is made of two part: A header that contain a bunch of global meta-data about the browser capacities, and a body that can contain some specific information necessary to the server to process the specific request.</p>
<p>The GET method is the method used by the browser to ask for a given resource to the server: "Hey server, I want to get this resource". So in such a case, the browser sends an empty body. Because the body have to be empty, if a form is sent using this method, the data will be happens to the URL</p>
<pre class="brush: html">
&lt;form action="http://foo.com" method="get"&gt;
  &lt;input name="say" value="Hi"&gt;
  &lt;input name="to" value="Mom"&gt;
  &lt;button&gt;Send my greetings&lt;/button&gt;
&lt;/form&gt;</pre>
<p>With the GET method, the HTTP request will look like:</p>
<pre>
GET /?say=Hi&amp;to=Mom HTTP/1.1
Host: foo.com</pre>
<p>The POST method is a little different. It's the method used by the browser to ask for a given resource to the server after the server has update the resource with the data available in the body of the HTTP request: "Hey server, update the resource with those data then send me back the result". In such a case, if a form is sent using this method, the data are appended to the body of the HTTP request.</p>
<pre class="brush: html">
&lt;form action="http://foo.com" method="post"&gt;
  &lt;input name="say" value="Hi"&gt;
  &lt;input name="to" value="Mom"&gt;
  &lt;button&gt;Send my greetings&lt;/button&gt;
&lt;/form&gt;</pre>
<p>In that case, the HTTP request will look like:</p>
<pre>
POST / HTTP/1.1
Host: foo.com
Content-Type: application/x-www-form-urlencoded
Content-Length: 13

say=Hi&amp;to=Mom</pre>
<p>Of course, HTTP request are never display to the users (if you want to see them, you have to use specific tools such as the Net panel of Firebug or the Chrome Developer Tools). The only thing displayed to the user is the URL called. So with a GET request, the user will see the data in his URL bar, but with a POST request, he will not. This is very important for two reasons:</p>
<ol>
  <li>If you have to send a password (or a sensitive piece of data), never use the GET method or you take the risk to display it in the URL bar.</li>
  <li>If you have to send a large amount of data, prefer the POST method because some browsers limit the size of the URLS and, anyway, many servers are configure with limits to the size of URLs they will accept.</li>
</ol>
<h3>On the server side: How to retrieve the data.</h3>
<p>So whatever the HTTP method used, the server will retrieve a string that will be parsed in order to get the data as a list of keys/values. The way to access this list of keys/values depends on the technology you use and on any specific framework you would have used with it. The technology you use also determines the way duplicate keys will be handled (many times the last value override the previous value).</p>
<p>Let's see how some examples</p>
<h4>Raw PHP example</h4>
<p>PHP offer some global objects to access the data. Assuming we used the POST method, the following example will just take the data and will display them to the user. Of course, what you do with the data is up to you (display them, store them into data base, send them to someone by e-mail, etc.)</p>
<pre class="brush: php">
&lt;?php
  // The global $_POST variable allow to access the data send with the POST method
  // To access the data send with the GET method, you can use $_GET
  $say = htmlspecialchars($_POST['say']);
  $to  = htmlspecialchars($_POST['to']);

  echo  $say, ' ', $to;</pre>
<p>This example will display a page with the data we send. In our example:</p>
<pre>
Hi Mom</pre>
<h4>Raw Python example</h4>
<p>Let's see the same example using the Python language this time. It uses the <a href="http://docs.python.org/3/library/cgi.html" rel="external" title="http://docs.python.org/3/library/cgi.html">CGI Python package</a> to access the form data.</p>
<pre class="brush: python">
#!/usr/bin/env python
import html
import cgi
import cgitb; cgitb.enable()     # for troubleshooting

print("Content-Type: text/html") # HTTP header to say HTML is following
print()                          # blank line, end of headers

form = cgi.FieldStorage()
say  = html.escape(form["say"].value);
to   = html.escape(form["to"].value);

print(say, " ", to)</pre>
<p>the result is the same as with PHP:</p>
<pre>
Hi Mom</pre>
<h4>Other languages and frameworks</h4>
<p>There is many others technologies server side to handle your forms (Perl, Java, .Net, Ruby, etc.), just pick the one you feel better with. That said, it worth noting that it's very uncommon to use those thechnologies as is because it can be quite tricky and difficult.</p>
<p>Fortunately there are many clever developers out there who build nice frameworks to make our life easier when we have to deal with web stuffs, including HTML forms:</p>
<ul>
  <li><a href="http://symfony.com/" rel="external" title="http://symfony.com/">Symfony</a> for PHP</li>
  <li><a href="https://www.djangoproject.com/" rel="external" title="https://www.djangoproject.com/">Django</a> for Python</li>
  <li><a href="http://rubyonrails.org/" rel="external" title="http://rubyonrails.org/">Ruby On Rails</a> for Ruby</li>
  <li><a href="http://grails.org/" rel="external" title="http://grails.org/">Grails</a> for Java</li>
  <li>etc.</li>
</ul>
<p><br />
  Ok, let's be honest, using those frameworks is not easy but if you take the time to learn to use at least one of them, you'll save a huge amount of time in the end (or at worst, you'll be able to talk with your friends/colleagues who will do the job).</p>
<h2>A special case: sending files</h2>
<p>Files are a special case with HTML forms. They are binary data—or considered as such—where all others data are text data. Because HTTP is a text protocol, there are special requirements to handle binary data.</p>
<h3>The {{htmlattrxref("enctype","form")}} attribute</h3>
<p>This attribute allows specifying the value of the Content-Type HTTP header. This header is very important because it's the one who say the server what is the kind of data is getting. By default, its value is <code>application/x-www-form-urlencoded</code>. In human term: "This is form data that are URL encoded".</p>
<p>But if you want to send files, you have to do two things:</p>
<ul>
  <li>Set the {{htmlattrxref("method","form")}} attribute to <code>POST</code> because a file content can't be put inside a URL parameter using a form.</li>
  <li>Define the value of {{htmlattrxref("enctype","form")}} to <code>multipart/form-data</code> because data will be split into multiple parts, one for each files plus one for the text data that could be send with the files.</li>
</ul>
<pre class="brush: html">
&lt;form method="post" enctype="multipart/form-data&gt;
    &lt;input type="file" name="myFile"&gt;
    &lt;button&gt;Send the file&lt;/button&gt;
&lt;/form&gt;</pre>
<div class="note">
  <p><strong>Note:</strong> Some browsers support the {{htmlattrxref("multiple","input")}} attribute ont the {{HTMLElement("input")}} element in order to send more than one file with only one input element. The way those files are handles server side really depend on the technology used on the server. As seen previously, using a framework will make you're life a lot easier.</p>
</div>
<div class="warning">
  <p><strong>Warning:</strong> Many servers are configured with a size limit for file and HTTP request in order to prevent abuse. It's important to check what is this limit before sending a file.</p>
</div>
<h2>Security concerns</h2>
<p>Each time you send data to a server you have to take care about security. HTML Forms are one of the first attack vector against servers. The problems never come from the HTML forms themselves, it comes from the way data are handled server side.</p>
<h3>Common security flows</h3>
<p>Depending on what you're doing, there are some very well known security issues:</p>
<h4>XSS and CSRF</h4>
<p>Cross Site Scripting (XSS) and Cross Site Request Forgery (CSRF) are common type of attacks when you display data send by a user to the user himself or to another user.</p>
<p>XSS allows attackers to inject client-side script into Web pages viewed by other users. A cross-site scripting vulnerability may be used by attackers to bypass access controls such as the same origin policy. Their effect may range from a petty nuisance to a significant security risk.</p>
<p>CSRF are really closed from XSS attack in the way they start the same way—By injecting client-side script into Web pages—but their target is different. CSRF attacker try to escalate privilege by force a granted user (such as a site administrator) to perform an action it shouldn't (for example, sending data to an untrusted user).</p>
<p>XSS attack exploit the trust a user have for a web site when CSRF attack exploit the trust a web site have for a user.</p>
<p>To prevent this you should always check what a user send to your server and if you have to display that content, try to not display HTML content provide by the user. Almost all framework on the market today implement a minimal filter that remove the HTML {{HTMLElement("script")}}, {{HTMLElement("iframe")}} and {{HTMLElement("object")}} element from data send by any user.</p>
<h4>SQL Injection</h4>
<p>SQL Injection is a type of attack that tries to perform action on a database used by the target web site. Basically it sends an SQL request and expects the server to execute it (many time when the application server try to store the data). This is actually <a href="https://www.owasp.org/index.php/Category:OWASP_Top_Ten_Project" rel="external" title="https://www.owasp.org/index.php/Category:OWASP_Top_Ten_Project">one of the main vector attacks against web sites</a>.</p>
<p>The consequence can be terrible from data loss to access to a whole infrastructure by using privilege escalation. This is a very serious threat and you should never store data send by a user without performing some sanitization (for example by using <a href="http://www.php.net/manual/en/function.mysql-real-escape-string.php" rel="external" title="http://www.php.net/manual/en/function.mysql-real-escape-string.php">mysql_real_escape_string</a> on a PHP/MySQL infrastructure).</p>
<h4>HTTP Header Injection and e-mail injection</h4>
<p>This kind of attacks can occur when your application build HTTP Headers or e-mails based on the user inputs. They will not directly damage your system or affect your users but it's an open door to deeper problems such as Session hijacking or Phishing attack.</p>
<p>Those attack are mainly silent and can turn your server into a <a href="http://en.wikipedia.org/wiki/Zombie_(computer_science)" rel="exernal" title="http://en.wikipedia.org/wiki/Zombie_(computer_science)">zombie</a>.</p>
<h3>Be paranoid: Never trust your users</h3>
<p>So, how to fight those threats? Well, this is a topic far beyond this guide, however there is a few rules it's good to have in mind. The most important rule is: Never ever trust you users, including yourself (I'm dead serious); even a trusted user could have been hijack.</p>
<p>Each data that come to your server must be check and sanitize. Always. No exception.</p>
<ul>
  <li>Escape dangerous characters (all server side languages have functions to do that)</li>
  <li>Limit the incoming amount of data to what is necessary only</li>
  <li>Sandbox uploaded files (store them on a different server and allow access to the file only through a different subdomain or even better through a fully different domain name)</li>
</ul>
<p>If you follow those 3 rules of thumb, you should avoid a large part of the troubles, but it never worth a good security review perform by third party people. As I said, don't even trust yourself ;)</p>
<h2>Conclusion</h2>
<p>As you can see, sending form data is easy, but securing an application can be tricky. Just remember that a front-end developer is not the one who should define the security model of the data. Yes, as we'll see, it's possible to perform client side data validation but the server can't trust this validation because it has now way to trustly know what's really happen on client side.</p>
<h2>See also</h2>
<p>If you want to learn more about securing a web application, you can dig into those few resources:</p>
<ul>
  <li><a href="https://www.owasp.org/index.php/Main_Page" rel="external" title="https://www.owasp.org/index.php/Main_Page">The Open Web Application Security Project (OWASP)</a></li>
  <li><a href="http://shiflett.org/" rel="external" title="http://shiflett.org/">Chris Shiflett's blog about PHP Security</a></li>
  <li><a href="https://code.google.com/intl/en/edu/security/index.html" rel="external" title="https://code.google.com/intl/en/edu/security/index.html">Learning material from Google</a></li>
</ul>
Revert to this revision