<pre>
Categorising usage of HTTP methods - a matter of trust.
=======================================================
Introduction
------------
There have been a number of threads on the REST mail list
discussing the usage of HTTP methods. In particular whether
using GET/PUT/DELETE/POST(a) is more RESTful than GET/POST(p).
Note: POST(a) refers to using POST as an append operation,
POST(p) refers to using POST as a 'process this' operation.
The semantic difference between these two flavours of POST is purely
at an application level, neither REST or HTTP make this distinction.
Here is some of the previous discussion :-
http://groups.yahoo.com/group/rest-discuss/message/4723
http://asynchronous.org/blog/archives/2004/08/index.html
http://groups.yahoo.com/group/rest-discuss/message/4725
http://groups.yahoo.com/group/rest-discuss/message/4728
The hypothesis I put forward here is that:
Whether it is appropriate to use GET/PUT/DELETE/POST(a)
rather than GET/POST(p) is a matter of the level of trust
between the user agent and the origin server.
Put another way, you can only expose the resources on an
origin server through GET/PUT/DELETE/POST(a) if the origin
server has complete trust in the user agent. In any other
circumstances you must limit the communication to
GET/POST(p).
Analysis of the methods
-----------------------
Methods for UNTRUSTED user agents
---------------------------------
GET = data retrieval
POST(p) = data processing
GET is a readonly HTTP method without side effects so it
presents no threat when used by an untrusted user agent.
POST is a request to process data contained in the entity
body. It's impact can be controlled by the process handling
the POST request. The request entity can be validated and
appropriately constrained to ensure that the integrity of
the server and the resources for which it is responsible
are not compromised.
Methods for TRUSTED user agents
-------------------------------
GET = data retrieval
PUT = write resource
DELETE = remove resource
POST(a) = create subordinate resource
GET (see above).
PUT is analogous to a write of the entity body to the
resource URI. A high level of trust is required that the
user agent will not unintentionally or maliciously
corrupt the resource that is being replaced. The resource
cannot be amended or extended by a PUT, it must be replaced.
This places a greater burden on the user agent to understand
the internal state of the resource and to maintain its
consistency within itself and in relation to the resources
that surround it.
DELETE has similar constraints and impact as PUT. It
requires a similar level of trust of the user agent.
POST(a) can be used to create a new subordinate resource.
This can be thought of as a two step process, even if it
is not implemented this way:
1. Calculate a new URI for the child resource
2. PUT the enclosed entity to that URI.
Therefore, POST(a) has the same impact and requires the same
level of trust as PUT and DELETE.
Observations on levels of trust
-------------------------------
These observations only concern those methods that request
modifications to resources. The GET method's behaviour is
identical for trusted and untrusted user agents.
References to code mean the business application code
not the code for the user agent (eg. browser) or origin
server (eg. apache).
Observations on UNTRUSTED access
................................
+---------+ +------------+
| | GET | |
| User |<---------| Origin |
| Agent | | Server |
| |<-------->| |
+---------+ POST(p) +------------+
1. Validation of the request is performed on the origin server
before the modifications are made and saved.
2. Modifications or amendments to existing resources are
possible.
3. Modifications to many resources are possible from the one
request.
4. The user agent only requires knowledge of the interface to
the resource. This may be as simple as a set of name/value
pairs.
5. The main processing logic occurs on the origin server.
6. The user agent may treat the responses from the origin
server in a fairly generic manner. A browser is an example of
this as it displays the responses according to the rules
for displaying HTML without any knowledge of the
application.
7. User agent code may be written by a different organisation
from the one that wrote the code on the origin server.
Observations on TRUSTED access
..............................
+---------+ +------------+
| | GET | |
| User |<---------| Origin |
| Agent | | Server |
| |--------->| |
+---------+ PUT +------------+
POST(a)
DELETE
1. The user agent must maintain relatively complete
representations of the internal state of the resources
it is manipulating.
2. Existing resource state on the origin server is
clobbered.
3. The user agent can only save the state of one resource
at a time.
4. The entire resource must be replaced as partial updates
are not possible.
5. The origin server may enforce constraints on the resources:
- referential integrity between resources
- structural, type or format constraints
- data validation.
6. The user agent requires intimate knowledge of the
representations of resources and the relationships
between resources.
7. The main processing logic occurs on the user agent. The
origin server is a repository for permanent or long
term application state.
8. The user agent code is probably written by the same
organisation that wrote the code on the origin
server or in close collaboration with them.
Three tier architecture
-----------------------
It is interesting to note that in the untrusted model the main
processing is done on the origin server while in the trusted
model the main processing is done on the user agent. When
these two models are overlapped we end up with the traditional
three tier architecture.
Presentation Business Data
Layer Logic Services
{Untrusted} {Trusted}
{ access } { access}
+---------+ +-------------+ +------------+
| | GET | | GET | |
| User |<---------|Origin User|<--------| Origin |
| Agent | |Server Agent| | Server |
| |<-------->| |-------->| |
+---------+ POST(p) +-------------+ PUT +------------+
POST(a)
DELETE
The next logical step is to allow the business logic server to
talk to another business logic server and we have RESTful Web
Services.
Presentation Business Data
Layer Logic Services
{Untrusted} {Trusted}
{ access } { access}
+---------+ +-------------+ +------------+
| | GET | | GET | |
| User |<---------|Origin User|<--------| Origin |
| Agent | |Server Agent| | Server |
| |<-------->| |-------->| |
+---------+ POST(p) +-------------+ PUT +------------+
| ^ POST(a)
| | DELETE
| |
GET | | POST(p)
V |
+---------+ +-------------+
| | GET | User |
| User |<---------| Agent|
| Agent | |Origin |
| |<-------->|Server |
+---------+ POST(p) +-------------+
Conclusion
----------
This article looks at categorizing the use of HTTP methods
in relation to the level of trust between the user agent and
the origin server.
It suggests that GET/PUT/DELETE/POST(a) methods should only be
used where there is a high level of trust between the user
agent and the origin server. Where there is not a sufficient
level of trust only the GET/POST(p) methods should be used.
It further suggests that these access models can be mapped
onto a three tier architecture. Untrusted access methods are
appropriate to use between the Presentation Layer and Business
Logic. Trusted access methods are appropriate to use between
the Business Logic and Data Services. Untrusted access methods
are also appropriate to use between independent units of Business
Logic.
Finally, I contend that trusted access to web resources using
GET/PUT/DELETE/POST(a) and untrusted access using
GET/POST(p) are two sides of the same coin and that one
is certainly NOT better or more RESTful than the other.
Donald Strong.
</pre>