> I don't think I've seen a strong case that this optimization is
> important. Would Wiki be noticably better if it used PUT instead of
> POST?
Oopsie with that last message. I'll continue my thought here.
There's *currently* a small efficiency advantage in having PUT as its
own method; caching. But safety is also an issue. Permitting the
PUT method to go behind a firewall is a much safer thing to do than
permitting POST, for example, because PUT behaviour is much more
narrowly defined.
I would expect that there's other good reasons too, but that's all
that I can think of right now.
MB
--
Mark Baker, Chief Science Officer, Planetfred, Inc.
Ottawa, Ontario, CANADA. mbaker@...http://www.markbaker.cahttp://www.planetfred.com
> > This is a really small advantage in a write-few/read-many environment.
> > I wouldn't worry about it.
>
> That's my point! If PUTs have a small advantage over POST then why
> should we REST-ies spend our valuable moral capital fighting with FORMs
> peoples etc. over PUT.
There's *currently* a small advantage; caching. There may be a larger
advantag
> Well, I'm mostly playing the devil's advocate
> because of course fighting for the Right Thing is its own reward. ;)
> Seriously though, I'd rather have strong arguments in favour of PUT than
> this one which seems quite weak to me.
I'm not rabid about the need for PUT in XForms. It wouldn't be a
replacement for having user agents supporting it as a "File->Save" type
operation. I'd see it mostly used for creating new resources, where
the content provider could specify the URI for the new resource.
Developers shouldn't commonly use PUT with XForms to try and replace
this missing functionality from user agents. But hopefully any user
agent supporting XForms and PUT would go the next step and support
PUT more generally like Amaya.
MB
--
Mark Baker, Chief Science Officer, Planetfred, Inc.
Ottawa, Ontario, CANADA. mbaker@...http://www.markbaker.cahttp://www.planetfred.com
Mark Baker wrote:
>
>...
>
> REST doesn't limit the number of methods. You might need 10 to
> coordinate something complicated. The point here being that there's
> a whole whack load of things that can't be *efficiently* coordinated
> with GET/POST. Just look at Wiki. It's doing PUT with POST, and so
> misses out on an important optimization (caching PUTs).
I don't think I've seen a strong case that this optimization is
important. Would Wiki be noticably better if it used PUT instead of
POST?
Paul Prescod
Mark Baker wrote:
>
>...
>
> It only takes one person at MSN to do a GET on that same resource, and
> for MSN to cache that response, to have the same effect.
Right. So and of course the same goes for the original originator
network AOL. So why worry about the caching of PUTs? The cache can just
wait for the next GET.
> This is a really small advantage in a write-few/read-many environment.
> I wouldn't worry about it.
That's my point! If PUTs have a small advantage over POST then why
should we REST-ies spend our valuable moral capital fighting with FORMs
peoples etc. over PUT. Well, I'm mostly playing the devil's advocate
because of course fighting for the Right Thing is its own reward. ;)
Seriously though, I'd rather have strong arguments in favour of PUT than
this one which seems quite weak to me.
Paul Prescod
> > Of course! How else can you explicitly set state and have
> > intermediaries know about it (so they can cache it)? You can't do that
> > with POST.
>
> Is this intermediary argument realistic? Let's say I'm going through a
> big honking cache at AOL. I do a PUT. Now all AOL users in the world
> have an updated view of the world because our cache recorded the PUT.
> But everyone else in the universe is out of date. Why should AOL users
> get a priviledged view of the world?
It only takes one person at MSN to do a GET on that same resource, and
for MSN to cache that response, to have the same effect.
This is a really small advantage in a write-few/read-many environment.
I wouldn't worry about it.
MB
--
Mark Baker, Chief Science Officer, Planetfred, Inc.
Ottawa, Ontario, CANADA. mbaker@...http://www.markbaker.cahttp://www.planetfred.com
> Mark N. believes (and without looking it up, I would guess he's right)
> that the semantics of PUT are that an immediately following GET should
> return the same representation that was just PUT.
Actually, that's not quite right. There's content negotiation to be
considered.
> Whereas POST has no
> such semantic. So when they say "data" they mean a representation that
> is appropriate for GETing.
>
> > I also don't know what John means by "The appropriate pair for REST
> > is GET/POST". Why "pair"? Why restrict one's self to two methods
> > for coordination when three is needed?
>
> You claim that three is needed (why not four) but the web more or less
> gets by with two today!
REST doesn't limit the number of methods. You might need 10 to
coordinate something complicated. The point here being that there's
a whole whack load of things that can't be *efficiently* coordinated
with GET/POST. Just look at Wiki. It's doing PUT with POST, and so
misses out on an important optimization (caching PUTs).
MB
--
Mark Baker, Chief Science Officer, Planetfred, Inc.
Ottawa, Ontario, CANADA. mbaker@...http://www.markbaker.cahttp://www.planetfred.com
Mark Baker wrote:
>
>...
>
> Of course! How else can you explicitly set state and have
> intermediaries know about it (so they can cache it)? You can't do that
> with POST.
Is this intermediary argument realistic? Let's say I'm going through a
big honking cache at AOL. I do a PUT. Now all AOL users in the world
have an updated view of the world because our cache recorded the PUT.
But everyone else in the universe is out of date. Why should AOL users
get a priviledged view of the world?
Paul Prescod
Mark Baker wrote:
>
>...
>
> I don't understand what you or John are referring to when you refer
> to a difference between "data" and the "representation". REST says
> nothing of "data", nor should it because it's hidden (if I understand
> what is meant by "data" - basically, the state in some raw form).
Mark N. believes (and without looking it up, I would guess he's right)
that the semantics of PUT are that an immediately following GET should
return the same representation that was just PUT. Whereas POST has no
such semantic. So when they say "data" they mean a representation that
is appropriate for GETing.
> I also don't know what John means by "The appropriate pair for REST
> is GET/POST". Why "pair"? Why restrict one's self to two methods
> for coordination when three is needed?
You claim that three is needed (why not four) but the web more or less
gets by with two today!
Paul Prescod
> John Barton has replied [1] to my e-mail, saying that his
> interpretation of REST is that POST is more appropriate than PUT.
>
> Still being a REST novice (and still, dammit, not having read the
> entirety of Roy's dissertation), what say you, rest-discuss? Does PUT
> have a place in REST?
Of course! How else can you explicitly set state and have
intermediaries know about it (so they can cache it)? You can't do that
with POST.
> My instinct is that there is; while POST has the benefit of
> separating the data and representations, there are times when it's
> beneficial to say "here's a representation; when people request this
> resource in the future, give it to them." Does having such a direct
> relationship violate REST?
I don't understand what you or John are referring to when you refer
to a difference between "data" and the "representation". REST says
nothing of "data", nor should it because it's hidden (if I understand
what is meant by "data" - basically, the state in some raw form).
I also don't know what John means by "The appropriate pair for REST
is GET/POST". Why "pair"? Why restrict one's self to two methods
for coordination when three is needed?
MB
--
Mark Baker, Chief Science Officer, Planetfred, Inc.
Ottawa, Ontario, CANADA. mbaker@...http://www.markbaker.cahttp://www.planetfred.com
John Barton has replied [1] to my e-mail, saying that his
interpretation of REST is that POST is more appropriate than PUT.
Still being a REST novice (and still, dammit, not having read the
entirety of Roy's dissertation), what say you, rest-discuss? Does PUT
have a place in REST?
My instinct is that there is; while POST has the benefit of
separating the data and representations, there are times when it's
beneficial to say "here's a representation; when people request this
resource in the future, give it to them." Does having such a direct
relationship violate REST?
[1] http://lists.w3.org/Archives/Public/www-forms/2002Jan/0078.html
On Tue, Jan 15, 2002 at 01:33:41PM -0800, Mark Nottingham wrote:
>
> I've raised an issue [1] with the XForms WG regarding their lack of
> support for PUTing xml instance data. Supporting PUT would seem to be
> very helpful to RESTful applications...
>
> [1] http://lists.w3.org/Archives/Public/www-forms/2002Jan/thread.html#47
>
>
> --
> Mark Nottingham
> http://www.mnot.net/
>
>
>
> To unsubscribe from this group, send an email to:
> rest-discuss-unsubscribe@yahoogroups.com
>
>
>
> Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
>
>
--
Mark Nottingham
http://www.mnot.net/
--- In rest-discuss@y..., Mark Nottingham <mnot@m...> wrote:
>
> On Mon, Jan 07, 2002 at 08:42:42PM -0500, Mark Baker wrote:
> > Content negotiation could be used to negotiate a representation
> > capable of expressing a form.
>
> I was hoping to allow requests to the record resource
> http://www.example.com/addresses/thePerson
> to negotiate the representation returned, so that the default would
> be to return an HTML page, but if linked from an <img> element, the
> image/jpeg in the jpegPhoto attribute would be returned.
>
Even if you tunnel Accept header in the URL (which means you need to
control the server or have a generic intermediary that un-tunnels
them) you'll want to consider strongly using sub-resource for images
and stuff.
If the URI space you design can address LDAP attributes, then you'd
have the problem solved maybe?
Please contribute comments on this article I am working on for xml.com.
==============
Second Generation Web Services
In the early days of the Internet, it was common for enlightened
businesses to connect to the Internet merely by using SMTP, NTTP and FTP
clients and servers to deliver messages, text files, executables and
source code. The Internet became a more fundamental tool when businesses
started to integrate their corporate information (both public and
private) into the emerging Web framework. The Internet became popular
when it shifted from a focus on transactional protocols to a focus on
data objects and the links between them.
The technologies that characterize the early Web framework were
HTML/GIF/JPEG, HTTP and URLs. This combination of standardized formats,
a single application protocol and a single universal namespace was
incredibly powerful. Using these technologies, corporations integrated
their diverse online publishing systems into something much more
compelling than any one of them could have built.
Once organizations converged on common formats, the HTTP protocol and a
single addressing scheme, the Web became more than a set of Web sites.
It became the world's most diverse and powerful information system.
Organizations built links between their own information and other
people's. Amazing third party applications also weaved the information
together. Examples include Google, Yahoo, Babelfish and Robin Cover's
XML citations.
First generation Web Services are like first generation Internet
connections. They are not integrated with each other and are not
designed so that third parties can easily integrate them in a uniform
way. I posit that the next generation will be more like the integrated
Web that arose for online publishing and human/computer interactions. In
fact, I believe that second generation web services will actually build
much more heavily on the architecture that made the Web work. Look for
the holy trinity: standardized formats (XML vocabularies), a
standardized application protocol and a single URI namespace.
This next generation of Web Services will likely bear the name "REST"
Web Services. REST is the underlying architectural model of the current
Web. It stands for REpresentational State Transfer. Roy Fielding of
eBuilt invented the name in his PhD dissertation.
http://www.ebuilt.com/fielding/pubs/dissertation/top.htm. Recently, Mark
Baker of PlanetFred has been a leading advocate of this architecture.
REST details why the Web has URIs, HTTP, HTML, JavaScript and many other
features. It has many aspects and I would not claim to understand it in
detail. I'm going to focus on the aspects that are most interesting to
XML users and developers.
The Current Generation
SOAP was originally intended to be a cross-Internet form of DCOM or
CORBA. The name of an early SOAP-like technology was "WebBroker" -
Web-based object broker. It made perfect sense to model an
inter-application protocol on DCOM, CORBA, RMI etc. because they were
the current models for solving inter-application interoperability
problems.
These RPC protocols achieved only limited success before they were
ported to the Web. Some believe that the problem was merely that
Microsoft and the OMG supporters could not get along. I disagree. There
is a deeper issue. RPC models are great for closed-world problems. A
closed world problem is one where you know all of the users, you can
share a data model with them, and you can all communicate directly as to
your needs. Evolution is comparatively easy in such an environment: you
just tell everybody that the RPC API is going to change on such and such
a date and perhaps you have some changeover period to avoid downtime.
When you want to integrate a new system you do so by building a
point-to-point integration.
On the other hand, when your user base is too large to communicate
coherently you need a different strategy. You need a pre-arranged
framework that allows for evolution on both the client and server sides.
You need to depend less on a shared, global understanding of the rights
and responsibilities of a participant. You need to put in hooks where
your users can innovate without contacting you. You need to leave in
explicit mechanisms for interoperating with systems that do not have the
same API. RPC protocols are traditionally poor at this kind of
evolution. Changing interfaces tends to be extremely difficult. I
believe that this is why no enterprise has ever successfully unified all
of their systems with an RPC protocol such as DCOM, CORBA or RMI.
Now we come to the crux of the problem: SOAP RPC is DCOM for the
Internet.
There are many problems that can be solved with an RPC methodology. But
I believe that the biggest, hairiest problems will require a model that
allows for independent evolution of clients, servers and intermediaries.
It is therefore important for us to study the only distributed
applications in history to ever scale to the size of the Internet.
The archetypical scalable application
There two most massively scalable, radically interoperable, distributed
applications in the world today and they are the Web and email. What
makes these two so scalable and interoperable feature? For starters,
they both depend on standardized, extensible message formats (HTML and
MIME). They both depend on standardized, extensible application
protocols (HTTP and SMTP). But I believe that the most important thing
is that each has a global addressing scheme.
In the real estate world there is a joke that there are three things
that make a property valuable: location, location and location. The same
is true in the world of XML web services. Properly implemented, XML web
services allow you assign addresses to data objects so that they may be
located for sharing or modification.
In particular, the web's central concept is a single unifying namespace
of URIs. URIs allow the dense web of links that make the Web worth
using. URIs identify resources. Resources are conceptual objects.
Representations of them are delivered across the web in HTTP messages.
These ideas are so simple and yet they are profoundly powerful and
demonstrably successful. URIs are extremely "loosely coupled". You can
pass a URI from one "system" to another using a piece of paper and OCR!
URIs are "late bound". They do not declare what can or should be done
with the information they reference. It is because they are so radically
"loose" and "late" that they scale to the level of the Web.
Unfortunately, most of us do not think of our web services in these
terms. Rather we think of them in terms of remote procedure calls
between endpoints that represent software components. This is CORBA/DCOM
thinking. Web thinking is organized around URIs for resources.
Claim: The next generation of web services will use individual data
objects as endpoints. Software component boundaries will be invisible
and irrelevant.
An Illustrative Example
UDDI is an example of a Web Service that could be made much, much more
robust as a second generation Web Service. I'm not discussing the
philosophical issues of UDDI's role in the web services world but the
very concrete issue of how to get information into and out of it. These
arguments will apply to most of the Web Services in existence, including
stock quote services, airplane reservations systems and so forth.
UDDI has a concept of a businessEntity representing a corporation.
Businesses are identified by UUIDs. The Web-centric way to do this would
have been to identify them by URIs. The simplest way to do this would be
to make a businessEntity an XML document addressable at a URI
like"http://www.uddi.org/businessEntity/ibm.com" or perhaps
"http://www.uddi.org/getbusinessEntity?ibm.com". The difference between
these two is subtle and does not have many technical implications so
let's not worry about it.
You can think of "http://www.uddi.org/businessEntity" as a directory
with files in it or a web service pulling data from a database. A
wonderful feature of the Web is that there is no way to tell which is
true just from looking at the URI. That is "loose coupling" in action!
Let's consider the implications of using HTTP-based URIs instead of
UUIDs for business entities:
* Anybody wanting to inspect that business entity would merely point
their (XML-aware!) browser at that URI and look at the businessEntity
record.
* Anybody wanting to reference the businessEntity (in another web
service or a document) could just use the URL.
* Anybody wanting to incorporate the referenced information into another
XML document could use an XLink, XPointer or XInclude.
* Anybody wanting a permanent copy of the record could use a command
line tool like "wget" or do a "Save As" from the browser.
* Any XSLT stylesheet could fetch the resource dynamically to combine it
with others in a transformation.
* Access to the businessEntity could be controlled using standard HTTP
authentication and access control mechanisms
* Metadata could be associated with the businessEntity using RDF
* Any client-side application (whether browser-based or not) could fetch
the data without special SOAP libraries.
* Two business entities could represent their merger by using a standard
HTTP redirect from one businessEntity to another.
* Editing and analysis tools like Excel, XmetaL, Word and EMACS could
import XML from the URL directly using HTTP. They could write back to it
using WebDAV.
* UUIDs or other forms of location-independent addresses could still be
assigned as an extra level of abstraction as demonstrated at purl.org.
The current UDDI "API" has a method called get_businessDetail. Under an
address-centric model, that method would become entirely redundant and
could thus be removed from the API. UDDI has several get_ methods that
operate on data objects such as tModels and business services. These
data objects could all be represented by logical XML documents and the
methods could be removed. Note how we have substantially simplified the
user's access to UDDI information.
Business entities are not the only things in UDDI that should be
identified by URI-addressable resources rather than SOAP APIs. In fact
all of the data in a UDDI database could be represented this way.
Summary: Resources (data objects) are like children. They need to have
names if they are to participate in society.
Extensibility
Now let's consider the extensibility characteristics of the REST model
versus the original SOAP RPC model. Let's say that your company has a
private UDDI registry and mine does also. You and I are business
partners. We agree to share our customer databases. The customer
databases have pointers into our UDDI registries for referring to
businessEntities.
If our registries have little or no overlap then it makes sense for you
to maintain yours and for me to maintain mine. Rather than replicating
between them (which has serious security and maintainability
implications) I would like to just add you to the access control lists
for some records and allow you to refer to them from your customer
database and I'll do the opposite from mine.
If the customer databases use UUIDs then they have no way of knowing
whether a particular UUID should be looked up in the local database, the
partner's database or even the public UDDI In The Sky. URIs are not just
globally unique but also typically embed enough information to allow
them to be de-referenced without further context. Using URIs instead of
UUIDs, new repositories can be integrated whenever we want. In fact, if
we use URIs, the customer database could refer just as easily to
businessEntity records sitting on somebody's hard disk as in a formal
UDDI registry. The database maintainer could choose whether to allow
that or not.
Because the businessEntity documents are XML, it is relatively easy to
add elements, attributes or other namespaces. This makes the document
format extensible. It is also easy to extend the protocol by adding
specialized HTTP headers or even new HTTP methods.
Performance
Performance of web services will be an important issue. Any resource
representation retrieved from a GET-based URI can be cached. It can be
cached in a cache server in front of the server, in an intermediate
provided by an ISP, at a corporate firewall or on the client computer.
Caching is built-in to HTTP. SOAP get_businessDetail messages are not
cached by any existing technology.
As an optimization, the URI "http://www.uddi.org/businessEntity/ibm.com"
might be represented as a raw text file on a hard disk of an operating
system optimized towards serving files over HTTP. There is not and will
likely never be any server that can invoke SOAP methods as quickly as a
fast HTTP server can serve files from disk.
Other methods
UDDI has other methods for working with businessEntities. One is
delete_business. HTTP already has a DELETE method. Therefore this method
would be redundant in the REST model. Instead of doing a UDDI
SOAP-RPC-specific delete you could do an HTTP delete. This would have
the benefit of being compatible with tools that know how to do HTTP
deletes like the Windows 2000 explorer and MacOS X finder. In theory,
businesses could delete portions of their own records (perhaps obsolete
branch plant addresses) by merely hitting the "delete" key.
Obviously authentication and access control is key. Microsoft should not
be able to delete their competitors (or at least should be forced to
delete them in the old fashioned way, by competing with them). HTTP
already has the authentication, authorization and encryption features
that UDDI's SOAP RPC protocol lacks. It already works.
UDDI has a save_business method. This is for uploading new businesses.
The HTTP equivalent is PUT or POST. A pleasant side effect of using HTTP
methods instead of a SOAP method is that you can do a POST from an HTML
form. So the web service can be used either from other programs or (with
a browser) by a human editor.
UDDI has a find_business method. This is no different in principle than
the search features built into every website in the world and search
engine sites in particular. That would be a form of GET. On the URL
line, the service would take a series of search parameters and return an
XML document representing the matching businessEntities (either by
reference, as URLs, or by value, as XML elements).
The Role of HTTP
You may notice a recurring theme. Everything that we want to do in this
Web Service is already supported in HTTP. The only things that we need
to innovate on are our URI structure and our XML schemas. Bingo! That
was the whole point of XML: to focus on data interchange instead of
software components!
Everything in UDDI can be represented in terms of HTTP operations on
resources. So HTTP isn't accidentally paired with URIs as one of the
central technologies of the Web. It is designed specifically as a major
part of the location-centric REST architecture.
Here's the radical idea: no matter what your problem, you can and should
think about it as a data resource manipulation problem rather than as an
API design problem. Think of your web server as this big information
repository: like a database. You are doing data manipulation operations
on it.
In UDDI I've chosen a web service that is ripe for an easy conversion to
REST philosophy but we can apply these principles to anything. What
about something like a purchase order submission? That seems more
transactional. Well purchase orders want to be named also! If you POST
or PUT a purchase order to a new URI then internal systems all over your
company can instantly refer to it no matter where they are. Using HTTP,
an arbitrary XSLT stylesheet or Perl script sitting on an employee's
desktop in the Beijing office can massage data from a purchase order
sitting on the accounting mainframe in Los Angeles. Accessing
HTTP-addressable resources is no more difficult than accessing files off
of the local file system, but it requires much less coordination than
standard file system sharing technologies.
What about a request for quote? RFQs want to be named! Once you give
them a name you can pass around the URL to your partners rather than the
text. Then your partners can build references to them using hyperlinks
from their documents and databases. Use access controls to keep out your
competitors. You can think about any business problem in this way.
Even web services with complicated work flows can be organized in a
URI-centric manner. Consider a system that creates airline reservations.
In a traditional HTML system there are a variety of pages representing
the different stages in the logical transaction. First you look up
appropriate flights. You get back a URI representing the set of
appropriate flights. Then you choose a light. You get back a URI
representing your choice. Then you decide to commit. You get back a web
page that returns reservation number. Ideally the URL for that page will
persist for a reasonable amount of time so that you can bookmark it.
An XML based web service could go through the exact same steps. Rather
than returning HTML forms at each step, the service would return XML
documents conforming to a standard airline industry vocabulary. Those
same XML documents could be used on a completely different airline
reservation site to drive exactly the same process.
Summary: Any business problem can be thought of as a data resource
manipulation problem and HTTP is a data resource manipulation protocol.
Metcalfe's Law Revisited
Metcalfe's law is that the value of a network is proportional to the
square of the number of people on the network, because each pair of
people can make a connection between them. One telephone is useless. One
billion phones cause a major telecommunications revolution - if they can
all access each other through a single global naming system.
Metcalfe's law also applies to data objects. Elements in UDDI can only
(with a few exceptions) refer to each other. They cannot refer to
objects elsewhere on the Web (for instance in other UDDI repositories).
Similarly, objects on the Web (for instance web pages) cannot refer to
the XML elements in the UDDI repository. A URL-centric solution would
unify these data domains as the phone number system unifies telephones.
Security
Making your data universally addressable is not equivalent to making it
universally available! It is easy to hide objects by merely never
publishing their URIs. It is also easy to apply security policies to
objects. In fact, REST simplifies security greatly.
Under the SOAP RPC model, the objects that you work with are implicit
and their names are hidden in method parameters. Therefore you need to
invent a new security strategy for each and every web service. UDDI is
completely unlike .NET My Service which will likely be completely unlike
Liberty and so forth. Under REST, you can apply the four basic
permissions to each data object: GET permission, PUT permission, DELETE
permission and POST permission. You might also want to allow or disallow
GET/PUT/DELETE and POST on sub-resources. This model is exactly like the
one used for today's file systems! It is proven and it works. I know of
no security model that works in a similarly generic manner for remote
procedure call models.
Maintenance
In fact, security is just one form of maintainability that is simplified
by REST. Any network administrator will tell you that every level of
networking causes its own headaches. Some days IP works but DNS doesn't
(DNS server down or DNS settings misconfigured). Some days IP/DNS works
but HTTP doesn't (firewall or proxy misconfigured). If you run a web
service protocol on top of HTTP it will add its own layer of
configuration and software headaches on top of the existing ones. It
cannot be more reliable than its foundational HTTP layer. It can only
add one more layer of unreliability.
Once you have your service working, it is possible to "test" REST web
services just by looking at them in a browser. It is possible to make
simple HTML forms to test POSTs. QA departments can easily pretend to be
multiple users by changing their HTTP credentials. Standard web tools
can monitor availability. In essence, testing REST services is often
easy if you already know how to test web sites. On the other hand, every
SOAP RPC service will have its own security model, its own addressing
model, an implicit data model and its own set of methods. Of these four
things, only the security model is even currently a candidate for
standardization. Testing such a system is much more challenging.
The Rest of the Story
This brief introduction can only whet your appetite to the theory and
practice of REST-based web services. In an upcoming article, I will:
* describe in more detail how any web service can be transformed into a
URI-centric one.
* show how the REST philosophy and the XML philosophy are highly
compatible.
* show an example of a successful, public, widely used web service that
uses this model today.
* discuss the role of SOAP in these sorts of web services.
* discuss reliability, coordination, transactions, encryption, firewalls
etc.
If you would like to discuss these issue in the meantime, please
consider contributing to the rest-wiki
(http://internet.conveyor.com/RESTwiki/moin.cgi/FrontPage) and the REST
mailing list (http://groups.yahoo.com/group/rest-discuss/).
=============
> I don't follow you. A form is a user interface. Today forms are often
> used as user interfaces to GET and to POST (i.e. filling in an
> information request as opposed to submitting data). XForms is the
> replacement for HTML forms. Over time browsers will move to this more
> functional user interface and yet lose out on the ability to do
> form-based GET. I see that as a problem.
I agree that it would be a problem if this were the case. But in my
time on the HTML WG, including while work on XForms was being done
there, it was never presented as a replacement.
It wouldn't hurt to double check though. Why don't you raise this on
www-forms?
MB
--
Mark Baker, Chief Science Officer, Planetfred, Inc.
Ottawa, Ontario, CANADA. mbaker@...http://www.markbaker.cahttp://www.planetfred.com
Mark Baker wrote:
>
>...
>
> I'm not worried about this at all. GET forms and POST forms are
> completely different beasts. The latter is for submitting resource
> representations, while the former is an assertion that the specified
> resource is a logical container for other resources, and is indexed by
> some set of names encoded into a query term.
I don't follow you. A form is a user interface. Today forms are often
used as user interfaces to GET and to POST (i.e. filling in an
information request as opposed to submitting data). XForms is the
replacement for HTML forms. Over time browsers will move to this more
functional user interface and yet lose out on the ability to do
form-based GET. I see that as a problem.
Paul Prescod
> I don't know much about XForms but this little bit worried me:
>
> "My personal take is that HTTP GET is broken beyond repair for I18N-safe
> form-data submission. This is part of the reason why it's deprecated in
> XForms. Sending around bits of XML is a much better way to go."
>
> And yes indeed:
>
> http://www.w3.org/TR/xforms/slice11.html#rpm-send
>
> "The HTTP "get" protocol is deprecated for use in form submission. Form
> authors should use "post" for greater compatibility."
>
> Insofar as XForms will most often be used to submit data to be stored,
> this isn't a crisis, but when XForms replace HTML forms in general, this
> is going to be a big problem.
>
> Paul Prescod
I'm not worried about this at all. GET forms and POST forms are
completely different beasts. The latter is for submitting resource
representations, while the former is an assertion that the specified
resource is a logical container for other resources, and is indexed by
some set of names encoded into a query term.
MB
--
Mark Baker, Chief Science Officer, Planetfred, Inc.
Ottawa, Ontario, CANADA. mbaker@...http://www.markbaker.cahttp://www.planetfred.com
I don't know much about XForms but this little bit worried me:
"My personal take is that HTTP GET is broken beyond repair for I18N-safe
form-data submission. This is part of the reason why it's deprecated in
XForms. Sending around bits of XML is a much better way to go."
And yes indeed:
http://www.w3.org/TR/xforms/slice11.html#rpm-send
"The HTTP "get" protocol is deprecated for use in form submission. Form
authors should use "post" for greater compatibility."
Insofar as XForms will most often be used to submit data to be stored,
this isn't a crisis, but when XForms replace HTML forms in general, this
is going to be a big problem.
Paul Prescod
On Thu, Jan 10, 2002 at 01:14:07PM -0500, Mark Baker wrote:
> > Why do so many websites use home-spun HTML/cookie authentication
> > (login/password) instead of HTTP authentication? I'm guessing it
> > is all about user interface issues -- being able to put the login
> > box where-ever you want it.
>
> Exactly right. This is a major issue, as it prevents many tasks
> from being automated.
I think the issue is more that publishers don't have much control
over the authentication state on the browser; things like remembering
the username between sessions, logging out, etc. weren't addressable
until IE and later Mozilla introduced password management interfaces.
They're still less capable than cookie handling, unfortunately.
Also, it was drilled into eveyone's heads that Basic authentication
isn't secure. Some people thought that magically using cookies would
solve this, whilst the more savvy used encrypted or hashed values in
cookies. There is Digest authentication, but it was plagued with
specification and implementation problems, IIRC.
> - conventions for cookie values. would also be difficult to rollout
> as HTTP libs that support cookies would all need fixing.
What kind of conventions? It strikes me that defining conventions for
cookies is about as friendly as defining conventions for URIs like
well-known locations...
> - recognizing forms with two fields where one is a password input type,
> and somehow kludging that knowledge into the auth system. easier to
> rollout, but error prone and not sure how the kludge would work
I believe this is what Mozilla and IE do now. Of course, the auth is
still sent as a cookie.
Cheers,
--
Mark Nottingham
http://www.mnot.net/
> Why do so many websites use home-spun HTML/cookie authentication
> (login/password) instead of HTTP authentication? I'm guessing it is all
> about user interface issues -- being able to put the login box
> where-ever you want it.
Exactly right. This is a major issue, as it prevents many tasks from
being automated.
> What needs to be done to web infrastructure so that this bit of context
> moves from the HTML/cookie domain down into HTTP where it is supposed to
> live?
I don't know that there's a quick fix. One thing I was thinking of was
an HTML/XHTML extension that would allow more flexibility in the user
interface of the authentication system. But it would take forever to
roll that out.
Other ideas;
- conventions for cookie values. would also be difficult to rollout
as HTTP libs that support cookies would all need fixing.
- recognizing forms with two fields where one is a password input type,
and somehow kludging that knowledge into the auth system. easier to
rollout, but error prone and not sure how the kludge would work
MB
--
Mark Baker, Chief Science Officer, Planetfred, Inc.
Ottawa, Ontario, CANADA. mbaker@...http://www.markbaker.cahttp://www.planetfred.com
Why do so many websites use home-spun HTML/cookie authentication
(login/password) instead of HTTP authentication? I'm guessing it is all
about user interface issues -- being able to put the login box
where-ever you want it.
What needs to be done to web infrastructure so that this bit of context
moves from the HTML/cookie domain down into HTTP where it is supposed to
live?
Paul Prescod
On Mon, Jan 07, 2002 at 08:42:42PM -0500, Mark Baker wrote:
> Content negotiation could be used to negotiate a representation
> capable of expressing a form.
I was hoping to allow requests to the record resource
http://www.example.com/addresses/thePerson
to negotiate the representation returned, so that the default would
be to return an HTML page, but if linked from an <img> element, the
image/jpeg in the jpegPhoto attribute would be returned.
Sadly, it's not to be. Conneg support in browsers is horrible.
Mozilla sends a static Accept header, no matter what the context,
which enumerates all of the types it supports. I've logged a bug[1]
for this (please vote for it if so inclined).
IE seems to generate the Accept header based on the filename
extension of the resource (EW!), and by default sends a header that
includes a lot of image/ types, a */* (which is a CUAP[2]), but no
HTML.
Of course, conneg is still useful to expose, but lack of support in
browsers means that another means of doing this -- probably with
query strings (or maybe path parameters?) -- will have to be hacked.
*sigh*
[1] http://bugzilla.mozilla.org/show_bug.cgi?bug_id=118696
[2] http://www.w3.org/TR/cuap
Hmm... Bugzilla would be an excellent candidate for RESTification...
--
Mark Nottingham
http://www.mnot.net/
On Mon, Jan 07, 2002 at 08:42:42PM -0500, Mark Baker wrote:
> > http://www.example.org/addresses
> > Exposes the DB as a whole
> > GET: representation is the main interface (queries, etc.)
> > POST: add a new entry, returns a 303 to the created resource
>
> Hmm, if you're creating a new resource it should be returned with a
> 201. You won't get an auto redirect, but the client will know the
> URI of the new resource. The body can also include a link if a
> browser is your client.
Hmm. Seems good; will have to play.
> > http://www.example.org/addresses?repr=add
> > GET: representation is an add form
>
> /addresses could serve that purposes, no need for the new URI.
> Content negotiation could be used to negotiate a representation
> capable of expressing a form.
[...]
> > http://www.example.org/addresses/thePerson?repr=edit
> > GET: representation is an edit form
>
> Again, could content negotiation not be used to retrieve an editable
> format?
What's the media type for an HTML form which is the editable
representation of a resource again? ;)
Also, how would I tell a browser to request it?
> BTW, my other message I promised is turning into something that should
> probably be sent to uri@..., so stay tuned.
Cool!
--
Mark Nottingham
http://www.mnot.net/
> > So there's no problem with POSTing to /search. That would be one way
> > for Google to allow people to register new unindexed resources, for
> > example. But as mentioned below, the name is confusing.
>
> Well, there's no problem POSTing an update to /search (or whatever),
> but doing a query through POST doesn't seem very RESTy at all.
Right. I guess I wasn't clear, my apologies.
> So, to review (and make sure I've got a cohesive interface),
>
> http://www.example.org/addresses
> Exposes the DB as a whole
> GET: representation is the main interface (queries, etc.)
> POST: add a new entry, returns a 303 to the created resource
Hmm, if you're creating a new resource it should be returned with a
201. You won't get an auto redirect, but the client will know the
URI of the new resource. The body can also include a link if a
browser is your client.
> http://www.example.org/addresses?repr=add
> GET: representation is an add form
/addresses could serve that purposes, no need for the new URI.
Content negotiation could be used to negotiate a representation
capable of expressing a form.
> http://www.example.org/addresses?query_to_the_db
> GET: representation is a listing of the query
Looks good.
> http://www.example.org/addresses/thePerson
> exposes a particular record
> GET: representation is a person's record
> POST: update the resource/record, returns a 303 to the resource
> PUT: create the resource/record (multiple content-types
> supported?), returns a status page (or a 303?)
> DELETE: delete the resource/record, returns a status page
Good.
> http://www.example.org/addresses/thePerson?repr=edit
> GET: representation is an edit form
Again, could content negotiation not be used to retrieve an editable
format?
BTW, my other message I promised is turning into something that should
probably be sent to uri@..., so stay tuned.
MB
--
Mark Baker, Chief Science Officer, Planetfred, Inc.
Ottawa, Ontario, CANADA. mbaker@...http://www.markbaker.cahttp://www.planetfred.com
On Mon, Jan 07, 2002 at 01:45:52PM -0500, Mark Baker wrote:
> > > Why not POST to http://www.example.org/search? The good thing about
> > > doing that is that an intermediary would know that /search has changed
> > > state due to the POST, whereas if it went to /add, the state change
> > > would be transparent and the intermediary wouldn't know.
> >
> > hmm. Individual searches are separate resources; i.e., they have
> > different query strings, so this wouldn't necessarily work, no? (It
> > wouldn't make sense to POST a query, obviously).
>
> Why not? Don't think of the resource as being a search, but think of
> it as the result of a search.
>
> "http://www.google.com/search?q=foo" is the set of all resources known
> by Google to contain the word "foo". That it requires a search to
> resolve this URI at google.com is immaterial really.
>
> So there's no problem with POSTing to /search. That would be one way
> for Google to allow people to register new unindexed resources, for
> example. But as mentioned below, the name is confusing.
Well, there's no problem POSTing an update to /search (or whatever),
but doing a query through POST doesn't seem very RESTy at all.
So, to review (and make sure I've got a cohesive interface),
http://www.example.org/addresses
Exposes the DB as a whole
GET: representation is the main interface (queries, etc.)
POST: add a new entry, returns a 303 to the created resource
http://www.example.org/addresses?repr=add
GET: representation is an add form
http://www.example.org/addresses?query_to_the_db
GET: representation is a listing of the query
http://www.example.org/addresses/thePerson
exposes a particular record
GET: representation is a person's record
POST: update the resource/record, returns a 303 to the resource
PUT: create the resource/record (multiple content-types
supported?), returns a status page (or a 303?)
DELETE: delete the resource/record, returns a status page
http://www.example.org/addresses/thePerson?repr=edit
GET: representation is an edit form
These duplicate functionality above:
http://www.example.org/addresses/thePerson?method=PUT
POST: tunnels to PUT, returns a status page
http://www.example.org/addresses/thePerson?method=DELETE
POST: tunnels to DELETE, returns a status page
I'm a *little* uncomfortable with the ?repr=add, but I can't say
exactly why.
> Exactly right. An alternative is to tunnel PUT and DELETE through
> POST with a hidden form field called "method". At least the URI
> wouldn't change. I guess there's a number of ways to kludge it.
> That would make for a good RESTwiki page.
That would be cool. I like the method= kludge...
> http://internet.conveyor.com/RESTwiki/moin.cgi/HowWikiComparesToRest
Ah, I'd seen that, but never investigated. Thanks.
--
Mark Nottingham
http://www.mnot.net/
> > Why not POST to http://www.example.org/search? The good thing about
> > doing that is that an intermediary would know that /search has changed
> > state due to the POST, whereas if it went to /add, the state change
> > would be transparent and the intermediary wouldn't know.
>
> hmm. Individual searches are separate resources; i.e., they have
> different query strings, so this wouldn't necessarily work, no? (It
> wouldn't make sense to POST a query, obviously).
Why not? Don't think of the resource as being a search, but think of
it as the result of a search.
"http://www.google.com/search?q=foo" is the set of all resources known
by Google to contain the word "foo". That it requires a search to
resolve this URI at google.com is immaterial really.
So there's no problem with POSTing to /search. That would be one way
for Google to allow people to register new unindexed resources, for
example. But as mentioned below, the name is confusing.
> > Also, search may not be the best name if you're posting to it.
> > /What about
> > this?
> >
> > http://www.example.org/database
> >
> > or, if the database is a list of employees;
> >
> > http://www.example.org/employees
>
> Yes. I knew that that was a problem, just didn't have a good
> mechanism; perhaps it would be good to have a DN->resource mapping,
> so that
>
> http://www.example.org/addresses
>
> maps to 'dc=example,dc=org' (if that's the DN being used for
> addresses. Then, individual entries could be maped to
>
> http://www.example.org/addresses/Bob%20Smith
>
> or somesuch.
>
> This might be a horrible abuse of LDAP, but hey, I'd
> rather abuse it than the Web ;)
Heh. But do you understand what I was saying here, after what I just
wrote above?
> > http://www.example.org/employees?id=234343?pap=true
> >
> > ("pap" => "post-as-put")
>
> Sure. Perhaps start with pap and then backport. I'm doing this with
> python's BaseHTTPServer, so I can do anything fairly easily ;)
Anything except implement a catchall do_* method (I'm writing a
proxy). 8-)
> > You can use them with URI refs;
>
> Yes. What I'm getting at is that the WIKI-like mechanisms such as
>
> http://www.example.com/theThing
> http://www.example.com/theThing?pap=true
>
> are a complete kludge; from a resource standpoint, these might as
> well be
>
> http://www.example.com/theThing
> http:/www.example.com/someOtherThing
>
> because of opacity; the only person who knows that ?pap=true does
> something special is the publisher.
Exactly right. An alternative is to tunnel PUT and DELETE through
POST with a hidden form field called "method". At least the URI
wouldn't change. I guess there's a number of ways to kludge it.
That would make for a good RESTwiki page.
> P.S. it would be interesting to see an analysis of the RESTyness of
> WIKIs.
This one has been up on the RESTwiki for a while. It's pretty
good, I think.
http://internet.conveyor.com/RESTwiki/moin.cgi/HowWikiComparesToRest
MB
--
Mark Baker, Chief Science Officer, Planetfred, Inc.
Ottawa, Ontario, CANADA. mbaker@...http://www.markbaker.cahttp://www.planetfred.com
On Mon, Jan 07, 2002 at 10:03:17AM -0500, Mark Baker wrote:
> > One can add to the database by doing a get (for the form) and then a
> > POST to
> > http://www.example.org/add
>
> Why not POST to http://www.example.org/search? The good thing about
> doing that is that an intermediary would know that /search has changed
> state due to the POST, whereas if it went to /add, the state change
> would be transparent and the intermediary wouldn't know.
hmm. Individual searches are separate resources; i.e., they have
different query strings, so this wouldn't necessarily work, no? (It
wouldn't make sense to POST a query, obviously).
> Also, search may not be the best name if you're posting to it.
> /What about
> this?
>
> http://www.example.org/database
>
> or, if the database is a list of employees;
>
> http://www.example.org/employees
Yes. I knew that that was a problem, just didn't have a good
mechanism; perhaps it would be good to have a DN->resource mapping,
so that
http://www.example.org/addresses
maps to 'dc=example,dc=org' (if that's the DN being used for
addresses. Then, individual entries could be maped to
http://www.example.org/addresses/Bob%20Smith
or somesuch.
This might be a horrible abuse of LDAP, but hey, I'd
rather abuse it than the Web ;)
> An "edit" should be done with a PUT interface, no? I know that this
> is problematic given that browsers/HTML only support GET/POST, but if
> you can afford it (development time, extra code to maintain, performance
> hit with another layer), I'd solve that problem after exposing the PUT
> interface, perhaps by having a separate Wiki-looking URI that accepts
> POSTs and treats them as PUTs.
>
> e.g.
>
> http://www.example.org/employees?id=234343?pap=true
>
> ("pap" => "post-as-put")
Sure. Perhaps start with pap and then backport. I'm doing this with
python's BaseHTTPServer, so I can do anything fairly easily ;)
> You can use them with URI refs;
Yes. What I'm getting at is that the WIKI-like mechanisms such as
http://www.example.com/theThinghttp://www.example.com/theThing?pap=true
are a complete kludge; from a resource standpoint, these might as
well be
http://www.example.com/theThing
http:/www.example.com/someOtherThing
because of opacity; the only person who knows that ?pap=true does
something special is the publisher.
P.S. it would be interesting to see an analysis of the RESTyness of
WIKIs.
--
Mark Nottingham
http://www.mnot.net/
> I'm trying to RESTify an existing Web application, and have a
> stylistic question.
>
> Imagine that a Web application fronts for a database, where all of the
> records are available as resources.
>
> For example, you could search the database at
> http://www.example.org/search
> and the results would point at records, like
> http://www.example.org/records/theThing
I have a thought about that that I'll bring up in another thread
shortly.
> One can add to the database by doing a get (for the form) and then a
> POST to
> http://www.example.org/add
Why not POST to http://www.example.org/search? The good thing about
doing that is that an intermediary would know that /search has changed
state due to the POST, whereas if it went to /add, the state change
would be transparent and the intermediary wouldn't know. Also,
/search may not be the best name if you're posting to it. What about
this?
http://www.example.org/database
or, if the database is a list of employees;
http://www.example.org/employees
> Now, how should editing and deletion be done? One approach is to
> expose an edit resource, similar to the add one;
> http://www.example.org/edit
> To edit a specific resource, one could use a query string;
> http://www.example.org/edit?resource=theThing
> and use GET and POST to get the form and POST the edit.
>
> Another approach would be to use the resource's native URI, like
> http://www.example.org/records/theThing?mode=edit
> in a similar manner. This seems to be the approach taken by most
> WIKIs, interestingly.
>
> Is either approach preferable/horrible, or are they pretty much the
> same? The specific application is a gateway to an LDAP database. I
> haven't read Roy's full dissertation (still!), so apologies if I've
> asked a FAQ.
An "edit" should be done with a PUT interface, no? I know that this
is problematic given that browsers/HTML only support GET/POST, but if
you can afford it (development time, extra code to maintain, performance
hit with another layer), I'd solve that problem after exposing the PUT
interface, perhaps by having a separate Wiki-looking URI that accepts
POSTs and treats them as PUTs.
e.g.
http://www.example.org/employees?id=234343?pap=true
("pap" => "post-as-put")
> Deletion is another interesting case; if each LDAP entry really is a
> resource, DELETE would be most appropriate, no? Unfortunately, methods
> other than GET and POST aren't available from HTML...
Yah. I'd do much the same as for PUT; build the DELETE interface, then
kludge a POST interface on top.
> P.S.; Do queries identify a new resource?
Yes. Each unique URI identifies a new resource, modulo generic &
per-scheme equivalence rules.
> Their defined as a mechanism
> to pass data to the server. Common use (as outlined above) seems to
> indicate that people don't consider them separate, but in the URI
> world, they're not lumped into URI-References with fragments...
You can use them with URI refs;
URI-reference = [ absoluteURI | relativeURI ] [ "#" fragment ]
absoluteURI = scheme ":" ( hier_part | opaque_part )
relativeURI = ( net_path | abs_path | rel_path ) [ "?" query ]
hier_part = ( net_path | abs_path ) [ "?" query ]
More on this in that other email I promised.
MB
--
Mark Baker, Chief Science Officer, Planetfred, Inc.
Ottawa, Ontario, CANADA. mbaker@...http://www.markbaker.cahttp://www.planetfred.com
I'm trying to RESTify an existing Web application, and have a
stylistic question.
Imagine that a Web application fronts for a database, where all of the
records are available as resources.
For example, you could search the database at
http://www.example.org/search
and the results would point at records, like
http://www.example.org/records/theThing
One can add to the database by doing a get (for the form) and then a
POST to
http://www.example.org/add
Now, how should editing and deletion be done? One approach is to
expose an edit resource, similar to the add one;
http://www.example.org/edit
To edit a specific resource, one could use a query string;
http://www.example.org/edit?resource=theThing
and use GET and POST to get the form and POST the edit.
Another approach would be to use the resource's native URI, like
http://www.example.org/records/theThing?mode=edit
in a similar manner. This seems to be the approach taken by most
WIKIs, interestingly.
Is either approach preferable/horrible, or are they pretty much the
same? The specific application is a gateway to an LDAP database. I
haven't read Roy's full dissertation (still!), so apologies if I've
asked a FAQ.
Deletion is another interesting case; if each LDAP entry really is a
resource, DELETE would be most appropriate, no? Unfortunately, methods
other than GET and POST aren't available from HTML...
Cheers,
P.S.; Do queries identify a new resource? Their defined as a mechanism
to pass data to the server. Common use (as outlined above) seems to
indicate that people don't consider them separate, but in the URI
world, they're not lumped into URI-References with fragments...