This is a multi-part message in MIME format.
--------------030005070805020605050103
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
Chris Dollin wrote:
>Roger.Evans@itri.brighton.ac.uk wrote:
>
>
>
>>Chris Dollin wrote:
>>
>>
>>
>>>x wrote:
>>>
>>>
>
>
>
>>>>I would use marked up text, either xml or xhtml. Providing a filter to
>>>>remove the tags is straight-forward. Styling via a set of standard
>>>>cascading style sheets (CSS) would then give you a specific look and
>>>>feel.
>>>>
>>>>
>>>I think this is a dreadful idea; those languages are *not* convenient
>>>to write. And XML isn't "marked up text", even though that's its
>>>historical origin - the pointy-bracket structure is fundamental to
>>>an XML document, rather than being mere decoration.
>>>
>>>
>>>
>>Sorry Chris, I can't let this go. Clearly the view of someone not
>>working in the text processing industry,
>>
>>
>
>That's true,
>
>
>
>>to whom using XML purely for data markup is just an amusing curiosity.
>>
>>
>
>But I don't understand this comment.
>
It was a mildly facecious comment, not worth spending much time on. As
you originally noted, markup started out as somehting for processing
documents. The idea that a 'document' could be used to represent just a
data structure was just some quirky idea. These days this quirky idea
has taken on a life of its own, but not for any particular reason. Ie
there's no reason why the de facto standard data representation language
should have evolved from the de facto standard document markup language
- its probably just an accident encouraged by the (highly laudible) unix
tradition of trying to represent data in plain ascii wherever possible.
The game gets a little more interesting on the boundaries between
document and data, for example representing dictionaries both as
human-accessible reference books and computer-accessible lexicons....
>
>I've written HTML by hand (well, by Ved), and using Quanta. I presume there
>are significantly more helpful tools around. As soon as the markup gets
>heavy (typically at the first hyperlink ...) I can't see the text any
>more.
>
>I don't deny that XML has become the de-facto inplementation of S-expressions,
>nor that one could mark up ordinary text using it and do useful things.
>I just don't think that, given the choice, one should be forced to
>*create and maintain* text using that form. So long as one can generate
>XML [or RDF ...] at will, we can arrange portability and cheap legacy.
>
>[This assumes that it's expensive, in some resource or other, to create
>a suitable editor from scratch, so that one is using WYSIWYN [... You Need.]
>EG for a typical Wiki, your editor is whatever <textarea> gives you ...]
>
>
>
>>The tex/data markup issue is blurred by using 'standoff' markup, which
>>might be worth at least thinking about in this context. The idea of
>>standoff markup is that you represent text-style markup as a separate
>>parallel data-style XML object, which contains pointers into the text
>>document. Its nice because its not invasive, and ebcause you can have
>>parallel multiple markups of a text that are not tree-structured with
>>respect to one another (type 'standoff markup' into google for more info
>>on it). So Steve could have one file which is plain text, and a parallel
>>file that marks up hyperlinks, text attributes etc in it, for systems
>>that want to exploit that. The downside is that your 'document' is
>>potentially distributed across multilple files that can get out of sync...
>>
>>
>
>I think that kills the idea stone dead.
>
>
No, its no different from maintaining both a plain text version and an
html version of a document. Actually it is a bit different but only in a
positive way. If you've got a plain text master and a derived html
version there's a risk someone will change the html version instead of
the master. With standoff-markup that's not really possible, because the
standoff markup references the plain text content directly, rather than
containing a copy of it.
Roger
--------------030005070805020605050103
Content-Type: text/html; charset=us-ascii
Content-Transfer-Encoding: 7bit
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1">
<title></title>
</head>
<body text="#000000" bgcolor="#ffffff">
<br>
<br>
Chris Dollin wrote:<br>
<blockquote type="cite"
cite="mid200309191430.h8JEUGMa000806@soapbox.cs.bham.ac.uk">
<pre wrap=""><a class="moz-txt-link-abbreviated" href="mailto:Roger.Evans@itri.brighton.ac.uk">Roger.Evans@itri.brighton.ac.uk</a> wrote:
</pre>
<blockquote type="cite">
<pre wrap="">Chris Dollin wrote:
</pre>
<blockquote type="cite">
<pre wrap="">x wrote:
</pre>
</blockquote>
</blockquote>
<pre wrap=""><!---->
</pre>
<blockquote type="cite">
<blockquote type="cite">
<blockquote type="cite">
<pre wrap="">I would use marked up text, either xml or xhtml. Providing a filter to
remove the tags is straight-forward. Styling via a set of standard
cascading style sheets (CSS) would then give you a specific look and
feel.
</pre>
</blockquote>
<pre wrap="">I think this is a dreadful idea; those languages are *not* convenient
to write. And XML isn't "marked up text", even though that's its
historical origin - the pointy-bracket structure is fundamental to
an XML document, rather than being mere decoration.
</pre>
</blockquote>
<pre wrap="">Sorry Chris, I can't let this go. Clearly the view of someone not
working in the text processing industry,
</pre>
</blockquote>
<pre wrap=""><!---->
That's true,
</pre>
<blockquote type="cite">
<pre wrap="">to whom using XML purely for data markup is just an amusing curiosity.
</pre>
</blockquote>
<pre wrap=""><!---->
But I don't understand this comment. </pre>
</blockquote>
It was a mildly facecious comment, not worth spending much time on. As
you originally noted, markup started out as somehting for processing
documents. The idea that a 'document' could be used to represent just a
data structure was just some quirky idea. These days this quirky idea
has taken on a life of its own, but not for any particular reason. Ie
there's no reason why the de facto standard data representation
language should have evolved from the de facto standard document markup
language - its probably just an accident encouraged by the (highly
laudible) unix tradition of trying to represent data in plain ascii
wherever possible. The game gets a little more interesting on the
boundaries between document and data, for example representing
dictionaries both as human-accessible reference books and
computer-accessible lexicons....<br>
<blockquote type="cite"
cite="mid200309191430.h8JEUGMa000806@soapbox.cs.bham.ac.uk">
<pre wrap="">
I've written HTML by hand (well, by Ved), and using Quanta. I presume there
are significantly more helpful tools around. As soon as the markup gets
heavy (typically at the first hyperlink ...) I can't see the text any
more.
I don't deny that XML has become the de-facto inplementation of S-expressions,
nor that one could mark up ordinary text using it and do useful things.
I just don't think that, given the choice, one should be forced to
*create and maintain* text using that form. So long as one can generate
XML [or RDF ...] at will, we can arrange portability and cheap legacy.
[This assumes that it's expensive, in some resource or other, to create
a suitable editor from scratch, so that one is using WYSIWYN [... You Need.]
EG for a typical Wiki, your editor is whatever <textarea> gives you ...]
</pre>
<blockquote type="cite">
<pre wrap="">The tex/data markup issue is blurred by using 'standoff' markup, which
might be worth at least thinking about in this context. The idea of
standoff markup is that you represent text-style markup as a separate
parallel data-style XML object, which contains pointers into the text
document. Its nice because its not invasive, and ebcause you can have
parallel multiple markups of a text that are not tree-structured with
respect to one another (type 'standoff markup' into google for more info
on it). So Steve could have one file which is plain text, and a parallel
file that marks up hyperlinks, text attributes etc in it, for systems
that want to exploit that. The downside is that your 'document' is
potentially distributed across multilple files that can get out of sync...
</pre>
</blockquote>
<pre wrap=""><!---->
I think that kills the idea stone dead.
</pre>
</blockquote>
No, its no different from maintaining both a plain text version and an
html version of a document. Actually it is a bit different but only in
a positive way. If you've got a plain text master and a derived html
version there's a risk someone will change the html version instead of
the master. With standoff-markup that's not really possible, because
the standoff markup references the plain text content directly, rather
than containing a copy of it.<br>
<br>
Roger<br>
</body>
</html>
--------------030005070805020605050103--
|