Code Search for Developers
 
 
  

vstream-handling.html from gzz at Krugle


Show vstream-handling.html syntax highlighted

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
  <title>A new design for vstream handling</title>
  <meta http-equiv="content-type" content="text/html; charset=ISO-8859-1">
</head>
<body>
<h1>A new design for vstream handling</h1>
<br>
<address>Benja Fallenstein<br>
</address>
[your name here if you do substantial changes]<br>
<h2>Motivation</h2>
<p>Currently we're re-thinking <tt>gzz.impl</tt>. We do it because the current
implementation is too complex; we do it <i>now</i> because it is too slow.
Most of the slowness seems to be caused by our current handling of vstreams,
so it seems obvious that we re-think that, too.</p>
<p>The current implementation handles vstreams as a weird mix of virtual
and real cells-- that's something we want to do away with. We want a scheme
that does not mean having to optimize the handling of real cells and connections
in any way. Now, the question becomes whether to implement cell's content
(vstreams) as some structure of real cells, or in a non-cellular structure--
viewable and editable as virtual cells, but not stored in the structure.</p>
<p>There are things to say for the first approach: Even internal things are
usually better stored in the structure, thus giving full control to the user
and make things like undo and merge completely orthogonal. However, Tuomas
has made a design decision for the second approach, because so far vstream
handling has <i>always</i> been a bottleneck since we started it and he therefore
wants to be sure to take the most efficient approach we can think of.</p>
<p>This document attempts to detial the design.</p>
<h2>Desiderata</h2>
<h3>Transclusions identity</h3>
Give span transclusions identity so that rearrangements on the cell level
work the way you expect them to work<br>
<h3>Text properties: The <i>Emacs</i> way</h3>
Logically, text properties should be assigned on the character level-- i.e.,
formats are applied to individual chars, not to ranges. This means that when
a range of formatted characters is disconnected, we need not worry about
moving markup cells or anything like that: as markup is per-character, the
two resulting pieces will have the correct markup attached. Similarly, when
connecting two ranges with identical markup, we don't need to do anything
special (a good implementation will internally reorganize the markup to be
efficient, of course).<br>
<br>
This dramatically simplifies rearrangements on the cell level and is therefore
The Right Thing. (Of course we need a good way to <i>edit</i> formatting
information on the cell level, but that is simple if we do <i>not</i> give
the virtual cells involved permanent identity.)<br>
<h3>Cursors in vstreams</h3>
We need a way to accurse positions in a vstream, if only to move around inside
it. The important point here is that when a range containing such a cursor
is removed from the vstream, the cursor must be moved so that it ends up
where the range used to be-- it must not move outside the vstream it was
on.<br>
<br>
(Yes, this can be done efficiently.)<br>
<h3>Virtual text handling</h3>
Handle text in spaceparts mostly like text in normal cells (iterable and
selectable-- but not necessarily editable-- like normal text)<br>
<h2>The proposal: Enfilades and external vstreams</h2>
<h3>Virtual structure</h3>
First off, now that we get a chance to redesign this anyway, let's do away
with the insane approach of the cell containing the vstream being the headcell.
(The worst problem with this is that after disconnecting a vstream in the
middle, you can't tell that the second piece is just a piece of vstream not
in any cell-- you have to interpret the headcell of that piece of vstream
as the cell containing the rest of the vstream.)<br>
<br>
Instead, behind each real cell, let's have one virtual cell (connected on
'd.vstream-containment'). This virtual cell will be the headcell of the vstream
inside that real cell. The real cell can then also be safely put on a different
vstream (somewhat similar to the containment mechanism of the Perl prototype).<br>
<br>
Having the virtual headcell allows to use it as the position <i>before</i>
 the stream when accursing position inside the stream. (Accursing character
x then means accursing the position <i>after</i> x.)<br>
<h3>File format</h3>
Tuomas suggests that every time a vstream is changed, it is completely put
into the diff. This has several disadvantages, though:<br>
<ul>
  <li>Not storing diffs of the content may be too costly.<br>
  </li>
  <li>When loading, we need to read in all the vstreams a cell has ever contained--
this may be time-consuming.</li>
  <li>We either cannot do backward diffing easily, or we have to store <i>
both</i> the old and the new vstream (for long vstreams, expensive).</li>
</ul>
<p>The first is easily overcome by switching to diffs once we find we need
them. To overcome the second and the third, I propose storing cell vstreams
outside the space diff, in special vstream blocks. (This means creating three
instead of two blocks each time we save: One for the space diff, one for
the text scroll, and one with the changed vstreams.) In the file format,
we would only say, "Change the vstream in cell C from the one in block B1,
offset O1, to the one in block B2, offset O2."</p>
<p>Then, when reading the file in, we only need to actually read and parse
the newest version of each vstream.</p>
<h3>In memory data structure</h3>
For good performance, I suggest that we finally switch to custom enfilades
as the in-memory structure.<br>
<br>
Cursor positions can be stored as WIDative, and formatting information as
DSPative properties.<br>
<h3>API</h3>
I think we should have a <tt>VStream</tt> interface in the <tt>gzz</tt> package.
This interface would allow the following things:<br>
<ul>
  <li>Iterating through the text.</li>
  <li>Insertion, deletion, excising, rearrangement, applying and removing
styles (if the vstream is editable).</li>
  <li>Get the identity of the vstream (usually the cell containing the vstream).</li>
  <li>Create a position inside the vstream whose position is maintained while
text is edited.</li>
</ul>
Spaceparts would then have to provide text for their cells through this interface.
The vstream spacepart would then have a very clean interface: it would simply
put a cellular representation of the <tt>VStream</tt> objects in the space
and call the appropriate functions on cellular rearrangements.<br>
<h2>Future direction: On-demand loading</h2>
All this provides a good basis for implementing on-demand loading of long
vstreams (<tt>&gt; 100</tt> chars, or something like that). It is not <i>
enough</i> to implement that, but with it, the extensions should become relatively
simple.<br>
<p></p>
</body>
</html>




See more files for this project here

gzz

An implementation of Ted Nelson's ZZstructure. ZZstructure is a new type of programming platform for structured data.

Project homepage: http://savannah.nongnu.org/projects/gzz
Programming language(s): C++,Java,Python
License: lgpl21

  Math/
    Makefile
    ajk-scribbles.tex
    math.ptex
  Helvetica.tfm
  Makefile
  Re-writing_gzz_impl.html
  vstream-handling.html