Rob, your expertise is desired on the matter of trees

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Rob, your expertise is desired on the matter of trees

Jesse Wilson
Hey Rob!

I'm not sure if James has already emailed you, but I thought
I might as well ping you myself...

For a long while we've been under the impression that JXTreeTable
was sufficient, and that Glazed Lists was not a good solution for
doing tree-tables. But lately, noticing the discussion of problems in
the Swinglabs JXTreeTable, James and I have been wondering about
what a tree-table built on Glazed Lists would look like....

This has always been an area you have been pushing for, and
I'm confident you've got a lot more experience in the area. I'd
like to get your feedback on how, what works, what doesn't, what
scales, the whole works.

First I'll describe what we're thinking thus far. The premise of our
idea hinges on a different type of tree model that's very simple:
No nodes! The entire tree is flattened, stored in a List, and structure
is inferred by asking rows what their parent row is. For example,
suppose the tree we wanna represent is this:
A
--B
----C
----D
E
--F
--G
--H
Then the List representation of the tree is simply a preorder traversal:
 { A, B, C, D, E, F, G, H }
Then there's a single extra method provided by an interface, something
like this:
   public interface TreeFormat {
        /** @return another tree element, or null if element is root */
       public Object getParent(Object element);
   }
For example, getParent("D") returns "B", and getParent("B") returns "A".
Using this, we can infer the structure of the overall tree, and do so quite
efficiently.

 - To do expanding/collapsing of nodes, we do something similar to filtering to
hide the approriate children.
 - To do indenting of nodes, count the number of getParent() calls to the root
to get a node's depth

Note that display is mostly irrelevant here, I'm mostly interested in the
model. The display could be done through a JXTreeTable using a wacky
adapter, or using a JTable with a fancy TableCellRenderer for the structure
column, or using a plain old JTree if there's only one column.

Deriving the model from a flattened list has some significant advantages:
 - no need to worry about lazy-loading for filtering
 - filtering and sorting are really easy and behave naturally

Of course, there are also disadvantages to this approach, namely the
entire dataset must always be in main memory, regardless of how much
of it is visible.

So what do you think? Genius? Tragic?

Again, know that you're right as usual, and that trees were something
we'd eventually need to conquer! I regret that it's taken me a year or
so to reach this conclusion.

Take care,
Jesse

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

GlazedTrees

Rob Eden-3
Hey Jesse James -

I didn't see an email from James. Sorry if I overlooked it somewhere.  
Spam's been killing me lately, so I may have been a little over  
zealous with the delete button. :-P

Yeah, at a basic level this is the approach we took in our  
application. If you start with data that is initially a list and then  
build the tree from there, it obviously fits nicely into GlazedLists.  
There are a variety of problems though. Off the top of my head:
    1) It's assumed that you can infer the structure of the tree from
       the nodes themselves.
    2) Allowing nodes to appear multiple places in a tree is difficult.
    3) Dealing with structure changes is either hard or inefficient.
       Take your pick.

The implementation I described on the SwingX is basically the same  
concept (build the tree from a list), but is quite different in  
implementation. Basically, I get (usually drastically) new data every  
5 - 60 seconds and rebuild the entire tree each time. Then I "merge"  
the new tree with the existing one to avoid a complete structure  
change. As inefficient as it sounds, it actually work surprisingly  
well. My pipeline looks like this:

        BasicEventList -> FilterList -> BatchList

A listener is registered to the end of the chain to listen for  
updates. It then triggers a background thread to rebuild the tree.  
The rebuilding also handles sorting. (We wanted to keep this out of  
the main path to avoid lag in the EventDispatchThread.) After the  
tree is rebuilt, it's passed off to TreeMerger (which runs in the  
EDT) to be merged in. A merge going from 200k leaves to 300k leaves  
(with a bunch added and bunch replaced) takes about 50ms.

So, to compare and contrast the two approaches:

   - We both assume #1 from above, obviously.
   - The interface you specified isn't going to allow nodes
     to be in multiple places. Also, your back-end design
     (having the nodes in order in the list) might not prove
     too conducive to this. By rebuilding the entire tree, my
     approach allows it, but it still has some rough edges.
     Maintaining selection (for example) is difficult when
     switching between tree building strategies.
   - #3's going to be rough. I avoid the problem by dictating
     that the structure can't change except via the tree merger.
     A change on the EventList chain kicks off a complete tree
     rebuild, so in a way I handle changes, but in a way I cheat
     and don't really.
     With yours we can handle things at a couple of places:
       - If you set a new TreeFormat, obviously the whole thing
         is redone.
       - You could also add a listener mechanism on the TreeFormat
         interface to indicate that the TreeFormat now wants to
         do things differently.
       - The hard part is handling changes on the nodes themselves
         that would cause the TreeFormat to classify them differently.
         The only real way to do this would be to either say they
         can't change or to register listeners to all of them,
         which wouldn't really be feasible. I guess the other
         possibility would be to add a method someplace (on TreeFormat?)
         like:
                elementChanged(Object element)
         Then you'd require the developer call this whenever a node
         changes is a way that might cause it to be reclassified.

Crazy stuff.

I agree that you don't need to worry about display. It's just the  
model we care about.

One performance note: TreeTable likes to call  
"TreeModel.getIndexOfChild(Object parent, Object child)" frequently  
(because JXTreeTable likes to fall back to using JTree's  
VariableHeightLayoutCache). Normally this method is implemented  
pretty inefficiently, but you'll want to make it as efficient as  
possible.

I agree that the model will need to be in memory. Yeah, it's a  
drawback, but not a show-stopper IMHO. Using "live" filters on a tree  
table rocks. So, I think the benefits are worth it.

Anyway... don't know if I answered all your questions or not. Lemme  
know what you think of the above. This'll be a good discussion and  
I'm sure you two will figure out a clever solution.

Rob

On Jul 6, 2006, at 1:37 AM, Jesse Wilson wrote:

> Hey Rob!
>
> I'm not sure if James has already emailed you, but I thought
> I might as well ping you myself...
>
> For a long while we've been under the impression that JXTreeTable
> was sufficient, and that Glazed Lists was not a good solution for
> doing tree-tables. But lately, noticing the discussion of problems in
> the Swinglabs JXTreeTable, James and I have been wondering about
> what a tree-table built on Glazed Lists would look like....
>
> This has always been an area you have been pushing for, and
> I'm confident you've got a lot more experience in the area. I'd
> like to get your feedback on how, what works, what doesn't, what
> scales, the whole works.
>
> First I'll describe what we're thinking thus far. The premise of our
> idea hinges on a different type of tree model that's very simple:
> No nodes! The entire tree is flattened, stored in a List, and  
> structure
> is inferred by asking rows what their parent row is. For example,
> suppose the tree we wanna represent is this:
> A
> --B
> ----C
> ----D
> E
> --F
> --G
> --H
> Then the List representation of the tree is simply a preorder  
> traversal:
> { A, B, C, D, E, F, G, H }
> Then there's a single extra method provided by an interface, something
> like this:
>   public interface TreeFormat {
>        /** @return another tree element, or null if element is root */
>       public Object getParent(Object element);
>   }
> For example, getParent("D") returns "B", and getParent("B") returns  
> "A".
> Using this, we can infer the structure of the overall tree, and do  
> so quite
> efficiently.
>
> - To do expanding/collapsing of nodes, we do something similar to  
> filtering to
> hide the approriate children.
> - To do indenting of nodes, count the number of getParent() calls  
> to the root
> to get a node's depth
>
> Note that display is mostly irrelevant here, I'm mostly interested  
> in the
> model. The display could be done through a JXTreeTable using a wacky
> adapter, or using a JTable with a fancy TableCellRenderer for the  
> structure
> column, or using a plain old JTree if there's only one column.
>
> Deriving the model from a flattened list has some significant  
> advantages:
> - no need to worry about lazy-loading for filtering
> - filtering and sorting are really easy and behave naturally
>
> Of course, there are also disadvantages to this approach, namely the
> entire dataset must always be in main memory, regardless of how much
> of it is visible.
>
> So what do you think? Genius? Tragic?
>
> Again, know that you're right as usual, and that trees were something
> we'd eventually need to conquer! I regret that it's taken me a year or
> so to reach this conclusion.
>
> Take care,
> Jesse
>


--
Rob Eden                             [hidden email]
                                      (217) 352-4000 x102
Principal Software Architect         Cell: (217) 898-8700
NEXVU Technologies                   www.nexvu.com

"Let your heart soar as high as it will.
  Refuse to be average."
                            - A. W. Tozer

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: GlazedTrees

Jesse Wilson
This is fantastic Rob. The merger is a really interesting idea,
although probably one only necessary for massive data sets!

On 7/6/06, Rob Eden <[hidden email]> wrote:
>     1) It's assumed that you can infer the structure of the tree from
>        the nodes themselves.

Yup, but I don't think this is a real limitation. The only other
mechanism would be for an external object to know the relationships,
in which case the TreeFormat could have a reference to that.
The trick here is that it's kinda like a relational database, where
even though the logical relationship is the parent points to the child,
not vice versa, it turns out to be more efficient the other way.

>     2) Allowing nodes to appear multiple places in a tree is difficult.
Definitely. Is this reasonable? We were thinking that identity
would be sufficient, and to require every element is distinct.
Is this the case for your tree?

>     3) Dealing with structure changes is either hard or inefficient.
>        Take your pick.
Yup. I think we'll probably go down the 'hard' road. I was thinking
the user could just do a set(i, get(i)) on the source BasicEventList to force
the element to be updated. It's crude, but it works.

Thanks again for your feedback Rob, it's going to be
very valuable down the road!

Cheers,
Jesse

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: GlazedTrees

Rob Eden
> Yup, but I don't think this is a real limitation. The only other
> mechanism would be for an external object to know the relationships,
> in which case the TreeFormat could have a reference to that.
> The trick here is that it's kinda like a relational database, where
> even though the logical relationship is the parent points to the  
> child,
> not vice versa, it turns out to be more efficient the other way.

Yeah, my best advice would be to start simple and keep things as  
flexible as possible. I think your original plan should be good for  
most cases.


> Definitely. Is this reasonable? We were thinking that identity
> would be sufficient, and to require every element is distinct.
> Is this the case for your tree?

Our tree actually has one view where things are in multiple places.  
For example, we look at flows (conversations) on a network. We have a  
view that rolls up by Server and one by Client. Then we have one that  
views by Host, so it's under both the client and the server.

The basic design I have allows it, but I haven't done much with it  
because it gives my selection mechanism fits.


> Yup. I think we'll probably go down the 'hard' road. I was thinking
> the user could just do a set(i, get(i)) on the source  
> BasicEventList to force
> the element to be updated. It's crude, but it works.

Yeah, that'd be my recommendation too. This is one area where I think  
being too magical is probably bad. Make the developer do the work  
where the solution isn't obvious.

Rob

--

"Are the gods not just?"
"Oh no, child. What would become of us if they were?"
                            - Till We Have Faces
                              C. S. Lewis


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]