Net2WG/Notes/20070525 Meeting Notes
* Deluge * CTP * Other projects (Zigbee, DYMO, 6lowpan)
* Om, USC * Rodrigo, Berkeley * Phil, Stanford * Mike, JHU * Razvan, JHU
Razvan: I committed this week the whole code and I just committed today an update from Mike which removes the 100 bytes limitation for the data packets. We didn't have time to make more progress on MicaZ. Right now it compiles but we didn't have time to try to run and debug it.
Om: Where are the applications
Om: I don't see it.
Phil: update -d
Om: I'm just browsing.
Phil: apps/tests/deluge. Sometimes the interface is update-to-date sometimes is not. I wouldn't worry about it.
Phil: I think on the MicaZ side what matters there is the 2.0.2 release. If it's done in 2-3 weeks that's fine.
Mike: This is in progress right now so we should have some good news for next week.
Phil: Ok. It's just it doesn't have to be done tomorrow.
Mike: Do you know when the 2.0.2 will be released?
Phil: Early July.
Razvan: One thing I didn't update is the Makefile from tos/tools/tinyos/misc. So the Python tools will not be installed yet.
Phil: I don't understand. In tools/releases you mean?
Razvan: In tools/tinyos/misc where I put the tos-deluge and tos-build-deluge image I didn't modify the Makefile to actually do something with these files. I will do it.
Phil: Ok. Sounds good.
Om: I'll test on a few motes.
Phil: There is a student here that researched a bunch of improvements to the Deluge that improves the performance by, sometimes, 40%. I should put him in contact with you guys.
Om: Sounds more like improvements in dissemination.
Phil: No, it mostly has to do with the Deluge part and not the Trickled based dissemination. It has to do with whom you select to request your pages from, how you do timeouts... a couple of things like that.
Om: So it's the dissemination of large objects which is what Deluge is.
Mike: I'm just curious: is his name Jung?
Phil: Jung Il.
Mike: I talked with him at TTX. He mentioned the problems from Deluge's timers and fair waiting protocol.
Phil: That's a little different.
Mike: Ok. I'll send him an email.
Phil: I'll talk with him today.
Razvan: Quick question. I send an email about the CRC in the serial communication. Why is that in little endian?
Phil: Because historically it was in little endian. All the Java side is in little endian. So... What's the advantages in making it big endian.
Razvan: Well... it's the last field that is not big endian.
Phil: Why would you want to change it?
Razvan: In out Python tools we have to sets of encodings because of this and the only reason is CRC is in little-endian.
Phil: So it's just historically. If you want to change it, the Java tools, all the tools anybody ever written for serial packets will have to change. So there is a cost in changing it and there is no real benefit.
Om: The benefit might be in everything being big-endian.
Phil: That's a negligible benefit.
Om: The next item is CTP. Phil did some experiments.
Phil: Yes. To fill in the new folks (Rodrigo just got in), when we submit the paper to Sensys we noticed that the protocol did not picked the best routes. MultihopLQI for example picked better routes.
Om: MultihopLQI also has thresholds, right?
Phil: MultihopLQI threshold is amazingly tiny. Because the curve on LQI is so step essentially what happens is MultihopLQI settles on a set of links that are perfect. So in the case of CTP the routes were pretty stable but they have higher costs than they should so I went down and look into the routing where we have these parameters. I run some simulation and my conclusion was that there is one parameter, which is the threshold on which you change to a diff route, was too high. The way is working is you only change to a diff route if the route ETX was 1.5 better than the current one. So if your route is 3 if will only switch to a route of 1.5. If your look at results of CTP vs MultihopLQI you can see that the CTP has a cost of 3 and the MultihopLQI has a cost of 2. So for short routes this kills you. For longer routes 9.5 vs 8 is not a big deal but 2 vs 3.5 is. So I played around in TOSSIM using noise simulation with bursty losses to explore which might be some better parameters and the one that I sent out in the email seemed to be pretty good. At least a great improvement.
Om: Switch threshold = 0.2, alpha = 0.5 and a data window of 5.
Phil: Yes. If you look of what these parameters do, they say that a single packet will not cause you to switch links. Let's say we have 2 links and both seem to be perfect. I have a link that is 1 and another one that is 1. And both have a route value of 3. So if I lose a single packet on my link these values will not cause you to change. [...] In contrast if you you lost a single packet in 3 consecutive data windows you will switch. Or if you lose 2 packets in a single data window you will switch. So these will make you resistant to single packet loses or a few packet loses sprinkled around but if bursts of packet loses you'll switch over. This follows the observation of the bi-modal behaviour.
Om: So given at least two packet loses the likelihood to enter in bad mode is very high, right?
Phil: Yes. Two consecutive loses.
Om: in a single window.
Phil: Yes. Well... it is possible to have the first lost at the end of one data window and the second one at the beginning of the next one then you'll also lose the next 4 packets before you'll switch. There is an edge condition but I didn't observe to be a big deal. So my suggestion to all is to run the CTP with these news settings and see how it performs. And Rodrigo, it might be good to do this on Mirage.
Rodrigo: I can do that.
Phil: So far we only have simulation we don't know what will happen in real world it might just... blow up.
Razvan: I actually try to run CTP (not the last version though), on our testbed, which is very clustered on one end, and I observed a lot packets all the time, even if I don't send anything.
Om: Are there debug messages on the serial port?
Razvan: What I did is just to include the collection and ask them to join. And I have one root and another mote that was listening the traffic. And the traffic went to like 40 packets/s.
Phil: This must be the control traffic.
Razvan: Yes, exactly. I didn't attempt to send anything.
Om: Maybe the fact that the basestation has constant beacons works badly when you have really dense networks.
Phil: No. The problem is that, when you have dense network, because the control traffic is non-suppressive and when a single advertises for example a poll, all of its neighbors will start sending control traffic. A single node that polls can cause an explosion.
Om: It could be that many polls? That frequently?
Phil: It doesn't matter. It could be.
Om: It could be but isn't that bad?
Rodrigo: It would be nice to take a look of a trace of those packets if it's possible. Do you log them?
Razvan: I can repeat the test.
Phil: Yes, it would be good. I think in general there is a problem with CTP and control traffic begin non-suppressive.
Om: ... or it can be even self-stimulating.
Phil: That is very possible.
Om: Razvan, if you do that all the motes can also send debug messages on the serial port. Is it possible to log that?
Razvan: I haven't run with debug messages. I can also try to do that.
Rodrigo: Can you log all the serial messages from all the motes?
Rodrigo: I think you'll have to enable that.
Razvan: I'll do that.
Om: I think is in the Makefile.
Razvan: I'll take a look. I was using a small program that I wrote not any of the examples.
Om: Related to this, I send a small script that I used to play around with the parameters. My conclusion was that giving a lot of weight to the history it's like making the window large. Not very large though. I was trying up to windows of 10. There is something going around. Even if you make the window very small, the only difference it seems to make is that the value at which the estimate it stabilizing, assuming that the loss rate is constant...
Phil: The problem is the loss rate is not constant. Losses are bursty. That will really change your results. The reason we are doing the window as far as I remember is to try to get intermediate periodic values such that when you do the hybrid estimate using the beacons you'll have meaningful values. Unless you want to apply different alpha for data packets and beacons packets.
Phil: And the real trick with the large windows is that you'll have a larger latency, right? If you have a window of 10 then you'll have possibly 10 consecutive lost packets before you'll detect the link was going bad.
Razvan: Just a quick question: these will take longer if you don't send any packets? If you let it run without any real traffic.
Om: That's the problem with the data estimates. Once you set the window you have to let him reach the end of the window.
[The discussion revolved around the frequency of reporting the losses to the link estimator and why not delaying is unnecessary.]
Phil: One thing that I'm concern is that, let's say you shrink the window to one, if this could make really brutal in response to two packets losses. Don't forget that once you switched the links, the only way the a link is going to be updated is the beacon packets. So what happens is, if you have two links and one is slightly better than the other, I see a situation when I do a 1, 2, 3, my link ETX goes too high, I switch links then I'm never going to go back to the first one for a long time.
Om: This can happen if one is significantly better.
Phil: Yes. Let's say with 1.2 instead of 1.5 then it would take a lot of time before you go back to it.
Phil: We can explore this more but we could start with the values that I send out to see if it improves the things. Make sense?
Om: Yes. When you say improves things are we looking for fewer parents changes? How do we know if it improves?
Phil: Cost. Let's look at cost first. Right now I'm less concern about parent changes and more about cost and control traffic.
Rodrigo: So this is the question of which node you select to send vs whether you actually advertise that.
Phil: Yes. Let's focus first on the cost. This should bring the cost to reasonable levels for short paths and then we can look at the control traffic of being over-zealos.
Om: The is also the control traffic from the root. I think we also need to address that.
Phil: Yes, of course.
[The discussion revolved around the timers for root in Ctp and the triggering of route updates.]
Om: Updates on some other projects. Zigbee: Andre is working on it. He is away this week he'll be back next week and he'll look at TinyOS 2.
Rodrigo: So is he going to be able to call in this time? It's probably too late for him?
Om: He says it's ok. He actually called us once. I emailed (?!) Alexei Martin (?!) about Zigbee and he said Crossbow will probably not be interested in 802.15.4 or just the clustered mode but if we are going to do the full network API then he can help out. That was his response.
Phil: So, since we've gone through this push at least 5 times and they say "there is nothing there" and we say "no, there is" and they say they want to do the full network thing...
Rodrigo: Can you explain one thing to me? So if we are interested in a specific thing they are not but if you are interested in the full thing they are?
Om: Yes. Because they are really interested in the network API implementation...
Rodrigo: But not some specific components such as mesh?
Om: Yes. Clustered tree, what is the name?
Phil: Skip (?!) tree.
Om: They are interesting in the whole API.
Phil: They want the entire layer 3. That is what it comes down to. They don't want the skip (!?) tree because that will be deprecated in the next version. They want a complete layer 3. What they are basically saying is that if we are committed to a complete layer 3 then they'll help.
Om: Ok. And that will also include the AODV probably?
Phil: Yes. But there are AODV implementation out there.
Om: Anyway... that's the status. I think we should probably wait until the next week to see. Basically, whatever Andre has done so far that's not interested. From their perspective. This is my take on it. Is this also you take on this Phil?
Phil: Yes, I think so.
Om: But 802.25.4 MAC is necessary. Is part of the effort.
Phil: I think they have a 802.15.4 MAC. Basically they are interesting in somebody to work on the code they don't have.
Om: Right. So that's the status on Zigbee. Dymo. I talked to Romain Thouvenin and he said the protocol pretty much works but he only test in simulation.
Phil (?!): Oh...
Om: But it works! Works well.
Phil: Which simulations.
Om: He didn't say. Probably TOSSIM because he asked a lot of question about TOSSIM.
Om: It's in TinyOS 2. He will be able to work on the code for a year. He's a master student in the final year. He hasn't decided what to next, apply to a PhD program... So to me this sounds promising. I don't know what you guys think?
Phil: Yes. It's excel. It works in simulation. Clear it would be additional problem to solve to make it work in progress but to have it fully working in TOSSIM it still pretty good. The more the marrier...
Om: Exactly. From a practical perspective, even if it only works in TOSSIM that is still a big step forward. Let's say that we don't ever implement it on motes you can still do a lot of stuff in simulator.
Phil: I wouldn't go that far.
Om: I know but people will appreciate it.
Phil: Well... but I think we should have a pre-requisite that it works on real motes. It is amazingly important not to fall into the pit that wireless MANET stuff into the 90s.
Om: True, but that is a requirement we never discuss. It sounds reasonable though. So now we need to find someone who might pair with him. He said he doesn't have real nodes.
Phil: There are public facilities. Motelab is public, Mirage is public.
Om: Maybe something closer to him. He is from Switzerland.
Phil: Is it ETH or EPFL?
Om: I don't know. If there are public facilities then it doesn't really matter.
Rodrigo: If he has a decent connection he can work on Mirage perfectly.
Om: This looks promising. 6lowpan is the next thing. I haven't got any response. I sent an email to Johnathan. Maybe we should just wait.
Phil: So there is a guy (I'll forward it on the mailing list) that say that has complete implementation working in TinyOS 2.
Phil: Yes. He's name is Matus Harvan from Jacobs University from Denmark. I'll forward the whole thread. I told him to get in touch with you.
Om: Good. There is one more thing... Kaisen wrote to me and he said he is willing to support the code for one year. This is the dissemination for lots of objects. Hundreds. Is that right Phil?
Phil: Yes. Kaisen has been working on the problem that Trickle doesn't work for disseminating many many objects. You cannot put all in one packet. What Dissemination does is defining a Trickle timer for every single data item so the amount of traffic scales linearly with the number of data items. So what he's doing has a constant overhead regardless of the number of data items and has some mechanism to discover which items are different. In the base case you are sending a hash of the meta-data. When the hash is different you'll start to figure out which item is different. It becomes complicated because of packet losses and other problems. Basically you have a hash tree and you are walking it figuring out which things are different. [...]
Om: I think we should push all this projects. Zigbee sounds difficult. We still don't know what will happen.