- page was renamed from Net2WG/Notes/20070608
Net2WG/Notes/20070615 Meeting Notes
* CTP * Deluge
* Om, USC * Phil, Stanford * Rodrigo, Berkeley * Mike, JHU * Razvan, JHU
Phil: Looking at this data, , the cost is 3.15, there are 2247 packets received and 1080 wait events. This basically means that of the 7 thousands packets transmitted 6 thousands were transmissions and 1 thousand, aproximatelly 14%, were retransmission. This means the CTP is picking good links but it's not picking good routes. Even without retransmissions the total cost would have been just under 3. It would have been 6 thousands divided by 2247 = 2.8. In the MultipleLQI data is 2.1. So it looks that it just doesn't pick good routes. Does it make sense?
[Rodrigo joined the phonecall]
Phil: We were looking of the data that Om sent out and the high level conclusion is the that adjusting the link estimator and the route switch-over threshold did not improved the routing cost in any significant way. That is the high level conclusion. It's not a question of the route switch-over threshold. And what the data suggests is that the problem might be that the link estimation table does not have links to the best routes. The evidence for this in experiment 1 is that even if you remove all the retransmissions, then the average cost is still 2.8.
Rodrigo: How did you get that number?
Phil: I get that number from the total number of transmission, 7082, minus the number of retransmission, which is 1080, that gives a 6002. Then divide to the number of delivered packet, 6002 divide by packets received without retransmission, when all the this links would have been perfect. 6002 / 2247 gives about 2.8. So what means is not that the CTP is picking bad links or that is sticking with bad links rather is that is picking bad routes.
Rodrigo: How do we know it this is number is bad or good?
Phil: The reason is the MultihopLQI gets much lower cost routes. And we observed that changing the parameters do now significantly affect the performance. Only slightly. So the hypothesis that route switch-over threshold is the problem seems to be incorrect. Is that right Om?
Rodrigo: The replacement in the neighbor table could be also the problem. Not the link estimation table.
Phil: I think is the question of the link estimation table not having the right links. It's the tradition problem of link estimator only track the links while the routing is interesting in overall quality of certain links.
Om: Can we talk a little about the replacement strategy that is currently there?
Rodrigo: Link estimator or routing?
Om: Because routing is trusting the estimator table basically.
Rodrigo: For link estimator, when the table is completely full, let's say with neighbors that are kind of ok, you are not going to evict them. Except if they get really bad and go below the threshold for example. Or you timeout.
Om: Timeout is very rare.
Rodrigo: So there are good links and there are links very good but the good links occupies the space. That's the hypothesis.
Om: One thing that we have never actually implemented in BVR but we've discussed was that if you have 12 spots in your table you'll keep 10 that are effective neighbors and keep 2 to try out new neighbors. So your last 2 will be always eligible for being replaced.
Rodrigo: I think this is the standard approach in many other fields that try to solve the same problem.
Rodrigo: I think conceptually that is equivalent to keeping two tables one to constantly discover new neighbors and one to actually using it. And then you'll move the good entries from one table to another one.
Phil: Would the one bit interface we talked about be useful? Some background: I sent a proposal for an interface to get physical layer information up to higher layers. And the interface was: I give you a packet and you give me a bit. And the bit says was this packet high quality, or it was close to the SNR threshold, if has bit errors, etc. Such that I can be sure if the packet is in the white region or in end of the gray area.
Rodrigo: I have a quick question about that: the concept of white and grey and sometimes black is it the case this is a bit only because you never actually get the black packets?
Phil: Yes. Basically the bit means is the packet is white or not. Is not if it's grey or not. If it's in the black is not a big deal.
Phil: Here is another thought. When the link estimator gets a packet from the white region, it asks the routing engine whether to consider the route. So, the idea is that it may be a very good link, but further down, it might not be of any routing use.
Rodrigo: The routing engine would compare the commutative quality of the packet to other routes. So, we can do this with one packet, right?
Phil: Suppose the link estimator gets a packet from gray region, it starts estimating how good the link is. When it thinks the link is actually good, it sets the link maturity flag in the table and asks the routing layer if it is reasonable. If it receives a packet from while region, it can asks immediately without the link estimation process.
Om: One question about this interface. What if the routing layer decides not to act on it? What is the default answer from the routing layer?
Rodrigo: There could be an I-don’t-care answer.
Phil: It signals the link layer that you can make the decision. Basically, the routing layer implements a comparison function, and it returns A, B, or I-don’t-care.
Om: If we have unlimited space, why can’t we maintain separate estimates just based on this one bit, and then combine somehow…
Phil: The consensus from the Core is that we think this high-level bit is useful, then we should go ahead. One concern is that if the link layer can not give any physical layer information, then what should the default value be. I think it is an open question, and we should wait until we understand the mechanism better.
Rodrigo: It depends on how we use it. It could be that we treat white as high priority than gray, then the default would be gray.
Om: I sent out another e-mail on MultiHopLQI link metric. It has the total count of how often a link metric appears in the network.
Phil: Basically, most of them are in the range of 300 or less.
Om: The value of 512 corresponds to LQI 98. I didn’t get a chance to plot these results.
Phil: In terms of mapping these values to LQI, 125 is basically the highest LQI. The estimate you get from LQI of 125 is 110, and the estimate for LQI of 50 is 8000. The plot is a third-degree polynomial. So, for 8000, it means that there was one packet received at LQI 50. Then, for 7410, there were two packets received at LQI 52. The highest value observed is 108, which is 155. So, what does this data tell us?
Om: I think it basically tells us how to use the bit and how to set the threshold.
Phil: These values tell us about the distribution of the links in the network.
Om: If I also plot the LQI actually used, then we can put the bit to be used in MultihopLQI.
Rodrigo: Do we have to relate this number to PRR?
Phil: Yah, so based on RSSI, LQI, and metrics you have, do you want to set the bit or not?
Om: I think if we look at this curve and the curve on the LQI actually used, it can guide us.
Phil: Why would the distribution of the data you sent out matter in terms of setting the bit?
Om: Maybe not in a very direct way, We don’t want a threshold that excludes 90% of the links.
Phil: The link layer can dynamically change the threshold. What you are showing is if I have a network where I have no good links, you can say something about the quality of the links if you have nothing to do with the distribution of the quality of the links on the network. I can give you a network where every single packet will give you 0. It should be independent of the distribution, and it should tell you whether it is a solid link based on a single packet.
Om: How should this bit behave if you have two testbeds where one is very close to 0 and one is very close to 1.
Rodrigo: I think it goes back to something mentioned before. One way to act on the bit is if the link is white, and no further estimation is required. If the link is gray, you would go to multiple packet estimate. In the hypothetical thing, the behavior of the algorithm would not depend on the distribution, only on the absolute value.
Phil: Back to Om’s question, say you have 100 nodes and all links are great, so every packet is going to tell you 1. Now, say you have 1000 nodes and none of the links are white, so every packet is always going to tell you zero. Basically, I can’t tell you from a single packet whether it is a good link, you gotta do something to figure it out.
Rodrigo: I agree that the algorithm should work in either network. So what should we do then?
Phil: The code is just gonna add the values (on the right of Om’s results) along a path. My suggestion going forward would be to implement this bit. I can talk to Kannan to get an LQI curve, and from that come up with a legitimate threshold for the bit. Then, whenever a link becomes mature, the link estimator asks the routing engine whether or not it should keep it. And, receiving a white packet is immediate mature. Should we do the comparator or just the should-I-keep-it?
Rodrigo: I would say start with the simpler one, just keep it or not.
Phil: If it is keep, which link should the link estimator evict? I argue that it should evict one randomly. If it unpins the worst the link, it could be the best route. Then, the next time it comes around, the routing layer is going to say keep. Eventually, the links no longer in the table are going to be the ones the routing layer thinks are the best. If you do it randomly, the table will converge to the optimal table.
Om: Why just do the first random. Why not ask the routing layer multiple times on whether the link should be evicted?
Phil: If it turns out this is an insufficient mechanism, we may need to look at a more complicated routing layer mechanism that can tell the link layer which one to evict.
Phil: I will look at the bit code. Om is the link estimator code. Rodrigo is the routing code.
Razvan: Haven't done much. Committed bug fix someone found. Other than that, nothing new.
Om: I can compile! But I do have a question. Would it be nice if I had a way to measure the time it takes to reprogram the network.
Phil: Why not send a packet to the UART?
Om: We have significant backchannel latencies here, up to a few seconds.
Mike: Well, it takes a minute or so to reprogram.
Phil: So your error from the latency won't be great.