Some of us working on collection met on July 31st to try to flush out as much of the collection protocol(s) as possible. These are the notes from that meeting.
Present: Om, Phil, Kyle, Rodrigo, Sukun, Alec Woo
The goal of the meeting was to try to come as close to a specification for protocols as possible such that two different implementations could interoperate (given the same radio and the same MAC layer, at least).
Photos of the whiteboads at http://www.cs.berkeley.edu/~binetude/download/collection_workshop_whiteboard.zip
Notes by Om:
Collection assumes that link layer has following properties:
1. Source and destination addresses in the packet header. 2. Unicast packets are acknowledged synchronously. 3. Protocol dispatch mechanism. 4. Broadcast is supported.
We do not assume the following functionalities:
* Snooping * Timestamping
Justification for synchronous acknowledgement assumption:
* bound the state used to hold packets after transmission * common * does not seem to preclude scenarios/protocols
Collection must satisfy the following requirements:
In the forwarding functionality:
1. Reliable beyond best effort. There is an overwhelming demand for high reliability collection of packets. Although collection will not guarantee 100% delivery, it must not preclude and make it easy to implement higher layer functionality to ensure 100% delivery. Packet drops should be rare and perhaps happen only in situations when:
* A queue is full * Maximum number of link or network level transmissions exceeded * Duplicate detected
2. Do not preclude high throughput operation. This requirement, for instance, makes it necessary to take great care while using end to end mechanisms in collection.
3. Encourage node fairness at a long time scale. Node fairness becomes an issue when enough packets are generated by the sources to keep the channel busy. Collection can use end-to-end and hop-by-hop mechanism to encourage node fairness.
In the routing functionality:
4. Select routes that enable energy efficient forwarding of packets.
5. Anycast semantics for collection packets - destination can be any root and can not be specified.
6. Collection must support multiple radios, potentially of different capabilities.
In network discovery:
7. Ability to trigger a tree update from the root at any time.
8. Flexible to many models/topologies, including static topologies.
Notes by Rodrigo:
As implemented in the T2 beta 2 release, we have two types of packets: data and control. Link Estimation "wire" formats are not discussed here because they are appended and prepended to the control packets independently of the format of these, and don't affect the tree formation protocol.
It is important that the final specified protocol allows implementations of the specification to:
1. Interoperate 2. Improve
The balance we seek is best exemplified by that achieved by TCP: many different implementations of the TCP specification can interoperate, while many improvements to the algorithms involved were made over the years.
Metric is the minimization criterion used by the tee building algorithm in selecting routes. We discussed what to specify of the metric, and the consensus was that metric should be non-decreasing from hop to hop. For the first specification it will be a 16 bit value representing ETX x 10^2.
Loop Detection / Recovery / Queueing
Causes of dropping packets:
1. origin queue 2. in-network queue 3. re-xmit count exceeded 4. node failure 5. loop detection (with ttl or thl) 6. duplicate detection (not a drop)
We can't do anything for 1., 4. 6 is not a drop. That leaves 2, 3, 5.
We agreeded that a packet is dropped when it ceases to exist in the network without reaching the destination. There were two positions: Alec was defending that we drop the packet, since 100% reliability has to be built with higher layer (end-to-end) mechanisms anyway. It also makes things simpler. Phil prefers that we don't drop the packet except for node failure.
We also agreeded that we need back pressure if we are not to drop packets at the head of queues in the network. The queues would just overflow and we would be doing tail dropping instead. Custody transfer is the limit in this policy of not dropping: packets are only deleted at a forwarding node after an acknowledgment that the packet will be forwarded by a next hop is received. (This is different from a link layer acknowledgment).
How do we do back pressure then? We looked at two alternatives:
1. ECN 2. Acks
Pros: * simple (1 bit) * resilient to loss of ECN: bit can keep being sent Cons: * merges control and data planes (control packets send information about the forwarding engine state, creating a coupling between them) * requires snooping or acks with payload (1 bit at least)
Network level acks:
Pros: * per packet, more reliable than ECN: will only send if there is space * data path only Cons: * more costly * not the same as link layer ack
A proposal was that ECN be tried first:
* ECN when queue is getting full (whatever that is chosen to mean: it could also mean when the queue is growing) * Set ECN bit in the control/beacon packets
Timing: when do we send a beacon?
Triggered versus Periodic beacons. Triggered beacons means you send a beacon in response to an external event (e.g. receiving a beacon), as opposed to an internal event like a timer. Triggered is the style employed by Drain. It implies that beacons have origin generated sequence numbers so that waves of beacons can be distinguished.
The two can be merged with the use of a force bit in the control packet: a root initiated update would have a force bit that would cause all nodes to broadcast a control packet. Maybe there's no need for this, as a new sequence number would imply a forced update. The other variation we discussed was the use of a P pull bit, for implementing a fast join: a new node send a void beacon with the P bit set, inviting nodes around it to send their current information.
* only matters when the network is overloaded. Is this handled at the upper layer? What state do we have to maintain on a tree to implement fairness? Per child (as in Ee's scheme), per source? Just distinguising between local and forwarding traffic seems insufficient.
* There are arguments in favor and against two ways of building trees: MintRoute style and DSDV style.
* The first implementation of the Net2 Collection followed MintRoute in how it build trees. It is simple, based on periodic traffic and on minimizing a cumulative path quality metric (ETX) to the root. We explicitly use many roots and enforce an AnyCast semantic. Nodes send packets towards the roots of a forest, and know not which root the packet is going to get to. * Loops: this tree building mechanism suffers from transient loops and count to infinity problems. Basically, a node can choose as a parent one of its descendents. The forwarding mechanism that goes together with MintRoute style has to cope with loops by detecting and recovering. * Packet formats Origin
* This protocol aims at creating trees that are loop-free by construction. A node will not choose one of its descendants as a parent, because there is a sequence number that is incremented only at each root, and serves as a version information for the tree building. Also, disconnections, instead of potentially creating count-to-infinity loops, will make all in the disconnected subtree have a metric of 0. * There are subtle edge cases that have to be looked at (when to update given a sequence number, etc) * This also allows root triggered tree formation/refresh in the style of Drain, by means of the root's introducing a new sequence number. * Packet formats Origin
Summary and Next Steps
* Write down a TEP describing the protocol(s), in addition to the TEP on the interface for collection that is already in the repository. * Implement both alternatives and evaluate which one performs better in practice.