Revisiting the LL Abstraction
In the Aug 3rd, 2006 meeting we decided to try to look at the requirements on the Link Layer by several protocols.
Defining the LL
* For the send path, the LL should provide * way to inform of multiple transmissions (message futures) * priority mechanism (urgent) * reliability mechanism (with feedback) * rate limiting * cancellation * late stamping * introspection * timing information * queueing infomration * medium congestion * snooping * It should also provide * message queueing/pooling * neighbor table * link quality information * node schedule information * mechanisms for controlling membership * e.g. pin/unpin, evict, insert * ease of reflection (private sync'd tables) * acks with payload
* Message Pool/Futures (Boomerang) * Neighbor Table (Schedule information/ Link Estimation information) (Net2 Collection, Boomerang) * Reliability mechanism/ feedback (Boomerang, Anycast) * Acks with payload (Anycast) * Timestamping / Late stamping of other information (events? callbacks?) (Fusion, Boomerang, Tenet, Flush) * Priority mechanism (Fusion, Boomerang) * Cancelling (Dissemination, opportunistic forwarding) * Introspection * timing information (Flush) * queue information (Flush) * medium congestion (Fusion) * Rate Limiting (Flush, Fusion) * Snooping (Flush, Fusion) * Naming * raw addressing exported (Boomerang) * handle (Berkeley's SP) * UART not an address, but interface (Tenet) * Timing (see DSN and Trickle for precise timing requirements)
Boomerang Lessons - Joe
Thanks for the email. Let me try to summarize what we've found useful, what is probably not necessary, and what we'd like to see in the future.
1) Message Pool. The message pool has been invaluable for storing messages until a later point in time that they can be sent. It decouples the power management policy from the data transmission policy, which in turn means you can submit a message whenever you want and it will go out on the radio.
2) Message Futures. Due to the decoupling above, message futures are necessary if you generate more than one message per power cycle. Keeping a queue (we use cqueue by Cory Sharp) in the networking protocols allows each protocol to decide how deep the queue is and how to prioritize it. This can also be accomplished with a buffer pool that network protocols share and extract buffers when needed.
3) Neighbor Table. This should be clear from your discussions, but I wanted to chime in for how we use the neighbor table. We actually do not use it for link estimation, rather for power scheduling. The neighbor table makes it easy for the link to either insert schedule or the network protocol (such as NetSync and NetWake in Boomerang) to add other schedules.
One shortcoming--We will be changing the interface to comply more with TEP102 for timers. Currently the neighbor table takes a 32-bit absolute time for turning the radio on and off. Because TinyOS is not real-time, delays can cause an event to be missed, and lead to 36-hour delays (32-bit rollover)! The solution is the TEP102 proposal for Alarms--use t0 and dt to specify the next wakeup interval. This then couples nicely with the Timer/Alarm system and makes the overall implementation easier.
4) Link Estimation turns out to be a bit of bust because you can't do it very well in an application-independent manner. Jack Stankovic pointed out at one of our meetings that different applications have different opinions on what a high-quality link is. Cost-based routing is one thing, but geographic or sensor-based routing has different requirements. Link estimation in the base case doesn't have to be fancy, nor does it have to give me "real units" like dBm or PER, relative link estimates are just fine for all the protocols we've encountered.
5) Reliability. I don't care how you implement reliable transport at the link, but you need to give protocols the option to ask for reliability and get feedback as to whether the reliability was achieved. This point should be obviously, but it is absolutely critical for us to have bi-directional reliability data to make informed decisions about what to do with the next message.
6) Timestamping. I've never figured out why Timestamping is not a core piece of metadata available with each packet. It is trivial to provide either on an SFD, or when SFD isn't available on the packet dispatch. Sure, it won't be 100% accurate, but with a 30.5us clock it doesn't really matter. The #1 best addition to SP in Boomerang was timestamping, and has served us very well.
7) Naming. This is an overrated topic. We initially thought we were going to implement Arsalan's SP handle idea. Then we starting looking at the realities of the situation and decided that, in practice, it was never going to be needed and could easily be pulled in if we needed it at a later date. There was no point confusing users with a handle when they're still trying to figure out what "address" means.
8) Congestion. Clearly this isn't completely important from a practical perspective. From a research/academic perspective it was quite fascinating to see we could build some congestion control scheme that had basic working properties in a platform/link independent manner. I still am fascinated by that, but practically speaking--our motes are off most of the time and our goal is to do as little data transfer as possible, so 99.9999% of the time there is no congestion.
9) Phase/Delay feedback. Not necessary with the message pool decoupling transmission from the network protocols. David Culler and I initially thought that this was a requirement for the decoupling, but experience with SP has shown that the message pool can do an effective job of the decoupling without complicating the other protocols.
10) Urgent. I think this is somewhat useful and doesn't need to have more than 1-bit of information. There's a number of papers out there, including one from Stankovic, that talk about how 1-bit of information can provide effective prioritization of network traffic. Arbitrary priority levels, in my opinion, are complex and aren't required by any protocol. The system is supposed to work together--if you are worried about resource constrained systems fighting internally with themselves, you're probably developing your application incorrectly.
11) Abstract data types. The initial version of SP required you to specify the fields in the message structure itself. Phil Levis was right in complaining that it led to a lot of programmer error. In SP in Boomerang, we moved to a system where we explicitly prohibited users from editing any of the fields of an sp_message_t structure. Instead, you set fields using the SPSend.send() and SPSend.sendAdv() commands, and accessed fields through the SPMessage interface of SPC. This is a much safer implementation and has reduced the number of headaches that we encountered in the initial implementation of SP for the Sensys paper.
12) Receive. We completely threw out the TinyOS method of Receive where you must return a message pointer and introduce a new interface, SPReceive, that can fan out to as many protocols as you need. This has been a time and bug saving development. The notion is the following: If you need data in the received message, you must copy or do something with it before returning from the SPReceive.receive() function. The buffer passing mechanism of ReceiveMsg, although novel in principle, just wasn't used in practice.
Critical to support: Message pool, message futures, neighbor table schedule information, reliability control and feedback, timestamping, abstract data types and non-direct accessing of structure field, non-buffer passing receive interfaces
Don't get too caught up with: Link estimation, Urgent/Priority
May not be completely useful: Congestion, Naming
Haven't found a need for yet: Phase/Delay feedback
Dissemination - Phil
Here are important considerations -- the wish list, if you will -- that trickle-style dissemination has for a data-link abstraction.
1) Cancellation. If there is any latency between submitting a packet to send and its transmission, it is important that you be able to cancel a packet after you have submitted it. Cancellation does not have to be perfect, but the finer the timing, the better. This means that rapid piggybacking might be problematic (I heard someone transmit, so I'll transmit too).
2) Timing. A dissemination component can know far in advance when it might want to transmit. E.g., if your timer is 1 minute, it might know that it wants to transmit a packet in 45 seconds. If you have cancellation, then you could submit it very early. The *exact* time of transmission is not as important as establishing that the packet does not go out in a certain period (the listen-only period). So maybe what you do is after 30 seconds submit the packet to be transmitted in 15 seconds. Note, though, that while if it's transmitted in 10 or 20 seconds that's OK, if it's transmitted in 35 that's not OK. (30s + 35s is larger than the interval of 1 minute) This
3) Rare and few packets. Generally signaling protocol sends a packet very rarely. Depending on the items being disseminated, there might be other requirements, but I think that's protocol specific. E.g., Deluge's dissemination might need to send lots of packets quickly, but its dissemination signaling protocol doesn't.
4) Trickle operates in terms of "broadcast time." In CSMA this is just time, while in broadcast-enabled TDMA it means logical time over the broadcast slots. It needs to have a reasonable degree of randomization within this time, and discretization can be an issue. For example, if a link layer delays packets in a fasion that bunches traffic, rather than having a uniform distribution of transmission times, Trickle will have correlated transmission times. This can cause problems with its scalability, as higher channel utilization will lead to hidden terminals, increasing loss rates and therefore the base of the logarithm in its log(n) scalability.
(See mailing list for discussion with Joe) --Phil
Anycast Requirements - Henri
Brief description: In anypath routing (aka opportunistic routing), each packet can be relayed by any one of a set of candidate next hops. These candidate next hops at each node are established by the routing protocol. Anycast forwarding is the link-layer mechanism that supporting the semantics of "send to any node in a set of neighbors". In certain anycast forwarding schemes, there may be multiple receivers of a packet, and an arbitration mechanism is needed to resolve which among those receivers becomes the effective next-hop relay. The implementation of such arbitration mechanisms motivates points 2 and 3 below.
Here are some link-layer requirements for anycast forwarding. I am explicitly *not* making any proposals related to link-layer interfaces; maybe all of this can be done above an abstracted link layer and has no effect on it (though i would be surprised ;)).
1) Multiple candidate relays: It must be possible for the routing protocol to indicate a list of candidate next hops for a packet (*). This is not the same as having backups in case a packet doesn't go through.
2) ACKs with payloads It should be possible to fit 1-2 fields in an ACK frame, maybe stg like 32 bits total. This does not mean that *all* ACKs must carry this overhead. So an open question is how the contents of this field are communicated down to the link-layer from above, on a per-packet basis.
On the receive side, this means that an ack is not just a 1 bit metadata field; the receiver must have some way of obtaining the ACK payloads, including for ACKs corresponding to a packet sent by another node.
3) ACK timing and slotted ACKs It must be possible to specify some timing on ACKs, ie say "i want the ack to go out in X ms; alternatively to have logical timeslots and say "i want the ack to go out in the Xth slot".
(*) This is not required by every conceivable anypath/opportunistic routing protocol. E.g. in some instances, one can simply send to broadcast address; every receiver with cost lower than the sender is a candidate relay (packet of course then has a cost field).
Flush (Fast bulk data transfer) - Rodrigo
* Brief description . Flush is the next generation of Straw, a protocol that aims at maximizing the throughput of bulk data transfer, one flow at a time in the network, enforcing 100% reliability. There are two parts to Flush: one that pipelines packets over multiple hops ideally as fast as possible, while trying to minimize self-interference in the path (among adjacent packets), and one that does a transport layer functionality, by running an end-to-end protocol that enforces that 100% of the data is correctly received. The end-to-end mechanisms often involve multiple rounds of transfers to get the missing data. We assume that the Link Layer provides queueing of some sort (which could also be a pool, there is no strict requirement on the ordering). Flush imposes some interesting requirements on the one hop layer. * Timing information feedback: Flush requires that the LL provide timing feedback on all packets: how long did the packet take after it was dequeued for sending, including all of the LL retransmissions and the MAC layer delays? * Late stamping of this timing information in the packet: after the packet leaves the queue, and before it is sent directly to the radio, Flush stamps it with the average of the delay of the previous packets. This average is computed from the number collected above. This is also not the same as post arbitration time, but would probably benefit from similar mechanisms. * Rate Limiting: very important feature for the pipelining to work in a stable fashion is the ability to tell the link layer to not send faster than a given rate (or, equivalently, not to send with an inter-packet interval smaller than some value). This is done in Flush at each hop, and not only at the source. Although and optimization, it provides for much less bursty rates. What happens in the common case when there's no rate limiting is that nodes drain the queues too fast, making the overall throughput suffer. * Snooping of Traffic: Flush needs rate information to flow in the reverse path in order to obtain a good throughput. This is done by embedding in the packets that go forward information destined to the previous hop(s). These nodes snoop the traffic and adjust their rates accordingly. If snooping is not provided, at least broadcast is required. Since all nodes are part of a single path here, it is reasonable to say that they are all awake.
These are the most important features.
Fusion/Congestion - Kyle
* Brief description . In the Fusion paper, we examined three techniques that span different layers of the network protocol stack: hop-by-hop flow control, rate limiting source traffic when transit traffic is present, and a prioritized medium access control (MAC) protocol. Each of these ideas has been proposed before at SenSys or MobiCom, and even before that in different contexts, so our contribution was to fuse them together and evaluate them in a real sensor network, something previous work in our field hadn't done at the time.
* To do CODA-style congestion control, the MAC needs to be able to sample the CCA pin from the radio and perform a time-averaged computation on that binary value. This information would then be exposed by the link layer.
* Like Flush, we relied on late-stamping information sent out in each packet -- this is desirable because we want congestion information piggybacked on each packet sent to be as fresh as possible.
* We want congested downstream nodes to take precedence in a multiple-access contention (if it happens). So Fusion requires a priority mechanism from the link layer. With regard to Joe's comments about priority, we agree; Fusion doesn't need more than one bit of priority info.
* Because most of what we obliquely call "congestion" is really wireless loss (we make this point plainly in the paper), rate-limiting is key. Whether it should live in a link-layer abstraction is debatable, but note that rate-limiting will at least require information from the link-layer abstraction, such as number of siblings, number of children, an estimate of number of nodes in the network, or some combination of these.
Tenet - Om
Tenet uses a collection protocol and a dissemination protocol (mechanism similar to Trickle) on the motes. So there is nothing new to say there.
Here are some some other requirements:
* Time synchronization. We currently use FTSP which requires a low-level timestamping. FTSP requires a callback on startSymbol on Receive as well as Send so that the packet can be time-stamped accurately. If the one-hop layer provides timestamping (on outgoing as well as incoming path), similar to what Joe what mentioned in his email, that would be nice.
* Tenet requires that the two tiers of the network be able to address each other. It would be good to think about naming a bit. Right now, all the nodes (Stargates and motes) derive their names from the same 16-bit address space which makes it necessary to prefix IP-prefix while resolving the Stargate names.
* On the gateway node, an interface that allows sending message on a given link and a given destination. For example, sending a message on UART to a destination stargate 5 vs sending a message on radio1 to a mote 10.
I think most of the time we are thinking of a radio link but should we consider other types of devices? Acoustic modems for underwater networks. CC2420. UART.
DSN (Declarative Sensor Networking) - Arsalan
DSN is a little different then the rest of the protocols in that it itself is not a protocol, but a way of programming the sensor network. In essence, DSN creates a runtime query processor on top of this link layer, and treates everything as predicates and queries. Theoretically, if it is successful, it should be able to implement many, if not all, of the above protocols, such as Flush, Fusion, and Trickle. Consequently, any requirements that any of these protocols have, are naturally carried over to DSN. However, there are some requirements that DSN itself places on the system:
- Snooping or Overhearing: Part of the method that DSN uses to efficiently disseminate and calculate queries is the overhearing and snooping of packets, in order to find new data for calculating a query.
- Precise Timing: DSN often uses a fine granularity for indicating the period when a query must be sent out, because it often uses epochs to send out the queries, and then gather results at a base station.
- Piggybacking: Ability to arbitrarily piggyback messages together for message savings.