Basic Wireless Communication for Microcontrollers
Chapter 4 - Design Project 3: 900MHz Automatic Error-Correcting Data Link
Communication Protocols and Networks
     If you didn't know anything about data communications and you were handed a microcontroller and an RF module, the first thing that might occur to you would be to just send raw serial data over the RF link. In many cases, this would work fairly well, and there are some circumstances where this is a logical thing to do. However, there are several reasons why this is usually not a good idea. For one, the data stream is susceptible to errors and adding some additional data (such as an FCS) to the message will allow us to correct errors. Secondly,
when the transmitter is not sending, it is very difficult to ensure that the receiver will not occasionally output random data, due to noise and interference. There would be no way to distinguish this data from the intended information if we just sent the raw data. Finally, if there are several transmitters and receivers on the same channel, there would be no way to select which data was intended for which receivers.
     For these reasons, most wireless links organize data into packets. Each packet usually contains bit sequences to indicate the beginning and end of the packet (to distinguish it from the random, noisy data which may be present while no real data is being sent), some data intended to help determine if the information in the packet has been corrupted by noise or interference, and possibly a length specification, and source and destination addresses (if there are multiple stations on the channel). The system for organizing the data into packets and using the information in the packets to coordinate the link efficiently is called a protocol.
Packet Synchronization
     To indicate the beginning and end of each packet and to distinguish it from noise, a particular byte (called a flag or flag character) is added at both the beginning and end of the packet. This byte is prevented from occurring anywhere else within the packet. If it would otherwise occur, some measure is taken to indicate this without actually sending the byte, such as sending a two byte sequence(this is called escaping because the first byte indicates that the following byte has been altered). Whenever the receiver sees the flag, it clears its receive buffer and begins assembling a new packet. The allows the receiver to synchronize immediately with the beginning of a new packet. Even if noise contained flag characters, the receiver would be sure to know the beginning of the new packet because it starts over at every flag byte which
is received.
Error Detection and Correction
     One of the advantages to packetization, as mentioned before, is that errors can be detected and corrected. Most protocols handle this using a frame check sequence (FCS) to detect errors and retransmission to correct them. Protocols which allow error correction by retransmission need to provide a means for the receiver to ask for a repeat and for the transmitter to send one, all without causing a disruption to the operation of the data link (such as causing duplicate data to be received) The process of stating that a packet was received correctly is called acknowledgement (often abreviated as ACKing or ACK) and when a packet is received with errors, a non-acknowledgement (or NACK) is sent.
Protocol Layers and the ISO Model
     The protocol we have been describing so far is a general one for transfering any type of data over a moderately noisy channel. In many applcations, several protocols are used together, one inside the other (in other words, the packet for protocol 1 may travel as the message in the packets of protocol 2). This is known as protocol layering. In situations such as this, the term protocol itself takes on a larger meaning, as an established means of accomplishing a particular task in the process of communicating.
     For example, if you were controling a robot you might send commands like "C90" meaning turn clockwise 90 degrees. Rather than send these directly over the RF link as raw data, you might encapsulate the commands in the packets of a general purpose RF communication protocol. We could then say that we are dealing with two protocols, a command protocol and a transfer protocol. Each one operates independently of the other, in the sense that they don't need to know anything about the internal operation of the other. This is helpful because if you wanted to change the communication method (say to a wired link), you could alter or replace the lower level protocol (the transfer protocol) without necessarily changing the command protocol, provided the two were still compatible. In other words, protocol layering allows you to treat each part of your communication system as a modular block which has specifications but is abstracted from its internal details, similar to objects in object oriented programming.
The International Standards Organization (ISO) has adopted a system for classifying protocols in layered protocol systems. The system is called the Open Systems Interconnection Reference Model (OSI-RM). The seven standard layers (with quick descriptions) are as follows, in order from lowest (the one which contains data from all the others) to highest:
Physical Layer: The hardware such as RF modules, wired links, etc.
Link Layer: The error checking/correcting, address checking protocol layer
Network Layer: In charge of routing packets from source to destination where multiple stations are involved inbetween
Transport Layer: Makes sure that the data arrives in the correct order. When data has to take various different routes, some older data may arrive sooner and the sequence may need to be re-arranged.
Session Layer: Provides synchronization between the sender and receiver, to maintain a constant data flow rate if needed.
Presentation Layer: Does data translation when different types of data encoding are used by each end.
Application Layer: Interfaces between the software which ultimately uses the information and the other six layers.
     The OSI-RM is not something that you would try to follow exactly, since it is often convenient to split up the layers differently or not even have some of them at all. In addition, several of the top layers are not relevant to microcontroller communications. Still, the model is mentioned often enough that it is useful to become familiar with it.
Multiple Access
     Multiple access refers to techniques used to allow several stations to communicate over the same medium. Simply placing the stations on different frequencies is called frequency-division multiple access (or FDMA). Spread spectrum transmission (see section ...) is often called code-division multiple access (CDMA). In cases where several stations must share one channel, time division multipl access (TDMA) is used. This is just the simple concept that each transmitter takes a turn sending. Since we are dealing with simple wireless links for microcontroller-based devices, we will focus on TDMA.
     There are two necessary elements to TDMA: an addressing scheme and an anti-collision scheme. The first is needed to allow individual receivers to determine which data is intended for them versus that intended for other receivers, and the second is to prevent multiple transmitters from sending at the same time and interfering with each other. Addressing schemes usually assign a number to each receiver and when a transmitter needs to communicate with a particular receiver, it will place that number in a particular part (or field) of the transmitted packets. Each receiver then scans this "address field" to see if it contains its particular ID number or address. If the receiver's address matches that in the address field of the packet, the receiver accepts the data. If not, it discards it.
     Anti-collision schemes are more varied than addressing schemes. The simplest anti-collision scheme is called aloha (because it was developed at the University of Hawaii). In aloha, there is no anti-collision mechanism. Each transmitter can just transmit whenever it wants. If there is a collision, the receiver never gets the data. The transmitters expect an acknowledgement (or ACK) in response to what they sent. When they never get this, they just resend. As long as each transmitter tries to transmit at random times (and they are not synchronized so they often try to transmit at the same time), this system can actually work fairly well. Aloha is used especially in circumstances where there are many small stations communicating with a central station (such as in a cellular netowork). In this case, the small stations may not be able to hear each other to determine if any others are transmitting, so they cannot avoid colliding. A variant of aloha, called slotted aloha, splits the time up into slots during which only certain transmitters are allowed to send. In other words, some central "master" station sends a signal which says "ok, transmitters in group A can now send" and then later "those in group B can now send", etc. By splitting the transmitters up into groups, it is possible to reduce the chances that more than one will try to send at the same time.
     Other popular anti-collision schemes are CSMA/CA and CSMA/CD. CSMA stands for carrier sense multiple access, which is really a variant of TDMA where all the stations on the channel have the ability to hear all the other stations and determine, by listening, if the channel is free (hence the term carrier sense). CA stands for collision avoidance and CD stands for collision detection. CA is when the stations listen before transmitting and do not transmit if the channel is not clear. Usually, the following algorithm is used. When you have data to send, listen first. If the channel is clear, send. If it is not clear, pick a random number and wait that amount of time before checking again to see if it is clear. The random number is to prevent stations from becoming synchronized and trying to send at the same time.
     CD takes a different strategy. In CD, stations transmit whenever they have data. Some means, such as error detection, is used to determine if a collision happened. If it did happen, the two transmitters each pick random numbers, wait that amount of time, and then send again. In some cases, the transmitters actually listen at the same time as transmitting (this usually only works on wired links, such as Ethernet, not wireless ones) and if a collision is detected during transmission, the transmitter will stop sending data and actually transmit a jamming signal to jam the other transmitter. The point of jamming is not to seek revenge, but rather to ensure that the other transmitter detects that a collision has occurred (the jamming signal makes it more certain that the other transmitter's received data will not match that being sent). CD is used in situations where channel usage is heavy and there is a significant propagation delay between stations (such as an ethernet network with computers attached to a 1000 foot piece of coax). In these cases, CA would not work well because even if you sense that the channel is clear, there may in fact be another station which has started sending, but the bits have not reached your location yet.
Connected vs. Connectionless Protocols
     Because a protocol needs to handle certain conditions differently than others (such as sending repeated data when an error occurs versus just sending data normally), the software which handles the protocol has certain states that it can enter. Some examples of such states in a vwey simple system would be idle (no data to transfer) and waiting for an ACK. More complex protocols often have many more states which may include the connected and disconnected state. The connected state means that a particular station is engaged in communication with a particular other station. Even if no data is being transferred, the link is maintained, usually by periodically sending packets just to ensure that the link still exists. The disconnected state is where there is currently no steady link with any station, although the reception and transmission of "broadcast" packets (sent to multiple stations at once) may be allowed. Protocols which use the connected and disconnected states are called connected protocols, and those that don't are called connectionless protocols.
     Connected protocols have the advantage that the link is constantly being tested and maintained so each end always knows the status of the link and the degree of readiness to transfer information. Also, on channels with multiple stations, the connected state signifies that all data which is sent to the software handling the protocol should go to a given station, rather than the higher-level software having to specify a destination for each piece of data. Connectionless protocols have the obvious advantage of simplicity.
     Connected protocols and others which have many different possible states are often documented using a state table. State tables have the states going down one side and various events going along the top. To see what the protocol would do when a certain event occurrs (such as receiving a NACK), you find the event along the top and then go down to the horizontal row which corresponds to the current state. The box at these coordinates will describe the actions taken (i.e., perhaps the last packet is resent, etc.) as well as what state to go to next. Very simple protocols could also be documented this way, but there would be so few states and events that it is not necessary or very helpful.
Overhead, Throughput, BER, Latency, and Complexity
     As with any engineering decision, it is important to make sure that you can summarize in a simple fashon the advantages and disadvantages of each protocol or type of protocol so it is easy to select one for a particular application. While there are many things to consider, there are five pieces of information which can be used to evaluate any protocol for wireless communications: overhead, throughput, bit error rate (BER), latency, and complexity.
     Overhead is how much data the protocol adds to the actual message data. Your RF module may be sending at 4800 bps, but if the protocol adds 20 bytes of overhead to every 20 bytes of data you want to send, then even under the best conditions you will only be able to send 2400 bits per second. Overhead is closely related to complexity, since more sophistocated protocols (with connected and disconnected states, etc.) need to add more information to each packet in order to handle the administrative functions. Also, any additional "supervisory" packets which the stations may need to transmit must be considered (i.e., if every packet must be ACKed, and we are on a single channel (half-duplex link) so that the destination station cannot transmit at the same time as the sender, then the number of bytes in the ACK packet should be considered in the overhead as well).
     Throughput is the average rate at which your raw data can be transferred across the link. Obviously,overhead is a major factor in this, since we can never do better than overhead allows (i.e., in the above example, the absolute maximum throughput would be 2400 bps, 240 bytes per second when you consider the start and stop bits of each byte, if a UART encoding is used). There are always other limiting factors as well, though. If retries happen due to collisions, noise, or interference, this will slow down the net rate of transfer. Transmitters and receivers need a short time between being activated and when they can actually handle data, so there will be delays when the circuits switch from transmit to receive. So, for a particular protocol operating over a given link (with its associated noise and number of other stations) with a given RF module, there will be a certain throughput. While it does depend on all these factors, the way in which the protocol handles the restrictions imposed by the channel and module make a large difference in the throughput so it can be considered a tool for evaluating protocols.
     Bit error rate (BER) is the fraction of bits, on average, which have been corrupted as they travel through a system. This can be a way of evaluating an RF link (for example, with certain antennas, RF modules, and interference conditions, you can expect a certain BER for the data being sent through the modules). Because protocols can correct errors by retransmission or other means, they can reduce the BER. Because of this, for a given protocol and input error rate, you can expect a certain output BER. There is often a tradeoff between BER and overhead and throughput. Better error detection and correction requires more overhead and possibly more time spent in retries, so a lower throughput. Also, reducing the rate at which bits are actually sent by the RF module can allow for reduced bandwidth, which in turn allows a higher signal to noise ratio by reducing noise while maintaining the same signal strength.
     Latency is a measure of how old the data you are receiving is. This delay is measured from the time that the data source transfers a byte (or whatever minimum unit of information you want) to the protocol software until the destination software makes it available to the user of the data. Latency is caused by many factors including the time it takes for the software to process the packets, the time it takes for the packets to be transfered internally in the source and destination units, and for protocols which need to receive the whole packet before making any of it available to the destination (i.e., to check the FCS to see if it was received correctly) it includes the time it takes to transfer the data over the RF link. For general purpose protocols, latency varies widely depending on retries, computational delays, transmitters having to wait for a clear channel, etc. Special purpose protocols can be designed, though, which either minimize the latency or force it to be a constant delay. Again, this is a factor which heavily involves the specifics of the RF modules and link characteristics. However, the design of the protocol can help or hurt in this regard, too.
     Finally, you should consider complexity. Can your microcontroller run the required number of instructions per second to handle all the stuff that the protocol needs to do? Does the protocol keep so many variables and data in memory that the micro doesn't have enough RAM? Your sanity and the ultimately reliability of your code matters, too, so you should remember KISS(Keep It Simple, Stupid) and not use something which is overkill for your application.
Some Examples of Protocols
     The XMODEM family of protocols is probably the most popular and its basic "philosophy" nicely fits the bill as a guide for general purpose wireless communication. The idea behind XMODEM is just to send data in chunks and after each chunk, wait for either an ACK or a NACK before proceding. If an ACK is received, continue with the next chunk. If a NACK is received, repeat the last one. If nothing is received, wait some time before either resending the last packet or sending some kind of request for ACK. One of the XMODEM family, the protocol X.25, is very common as an RF communication protocol, both in Amateur radio as the "packet radio" protocol and also in commercial data links. X.25 operates as described above, but is also a connected protocol. XMODEM type protocols are simple, have a fairly small overhead, and provide a very robust link where it is very certain that the data will either arrive safely or you will know immediately that it has not. For more information on XMODEM and X.25, see the bibliography.
     While it is a nice common-sense approach, XMODEM has poor throughput and latency performance. This is due to the extra packet that has to be sent to ACK each data packet. Not only is this extra bits going back and forth, but it requires that the sender and receiver wait long enough for the radios to switch between transmit and receive twice for each packet. For links where there is a high signal to noise ratio and very few stations, and especially for links which are full-duplex (can send and receive at the same time, such as a phone line), streaming protocols (such as ZMODEM) are a tremendous improvement over XMODEM.
     Streaming protocols still split the data up into chunks and place it in packets, but the transmitter sends continuously or in long bursts containing many packets. In many cases, the packets themselves are made longer than they could be for XMODEM,further reducing overhead. No acknolwdegements are expected, the receiving end remains totally silent unless an error occurs, in which case it sends a NACK. On half-duplex links, the transmitter must stop occasionally to see if the receiver has received any errors. On full duplex links, the transmitter can just send all the time and listen on its receive channel.
     When there is little interference and low noise, ZMODEM works much better than XMODEM, and the throughput can approach the full speed of the link. However, on half-duplex links in noisy situations with even a few errors, ZMODEM can bog down very quickly. The reason for this is that so much data is being sent at a time that it is likely that much more data will have to be resent than if smaller packets were used with ACKs inbetween. Also, the link status is not known until some form of response comes from the receiver, and there can be long periods where the receiving end sends nothing. For these reasons, ZMODEM type protocols are restricted to use on known reliable links. For more information on ZMODEM and other protocols, see the bibliography.
Standard Protocols vs. Rolling Your Own
     One choice you will have to make is whether you should select a standard protocol and implement it completely, or come up with your own protocol, which may be similar to a standard one, but doesn't conform exactly or even closely. On the one hand, it is nice to use standard protocols because all of the contingincies have already been thought out and you know that as long as you implement it correctly, there won't be any surprises such as unrecoverable states or unexplained data loss.
     The disadvantage to this is that standard protocols are often too complex for the typical microcontroller application. Much of X.25, for example, is devoted to handling the various states of connection. While the basic framework is an excellent protocol for simple RF links, the extra baggage adds overhead and code complication. In some circumstances, there may be particular features of your microcontroller or other hardware that you can exploit, too, if you don't use a standard protocol (let's say you just happen to have a 10 bit UART and could fit some of the overhead in the upper bits of each UART word).
     In general, the best strategy is to read about various protocols, see what ideas others have used, and then either use a slightly modified standard or a totally new protocol based on the ideas and concepts you learned in studying what is already out there.
BACK   Table of Contents   
NEXT