... connecting the future...

     Intern

    

 

A contemplation, that the software-author represents in the following article, over the Routing, describes for each Sysop, which is important for a well working node and is to be believed which peculiarities with the Parameters of the Digis, descriptive and plastic. Jimy, DL1GJI, explains on this occasion also peculiarities his XNet-Software regarding the improved Routings.

From the content:

  1. L3-Routing
    1. Expenses-function
    2. Expenses-investigation
    3. Ways forget, route Timeout,
    4. Expenses-border (Count to Infinity)
    5. Divided horizon (Split horizon)
    6. Negative feedback (Poison lapels)
    7. Event updating (Triggered Updates)
    8. Router-Panic (Broadcast-Storms)
  2. Improvement of the TNN-Routings through above mechanisms
    1. Expenses-border (Count to Infinity)
    2. Negative feedback (Poison lapels)
    3. Automatic updating (Triggered Updates)
    4. Plausibility-examinations (DL1GJI)
    5. Implementation-details
      1. Quality-calculation
      2. Shadow-table
      3. Multitasking
  3. Round trip Estimation
    1. Adaptive L3 Lifetime
  4. L4 transportation-layer
    1. L4 ROUND-TRIP
    2. Dynamic L4-Timeout-Wait
    3. Adaptive L4 - TACK
  5. Literature:


Definition

Stage, Link: secured point to point connection of the layer 2. Bows, Loops: originating through inconsistencies in the Routing table.

1. L3-Routing

By Routing, one understands the search after a way of nodes A and nodes B within a network. If there are several ways, the best interests us. What is the best?

1.1 expenses-functions

In the amateur-radio, there are two essential criteria after which we can compare two ways together:

1.) Answer-time:	  

How long does an answer of the counter-station take to get to me?

2.) Throughput:	  

How many bits per second can I transfer on this path?

With the FlexNet - and NetROM-Routing, these two specifications are different, i.e., one assumes that good throughput means also a good answer-time. FlexNet assesses the different ways with theoretical terms (from 0 to 500s). In the event of multiple paths to a node, the selected path is calculated with the shortest total-term. NetROM-Routers assess the ways with a so-called quality (from 0 to 255). This total-quality is also calculated from the qualities of the stages (hops), and is somewhat complicated. The path is chosen with the highest total-quality. The FlexNet-run times are descriptive also for the users while the NetROM qualities represent a rather abstract code. The Router only uses these numbers for itself for point to point however let’s suppose there are 3 possible paths to a node, according to the minimum transfer time route or maximum quality route, is the NetROM quality the only selection sufficient to achieve the best path? If it is decided that a Flexnet route would be best then that is the chosen method.

1.2 expenses-investigations

The topic Routing is so interesting in the Packet-Network because RF is subject to permanent fluctuations. Links used at their maximum run well once, bad once, or they are cancelled completely. Sometimes nodes are turned off new nodes arrive etc. Because nobody finds the time and the desire to configure the network again, the routing must take place automatically. Also the data information must exist for routing, i.e., the terms or the qualities of the stages automatically are determined. FlexNet and TheNetNodes use specifically defined "Round-Trip-Time" in each case, RTT-Packet about the term or the quality of a stage permanently too presumptuous. The basis is created with both methods to choose the optimal path amongst the best routes.

1.3 ways forget, route Timeout,

With NetROM, a path must be confirmed to a node again and again (Nodes Broadcasts). Otherwise, NetROM forgets the way after an hour. Only the stable ways are permanently entered in by the direct neighbors and remain. This forgetting of a path after an hour is also only possible with TheNetNode and still today needs to receive a node from the network again. However, through "reverberation-effects", dropping a path can last up to ten hours until a relaxed node vanishes completely from the network. Through a meaningful expenses-waiter-border, one can moderate this cosmetic characteristic:

1.4 expenses-frontiers (Count to Infinity)

Which nodes should they take into account with the Routing at all? Always, worldwide, how are all nodes in the destination or Nodes-List represented? If the pure Connect already lasts 10 minutes, the existence of this node is no longer interesting. Therefore, there is an expenses-border in the network, the node-information is relayed. With FlexNet, this border is the total-term of 500 milliseconds. With TNN, it is the minimum-quality (MinQuality) parameter. This expenses-border has however also another important function: because in the network if a node vanishes, this node should immediately be purged from the node-list…Theoretically yes. That is not practical so: one sees that the node remains in the network, merely the quality sinks or the term ascends. Overstep the expenses-border the node, it finally becomes invisible. It formulates differently: give it no expenses-waiter-border, zombie-nodes would create a network with craziness-terms or mini-qualities in the Routing-Table to have large numbers.

1.5 Divided horizons (Split horizon)

The principle of the "divided horizon" means that one returns the node-information, that one doesn't get from the one part-network, into this part of the network but relays information to the other networks. One always transports the Routing-Information direct to the front. With network-topologies inheriting this principle is enough to hold crisp information freely about the Routing. The amateur-radio-network has become extremely strong. Sharpen-freedom is not to be reached with this principle alone. Negative feedback is to be solved a method about the problem better.

1.6 negatives feedback (Poison lapels)

The principle of the "negatives feedback" goes like "Split Horizon" from similar considerations. While directed Split Horizon "the Routing information only goes to the front", Poison lapels "send back" the information as well but negatively. In NetROM the front the positive Qualities are reported, in backward directional, the Quality is sent as 0.

1.7 Event updating (Triggered Updates)

The idea is clear, the faster the Routing-Information is relayed, the more quickly the network responds. Alterations in the Routing-Table should be relayed as quickly as possible, however the next problem immediately originates if one makes use of this method too often creating Broadcast Storming.

1.8 Router-Panics (Broadcast-Storms)

Does the Router transmit information too often and too quickly with updated data? Then it can be said that "Broadcast Storms happen. If an important stage is cancelled in the network, all nodes only devote time to the passing of updated routes for seconds. There must be borders for these updating. A possibility is that updating can take place in a minute second at best.

2. Improvement of the TNN-Routings through above mechanisms

The following describes the like from the above general concept in the node-software of XNet. The compatibility of other protocols to existing NetROM remains 100%.

2.1 expenses-frontiers (Count to Infinity)

XNet guarantees that the quality of a node sinks per stage about at least 5 points. The normal quality-calculation, which sinks the quality again, is also made. If for example we put MinQual on 100 everywhere, one can guarantee that the node is not spread further than 31 hop, 5 * 31 = 155. If a node is cancelled, the energy is withdrawn from the "reverberation-effect" and the node vanishes more quickly under the MinQuality.

2.2 negatives feedback (Poison lapels)

This principle can be reached ideally with the existing NetROM-Protocol. However not with the station transmitting, as above described but with the receiving station.

Example: that XNet Digi DB0SIG gets the following NetROM-Routing Broadcast from the neighbor HB9AK:

= >monitor -u +4 
4:fm HB9AK to DB0SIGS via DB0BAX * ctl UI ^ pid CF 
BC SARTG :HB9AK signature: FF [238] 
GOE :DB0GOE via DB0EAM 199 
GS :DB0GSH via DB0SIG 178 

DB0SIG recognizes that HB9AK the node DB0GSH over DB0SIG route. DB0SIG now puts the Quality of DB0GSH in the reverse conclusion over HB9AK on 0:

= >r d 
DB0GOE HB9AK 199 
DB0GSH DB0LBG-7 200 
DB0GSH HB9AK 0 

2.3 Event updating (Triggered Updates)

Every 10 minutes XNet performs a Broadcast of the entire Routing table. Alterations of the Qualities submitted to the network in the meantime that bigger than the set MinQual are added. If nodes are not reached or suddenly become bad, qualities will have plausibility-examinations (DL1GJI) in the event-controlled Broadcast is distributed the NetROM-Broadcasts to all nodes in the same form in principle. Smart routing also is incorporated. If node A broadcasts that it hears node B with a higher quality than how XNET hears node B in a direct path, XNet reacts in that the right quality of node B it sends out with help of the event-controlled updating and removes the Quality size with it. At the moment, these information is simply discarded with the TNN - however exactly with this Plausi-Check has recognized itself with the NetROM-Protocol tie. This case becomes in the on-line-protocol from XNet the Sysop as follows told:

= >log 
Log-Messages 
Router: Quality of DB0EAM (220) from HB9AK via me (219) too high 

2.5 implementation-details

XNet works with a Routing table sorted after Call, that essentially contains the Call of the nodes and a referral at the current best Link. In order to refer an L3-Packet, only a binary search is necessary to log N with it after the Time-Call, in order to determine on the Link how the packets trip continues. TNN clatters the complete Nodes-List through n/2 to this in each case.

2.5.1 quality-changes

At the moment, it looks with TNN so: the Link quality is measured by the L3RTT. This result is written down into the Routing table. Problem: if the Link now subsides, we have the wrong information for 5 to 10 minutes. If the Link becomes so bad, that also the Broadcasts are no longer transferred, this wrong-piece of information lasts up to an hour (Route-Timeout-time). With XNet: the Link quality is measured by the L3RTT. The Broadcasts received, but in the original ways stored, as well as, updated. Every 20 seconds in the background, a new Routing-Table from the Link qualities and the present Broadcasts are calculated are stored. Lose a Link now; all nodes that emerge over the path of this Link have their qualities dropped to 0 immediately, and it drops routing for nodes over this Link. If they are found over alternative routes then the best route is chosen or if no routes are available, they fall from the Routing-Table. In the event the Link is restored, the same can take place in the other direction: the Quality of the Nodes increases itself again and routing is as was. Its not necessary for the reception of a new Broadcast to change it, because the old Broadcast is possibly still there, up to 1h in memory.

2.5.2 shadow-spreadsheets

Producing of the Routing table takes place through a background-process as to not halt the Digi during the updating of the Routing table. XNet works with two tables: the actual Routing table and a shadow-table for the new evaluation. The tables are exchanged after a new evaluation has taken place, in each case, 2 pointers. The execution of processes and the updating of the Routing-Table happen in parallel consequently.

In that one compares the node-qualities in the old and the new table, one can determine differences and gets updated with the list of the nodes that have event-controlled, Triggered Updates without halting other processes within the Digi making the operation transparent.

2.5.3 Multitasking

The individual layers in each case or several processes is allotted. The synchronization of the processes takes place through semaphores and through coordination-counters. High priority processes, like for example the HDLC-Process of the die time control are running continuously without being stopped, all other Processes will be activated ("triggered") by request and stopped again when finished. The statement "process switch" in XNet statistics means like often the HDLC-Process will immediately go through. This statement is comparable with the "Rounds per Second" in the TNN.

Value Now min Max 

process switch [hz] | 6033 | 2675 | 6235 | - | 

The statement / Timer accuracy "

timer accuracy | 60000 | 60000 | 60015 | - | 

state exactly how the internal Software-Timers work. In the above case, the maximum postponement amounted to 15 ms in 7 working-days. In the DOS-versions from XNet often sees essentially higher values that originate through hard disk-access.

3. Round trip Estimation

The "Round Trip" measurements from the KA9Q-kernel is implemented in most Packet-Programs today. With the newer implementations, as XNet, is viewed not only the average-value but also the scatter of the values around the average. The estimated value for strongly fluctuated RTTs is essentially higher than the estimated value for virtually constant RTTs. This procedure fits itself in the practice essentially better the realities on the QRG at.

3.1 adaptive L3 Lifetime

Exactly in the case where the Routing is not correct one time and bows come into existence, high values of the L3RTT-Lifetime of too incredibly high network worries begin if the packets with maximum speed rotate through several nodes in the circle through. The standard-value of 30 hop is already very high. XNet calculates the Lifetime from the simple formula

lifetime = min(l3lifetimeparameter(280 – Node quality)/5, 

become in the most decrease with a Lifetime < 30 connects.

4. L4 transportation-layer

4.1 L4 ROUND-TRIP

Exactly like in the L2, Time (RTT) can "be measured" in Round trip on L4. This time doesn't need is determined for each Connect however again but its stored as attribute with the respective goal-node and is updated with each Connect to this goal-node. If the Digi now runs longer time, it knows the times to the different goals little by little.

= >r n 
3NETNF DL1GWX-7S 6 1616 
CHARLY DL1GWX-11 6 3612 
Lille F8KOT 3 50000 <= default RTT 
ZHGATE HB9AB-10 5 3560 
SARTG HB9AK 5 6096 
TITLIS HB9AK-14 5 3255 

Nodes with those until now still have no existing L4-RTT, one recognizes by the default-value. With this knowledge, one can determine the L4-Timeout-Time dynamically:

4.2 dynamic L4-Timeout

The L4-Timeout is similar to the FRACK-Timer in the L2. With TNN, the L4-Timeout is fixed solidly by the Sysop for all L4-transmissions. This L4-Timeout is valid from it then independently whether an L4-transmission goes exactly to the next neighbor or whether 20 hop between it lying. The L4-Timeout can be determined with help of the L4-RTT dynamically.

4.3 adaptive L4S - TACK

TACK is that on L4 layer of the T2-Timer in the L2. It quite essentially determines the throughput with a transfer. With a Transport-send window length of 10 (usual value) and a TACK-value of 4 seconds calculates itself the at most attainable L4-throughput with the formula:

BITRATE =, TWINDOW * PACLEN * 8, / TACK 
        = (10 * 236 * 8)/4
= 4720 Bit/sec 

This Bit rate can be maximized in that one reduces TACK, for example (2s). Exactly like on L2 should be fostered the TACK so however that not always for each received L4PDU (PDU = Protocol Data Unit = L4 - "pack") an L4ACK-PDU is generated, network-burden. At least for 1200 baud a value of 2 seconds is already too small since the transmission time of a maximum L4-PDU already amounts L2-Link to 2,280 sec over a 1200 baud.

For TNN, ON5ZS has proposed these improvements and has played in the source text into the boxes.

XNet uses 3 sec as TACK-time standard. In order not to restrict the most possible throughput, XNet sends after at least 5 received and not yet confirmed and processed L4-PDUs immediately a L4ACK from. Process means that the recipient-process (for example the Host mode program) who has also kept data. Working the recipient slowly, a former TACK doesn't occur with it either.

5. Literature:

Virtually all spoken to concepts are taken following two books. Exception: 2.4 plausibility-examinations, those are from me (DL1GJI).
Internetworking with TCP/IP Volume I Internetworking with TCP/IP Volume II authors: Douglas E. Comer and David L. Steven, publishing house: Prentice reverberation

JIMY, DL1GJI, 


Suggestions and comments will be accepted via e-mail to:

Jimy shearers, DL1GJI    ,

JScherer@t-online.de

Peter Stirnimann, HB9PAE,:

mailto:hb9pae@swissonline.ch

  HB9PAE 06.04.99
English Translations performed by Brian N1URO and Jacques ON4KJP