RFC528 日本語訳

0528 Software checksumming in the IMP and network reliability. J.M.McQuillan. June 1973. (Format: TXT=23152 bytes) (Status: UNKNOWN)
プログラムでの自動翻訳です。
RFC一覧
 英語原文

Network Working Group                                      J.  McQuillan
Request for Comments: 528                                        BBN-NET
NIC: 17164                                                  20 June 1973

コメントを求めるワーキンググループJ.マッキランの要求をネットワークでつないでください: 528 BBNネットのNIC: 17164 1973年6月20日

        SOFTWARE CHECKSUMMING IN THE IMP AND NETWORK RELIABILITY

悪童のソフトウェアCHECKSUMMINGとネットワークの信頼性

   As the ARPA Network has developed over the last few years, and our
   experience with operating the IMP subnetwork has grown, the issue of
   reliability has assumed greater importance and greater complexity.
   This note describes some modifications that have recently been made
   to the IMP and TIP programs in this regard.  These changes are
   mechanically minor and do not affect Host operation at all, but they
   are logically noteworthy, and for this reason we have explained the
   workings of the new IMP and TIP programs in some detail.  Host
   personnel are advised to note particularly the modifications
   described in sections 4 and 5, as they may wish to change their own
   programs or operating procedures.

アーパネットがここ数年間にわたって展開していて、IMPサブネットワークを操作する私たちの経験が成長したとき、信頼性の問題は、よりすばらしい重要性と、より大きい複雑さを持っていました。この注意は最近この点でIMPとTIPプログラムにされたいくつかの変更について説明します。それらは論理的に注目に値します、そして、これらの変化は、機械的に小さい方であり、全くHost操作に影響しませんが、この理由で、私たちは何らかの詳細における、新しいIMPとTIPプログラムの作業について説明しました。ホスト人員が特にセクション4と5で説明された変更に注意するようにアドバイスされます、それら自身のプログラムか操作手順を変えたがっているとき。

1. A Changing View of Network Reliability

1. ネットワークの信頼性の変化視点

   Our idea of the Network has evolved as the Network itself has grown.
   Initially, it was thought that the only components in the network
   design that were prone to errors were the communications circuits,
   and the modem interfaces in the IMPs are equipped with a CRC checksum
   to detect "almost all" such errors.  The rest of the system,
   including Host interfaces, IMP processors, memories, and interfaces,
   were all considered to be error-free.  We have had to re-evaluate
   this position in the light of our experience.  In operating the
   network we are faced with the problem of having to perform remote
   diagnosis on failures which cannot easily be classified or
   understood.  Some examples of such problems include reports from Host
   personnel of lost RFNMs and lost Host-Host protocol allocate
   messages, inexplicable behavior in the IMP of a transient nature,
   and, finally, the problem of crashes -- the total failure of an IMP,
   perhaps affecting adjacent IMPs.  These circumstances are infrequent
   and are therefore difficult to correlate with other failures or with
   particular attempted remedies.  Indeed, it is often impossible to
   distinguish a software failure from a hardware failure.

Network自身が成長するのに従って、Networkに関する私たちの考えは発展しました。初めは、ネットワークデザインにおける誤りに傾向がある唯一のコンポーネントがコミュニケーション回路であると考えられて、IMPsのモデム・インタフェースは、「almost all」のそのような誤りを検出するためにCRCチェックサムを備えています。 Hostインタフェース、IMPプロセッサ、思い出、およびインタフェースを含むシステムの残りがエラーのないとすべて考えられました。私たちは私たちの経験の見地からこの位置を再評価しなければなりませんでした。ネットワークを経営する際に、私たちは容易に分類できないか、理解できない失敗にリモート診断を実行しなければならないという問題に直面しています。そのような問題に関するいくつかの例が一時的性質のIMP、および最終的にクラッシュの問題に無くなっているRFNMsと無くなっているHost-ホストプロトコルの人員がメッセージ、不可解な振舞いを割り当てるHostからのレポートを含んでいます--恐らく隣接しているIMPsに影響するIMPの大失敗。これらの事情は、珍しく、したがって、他の失敗か特定の試みられた療法で関連させるのは難しいです。本当に、ハードウェアの故障とソフトウェア障害を区別するのはしばしば不可能です。

   In attempting to post-mortem crashes, we have sometimes found the IMP
   program has had instructions incorrect--sometimes just one or two
   bits picked or dropped.  Clearly, memory errors can account for
   almost any failure, not only program crashes but also data errors
   which can lead to many other syndromes.  For instance, if the address
   of a message is changed in transit, then one Host thinks the message
   was lost, and another Host may receive an extra message.  Errors of
   this kind fall into two general classes: errors in Host messages,

死後クラッシュへの試みでは、私たちは、時々IMPプログラムで指示が不正確になったのがわかりました--ちょうど時々1か選ばれるか、または下げられた2ビット。明確に、メモリ誤りは他の多くのシンドロームにつながることができるプログラムクラッシュにもかかわらず、データ誤りだけではなくも、ほとんどどんな失敗も説明できます。例えば、トランジットでメッセージのアドレスを変えるなら、1Hostが、メッセージが失われたと考えます、そして、別のHostは付加的なメッセージを受け取るかもしれません。この種類の誤りは2つの一般的なクラスになります: Hostメッセージにおける誤り

McQuillan                                                       [Page 1]

RFC 528             SOFTWARE CHECKSUMMING IN THE IMP        20 June 1973

悪童1973年6月20日のマッキラン[1ページ]RFC528ソフトウェアCHECKSUMMING

   whether in the control information or the data, and errors in inter-
   IMP messages, primarily routing update messages.  In the course of
   the last few years, it has become increasingly clear that such errors
   were occurring, though it was difficult to speculate as to where,
   why, and how often.

制御情報かデータと、相互IMPメッセージにおける誤り、主としてルーティングアップデートメッセージにかかわらず。ここ数年の間に、そのような誤りが発生していたのはますます明確になりました、どれくらいの頻度でどこ、なぜに関して推測するかが難しかったのですが。

   One of the earliest problems of this kind was discovered in 1971.
   The Harvard IMP was sometimes crashing in an unknown manner so that
   all the other IMPs were affected.  It was finally determined that its
   memory was faulty and sometimes the routing messages read out from
   memory by the modem output interfaces were all zeroes.  The adjacent
   IMPs interpreted such an erroneous message as stating that the
   Harvard IMP had zero delay to all destinations -- that it was the
   best route to everywhere! Once this information propagated to the
   other IMPs, the whole network was in a shambles.  The solution to
   this problem was to generate a software checksum for each routing
   message before it was sent from one IMP, and to check it after it was
   received at the other IMP.  This software checksum, in addition to
   the hardware checksum of the circuit, checks the modem interfaces and
   memories at each IMP, and protects the IMPs from erroneous routing
   information.  The overhead in computing these checksums is not great
   since the messages are only exchanged every 2/3 of a second.

この種類の最も初期の問題の1つは1971年に発見されました。ハーバードIMPが未知の方法で時々ダウンしていたので、他のすべてのIMPsが影響を受けました。記憶が不完全であったのが、最終的に断固としていて、時々、モデム出力インタフェースによってメモリから読みだされたルーティング・メッセージはすべてゼロでした。隣接しているIMPsはそれがいたる所への最も良いルートであったというハーバードIMPがすべての目的地にゼロ遅延を持っていたと述べるような誤ったメッセージを解釈しました! この情報がいったん他のIMPsに伝播されると、全体のネットワークは修羅場でした。この問題への解決が1IMPからそれを送る前に各ルーティング・メッセージのためにソフトウェアチェックサムを生成することであり、次々とそれをチェックするのはもう片方のIMPで受け取られていました。回路のハードウェアチェックサムに加えて、このソフトウェアチェックサムは、各IMPでモデム・インタフェースと思い出をチェックして、誤ったルーティング情報からIMPsを保護します。これらのチェックサムを計算することにおけるオーバーヘッドは、メッセージを交換するだけであるので、すばらしくはありません。1秒のあらゆる2/3。

   In the first few months of 1973, we began to have a great deal of
   trouble with the reliability of some IMPs, especially these in the
   Washington area.  The normal procedures of calling in and working
   with Honeywell field engineers had not cleared up several of these
   persistent failures, and it was felt that an escalation of BBN
   involvement was needed to identify the exact causes of the problems.
   Therefore, during much of February and March there were one or more
   members of the staff at various sites in the network where hardware
   problems were suspected.  The first thing we found out was that the
   operational IMP program did not give enough diagnostic information
   about failures when they occurred, and that the available test
   programs did not detect errors frequently enough to justify their
   use.  That is, the errors were appearing at rather low frequency,
   from once every few hours to once every few days, compared to message
   rates of once a second or faster.  Therefore, we decided to try to
   make the operational IMP program run when it could, and report more
   information about detected hardware errors, rather than keep the
   failing IMPs off the network for days at a time.

1973年のわずかな1カ月目で、私たちはいくつかのIMPs(特にワシントン地域のこれら)の信頼性の多くの苦労をし始めました。 BBNかかわり合いの増大が、問題の正確な原因を特定するのに必要でした。技術者がこれらの永続的な失敗、およびそれの数個にクリアしていなかったハネウェル分野で回収して、働く正常な手順はしたがって、2月、そこの3月の多くの間ハードウェア問題が疑われたネットワークにおける様々なサイトのスタッフの1人以上のメンバーがそうであると感じられました。私たちが見つけた最初のものは起こったとき、操作上のIMPプログラムが失敗に関する十分な診断情報を与えないで、また利用可能なテストプログラムが彼らの使用を正当化できるくらいの頻繁に誤りを検出しなかったということでした。誤りがあらゆる数時間単位でかなり低い頻度で一度から一度に生じていたというあらゆる数日単位でそれは、1秒あたりの一度のメッセージレートと比較されているか、または、より速いです。したがって、私たちは、させることができたとき、操作上のIMPプログラムを動かさせようとして、失敗したIMPsをネットワークからのけるよりむしろ一度に続けて何日間も検出されたハードウェアのエラーに関する詳しい情報を報告すると決めました。

   Modifications to the IMP program had two independent goals: we wanted
   to make the software less vulnerable to hardware failures, and we
   wanted the software to isolate the failures and report them to the
   NCC.  The technique we chose to use was generating a software
   checksum on all packets as they are sent out over a line.  We
   suspected that the hardware failures in the Washington area were

IMPプログラムへの変更には、2つの独立している目標がありました: 私たちはソフトウェアをハードウェアの故障により被害を受け易くなくしたくて、ソフトウェアに失敗を隔離して、それらをNCCに報告して欲しいと思いました。私たちが使用するのを選んだテクニックは、系列の上にそれらを出すので、すべてのパケットの上でソフトウェアチェックサムを生成したことです。私たちは、ワシントン地域でのハードウェアの故障がそうであると疑いました。

McQuillan                                                       [Page 2]

RFC 528             SOFTWARE CHECKSUMMING IN THE IMP        20 June 1973

悪童1973年6月20日のマッキラン[2ページ]RFC528ソフトウェアCHECKSUMMING

   happening between IMPs, that is, the packets were correct before they
   were sent.  Thus, a memory-to-memory software checksum, similar to
   the technique installed two years before for routing messages only,
   should be able to detect these errors.  On March 13, a new version of
   the IMP program was released with software checksum code.  In this
   program, when a packet is found to have an incorrect checksum it is
   discarded, and a copy of the data is sent to the NCC.  The previous
   IMP retransmits the packet, since an acknowledgment is not returned.

IMPsの間で起こって、それらを送る前にすなわち、パケットは正しかったです。したがって、メモリからメモリへのソフトウェア2年前ルーティング・メッセージだけのためにインストールされたテクニックと同様のチェックサムはこれらの誤りを検出できるべきです。 3月13日に、IMPプログラムの新しいバージョンはソフトウェアチェックサムコードでリリースされました。パケットが不正確なチェックサムを持っているのがわかっているとき、このプログラムでは、それは捨てます、そして、データのコピーをNCCに送ります。承認が返されないので、前のIMPはパケットを再送します。

   A partial list of the hardware problems that were uncovered by
   software checksums, and subsequently fixed, includes:

ソフトウェアチェックサムによって発見されて、次に修正されたハードウェア問題の部分的なリストは:

      *  One modem interface at the Aberdeen IMP dropped several bits
         from several successive words in transferring data into memory.

* アバディーンIMPの1つのモデム・インタフェースが、データをメモリに移しながら、いくつかの連続した単語から数ビットをちょっと立ち寄らせました。

      *  One modem interface at the Belvoir IMP picked one or two bits
         in a single word in transferring data into memory.

* Belvoir IMPの1つのモデム・インタフェースがデータをメモリに移す際に1ビットか2ビットを一語で選びました。

      *  One modem interface at the ETAC TIP dropped the first word in
         transferring data out of memory.

* ETAC TIPの1つのモデム・インタフェースが、メモリからデータを移しながら、最初の単語をちょっと立ち寄らせました。

      *  A region in memory at the Utah IMP changed the low order two
         bits in some words on an irregular basis.

* ユタIMPのメモリの領域は不規則ベースに関するいくつかの単語で下位を2ビット変えました。

   Each of these problems resulted in two or three detected errors per
   day.  There were other problems that were not detected by the
   software checksum, such as dropped interrupts.  This set of problems
   may be explained by the electronics of the high-speed DMC on 316
   IMPs.  The first three machines cited above are 316 IMPs with 3 modem
   interfaces, and they are the only such machines in the network.  The
   third interface is in a separate drawer and the total bus length
   seems to be too long for the driving electronics in the original
   design.  We are presently investigating various ways to fix these
   problems, and have had some success already.

それぞれのこれらの問題は1日あたり2か3つの検出された誤りをもたらしました。ソフトウェアチェックサムによって検出されなかった下げられた中断などの他の問題がありました。このセットの問題は316IMPsの上の高速DMCのエレクトロニクスによって説明されるかもしれません。上で引用された最初の3台のマシンが3つのモデム・インタフェースがある316IMPsです、そして、それらはそのようなものがネットワークで機械加工する唯一です。別々の引き出しの中に3番目のインタフェースがあります、そして、総バスの長さは当初の設計における運転するエレクトロニクスには長過ぎるように思えます。私たちは、現在これらの問題を修正する様々な方法を調査していて、既に何らかの成功を持ちました。

2. An End-to-End Software Checksum on Packets

2. パケットの上の終わりから終わりへのソフトウェアチェックサム

   This last experience, and the earlier checksum on routing messages,
   proved the value of a software checksum on all inter-IMP
   transmissions.  We have decided to extend the checksum to detect
   intra-IMP failures as well, and make software checksums on all
   network transmissions a permanent feature of the IMP system.  We can
   obtain an end-to-end software checksum on packets, without any time
   gaps, as follows:

この最後の経験、およびルーティング・メッセージの以前のチェックサムはすべての相互IMPトランスミッションのときにソフトウェアチェックサムの値を立証しました。私たちは、また、イントラ-IMPの故障を検出して、すべてのネットワーク送信でのソフトウェアチェックサムをIMPシステムの永久的な特徴にするようにチェックサムを広げると決めました。私たちは以下の通りパケットの上で少しも時間ギャップなしで終わりから終わりへのソフトウェアチェックサムを得ることができます:

McQuillan                                                       [Page 3]

RFC 528             SOFTWARE CHECKSUMMING IN THE IMP        20 June 1973

悪童1973年6月20日のマッキラン[3ページ]RFC528ソフトウェアCHECKSUMMING

          +--------+        +--------+        +---------+
          |  IMP  2|--------|3 IMP  4|--------|5  IMP   |
          |   1    |        |        |        |    6    |
          +---|----+        +--------+        +----|----+
              |                                    |
          +---|----+                          +----|----+
          |        |                          |         |
          |  Host  |                          |  Host   |
          +--------+                          +---------+

+--------+ +--------+ +---------+ | 悪童2|--------|3 悪童4|--------|5悪童| | 1 | | | | 6 | +---|----+ +--------+ +----|----+ | | +---|----+ +----|----+ | | | | | ホスト| | ホスト| +--------+ +---------+

      *  A checksum is computed at the source IMP for each packet as it
         is received from the source Host. (interface 1)

* チェックサムは各パケットのためにソースHostからそれを受け取るようにソースIMPで計算されます。 (インタフェース1)

      *  The checksum is verified at each intermediate IMP as it is
         received over the circuit from the previous IMP. (interfaces 3
         and 5)

* チェックサムは前のIMPからそれを回路の上に受け取るようにそれぞれの中間的IMPで確かめられます。 (インタフェース3と5)

      *  If the checksum is in error, the packet is discarded, and the
         previous IMP retransmits the packet when it does not receive an
         acknowledgment. (interface 2 and 4)

* チェックサムが間違っているなら、パケットは捨てられます、そして、承認を受けないとき、前のIMPはパケットを再送します。 (インタフェース2と4)

      *  The previous IMP does not verify the checksum before the
         original transmission, to cut the number of checks in half.
         But when it must retransmit a packet it does verify the
         checksum.  If it finds an error, it has detected an intra-IMP
         failure, and the packet is lost.  If not, then the first
         transmission was lost due to an inter-IMP failure, a circuit
         error, or was simply refused by the adjacent IMP.  The previous
         IMP holds a good copy of the packet, which it then retransmits.
         (interface 2 and 4)

* 前のIMPは、オリジナルのトランスミッションの前にチェックについて半分に数を切るためにチェックサムについて確かめません。しかし、パケットを再送しなければならないとき、それはチェックサムについて確かめます。間違いを見つけるなら、イントラ-IMPの故障を検出しました、そして、パケットは無くなっています。そうでなければ、そして、最初のトランスミッションは、相互IMPの故障、回路誤りのため失われたか、または隣接しているIMPによって単に拒否されました。前のIMPはパケットの良いコピーを持っています。(次に、それはパケットを再送します)。 (インタフェース2と4)

      *  After the packet has successfully traversed several
         intermediate IMPs, it arrives at the destination IMP.  The
         checksum is verified just before the packet is sent to the
         Host. (interface 6)

* パケットが首尾よく数個の中間的IMPsを横断した後に、それは目的地IMPに到着します。パケットをHostに送るすぐ前にチェックサムについて確かめます。 (インタフェース6)

   This technique provides a checksum from the source IMP to the
   destination IMP on each packet, with no gaps in time when the packet
   is unchecked.  Any errors are reported to the NCC in full, with a
   copy of the packet in question.  This method answers both
   requirements stated above: it makes the IMPs more reliable and
   fault-tolerant, and it provides a maximum of diagnostic information
   for use in fault isolation.  This expanded checksum logic was
   installed in the network on June 19.

このテクニックは各パケットの上でソースIMPから目的地IMPまでチェックサムを提供します、パケットが抑制されない時代の間のギャップなしで。どんな誤りもパケットのはっきりしていないコピーで全部のNCCに報告されます。このメソッドは以下の上に述べられた両方の要件に答えます。それは、より信頼できるIMPsとフォールトトレラントを作ります、そして、欠点分離における使用のための最大診断情報を提供します。この拡張チェックサム論理は6月19日にネットワークにインストールされました。

   On of the major questions about such approaches is their efficiency.
   We have been able to include the software checksum on all packets
   without greatly increasing the processing overhead in the IMP.  The

そのようなアプローチに関する重要な問題でオンであるのは、それらの効率です。 IMPで処理オーバヘッドを大いに増強しないで、私たちはすべてのパケットの上のソフトウェアチェックサムを含むことができました。 The

McQuillan                                                       [Page 4]

RFC 528             SOFTWARE CHECKSUMMING IN THE IMP        20 June 1973

悪童1973年6月20日のマッキラン[4ページ]RFC528ソフトウェアCHECKSUMMING

   method described above involves one checksum calculation at each IMP
   through which a packet travels.  We developed a very fast checksum
   technique, which takes only 2 msec per word.  The program computes
   the number of words in a packet and then jumps to the appropriate
   entry in a chain of add instructions.  This produces a simple sum of
   the words in the packet, to which the number of words in the packet
   is added to detect missing or extra words of zero.  With the
   inclusion of this code, the effective processor bandwidth of a 516
   IMP is reduced by one-eighth for full-length store-and-forward
   packets, from a megabit per second to 875 kilobits per second.  That
   is, the IMP now has the processing capability to connect to 17 full
   duplex 50 kilobit per second lines, as compared to 20 such lines
   without the checksum program.  We are aware that this add checksum is
   not a very good one in terms of its error-detecting capabilities, but
   it is as much as the IMP can afford to do in software.  Furthermore,
   we emphasize that the primary goal of this modification is to assist
   in the remote diagnosis of intermittent hardware failures.

上で説明されたメソッドはパケットが移動する各IMPの1つのチェックサム計算を伴います。私たちは非常に速いチェックサム技術を見いだしました。(それは、1単語あたりのmsecに2だけを取ります)。プログラムはaの適切なエントリーまでパケットと次に、ジャンプが鎖を作るコネが指示を加えるという単語の数を計算します。これはパケットでの単語の簡単な合計を生産します。(パケットのワード数は、ゼロのなくなったか付加的な単語を検出するためにパケットに加えられます)。このコードの包含によると、516IMPの有効なプロセッサ帯域幅は等身大の店とフォワードパケットのために1/8減少します、1秒あたり1秒あたり1つのメガビットから875のキロビットまで。すなわち、IMPには、現在、1セカンドラインあたりの50キロビットを17全二重に関連づける処理能力があります、チェックサムプログラムなしでそのような20の系列と比べて。私たちがこれが、チェックサムが誤りを検出する能力に関する非常に良いものでないと言い足すのを意識していますが、それはIMPにはソフトウェアでする余裕があるのと同じくらい多いです。その上、私たちは、この変更のプライマリ目標が間欠ハードウェアの故障のリモート診断を助けることであると強調します。

3. Checksumming to Improve the Reliability of Routing

3. ルート設定の信頼性を改良するために、Checksummingします。

   We mentioned earlier the catastrophic effects that follow for the
   Network as a whole when a single IMP begins to propagate incorrect
   routing information.  The experience described above involved a
   specific memory failure which has not recurred in the last two years,
   but the problem is easily understood to be of a general nature.  In
   fact, we recently had another network-wide failure that was traced to
   a hardware error that resulted in erroneous routing messages, after
   we had installed a software checksum on all inter-IMP transmissions.
   The problem we had were due to a single broken instruction in the
   part of the IMP program that builds the routing message.  As a
   result, the routing messages from that IMP were random data, and the
   neighboring IMPs interpreted these messages as routing update
   information.  When this happened, traffic flow through the Network
   was completely disrupted and no useful work could be done until the
   failed IMP was halted.

私たちは、より早く独身のIMPが不正確なルーティング情報を伝播し始めるとき全体でNetworkのために続く壊滅的な効果について言及しました。上で説明された経験はここ2年間再発していない特定のメモリ障害にかかわりましたが、問題が一般的に自然であることが容易に理解されています。事実上、私たちには、最近、誤ったルーティング・メッセージをもたらしたハードウェアのエラーにたどられた別のネットワーク全体の失敗がありました、私たちがすべての相互IMPトランスミッションにソフトウェアチェックサムをインストールした後に。ルーティング・メッセージを築き上げるIMPプログラムの部分でのただ一つの中断した指示に支払われるべきものがあったなら私たちが持っていた問題。その結果、そのIMPからのルーティング・メッセージは無作為のデータでした、そして、隣接しているIMPsはアップデート情報を発送するとこれらのメッセージを解釈しました。これが起こったとき、Networkを通した交通の流れを完全に混乱させました、そして、失敗したIMPが止められたとき、初めて、実質的な仕事が全くできました。

   This kind of problem, the introduction of incorrect routing
   information into the Network, can happen in three ways:

この種類の問題(Networkへの不正確なルーティング情報の導入)は3つの方法で起こることができます:

      *  The routing message is changed in transmission.  The inter-IMP
         checksum should catch this.  The bad routing messages we saw in
         the Network had good checksums.

* トランスミッションでルーティング・メッセージを変えます。相互IMPチェックサムはこれを捕らえるべきです。 Networkの私たちが見た悪いルーティング・メッセージは良いチェックサムを持っていました。

      *  The routing message is changed as it is constructed, say by a
         memory or processor failure, or before it is transmitted.  This
         is what we termed above an intra-IMP failure.

* ルーティング・メッセージをそれを組み立てるとき変えるか、メモリかプロセッサの故障で言うか、またはそれの前に伝えます。これは私たちがイントラ-IMPの故障を超えて呼んだことです。

McQuillan                                                       [Page 5]

RFC 528             SOFTWARE CHECKSUMMING IN THE IMP        20 June 1973

悪童1973年6月20日のマッキラン[5ページ]RFC528ソフトウェアCHECKSUMMING

      *  The routing program is incorrect for hardware or software
         reasons.

* ルーティングプログラムはハードウェアかソフトウェア理由で不正確です。

   We have attempted to solve the last two kinds of problems by
   extending the concept of software checksums.  The routing program has
   been modified to build a software checksum for the routing message as
   it builds the message, just as if it came from a Host.  It is
   important that this checksum refer to the intended contents of the
   routing message, not the actual contents.  That is, the program which
   generates the routing message builds its own software checksum as it
   proceeds, not by reading what has been stored in the routing message
   area, but by adding up the intended contents for each entry as it
   computes them.  The process which sends out routing messages then
   always verifies the checksum before transmitting them.  This scheme
   should detect all intra-IMP failures.

私たちは、ソフトウェアチェックサムの概念について敷衍することによって最後の2種類の問題を解決するのを試みました。ルーティングプログラムはメッセージを築き上げるときルーティング・メッセージのためにソフトウェアチェックサムを築き上げるように変更されました、まるでまさしくHostから来るかのように。このチェックサムが実際のコンテンツではなく、ルーティング・メッセージの意図しているコンテンツについて言及するのは、重要です。ルーティングメッセージ領域に保存されたものを読むことによって続くのではなく、それらを計算しながら各エントリーに意図しているコンテンツを合計することによって続くとき、すなわち、ルーティング・メッセージを生成するプログラムはそれ自身のソフトウェアチェックサムを築き上げます。それらを伝える前に、その時ルーティング・メッセージを出すプロセスはいつもチェックサムについて確かめます。この体系はすべてのイントラ-IMPの故障を検出するべきです。

   Finally, the routing program itself can be checksummed to detect any
   changes in the code.  The programs which copy in received routing
   messages, compute new routing tables, and send out routing messages
   each calculate the checksum of the code before executing it.  If the
   program finds a discrepancy in the checksum of the program it is
   about to run, it immediately requests a program reload from an
   adjacent IMP.  These checksums include the checksum computation
   itself, the routing program and any constants referenced.  This
   modification should prevent a hardware failure at one IMP from
   affecting the Network at large by stopping the IMP before it does any
   damage in terms of spreading bad routing.  A version of the IMP
   program with this added protection for routing was released on May
   22.

最終的に、コードにおけるどんな変化も検出するためにルーティングプログラム自体をchecksummedされることができます。それを実行する前に、受信されたルーティング・メッセージにコピーして、新しい経路指定テーブルを計算して、ルーティング・メッセージを出すプログラムはそれぞれコードのチェックサムについて計算します。プログラムがそれが動かそうとしているプログラムのチェックサムにおける食い違いに当たるなら、それはすぐに、隣接しているIMPからプログラム再ロードを要求します。これらのチェックサムはそれ自体、ルーティングプログラム、およびどんな定数も参照をつけたチェックサム計算を含んでいます。この変更は、1IMPでのハードウェアの故障が全体のNetworkに影響するのを悪いルーティングを広げることに関してどんな損害も与える前にIMPを止めることによって、防ぐべきです。これがあるIMPプログラムのバージョンは、ルーティングのための保護が5月22日にリリースされたと言い足しました。

   In the first few months of 1973, there have been several other
   efforts aimed at improving the reliability of the Network, in
   addition to software checksumming in the IMPs.  At the same time that
   we were discovering inter-IMP failures with the software checksum
   packets, we began to notice a different kind of problem with intra-
   IMP failures.  In these cases we were primarily faced with memory
   problems, and they often affected the IMP program itself, rather than
   the packets flowing through the IMP.  Our first attack on this
   problem was to build a PDP-1 program to verify the running IMP and
   TIP programs at a site against the correct core images held at the
   PDP-1.  The program interrogates the IMP with DDT messages, and
   prints out a list of discrepancies.  Using this program, we have
   already found memory failures at one site.

1973年のわずかな1カ月目には、Networkの信頼性を改良するのが目的とされた他のいくつかの取り組みがありました、IMPsでchecksummingされるソフトウェアに加えて。私たちがソフトウェアチェックサムパケットで相互IMPの故障を発見したのと同時に、私たちはイントラIMPの故障がある異種の問題に気付き始めました。これらの場合では、私たちは主として記憶障害に直面していました、そして、それらはしばしばIMPを通して流れるパケットよりむしろIMPプログラム自体に影響しました。この問題に対する私たちの最初の攻撃はサイトでPDP-1に保持された正しいコア・イメージに対して実行しているIMPとTIPプログラムについて確かめるためにPDP-1プログラムを組立てることでした。プログラムは、DDTメッセージでIMPについて査問して、食い違いのリストを印刷します。このプログラムを使用して、私たちは1つのサイトで既にメモリ障害を見つけました。

McQuillan                                                       [Page 6]

RFC 528             SOFTWARE CHECKSUMMING IN THE IMP        20 June 1973

悪童1973年6月20日のマッキラン[6ページ]RFC528ソフトウェアCHECKSUMMING

4. TIP Modifications

4. チップ変更

   The hardware difficulties which we began to experience during the
   first few months of 1973 had two effects on Host-to-Host
   communication.  First, the intermittent modem interface failures, of
   the type seen at Belvoir, Aberdeen, and ETAC, meant that messages
   were occasionally lost by the network.  This loss is reported to the
   transmitting Host by the "Incomplete Transmission" message generated
   by the source IMP; the Host must then decide whether to retransmit or
   to take some other action.  Second, the higher than normal incidence
   of machine failures meant that the network sometimes "partitioned" so
   that there was no path between the two communicating Hosts. (It
   should be noted that, contrary to the original design, two sites are
   currently connected to the network by only a single path; other
   similar connections are planned.  For any such sites, any failure
   along the single path will be seen as a partition.) Since a TIP acts
   as a Host for its users, its resilience when these types of failures
   occur has a major effect on user satisfaction.

1973年のわずかな1カ月目の間に陥る私たちが、始めたハードウェア苦境はHostからホストへのコミュニケーションに2回の影響を与えました。まず最初に、間欠モデム・インタフェース失敗は、メッセージが時折ネットワークによって失われたことをBelvoir、アバディーンとETACで見られたタイプに意味しました。この損失はソースIMPによって生成された「不完全なトランスミッション」メッセージによって伝わっているHostに報告されます。そして、Hostは、再送するか、またはある他の行動を取るかを決めなければなりません。 2番目に、Hostsを伝える2つの間には、経路が全くなかったように時々「仕切られた」ネットワークはマシンの故障の標準の発生がそれを意味したより高いです。 (2つのサイトが現在当初の設計とは逆にただ一つの経路だけによってネットワークにつなげられることに注意されるべきです。他の同様の接続は計画されています。どんなそのようなサイトにおいても、ただ一つの経路に沿ったどんな失敗もパーティションと考えられるでしょう。) TIPがユーザのためのHostとして機能するので、これらのタイプの失敗が起こると、弾力はユーザ満足に主要な影響を与えます。

   Prior to this time the TIP program "aborted" the user's connection if
   it received an Incomplete Transmission indication from the IMP
   program.  In March the TIP program (and the programs of several other
   Hosts) was changed to retransmit messages for which the Incomplete
   Transmission indication was returned; some Hosts (e.g. MULTICs) have
   done this from the start.  This modification has turned out to be
   relatively simple, and we urge other Hosts to consider implementing
   some sort of error recovery software.  On the other hand, it has not
   seemed reasonable to continue attempting to transmit when the program
   receives a "Destination Unreachable" indication, since this could
   arise either from a network partition or from a failure at the
   destination site.  The interactive user is, of course, free to try
   again manually.

今回以前、IMPプログラムからIncomplete Transmission指示を受けたなら、TIPプログラムはユーザの接続を「中止になりました」。 3月に、Incomplete Transmission指示が返されたメッセージを再送するために、TIPプログラム(そして、他の数個のHostsに関するプログラム)を変えました。いくつかのHosts(例えば、MULTICs)が始めからこれをしました。この変更は比較的簡単であると判明しました、そして、私たちはある種のエラー回復ソフトウェアを実装すると考えるよう他のHostsに促します。他方では、プログラムが「目的地手の届かない」指示を受けるとき、伝わるのを試み続けているのは妥当に思えませんでした、これがネットワークパーティションか目的地サイトでの失敗から起こることができたので。対話的なユーザはもちろん自由に手動で再試行できます。

   A different situation pertains to tape transfers involving TIPs with
   the magnetic tape option.  In these cases, the user would like to
   start the process and then ignore it until the transfer is finished.
   Network partitions, even if infrequent, may occur when tape transfers
   many hours in length are in progress.  Therefore, we made a
   significant modification to the TIP magnetic tape option to include a
   sequencing mechanism in the tape transfer protocol which permits
   automatic recovery and transmission continuation after most kinds of
   network transients.  With this mechanism in effect, and assuming a
   tape is mounted at the "other end", the complete transfer of a tape
   is possible with a single command given at either end.  If the
   connection goes dead in mid-transfer, the TIP magnetic tape software
   will attempt to reopen the connection until successful and then
   continue the transfer from where it was left off.  In addition to
   modifying the TIP magnetic tape option as specified above, we also

異なった状況は磁気テープオプションにTIPsにかかわるテープ転送に関係します。これらの場合では、転送が終わるまで、ユーザは、プロセスを始めて、次に、それを無視したがっています。何時間も長さにおけるテープ転送が進行しているとき、珍しくても、ネットワークパーティションは起こるかもしれません。したがって、私たちは、ほとんどの種類のネットワーク過渡現象の後に自動復旧とトランスミッションに継続を可能にするテープ転送プロトコルに配列メカニズムを含むようにTIP磁気テープオプションへの重要な変更をしました。この有効なメカニズムとテープがあると仮定するのにおいて、「他の終わり」に取り付けられるテープの完全な転送はどちらの終わりにもただ一つのコマンドを与えていて可能です。接続が中間の転送でばかになると、TIP磁気テープソフトウェアは、接続をうまくいくまで再開させて、次に、それがやめられたところから転送を続けているのを試みるでしょう。上で指定されるとしてTIP磁気テープオプションを変更する私たちもに加えて

McQuillan                                                       [Page 7]

RFC 528             SOFTWARE CHECKSUMMING IN THE IMP        20 June 1973

悪童1973年6月20日のマッキラン[7ページ]RFC528ソフトウェアCHECKSUMMING

   modified the TENEX program which is able to communicate with the TIP
   magnetic tape option so that it remained compatible.  These changes
   were installed in April.

互換性があったままで残ったようにTIP磁気テープオプションで交信できるTENEXプログラムを変更しました。これらの変化は4月にインストールされました。

5. Future Plans

5. 将来のプラン

   We have been considering some of the issues of network reliability
   discussed above in connection with the development of the new High
   Speed Modular IMP.  This design effort and the experiences with the
   current IMP system are, of course, linked together, and we have
   already decided on several approaches to be taken in the new line of
   IMPs:

私たちは上で新しいHigh Speed Modular IMPの開発に関して議論したネットワークの信頼性の問題のいくつかを考えています。現在のIMPシステムのこのデザイン取り組みと経験はもちろん結びつけられます、そして、私たちはいくつかの接近のときにIMPsの復帰改行で取られると既に決めました:

      *  The IMP will have a hardware CRC checksum generator which
         returns the checksum on a specified range of memory.

* IMPは指定された範囲に関するメモリにチェックサムを返すハードウェアCRCチェックサムジェネレータを持つでしょう。

      *  The IMP will use this facility to generate and check an end-
         to-end checksum on messages.  This checksum will therefore be
         more comprehensive and better for error detection than the
         current software checksum.  It will insure a high degree of
         reliability for Host transmissions.

* IMPは、メッセージで終わりまでの終わりのチェックサムを生成して、チェックするのにこの施設を使用するでしょう。したがって、このチェックサムは、現在のソフトウェアチェックサムより誤り検出にはさらに包括的であって、より良くなるでしょう。それはHostトランスミッションのために高度合いの信頼性を保障するでしょう。

      *  In addition, the IMP will perform a verification of a packet
         checksum at each hop to provide diagnostic information.  This
         check will be on an optional basis, whenever the system has
         available resources for the check.

* さらに、IMPは、診断情報を提供するために各ホップでパケットチェックサムの検証を実行するでしょう。システムにチェックのための利用可能資源があるときはいつも、このチェックは任意ベースにあるでしょう。

      *  The code for the new IMP system will be read-only (this is
         impractical for the present 516 and 316 IMPs), and the program
         will periodically checksum itself using the hardware CRC
         generator.  We hope to design the program so that it can be
         reloaded in segments in the event of a detected error in the
         code, with no service interruption.

* 新しいIMPシステムのためのコードは書き込み禁止(現在の516と316IMPsに、これは非実用的である)でしょう、そして、プログラムはハードウェアCRCジェネレータを使用することでチェックサム自体を定期的に望んでいます。私たちは、コードにおける検出された誤りの場合、セグメントでそれを再び積むことができるようにプログラムを設計することを望んでいます、停電なしで。

      *  Finally, we are looking into the structure of an optional IMP-
         Host/Host-IMP checksum to complete Host/Host end-to-end
         checksum.  Under such an arrangement, the IMP and Host could
         agree to verify the checksums on the messages transferred over
         the interface between them, and the appropriate signalling
         mechanisms would be provided to handled errors.  With this
         technique in effect, two Hosts could be certain that their
         messages were delivered error-free or else they would be
         notified of an error, and could then retransmit their message
         if desired.

* 最終的に、私たちは、ホストの終わりからHost/終わりへのチェックサムを完成するために任意のIMPホストホスト/IMPチェックサムの構造を調べています。そのようなアレンジメントで、IMPとHostは、それらの間のインタフェースの上に移されたメッセージでチェックサムについて確かめるのに同意できました、そして、適切な合図メカニズムを扱われた誤りに提供するでしょう。このテクニックが有効である場合、2Hostsが彼らのメッセージがエラーのない状態で提供されたか、またはそれらが誤りについて通知されるのを確信しているかもしれません、そして、次に、望まれているなら、それらのメッセージを再送するかもしれません。

McQuillan                                                       [Page 8]

RFC 528             SOFTWARE CHECKSUMMING IN THE IMP        20 June 1973

悪童1973年6月20日のマッキラン[8ページ]RFC528ソフトウェアCHECKSUMMING

         More details on any such modifications to the IMP and to the
         IMP-Host interface will be published when appropriate.

適切であるときに、IMPと、そして、IMP-ホスト・インターフェースへのどんなそのような変更に関するその他の詳細も発表されるでしょう。

             [This RFC was put into machine readable form for entry]
               [into the online RFC archives by Via Genie 12/1999]

[このRFCはエントリーのためのマシンに入れられた読み込み可能なフォームでした][Via Genie12/1999によるオンラインRFCアーカイブへの]

McQuillan                                                       [Page 9]

マッキラン[9ページ]

一覧

RFC 1～100	RFC 1401～1500	RFC 2801～2900	RFC 4201～4300
RFC 101～200	RFC 1501～1600	RFC 2901～3000	RFC 4301～4400
RFC 201～300	RFC 1601～1700	RFC 3001～3100	RFC 4401～4500
RFC 301～400	RFC 1701～1800	RFC 3101～3200	RFC 4501～4600
RFC 401～500	RFC 1801～1900	RFC 3201～3300	RFC 4601～4700
RFC 501～600	RFC 1901～2000	RFC 3301～3400	RFC 4701～4800
RFC 601～700	RFC 2001～2100	RFC 3401～3500	RFC 4801～4900
RFC 701～800	RFC 2101～2200	RFC 3501～3600	RFC 4901～5000
RFC 801～900	RFC 2201～2300	RFC 3601～3700	RFC 5001～5100
RFC 901～1000	RFC 2301～2400	RFC 3701～3800	RFC 5101～5200
RFC 1001～1100	RFC 2401～2500	RFC 3801～3900	RFC 5201～5300
RFC 1101～1200	RFC 2501～2600	RFC 3901～4000	RFC 5301～5400
RFC 1201～1300	RFC 2601～2700	RFC 4001～4100	RFC 5401～5500
RFC 1301～1400	RFC 2701～2800	RFC 4101～4200

RFC528 日本語訳

一覧

リンク

メニュー

コメント

お問い合わせ

プライバシーポリシー