From eec4fadcf0de65c13356cde369e2ae0eeda92e97 Mon Sep 17 00:00:00 2001 From: Wes McKinney Date: Mon, 15 May 2017 18:04:06 -0400 Subject: [PATCH] Clarify that the IPC file footer contains an additional copy of the schema Change-Id: I8ade726555c2d84ef33e7c9063100422909374f0 --- format/IPC.md | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/format/IPC.md b/format/IPC.md index f0a67e29218..bf2aaa74b3b 100644 --- a/format/IPC.md +++ b/format/IPC.md @@ -83,9 +83,11 @@ as an `int32` or simply closing the stream interface. We define a "file format" supporting random access in a very similar format to the streaming format. The file starts and ends with a magic string `ARROW1` (plus padding). What follows in the file is identical to the stream format. At -the end of the file, we write a *footer* including offsets and sizes for each -of the data blocks in the file, so that random access is possible. See -[format/File.fbs][1] for the precise details of the file footer. +the end of the file, we write a *footer* containing a redundant copy of the +schema (which is a part of the streaming format) plus memory offsets and sizes +for each of the data blocks in the file. This enables random access any record +batch in the file. See [format/File.fbs][1] for the precise details of the file +footer. Schematically we have: