ARC Release 6: The two missing files

[First posted on January 12, 2018, at rgr-cyt.org.]

NARA’s announcement of JFK AR release 6 stated that 3,539 files were posted on NARA’s website. As I noted in an earlier post (here), I was only able to find 3537, leaving two unaccounted for. This is the kind of thing that drives me crazy, so I’ve been looking for them for almost a month now.

Although I cannot definitively answer this question, I now have a suggestion about the discrepancy. In reviewing my figuring for my post on NARA’s file replacement (here), I noticed that there are two duplicate listings in release 6. Using the latest NARA spreadsheet of JFK ARC releases as a reference, row 2356 lists document #124-10204-10000, posted as docid-32585200.pdf. Row 3424 then lists the same document, 124-10204-10000, posted as the same file, docid-32585200.pdf. This happens again at rows 3849 and 3944, which list document #157-10002-10002, posted as docid-32281842.pdf, twice. Perhaps this is the reason that NARA thought that it posted 3539 files, but actually posted only 3537.

Another possibility is that, in both these cases, NARA actually posted two different files with the same name. The effect of this would of course be that the second file overwrote the first file. This is what happened with the ‘replacement files’ I discussed earlier. In those cases, I was able to see that there were originally two different files because I had downloaded the earlier version of the file before it was overwritten (replaced) by the later version. In the case of 124-10204-10000 and 157-10002-10002, however, the first version of the file would have been overwritten a few minutes or seconds later, when the second version was uploaded on top of it. I doubt that anyone could have been lucky enough to download the first version in such a short time, so the only ones who would have seen these theoretical different versions of the two documents are the folks at NARA.

Going back to an earlier issue, I also noted a while ago that there is an earlier set of 9 files that have the same problem: These are files listed in the spreadsheet for release 1, then again, with the same record number, and the same file name, in the spreadsheet for release 3. I did not discuss these in my post on replacement files because I do not have copies of the first (July) release versions of these files; I downloaded only a part of these files at that time, and therefore missed that opportunity. These release 1 and release 3 duplicate listings have the same possibilities as the duplicate listings in the release 6 spreadsheet: 1) it could just be a typo; 2) there could have been two versions of these files as well, and the earlier version was overwritten by the later version, as happened to the replacement files.

Although I can’t say what the case was for any of these “duplicate listings”, I’ve put up a list of them here, and I’ll try a letter to NARA after I’ve gone through the remaining issues in release 6.

Postscript

I reviewed the zip files I downloaded immediately after release 1 in July, and found 5 of the 9 pdf files that were posted again in release 3 in November. The release 1 version and the release 3 version are in all cases byte for byte identical. This negates any suspicion of NARA ‘replacing’ an earlier version of these July files with a later, different version in November.