NARA 21, part 1: A new version of the JFK database

[Updated 7/15/2021]

As I noted in my last post, there is a new version of the JFK Assassination Records Collection database now available at NARA (here). Six excel files were posted online, apparently in mid-June. These files represent a new version of the ARC database with important information on the current state of the JFK Collection. I will refer to this new version as NARA 21, and the six files as the excel version of NARA 21.

This note today will look at the scope of NARA 21, and give some general figures for withheld and redacted documents in the ARC, as described in the new table. Look for at least two more notes in the days ahead.

I look at document totals today because the number of withheld and redacted documents has been the subject of much discussion and rumor, both on-line and in print. Later notes will look at a more detailed picture of the reasons listed for withholding, and will compare redactions and withholding data in NARA 21 with data from earlier sources.

Overall, NARA 21 removes much of the confusion over this issue, and the staff of NARA deserves a hearty thanks for their diligent work in accounting for the JFK Collection’s status with thorough-going detail.

The JFK database, AKA ACRS

For those new to the blog, the JFK database was established by NARA all the way back in 1992, pursuant to the mandate of the JFK Act, the law that created the JFK assassination collection. It contains standard metadata for each record in the JFK ARC. This database is usually available on-line in a public-use version called the Assassination Collection Reference System (ACRS), but since October 2020 it has been down for maintenance. In its place NARA first substituted an excel file made from a very old backup of the JFK database.1I have a couple of notes about all this which I am too lazy to link to today. The excel version we are looking at today replaces that.

Scope of NARA 21

In contrast with the old backup file, NARA 21 is a complete copy of the ACRS as we originally knew it from online. The old backup file had only 150,000 rows, less than half of the ACRS listing. The six Excel files now on-line consist of 319,106 rows of data. This is the same as the version of the ACRS available at the Mary Ferrell website (I call this MF15 to keep track of the different versions).2See here for a discussion of where MF15 came from and the correct total of rows in the ACRS. We are therefore dealing with basically the same record set as in MF15.

I should probably confess at this point that I also scraped my own version of the ACRS about two years ago. With the exception of some technical glitches, I believe the data I got was identical to the data in MF15. When I talk about “my copy of ACRS” below, this is what I’m referring to.

As I have discussed elsewhere, 319,106 is not the actual number of records in the ARC. There are at least 373 rows for non-existent records, as a result of miscellaneous counting, inputting, and revision errors. Much more important, in my opinion, there are thousands of records that do NOT appear in the public-use ACRS.3I will cite my numerous notes on this later. This was mostly due to problems with a few of the thousands of floppy disks that dozens of government agencies used to provide the complex metadata for their assassination-related records. None of the “missing” records on these faulty disks have been added to this version of the JFK database.

Does this affect our conclusions on the state of withholding and redaction in the ARC? Yes, it does. As an example, I give the NSA records in the ARC. These are not in the JFK database. We know from the ARRB’s report and their notices in the Federal Register back in 1995-1998 that there are 346 NSA records in the Collection. We also know from the 2017-2018 releases that many of these are still redacted. Because these have not been added to the JFK database, however, we do not know the current status of these records.

Overall changes in NARA 21

I will not go through detailed comparisons of all the data fields in NARA 21 with my copy of ACRS. In general, there has not been a major revision of basic data, such as title, doc date, doc pages, etc. There remains work to be done here; the metadata for the HSCA records, for example, are still littered with errors and inconsistencies.

The major change has been in four key database fields: classification, restrictions, current status, and date of last review. These four fields incorporate the work of the last four years, since NARA began preparing the collection for general release in 2017 as mandated by the JFK Act.

The new “current status”

The key field for the questions we are looking at today is “current status”. Following is a summary of “current status” for my copy of the ACRS, circa 2019:

statusrecord total
OPEN273289
RELEASED WITH DELETIONS34927
POSTPONED IN FULL9718
[empty value]799

Compare this to “current status” in NARA 21:
statusrecord total
Release301981
Redact16283
Withhold602
Pending227
Declassify13

For those not yet used to the terms used here, “postponed in full” usually refers to documents that are withheld completely. Only limited metadata is available for these, such as who originated the document, how many pages does it have, etc. This is equivalent to “withhold” in NARA 21. “Released with deletions” usually means that some text was withheld from the public, anywhere from two letters to multiple pages. This is equivalent to “redact” in NARA 21; “Open” means open in full, nothing withheld. This is equivalent to “release” in NARA 21.

Note, however, that some documents were created with material withheld. There are lots of these, especially in the HSCA records, and these are counted as “open”. I will post a note on these documents later this summer.

“Empty value” means that the “current status” record was blank for this many records. There are no records with “current status” blank in NARA 21. “Pending” and “declassify” are new terms in NARA 21 that I will explain in my next note.

Obviously, my ACRS copy did not reflect any of the releases from 2017-2018. Instead it reflected the same status as MF 15, which was scraped in 2015. I believe that these numbers were already quite inaccurate back then.4I looked at some of these numbers earlier and found many, many cases where documents with a current status of “postponed in full” were actually available at Mary Ferrell or elsewhere online.

Despite the inaccurate base counts, however, it is clear that in 2017-2018 NARA released the large majority of withheld material. There is no way that we will see anything like that volume of releases ever again. The withheld cupboard is almost bare.