How to organize archival materials

delores · August 4, 2018

I've just started a month-long trip to the archives, my very first one! After my first day there, I have a camera filled with pictures of archival documents, and now I'm wondering how best to organize and keep track of them.

What I did was download the pictures to my computer, name them all after the document name (Caja XXX, Vol XX, Ex. XX #1, #2, #3 etc) and put them in separate folders, but I'm already sensing that this is not a great system--for one thing, it took me forever to individually name and order all the pictures; also, the document name, while necessary information, is not very descriptive and doesn't let me see easily what's what.

Does anyone have a better way of dealing with this? I'd love your advice! I want to get a good system in place while I'm still at the beginning stages of research.

Thanks for your help!

TsarandProphet · August 4, 2018

Welcome to the archives! I organize them in folders (Fond XXX, Opis XXX, Folder XXX) and each picture is only the number of the photo within the folder. I use an excel file to keep track of descriptions.

Sigaba · August 4, 2018

12 hours ago, delores said:

I've just started a month-long trip to the archives, my very first one! After my first day there, I have a camera filled with pictures of archival documents, and now I'm wondering how best to organize and keep track of them.

What I did was download the pictures to my computer, name them all after the document name (Caja XXX, Vol XX, Ex. XX #1, #2, #3 etc) and put them in separate folders, but I'm already sensing that this is not a great system--for one thing, it took me forever to individually name and order all the pictures; also, the document name, while necessary information, is not very descriptive and doesn't let me see easily what's what.

Does anyone have a better way of dealing with this? I'd love your advice! I want to get a good system in place while I'm still at the beginning stages of research.

Thanks for your help!

I recommend that during your trip, you save the images to folders by location and/or date and/or archive.

In each folder, have a readme.txt file in which you put basic information about the images in the folder. This information should include guidance on how to cite the materials in your writing.

Later, use a program like Adobe Acrobat to combine the individual photos into PDF files. So if a letter you photographed had two pages and you took two images, you'd have a two-page PDF.

Use the OCR function to turn the PDFs into searchable documents and use file =-> properties ==> description to add meta data to each file.

Download and install a desktop search application like Copernic Desktop Search to index your files.

IF the OCR function does its job and/or you're diligent in adding metadata, the PDFs can be stored in folders with fewer and fewer subfolders.

The upfront costs of this recommendation will be the price of Acrobat (which may be discounted as you're a student), the price of the desktop search application (unless you're satisfied with a free application), the time to run the OCR, and the time to tag the files.

For me, the benefit is this: on my laptop I have 58.5 k indexed items. It takes me less than 5 seconds to find the 74 files I need with the right search words. (The time of this search is without tagging PDFs with metadata.)

gsc · August 4, 2018

How you organize your archive is really a personal thing- one person's standby is another person's headache - and there's a lot of trial and error before you come up with something that you like. It's really a topic that should be taught more in grad school, especially as technology makes it possible to acquire ever more documents and images at ever more archives and repositories, physical and digital.

1) IME, you want to do as much of the front-end work in the archives as you can. The more "work" you have to do with your photographs after you take them, the less likely it is that you'll use them. So if you have 500 individual JPEGs, each one will need a label, at a certain point you'll want to combine them into individual PDFs, label the PDFs, label the metadata, and so on. You need to do this work while it's all at least a little fresh in your head, but after you've spent 7 hours in the archives taking the pictures, spending another 7 hours managing the pictures is the literal last thing you want to do.

I solve this problem in two ways. First, I take my photos using ScannerPro (works on tablets and phones, although tablet is better) which allows me to combine images into PDF files within the app itself. Instead of having 50 separate images from 1 folder, or even 1 big PDF document with 50 images inside it, I can make 10 PDFs with 3 or 4 images each. Each PDF represents a document- a two page letter, 10 page memo, etc. Second, I label each PDF as I make it; I use a wireless keyboard to type the labels faster. On my first pass through these labels are kind of general: e.g., "1953.09.14 Correspondence btwn Creelman and Tennant." When I'm working with the documents later, I update that into something a little more useful: "1953.09.14 Creelman invites Tennant to nursing committee, Tennant reply pos" which tells me a little bit more. I label all documents with the same date/month/year so that it's easy to organize them chronologically later. So instead of having 150 images at the end of the day, which are useless to me without more work, I have 40 documents that I can play with as soon as I get them off my iPad and onto Dropbox. For more useful suggestions on archive photography: 1, 2

If you aren't doing this already, make sure that every image you take has the citation information in the image. Some archives give you handy citation slips for this; for the others, just take a sheet of paper and label it with the box and folder number, and use it like a running header.

2) I think that life gets a lot easier if you can work with all of your documents within the same application, or, if you are feeling extra organized, a database. I find that putting my documents in folders straight on my hard drive, as suggested above, makes it difficult for me to see everything at once, or to make connections between documents that didn't come from the same archive. This article explains that predicament in more detail. So, software. I put every one of my PDFs into Devonthink Pro Office, which is very powerful database software for Macs only. It comes with OCR already installed, so everything gets converted into searchable PDFs and eventually tagged and annotated. See here for some evangelizing on DTPO, what it can do for you, and strategies for organizing documents within the database: 1, 2, 3, 4.

But there are many kinds of software you can use- off the top of my head, there's Evernote, Zotero, Tropy, DTPO, Filemaker Pro, etc. Tropy was designed by historians and it's also free and open source, if you don't want to plunk down money for DTPO or Filemaker. The point is that as you go on more and more research trips, you create an archive about your topic. Software not only helps you organize this archive but also can help you think with and through it.

Hope this helps. Happy archiving!

Edited August 4, 2018 by gsc

AfricanusCrowther · August 5, 2018

How do those who work on handwritten sources or obscure foreign languages (or both, for me) work with the lack of OCR?

Edited August 5, 2018 by AfricanusCrowther

historygeek · August 5, 2018

For hard copies, I have this huge binder for my thesis notes. I divided it using tab dividers: secondary sources, newspapers, newsletters, pictures, oral histories, etc. Then I organize them chronologically. I do have a sheet in the front pocket of my binder to tell me what I have and where.

For digital copies, I have a folder, and then have a similar system. Different folders for each type, then organize chronologically. I also have an Excel sheet as a directory.

delores · August 5, 2018

Thanks everyone for this advice! Can't wait to get back to the archive on Monday...

dr. t · August 6, 2018

17 hours ago, AfricanusCrowther said:

How do those who work on handwritten sources or obscure foreign languages (or both, for me) work with the lack of OCR?

By reading what's written on the page and transcribing the important pieces of it.

In this digital age, we come to the idea that because it's easy to create vast amounts of digital information, particularly images, when visiting an archive, we should. Really, images should either be 1) a last resort because you've got two days left or 2) capturing a document of greater than average utility for deeper later study. But in either case, you should have a sense of what is in the document before you decide to photograph it.

By reading your documents, you gain a better grasp on what you need to transcribe, and to be honest the first 3-4 days of your archival dig will be more establishing this baseline than necessarily collecting all the information you want. You need to balance your desire to transcribe less with the fact that you don't actually know if what you're omitting might be useful in the future.

So, for example, my archive notes look like this:

Arch. de la Haute-Marne 5 H 8, unlabeled folder, piece 5

- Hugo comes mettensis confirms whatever La Crête has apud medium vicum and concedes any rights he might have there, testes Cono de Malbere, Robertus de Wirrise, Coruynus de uualemen, Symon de Lonwit, Arardus de risne, Garnerius iuuvens de sampinei, Ricardus de parnei, N.D.

Note how I'm switching between the Latin whenever I want to preserve particular terms, and back to English for summary. The result is an accurate summary of the material, but not an accurate transcription (I have a notation to mark literal transcriptions).

pudewen · August 6, 2018

10 hours ago, telkanuru said:

By reading what's written on the page and transcribing the important pieces of it.

In this digital age, we come to the idea that because it's easy to create vast amounts of digital information, particularly images, when visiting an archive, we should. Really, images should either be 1) a last resort because you've got two days left or 2) capturing a document of greater than average utility for deeper later study. But in either case, you should have a sense of what is in the document before you decide to photograph it.

By reading your documents, you gain a better grasp on what you need to transcribe, and to be honest the first 3-4 days of your archival dig will be more establishing this baseline than necessarily collecting all the information you want. You need to balance your desire to transcribe less with the fact that you don't actually know if what you're omitting might be useful in the future.

So, for example, my archive notes look like this:

Arch. de la Haute-Marne 5 H 8, unlabeled folder, piece 5

- Hugo comes mettensis confirms whatever La Crête has apud medium vicum and concedes any rights he might have there, testes Cono de Malbere, Robertus de Wirrise, Coruynus de uualemen, Symon de Lonwit, Arardus de risne, Garnerius iuuvens de sampinei, Ricardus de parnei, N.D.

Note how I'm switching between the Latin whenever I want to preserve particular terms, and back to English for summary. The result is an accurate summary of the material, but not an accurate transcription (I have a notation to mark literal transcriptions).

In some ways, one of the greatest gifts of my main archive allowing you to receive digital facsimiles of a total of only 20 documents per research trip (defined as one calendar year) and forbidding all camera use was that it forced me to read everything I collected in order to transcribe it. I'm not claiming that I especially enjoyed spending 8 hours a day for much of a year typing out transcriptions of Chinese and Manchu documents. But it meant that when I was done, I had at least read everything that I had collected. It meant that I was more selective (so a much higher percentage of my documents were useful to my project, even though I had fewer documents). And it meant that, compared to colleagues who also do work on sources that don't lend themselves to OCR (but who were allowed to photograph whatever they wanted), I had a much easier time finding stuff in my docs.

I imagine I'll be less happy about this policy if I get to the point of having an academic job and a family that make it much, much harder to spend a year living at archives in Beijing (and will wish I could just show up for five days, photograph a bajillion things and leave). But as a grad student, it was really a blessing in disguise. So thanks PRC government for your terrible policies of making archives difficult to use.

gsc · August 6, 2018

2 hours ago, pudewen said:

In some ways, one of the greatest gifts of my main archive allowing you to receive digital facsimiles of a total of only 20 documents per research trip (defined as one calendar year) and forbidding all camera use was that it forced me to read everything I collected in order to transcribe it. I'm not claiming that I especially enjoyed spending 8 hours a day for much of a year typing out transcriptions of Chinese and Manchu documents. But it meant that when I was done, I had at least read everything that I had collected. It meant that I was more selective (so a much higher percentage of my documents were useful to my project, even though I had fewer documents). And it meant that, compared to colleagues who also do work on sources that don't lend themselves to OCR (but who were allowed to photograph whatever they wanted), I had a much easier time finding stuff in my docs.

I imagine I'll be less happy about this policy if I get to the point of having an academic job and a family that make it much, much harder to spend a year living at archives in Beijing (and will wish I could just show up for five days, photograph a bajillion things and leave). But as a grad student, it was really a blessing in disguise. So thanks PRC government for your terrible policies of making archives difficult to use.

I agree. Being able to photograph at will has a lot of positives- you can get through material very quickly if you are pushed for time, and with OCR, you essentially come out with a digitized, searchable archive for your own use. But more information doesn't lead to more insight or more analysis. It's just more raw data that you, the researcher, will need to manage and work with.

I definitely take more photos than I need to. In part to feel reassured- that if I need to go back and look at this particular set of letters, I can. I try to head off the inevitable avalanche of data by taking careful notes (what I actually see myself using, and what I am photographing for background information and to head off my own anxiety), labeling according to a system, highlighting the most useful documents, creating an internal index so I know what each collection of images contains, etc. The more information you gather, the more robust a system of information management you need.

Edited August 6, 2018 by gsc

TMP · August 6, 2018

Agreed with all the above. However, I've found ScannerPro's OCR not too great for my typer-written and cable notes from mid-20th century. So I do end up having to read and highlight when I upload the documents to Dropbox and read off Acrobat.

One thing I do want to tell my younger self: "Don't write "summary of X"... read the damn document and make more detailed notes so that I know why I'm going back to the archives 3 years later when that archive never bothered to make a copy for me." (Said "summary" actually turned out to be quite useful so I'm glad I went back and typed it all out.)

I generally stick to finding aid's classification system for labeling PDF files and creating sub-folders. I have Excel split up into the following columns, with each tab for each collection:

Status [Scanned? Partial scan? Uploaded? Useless?] -- Digital ID (if applicable) -- Box Number -- Folder Number -- Folder Name -- Date --- Notes (usually I put in a few words of what's in there, interesting documents)

Sigaba · August 6, 2018

On 8/4/2018 at 2:38 PM, gsc said:

also free and open source, if you don't want to plunk down money for DTPO or Filemaker.

A question and a comment. If a freeware/open source db goes side ways, will you have access to the same level of support that comes with an application for which you pay?

If you're digitizing your materials, please consider the utility of having multiple backups (a flash drive, a cloud drive, an external drive) and the advantages of having hard copies of your most essential documents.

Sign In

How to organize archival materials

Recommended Posts

delores

TsarandProphet

Sigaba

gsc

AfricanusCrowther

historygeek

delores

dr. t

pudewen

gsc

TMP

Sigaba

Create an account or sign in to comment

Create an account

Sign in

Browse

Activity

Search

Results

Important Information