Tech

This Silicon Valley Startup is Bringing Photographic History Online

Gado Images, a company nestled in amongst the tech giants of San Francisco Bay, are a small company with a big dream – to digitise and share the world’s visual history. In just six years, the company has digitised hundreds of thousands of images, and correctly identified a plethora of mislabeled historical photographs. Many are now available online, some for free, but Gado isn’t stopping there.

It’s likely that you’ve seen the fruits of their labour, as Gado’s distribution network is huge. At this moment it includes Getty Images, Sheet Music Plus, and the Internet Archive. However, it was partly because of an obscure but culturally crucial publication that the company came into existence. In 2010, the team observed a dire need for archival work at the Afro American Newspapers (Afro) in Baltimore, Maryland.

Originally launched in the 1890s by a former slave, the Afro is the longest continuously operating African-American newspaper in the world. In over 120 years, the paper had built an archive of more than 1.5 million photographs covering everything from civil rights, sports, and entertainment to everyday life.

Due to the high cost of digitisation and annotation, as of 2010, the paper had only scanned about 5,000 of these photos. This meant that the vast majority of this astounding collection of American history was not accessible to scholars, journalists and other creatives. Determined not to let these precious pieces of history become lost, Gado Images (originally named Project Gado) was launched in order to digitise and preserve Afro’s collection, but has gone on to much greater things.

In addition to their initial project at the Afro, scanning more than 120,000 images, they have worked on several other black history collections. Through a recent partnership with the Johns Hopkins University, Gado digitised and annotated more than 1,000 early African-American portrait photographs.

Unlike many archival projects, Gado has focused heavily on tracking down and negotiating with individual collectors and photographers, as a huge amount of historical material remains in private hands. Regular partner, Stuart Lutz Historic Documents, for example, holds one of the best private Vietnam War archives in the world. Gado also works hard to recover historical gems from the oft-neglected area of vernacular photography; everything from Polaroids and snapshots to film stills, postcards, advertisements and other ephemera.

vxphay0gxgdj9mwvnrts.jpg

Several pieces of digitization equipment in Gado Images’ lab in the San Francisco Bay Area /Gado

“Once we’ve identified target collections for an organisation, we design and implement a digitization plan.” Gado Images Co-Founder Amy Smith explains. “This might involve advising on equipment to purchase, developing training materials for staff members, or even placing our own staff members in the partner organization’s archives.” On the Afro project, for example, Gado placed a staff member in the paper’s archives for a year and a half.

The Gado team uses a wide range of equipment for archiving, including a number of familiar camera brands. “When we work directly in an archive or digitise our own wholly-owned materials, we use the Epson Perfection line of flatbed scanners for delicate materials, the Kodak PS50 system for prints, a Fuji Scansnap overhead scanner or a mounted Canon DSLR for oversize items, a Leica camera for objects, and a Powerslide 5000 for slides,” Smith says. “We also have custom scanning equipment for 8mm and 16mm films.”

Though what Gado Images does is extremely beneficial, it is at the end of the day, a business. The company generates revenue for itself and the image providers by providing their digitised visual content to creative professionals worldwide and (where appropriate) monetising images through licensing. For instance, more than 11,000 of the Afro’s photos are now online due to Gado’s partnership with Getty Images, and is now helping to generate additional revenue for the paper.

In order for their catalogues to avoid becoming just as jumbled as the hardcopy vaults the team sorts through, Gado depends heavily on the Cognitive Metadata Platform (CMP). Based on IBM’s Watson and utilizing Google Vision, this AI software platform uses tools like facial recognition, optical character recognition, object recognition, neural networks and natural language processing to automatically tag and caption the images the Gado team scans.

“Without automated tools, these kinds of historically valuable but mislabeled photos would be nearly impossible to find.”

Once materials are scanned, they are loaded into the CMP. The system has a web interface which staff can then use to enter descriptions. The CMP then uses this info to automatically perform keywording, date estimation, and named entity tagging for the images. The CMP is able to perform captioning and keywording fully autonomously in some cases, but this depends on the nature of the collection and what data is available. Entirely unprocessed materials require more human input, while materials which have extensive existing metadata can be fully automated.

This system works because the CMP is able to automatically find “entities” depicted in each image; significant people, locations, events and actions. For example, it uses facial and object recognition to find significant people, and things by searching through over 60,000 of their own and partners’ archive images. The CMP also uses optical character recognition to read the text in any handwritten or printed notations to tag events (like ‘World War 2’ or ‘a protest’). In combining these elements the CMP makes logical assumptions and acts on them.

“For instance an image of American Baseball legend Jackie Robinson was submitted to the CMP,” explains Smith, “The system already knew who Jackie was, and it knew that his birthday was in 1919. Based on his facial characteristics, it estimated his age at 31 in the photo, and combined this with his birthday to get an estimated date of 1950 for the photo.” It then combined all this information into the automatic caption American former baseball player Jackie Robinson, 1950, along with relevant keywords, such as athlete and hat.

“One of our best photos of Martin Luther King Jr was not filed in the Afro’s MLK archives, but rather in a generic folder labeled ‘Black History’,” says Smith, “Without automated tools, these kinds of historically valuable but mislabeled photos would be nearly impossible to find.”

The incredible functionality of these complex A.I. systems unearths previously lost or forgotten items and saves countless manhours in the process. With a staggering amount of photos from the 20th century in danger of disappearing for good, archival teams Gado’s are providing a service that will only become more valuable as time goes on.

(Cover Picture: Civil Rights leader the Reverend Dr Martin Luther King Jr, Paul Moore, former pastor of the New Bethel Baptist Church Walter Fauntroy and Ralph Abernathy leading citizens on a thank you march to the White House, Washington DC, August 14, 1965 /Afro Newspaper /Gado)