With more and more content existing online, governments have jumped on the opportunity to use their websites and social media pages as tools for branding and communication with the public.
While websites and social media offer the benefits of extended reach, flexibility, and freedom as a means of communication, governments can find themselves at serious risk if they fail to treat this content like other official communications, equally subject to regulatory laws. These laws make the collection and preservation of authentic digital records a demand.
Digital Archiving Is The Law
As government websites and social media are heavily relied on by citizens and businesses for a wealth of information, these communications are considered official government publications to which the Freedom of Information Act (FOIA) and Open Records Laws apply. Compliance with these laws ensures government transparency and fair public access to records of interest.
Complying with these laws can cost governments thousands a year, at an average of $678 per information request. A recent report from Washington State’s Auditors Office also revealed that state and local governments spent $60 million to fill more than 285,000 public-records requests during a recent 12-month period!
Federal Government Agencies
Federal agencies must comply with the Freedom of Information Act (FOIA), giving citizens the right to access certain government information spanning from written documents to photographs and more. In addition, The Federal Records Act determines the expectations for the management of federal records. The Federal Records Act has changed over time to keep up with innovations in communications. It has expanded the definition of “federal records” to include electronic documents. For the latest, see the Presidential and Federal Records Act Amendments of 2014.
The act requires federal agencies to work with the National Archives and Records Administration (NARA). NARA works as the nation’s record keeper to ensure federal agencies preserve their records for increased public access. It sets rules and regulations on how (and in what formats) agencies must collect, store and archive files, for example under the U.S Code Chapter 31 - Records Management by Federal Agencies. The chapter sets expectations of record management in federal agencies from general duties, to transfer of materials, to determine what is considered illegal destruction of materials.
State & Local Government Organizations
State and local government agencies must also comply with FOIA as well as Open Records Laws or Sunshine Laws. These laws determine the kinds of government information subject to disclosure and outline the required disclosure procedures and formats for electronic methods.
In summary, the key archiving requirements for government agencies (dependent on state) are:
- Collect both web pages/social media content and metadata;
- Retain digital data in its original file format (HTML or WARC, not screenshots or PDFs);
- Retain online records for seven years.
Governments are also subject to the Federal Rules of Civil Procedure - FRCP Rule 34(b)(1)(c), and Federal Rules of Evidence - FRE Rule 901(a) - which indicate that parties must provide proof of data integrity and data authenticity for evidence.
The FRE states that to satisfy the requirement of authenticating or identifying an item of evidence in a case, the proponent must produce evidence sufficient to support a finding that the item is what the proponent claims it is, bringing light to data authenticity and data integrity for acceptance by the courts.
For digital evidence like web pages, this can be achieved by using 256-bit digital signatures and timestamps on all archived web pages as outlined in the eSign Act.
In the case of a lawsuit, several courts have not accepted simple screenshots of web pages or social media digital data as legal evidence if parties were not able to prove data integrity and data authenticity of these digital files.
To satisfy these laws and meet eDiscovery laws as admissible evidence, the following advice should be followed:
- Obtain an image of the web page/post as a visual reference;
- Obtain the source code of the web page in HTML or WARC format for digital forensics analysis;
- Gather the web server metadata (HTTP headers) for digital forensics analysis. This provides details on the web server collection date, time, IP address, web browser used, etc.;
- Place a 256-bit digital signature & timestamp using an official digital certificate on all collected files to authenticate the collected evidence;
- Ensure that data can be exported in the appropriate format to eDiscovery workflow tools.
Backups Are Not Archives
There are a few ways to collect and preserve your online records. Often, organizations rely on information stored in their Content Management System (CMS) (i.e - Wordpress) which runs the back-end of their websites. Platforms like Facebook and Twitter also automatically collect historical information of all accounts, making this information readily accessible (or requestable) when needed. While you might plan on using these methods to cover your bases, there a number of shortcomings with the quality of these backups.
The invention of CMS simplified the process of creating, maintaining and updating websites, and has been the technology behind millions of modern websites for quite some time now. But, a CMS was never designed to be an archive system. Many come with version control, and some with plugins that allow you to backup your files and data; But it’s key to understand that this is not the same as a proper archive.
The difference between backups and archives is that a backup takes periodic snapshots of data to help you recover records that get lost. Most backups are saved for a few days or weeks until they make new for new backup data to overwrite it. This works well in the short-term in the case of an emergency recovery of data. On the other hand, an archive is a record, a verifiable authenticatable account of a particular time. It is different from a backup in that it can give you ongoing access to business information as it once existed, for long periods of time.
An issue with regular backups is the very fact that you could look at your revision history in a CMS, find the version that was online at that time, and republish it somewhere. It’s all too easy just to take a screenshot, import it into Photoshop, and make the change you want. It’s for that reason that the legal system does not accept these backup versions as evidence.
With the wealth of information existing online, what’s published on a website or social media account now has the same legal status as a paper record. Therefore, it needs to be protected the same way; in a manner that is unalterable, and trustable. In regulatory terms, that means adhering to the FRE (Federal Rule of Evidence) and the FRCP (Federal Rules of Civil Procedure).
Why Capturing MetaData is a Must
Not only is it the form of the website or social media page that must be saved to meet these laws, but all the associated data has to be saved and matched as well.
In simple terms, metadata is “a set of data that describes and gives information about other data”; and in the world of digital evidence, there are four primary types:
- Client Metadata (who collected it) i.e Browser, operating system, IP address, user
- Web Server/API Endpoint Metadata (where and when it was collected) i.e URL, HTTP headers, type, date & time of request and response
- Account Metadata (who is the owner) i.e Account owner, bio, description, location
- Message Metadata (what was said when) i.e Author, message type, post date & time, versions, links (un-shortened), location, privacy settings, likes, comments, friends
Metadata can give tremendous insight into who you are, where you live, and where else you spend your time online; and it can have numerous applications, from allowing marketers to retarget you with specific content tailored to aiding in cases from insurance fraud to IP infringement, and divorce & family matters. Metadata can help to provide extremely essential and contextual information about the when’s and where’s of actions related to a legal case and is key to proving data authenticity and integrity in court.
For instance, a web page reproduced through CMS revision history will not be admissible unless the reproduced page bears a digital timestamp and signature to prove its metadata. Needless to say, lacking metadata, CMS backups and social media backups are not a viable option for companies interested in a strong records retention policy.
Archiving is the solution to the problem of maintaining perfect historical web records. An enduring web archive is created by capturing a digital snapshot of the content, independent of specific databases or technologies. That means the website can always be viewed in its original form and deliver the same user experience, meeting regulation requirements for authentic copies.
The content of this blog post was developed and produced exclusively by one of NACo’s Corporate Partners, PageFreezer.