When Governments destroy data…

This is an invaluable historic resource: The Internet Archive. Especially when you become aware that the Trump administration has been ‘disappearing‘ so many web pages. From Civil Rights, Climate Change to LGBT Rights’. Mind you, if you start looking around, you may note that many other governments do the same.

Long before the 2016 Presidential election cycle librarians have understood this often-overlooked fact: vast amounts of government data and digital information are at risk of vanishing when a presidential term ends and administrations change.  For example, 83% of .gov pdf’s disappeared between 2008 and 2012. 

That is why the Internet Archive, along with partners from the Library of Congress, University of North Texas, George Washington University, Stanford University, California Digital Library, and other public and private libraries, are hard at work on the End of Term Web Archive, a wide-ranging effort to preserve the entirety of the federal government web presence, especially the .gov and .mil domains, along with federal websites on other domains and official government social media accounts.

While not the only project the Internet Archive is doing to preserve government websites, ftp sites, and databases at this time, the End of Term Web Archive is a far reaching one.

The Internet Archive is collecting webpages from over 6,000 government domains, over 200,000 hosts, and feeds from around 10,000 official federal social media accounts. The effort is likely to preserve hundreds of millions of individual government webpages and data and could end up totaling well over 100 terabytes of data of archived materials. Over its full history of web archiving, the Internet Archive has preserved over 3.5 billion URLs from the .gov domain including over 45 million PDFs.

This end-of-term collection builds on similar initiatives in 2008 and 2012 by original partners Internet Archive, Library of Congress, University of North Texas, and California Digital Library to document the “gov web,” which has no mandated, domain-wide single custodian. For instance, here is the National Institute of Literacy (NIFL) website in 2008. The domain went offline in 2011. Similarly, the Sustainable Development Indicators (SDI) site was later taken down. Other websites, such as invasivespecies.gov were later folded into larger agency domains. Every web page archived is accessible through the Wayback Machine and past and current End of Term specific collections are full-text searchable through the main End of Term portal. We have also worked with additional partners to provide access to the full data for use in data-mining research and projects.

The project has received considerable press attention this year, with related stories in The New York Times, Politico, The Washington Post, Library Journal, Motherboard, and others.

“No single government entity is responsible for archiving the entire federal government’s web presence,” explained Jefferson Bailey, the Internet Archive’s Director of Web Archiving.  “Web data is already highly ephemeral and websites without a mandated custodian are even more imperiled. These sites include significant amounts of publicly-funded federal research, data, projects, and reporting that may only exist or be published on the web. This is tremendously important historical information. It also creates an amazing opportunity for libraries and archives to join forces and resources and collaborate to archive and provide permanent access to this material.”

This year has also seen a significant increase in citizen and librarian driven “hackathons” and “nomination-a-thons” where subject experts and concerned information professionals crowdsource lists of high-value or endangered websites for the End of Term archiving partners to crawl. Librarian groups in New York City are holding nomination events to make sure important sites are preserved. And universities such as  The University of Toronto are holding events for “guerrilla archiving” focused specifically on preserving climate related data.

We need your help too! You can use the End of Term Nomination Tool to nominate any .gov or government website or social media site and it will be archived by the project team.   If you have other ideas, please comment here or send ideas to info@archive.org.   And you can also help by donating to the Internet Archive to help our continued mission to provide “Universal Access to All Knowledge.”

See: https://blog.archive.org/2016/12/15/preserving-u-s-government-websites-and-data-as-the-obama-term-ends/

Big data’s power is terrifying.

Online information is already being used to manipulate us. We must act now to own the new political technologies before they own us.

Has a digital coup begun? Is big data being used, in the US and the UK, to create personalised political advertising, to bypass our rational minds and alter the way we vote? The short answer is probably not. Or not yet.

A series of terrifying articles suggests that a company called Cambridge Analytica helped to swing both the US election and the EU referendum by mining data from Facebook and using it to predict people’s personalities, then tailoring advertising to their psychological profiles. These reports, originating with the Swiss publication Das Magazin (published in translation by Vice), were clearly written in good faith, but apparently with insufficient diligence. They relied heavily on claims made by Cambridge Analytica that now appear to have been exaggerated. I found the story convincing, until I read the deconstructions by Martin Robbins on Little Atoms, Kendall Taggart on Buzzfeed and Leonid Bershidsky on Bloomberg.

Either we own political technologies, or they will own us. The great potential of big data, big analysis and online forums will be used by us or against us. We must move fast to beat the billionaires.

 Twitter: @GeorgeMonbiot. A fully linked version of this column will be published at monbiot.com

How America Gave Up on Change

Thinking about it, you could equally say: How the West Gave Up on Change.

Is everybody retreating to some mythical comfort zone that the drum beats of nationalism and anti-immigrant sentiment promise?

In his last book, economist Tyler Cowen wrote about how machine intelligence could change the world. In his new book, The Complacent Class, he writes about the forces that prevent change from happening. In particular, he argues that America has become more averse to change in recent decades, and that this has transformed our work, our leisure, and our neighborhood.

“Tyler Cowen’s blog, Marginal Revolution, is the first thing I read every morning. And his brilliant new book, The Complacent Class, has been on my nightstand after I devoured it in one sitting. I am at round-the-clock Cowen saturation right now.”

Malcolm Gladwell

Since Alexis de Tocqueville, restlessness has been accepted as a signature American trait. Our willingness to move, take risks, and adapt to change have produced a dynamic economy and a tradition of innovation from Ben Franklin to Steve Jobs.

The problem, according to legendary blogger, economist and best selling author Tyler Cowen, is that Americans today have broken from this tradition―we’re working harder than ever to avoid change. We’re moving residences less, marrying people more like ourselves and choosing our music and our mates based on algorithms that wall us off from anything that might be too new or too different. Match.com matches us in love. Spotify and Pandora match us in music. Facebook matches us to just about everything else.

Of course, this “matching culture” brings tremendous positives: music we like, partners who make us happy, neighbors who want the same things. We’re more comfortable. But, according to Cowen, there are significant collateral downsides attending this comfort, among them heightened inequality and segregation and decreased incentives to innovate and create.

The Complacent Class argues that this cannot go on forever. We are postponing change, due to our near-sightedness and extreme desire for comfort, but ultimately this will make change, when it comes, harder. The forces unleashed by the Great Stagnation will eventually lead to a major fiscal and budgetary crisis: impossibly expensive rentals for our most attractive cities, worsening of residential segregation, and a decline in our work ethic. The only way to avoid this difficult future is for Americans to force themselves out of their comfortable slumber―to embrace their restless tradition again.