Double-Zipping Files and Folders

From ADA Public Wiki
Revision as of 02:12, 16 January 2020 by Dahaddican (talk | contribs)
Jump to navigation Jump to search

The double-zipping of files is required in order to ensure that CSV, Stata, SPSS, SAS and Excel file types remain compatible with Dataverse and to prevent the Two Ravens explorer functionality from being added to these file formats.

Dataverse Version 4.6.1

It is a known issue that Dataverse Version 4.6.1 cannot directly ingest certain file types (SAS, SPSS, Stata, Excel and CSV) without removing some of the files original formatting. Dataverse Version 4.6.1 also adds the Two Ravens ‘Explorer’ function button to directly ingested files, this enables functionality that the ADA do not currently want and are not resourced to manage. To prevent both of the above, and also to cater for the fact that Dataverse automatically removes a layer of zipping during the ingest process, all files that are in any of the aforementioned formats must be double-zipped prior to ingest.

Double-Zipping

Since a single layer of zipping will be removed from the aforementioned file types by Dataverse during the upload process, to maintain the integrity of the files and their data, these file types must be Double-Zipped. Although a single layer of the zipping will be removed during the upload process, the remaining layer post upload will be enough protection to prevent Dataverse from changing any of the files formatting or from adding the Two Ravens 'Explorer' function. The ingest process to Dataverse (HTTPS) protects your data during upload and the Role Permissions applied to the dataset prevent other non-authorised users from accessing your data when stationary in the dataset, ensuring that it remains secure and protected at all times.

No Zipping

If a file of the aforementioned formats is uploaded directly to your dataset with no or only a single layer of zipping, Dataverse will remove this layer and will add the Two Ravens 'Explorer' button that enables functionality that the ADA do not want available at the present time. In the event that data files are uploaded without double-zipping, your deposit will not pass ADA Quality Assurance checks and the dataset will be returned to you for rectification along with a Processing Report detailing the changes required, therefore delaying its publication.

Files versus Folders

If there are an excessive number of files that require double-zipping prior to upload, it is possible to ingest multiple files as a single folder that is then downloadable in Dataverse. This however means that the files are unable to be given File Tags or Description Notes, making them harder to search for and discover and potentially reducing their chances of reuse. Before uploading multiple files in a folder, you should contact the ADA to discuss your options. When approved by the ADA, if the folder contains any CSV, Stata, SPSS, SAS or Excel files the folder will need to be uploaded as a double-zipped folder for the same reasons as above.

How to Double-Zip

For instructions on how to Double-Zip files and folders refer to the information contained at Instructions on how to Zip and Encrypt a file or folder.