Double-Zipping Files and Folders: Difference between revisions

From ADA Public Wiki
Jump to navigation Jump to search
No edit summary
No edit summary
Line 1: Line 1:
The zipping and password encryption of files is required in order to ensure that the files remain compatible with Dataverse and to keep all files consistent. It is a known issue that Dataverse Version 4.6.1 cannot directly ingest certain file types (SAS, SPSS, Stata, Excel and CSV) without removing some of the files original formatting. Dataverse Version 4.6.1 also adds an ‘Explorer’ function button to directly ingested files, this enables a functionality that the ADA does not currently want. To prevent both of the above, and also to cater for the fact that Dataverse strips away the first layer of zipping during the ingest process, all data files and those supporting documentation files that are in any of the aforementioned formats must be zipped with a password protected layer of encryption prior to ingest to Dataverse. The password prevents Dataverse from removing the layer of zipping, the files integrity is then retained, and the files are present in their original format without the explorer function. in addition, the data is protected through the password protection.
The zipping and password encryption of files is required in order to ensure that the files remain compatible with Dataverse and to keep all files consistent.  


If there are an excessive number of data or supporting files that require zipping and password protected encryption prior to upload, it is possible to ingest multiple files as a single downloadable folder. This does mean that the files are unable to be given File Tags or Description Notes. Before uploading files in a folder, you should [[Contact the ADA|contact the ADA]]. When approved by the ADA, the folder will then need to be uploaded zipped with password protected encryption for the same reasons as above.
=Dataverse Version 4.6.1=
It is a known issue that Dataverse Version 4.6.1 cannot directly ingest certain file types (SAS, SPSS, Stata, Excel and CSV) without removing some of the files original formatting. Dataverse Version 4.6.1 also adds an ‘Explorer’ function button to directly ingested files, this enables functionality that the ADA do not currently want. To prevent both of the above, and also to cater for the fact that Dataverse strips away the first layer of zipping during the ingest process, all data files and those supporting documentation files that are in any of the aforementioned formats must be treated prior to ingest in one of two ways. Password protected encryption and zipping, or the double-zipping of files.  


If a password protected layer of encrypted zipping is not added to the file or folder prior to uploading to Dataverse, during the upload process there is a known issue with Dataverse Version 4.6.1 that removes a single layer of zipping from the file or folder. Removal of the layer of zipping can also alter the files formatting if it is one of the aforementioned file types. Therefore the password protected layer of encrypted zipping is required to ensure that a single layer remains post upload to protect not only your data from intrusion but also the original files formatting. Dataverse is unable to remove the layer of encrypted zipping if protected by a password.  
==Password Protected Encryption and Zipping=
Data files (or Supporting Documentation files of the aforementioned formats) must be zipped once with a password protected layer of encryption added prior to ingest to Dataverse. The password prevents Dataverse from removing a layer of zipping, meaning the files integrity is retained, and the files are present in their original format without the addition of the explorer function. Importantly the data is also protected from intrusion through the password protection.


Although it is not recommended, if a password is not added you will need to double-zip the files and folders as a single layer of zipping will then be removed by Dataverse during the upload process. The double-layer of zipping will ensure that a single layer remains post upload, and therefore the files original formatting will be retained. In this case, your data will not be protected during the upload if it were to be intercepted.
==Double-Zipping=
If a password protected layer of encrypted zipping is not added to the file prior to uploading to Dataverse, a single layer of zipping will be removed from the file during the upload process. To counter this and to maintain the integrity of the file and its data, the file must therefore be Double-Zipped. Although a layer of the zipping will be removed during the upload process, the layer remaining post upload will prevent Dataverse from changing any of the files formatting or from adding the 'Explorer' function. As this method will not protect your data from possible intrusion, the ADA recommends that you always password protect the encrypted file during the zipping process wherever possible.  


If a data file is uploaded directly with no zipping, or a only single layer of zipping without a password, Dataverse will add an 'Explore' button that enables functionality that the ADA do not want available at the present time. In the event that data files are uploaded without a password protected layer of zipping or double-zipping, your deposit will not pass ADA Quality Assurance checks and will be returned to you for rectification, delaying the publication of your deposit.
=No Zipping=
If a data file is uploaded directly with no zipping, or a only single layer of zipping without a password, Dataverse will add the 'Explore' button that enables functionality that the ADA do not want available at the present time. In the event that data files are uploaded without a password protected layer of single-zipping or without double-zipping, your deposit will not pass ADA Quality Assurance checks and will be returned to you for rectification, delaying its publication.
 
=Files versus Folders=
If there are an excessive number of data files or supporting documentation files that require zipping and password protected encryption prior to upload, it is possible to ingest multiple files as a single folder that is then downloadable in Dataverse. This however means that the files are unable to be given File Tags or Description Notes, making them harder to search for and discover and potentially reduces their chances of reuse. Before uploading multiple files in a folder, you should [[Contact the ADA|contact the ADA]] to discuss the options. When approved by the ADA, the folder will still need to be uploaded as a zipped folder with password protected encryption or as a double-zipped folder for the same reasons as above.  


= 7-Zip Software =
= 7-Zip Software =
It is recommended by the ADA that all data files and certain supporting documents be encrypted using the 7-Zip open source software. This software is used by the ADA Staff and is free. This ensures that the files and folders are protected from unauthorised disclosure during the Dataverse upload process. The software creates a container called ‘archive’ that holds the files requiring protection. That archive container can then be encrypted and password protected. Copies of the software can be obtained via the links at [https://www.7-zip.org/ https://www.7-zip.org/].
It is recommended by the ADA that all data files and certain supporting documents be encrypted using the 7-Zip open source software. This software is used by the ADA Staff and is free. This ensures that the files and folders are protected from unauthorised disclosure during the Dataverse upload process. The software creates a container called ‘archive’ that holds the files requiring protection. That archive container can then be encrypted and password protected. Copies of the software can be obtained via the links at [https://www.7-zip.org/ https://www.7-zip.org/]. The 7-Zip Software is not available for Apple Mac users, you should therefore use the Apple Compression and Encryption tools which can be decrypted by other users with free software such as [https://theunarchiver.com/ TheUnarchiver].


= How to Double-Zip =
= How to Double-Zip =
For instructions on how to Double-Zipp files and folders using the 7-Zip software, refer to the page detailed below.
For instructions on how to Double-Zip files and folders using the 7-Zip software, refer to the page detailed below.
*[[Instructions on how to Double-Zip a file or folder]]
*[[Instructions on how to Double-Zip a file or folder]]

Revision as of 23:55, 17 September 2019

The zipping and password encryption of files is required in order to ensure that the files remain compatible with Dataverse and to keep all files consistent.

Dataverse Version 4.6.1

It is a known issue that Dataverse Version 4.6.1 cannot directly ingest certain file types (SAS, SPSS, Stata, Excel and CSV) without removing some of the files original formatting. Dataverse Version 4.6.1 also adds an ‘Explorer’ function button to directly ingested files, this enables functionality that the ADA do not currently want. To prevent both of the above, and also to cater for the fact that Dataverse strips away the first layer of zipping during the ingest process, all data files and those supporting documentation files that are in any of the aforementioned formats must be treated prior to ingest in one of two ways. Password protected encryption and zipping, or the double-zipping of files.

=Password Protected Encryption and Zipping

Data files (or Supporting Documentation files of the aforementioned formats) must be zipped once with a password protected layer of encryption added prior to ingest to Dataverse. The password prevents Dataverse from removing a layer of zipping, meaning the files integrity is retained, and the files are present in their original format without the addition of the explorer function. Importantly the data is also protected from intrusion through the password protection.

=Double-Zipping

If a password protected layer of encrypted zipping is not added to the file prior to uploading to Dataverse, a single layer of zipping will be removed from the file during the upload process. To counter this and to maintain the integrity of the file and its data, the file must therefore be Double-Zipped. Although a layer of the zipping will be removed during the upload process, the layer remaining post upload will prevent Dataverse from changing any of the files formatting or from adding the 'Explorer' function. As this method will not protect your data from possible intrusion, the ADA recommends that you always password protect the encrypted file during the zipping process wherever possible.

No Zipping

If a data file is uploaded directly with no zipping, or a only single layer of zipping without a password, Dataverse will add the 'Explore' button that enables functionality that the ADA do not want available at the present time. In the event that data files are uploaded without a password protected layer of single-zipping or without double-zipping, your deposit will not pass ADA Quality Assurance checks and will be returned to you for rectification, delaying its publication.

Files versus Folders

If there are an excessive number of data files or supporting documentation files that require zipping and password protected encryption prior to upload, it is possible to ingest multiple files as a single folder that is then downloadable in Dataverse. This however means that the files are unable to be given File Tags or Description Notes, making them harder to search for and discover and potentially reduces their chances of reuse. Before uploading multiple files in a folder, you should contact the ADA to discuss the options. When approved by the ADA, the folder will still need to be uploaded as a zipped folder with password protected encryption or as a double-zipped folder for the same reasons as above.

7-Zip Software

It is recommended by the ADA that all data files and certain supporting documents be encrypted using the 7-Zip open source software. This software is used by the ADA Staff and is free. This ensures that the files and folders are protected from unauthorised disclosure during the Dataverse upload process. The software creates a container called ‘archive’ that holds the files requiring protection. That archive container can then be encrypted and password protected. Copies of the software can be obtained via the links at https://www.7-zip.org/. The 7-Zip Software is not available for Apple Mac users, you should therefore use the Apple Compression and Encryption tools which can be decrypted by other users with free software such as TheUnarchiver.

How to Double-Zip

For instructions on how to Double-Zip files and folders using the 7-Zip software, refer to the page detailed below.