Quick Deposit Guide: Difference between revisions

From ADA Public Wiki
Jump to navigation Jump to search
 
(27 intermediate revisions by 3 users not shown)
Line 6: Line 6:
  '''- Study Year:'''
  '''- Study Year:'''
  '''- Department or Research group: '''
  '''- Department or Research group: '''
  '''- [http://purl.org/au-research/vocabulary/anzsrc-for/2008/16 Field of Research (FoR) code(s)]: '''
  '''- [https://vocabs.ardc.edu.au/viewById/316 Field of Research (FoR) code(s)]: '''
  '''- List of Files and File Descriptions:'''
  '''- List of Files and File Descriptions:'''


Line 20: Line 20:
*[https://docs.ada.edu.au/index.php/De-Identification De-Identification]
*[https://docs.ada.edu.au/index.php/De-Identification De-Identification]
*[https://docs.ada.edu.au/index.php/The_Privacy_Act_1988 The Privacy Act 1988]
*[https://docs.ada.edu.au/index.php/The_Privacy_Act_1988 The Privacy Act 1988]
== Levels of Curation ==
ADA will discuss with depositors levels of curation that may be offered:
* A. Content distributed as deposited 
* B. Basic curation – e.g., brief checking, addition of basic metadata or documentation 
* C. Enhanced curation – e.g., conversion to new formats, enhancement of documentation 
* D. Data-level curation – as in C above, but with additional editing of deposited data for accuracy


== Reusability of data ==
== Reusability of data ==
Data deposited at the ADA should be understandable and reusable by a secondary user.  
Data deposited at the ADA should be understandable and reusable by a secondary user.  
''' Please ensure that: '''
''' Please ensure that:'''
'''- Study variable names use a consistent naming convention and can be readily matched to corresponding questions and sections on the study questionnaire (if applicable).'''
;Study variable names
  '''- All variables have appropriate labels. Labels should enable secondary users to understand the meaning of each variable, and ADA recommends that variable label lengths should be 80 characters or less.'''
: use a consistent naming convention and can be readily matched to corresponding questions and sections on the study questionnaire (if applicable).
'''- All values of categorical variables have clear labels, preferably under 25 characters '''
;Variable name lengths
'''- There are no comma or separators in value labels as this can cause problems in CSV files. Ensure also that any leading or trailing spaces are removed.  '''
: STATA always supports 32 characters
: SPSS supports 64 characters.  8 or12-character restriction applies to versions below v12 
: SAS supports 32 characters. 8 or12-character restriction applies to versions below v6 
;All variables have appropriate labels
: Labels should enable secondary users to understand the meaning of each variable, and ADA recommends that variable label lengths should be 80 characters or less.  
;All values of categorical variables have clear labels, preferably under 25 characters '''
: There are no comma or separators in value labels as this can cause problems in CSV files. Ensure also that any leading or trailing spaces are removed.  '''


For more information see:
For more information see:
* [[Qualitative Data Processing]]
* [https://docs.ada.edu.au/index.php/Quality_Assurance Quality Assurance]
* [https://docs.ada.edu.au/index.php/Quality_Assurance Quality Assurance]
* [https://docs.ada.edu.au/index.php/Workflows Workflows]
* [https://docs.ada.edu.au/index.php/Workflows Workflows]


== Data formats ==
== Data formats ==
The ADA prefers SPSS files for tabular survey data because that file format contains a metadata dictionary. Other compatible formats are also generally accepted, provided variable and values labels can be associated with their data. We will export all tabular data to SPSS, STATA and SAS formats, as well as CSV for distribution.  
ADA predominately archives and disseminates quantitative data. We have a small collection of qualitative data.


;Tabular data files
: ADA prefers submission in SPSS format for tabular survey data because that file format contains a metadata dictionary (.sav files). This format captures variable level metadata (variable and value labels, data formats etc.) and SPSS Statistics is adept at exporting multiple alternative file formats.
: The ADA also accepts Stata (.dta) SAS (.sas; .sas7bdat), and R (.rdata), as well as text formats (.csv; .tab) provided data including both value labels and codes can be provided. Other data formats will be considered on a case-by-case basis.
Other compatible formats are also generally accepted, provided variable and values labels can be associated with their data. We will export all tabular data to SPSS, STATA and SAS formats, as well as CSV for distribution.
;Qualitative data
: Data formats vary significantly and submitted formats will also be considered on a case-by-case basis.  See ADA's [https://docs.ada.edu.au/index.php/Qualitative_Data_Processing Qualiltative Data Processing] guidelines.
;Other formats
If you are using an unusual format (e.g. a database set up), please let us know in advance and we can discuss options.   
If you are using an unusual format (e.g. a database set up), please let us know in advance and we can discuss options.   


Line 51: Line 76:


Note that related publications with an existing digital identifier (DOI) or webpage (URL) can be referenced in the project metadata on Dataverse.
Note that related publications with an existing digital identifier (DOI) or webpage (URL) can be referenced in the project metadata on Dataverse.
==Qualitative Data Processing==
Support materials are available for qualitative researchers on this page.
Materials: [https://docs.ada.edu.au/index.php/Qualitative_Data_Processing Qualitative Data Processing]


= Step 3 - Upload files =
= Step 3 - Upload files =
An ADA archivist will send you a link to a website where you can securely upload your files. To access this website you will first be directed to create a user account. Please inform the ADA once you have done that step, so we can give you editing rights to your deposit page. Please upload your data files and supporting documentation files to this site. Other secure file sharing solutions are possible, however, this should be discussed with the ADA first. For security reason do not send data files by email.
 
ADA File Naming Conventions for data depositors to follow when uploading documentation and data files.
 
Deposited data files should follow the below file naming conventions:
File Codes
* 0_ for ADA Data License (to be completed in consultation with ADA archivists as part of the deposit phase)
* 1_ for documentation files 
* 2_ for GENERAL Release data files (as defined in the ADA Data License based on data sensitivity)
* 3_ for RESTRICTED release files (as defined in the ADA Data License based on data sensitivity)
* 99_ for ARCHIVE ONLY
* the ADA ID should be part of the filename
 
 
<pre> [File code]_[Proj Name]_[ADAID]    Example: 2_ANUPOLL_53_100500.sav </pre>
where
 
<blockquote>
[File code]:  is coding for different kind of file type, '''0''' data license, '''1'''  documentation, '''2''' general data, and  '''3''' restricted data
 
[Proj Name]:  Usually an abbreviation for the dataset/research project name. If the data is longitudinal data, add the number for wave, like included. e.g. '''ANUPOLL_53''', '''GEN_W1''', '''FWFC_w3'''
 
[ADAID]:      the ADAID that was assigned to this dataset, e.g. '''100500'''
 
</blockquote>
An ADA archivist will send you a link to the ADA Deposit Dataverse where you can securely upload your files. To access this website you will first be directed to create a user account. Please inform the ADA once you have done that step, so we can give you editing rights to your deposit page.
 
Please upload your data files and supporting documentation files to this site. Alternative file sharing services can be used by prior arrangement with ADA. For security reasons, please do not send data files by email.


= Step 4 - Provide metadata =
= Step 4 - Provide metadata =


Good metadata is essential for findability and reusability of data. Please fill in as many metadata fields as you can in the provided deposit shell (where you uploaded your data). If you are unsure about what information to provide in a given field, please contact the ADA for guidance.  
Good metadata is essential for findability and reusability of data. Please fill in as many metadata fields as you can in the provided deposit shell (where you uploaded your data). Navigate to the Metadata tab, and click ‘Add + Edit Metadata’ to edit the fields. 
 
See our [[Metadata guidelines for ADA Dataverse]] and the [https://rdf-vocabulary.ddialliance.org/cv DDI Controlled Vocabularies - Overview] for recommended 'vocabulary TERMS'.


The ADA uses controlled vocabulary for keywords, see [http://vocabularyserver.com/apais/ APAIS] and for the topic classification, we use [http://purl.org/au-research/vocabulary/anzsrc-for/2008/16 ANZSRC FoR code].
If you are still unsure about what information to provide in a given field, please refer to the [https://zenodo.org/records/5576412 Dataverse North Metadata Best Practises Guide] for more extensive documentation.


= Step 5 - License and Access conditions =
= Step 5 - License and Access conditions =

Latest revision as of 00:35, 22 October 2025

Step 1 - Contact the ADA

If you want to deposit research data with the ADA, please send an email to ada@ada.edu.au with the following information:

- Study Title:
- Study Year:
- Department or Research group: 
- Field of Research (FoR) code(s): 
- List of Files and File Descriptions:


See Deposit Appraisal & Collection Policy for details on what data the ADA accepts. Licenses and access conditions will be discussed once the deposit is accepted for archiving with the ADA.

Step 2 - Prepare data files and documentation

De-identification of sensitive data

Ensure that your data does not contain identifying information about your research participants. This includes, for instance, personal information about the respondents in a survey that could lead to the identification of an individual respondent.

For more information on the topic see:

Levels of Curation

ADA will discuss with depositors levels of curation that may be offered:

  • A. Content distributed as deposited
  • B. Basic curation – e.g., brief checking, addition of basic metadata or documentation
  • C. Enhanced curation – e.g., conversion to new formats, enhancement of documentation
  • D. Data-level curation – as in C above, but with additional editing of deposited data for accuracy

Reusability of data

Data deposited at the ADA should be understandable and reusable by a secondary user. Please ensure that:

Study variable names
use a consistent naming convention and can be readily matched to corresponding questions and sections on the study questionnaire (if applicable).
Variable name lengths
STATA always supports 32 characters
SPSS supports 64 characters. 8 or12-character restriction applies to versions below v12
SAS supports 32 characters. 8 or12-character restriction applies to versions below v6
All variables have appropriate labels
Labels should enable secondary users to understand the meaning of each variable, and ADA recommends that variable label lengths should be 80 characters or less.
All values of categorical variables have clear labels, preferably under 25 characters
There are no comma or separators in value labels as this can cause problems in CSV files. Ensure also that any leading or trailing spaces are removed.

For more information see:

Data formats

ADA predominately archives and disseminates quantitative data. We have a small collection of qualitative data.

Tabular data files
ADA prefers submission in SPSS format for tabular survey data because that file format contains a metadata dictionary (.sav files). This format captures variable level metadata (variable and value labels, data formats etc.) and SPSS Statistics is adept at exporting multiple alternative file formats.
The ADA also accepts Stata (.dta) SAS (.sas; .sas7bdat), and R (.rdata), as well as text formats (.csv; .tab) provided data including both value labels and codes can be provided. Other data formats will be considered on a case-by-case basis.

Other compatible formats are also generally accepted, provided variable and values labels can be associated with their data. We will export all tabular data to SPSS, STATA and SAS formats, as well as CSV for distribution.

Qualitative data
Data formats vary significantly and submitted formats will also be considered on a case-by-case basis. See ADA's Qualiltative Data Processing guidelines.
Other formats

If you are using an unusual format (e.g. a database set up), please let us know in advance and we can discuss options.

For more information see:

Documentation files

Please provide appropriate documentation for your data. As a minimum, you should supply the study questionnaire (or equivalent research materials that correspond to your data). Other appropriate supporting documentation can include:

  • technical report
  • instructions to the data collector

Note that related publications with an existing digital identifier (DOI) or webpage (URL) can be referenced in the project metadata on Dataverse.

Qualitative Data Processing

Support materials are available for qualitative researchers on this page.

Materials: Qualitative Data Processing

Step 3 - Upload files

ADA File Naming Conventions for data depositors to follow when uploading documentation and data files.

Deposited data files should follow the below file naming conventions: File Codes

  • 0_ for ADA Data License (to be completed in consultation with ADA archivists as part of the deposit phase)
  • 1_ for documentation files
  • 2_ for GENERAL Release data files (as defined in the ADA Data License based on data sensitivity)
  • 3_ for RESTRICTED release files (as defined in the ADA Data License based on data sensitivity)
  • 99_ for ARCHIVE ONLY
  • the ADA ID should be part of the filename


 [File code]_[Proj Name]_[ADAID]    Example: 2_ANUPOLL_53_100500.sav 

where

[File code]: is coding for different kind of file type, 0 data license, 1 documentation, 2 general data, and 3 restricted data

[Proj Name]: Usually an abbreviation for the dataset/research project name. If the data is longitudinal data, add the number for wave, like included. e.g. ANUPOLL_53, GEN_W1, FWFC_w3

[ADAID]: the ADAID that was assigned to this dataset, e.g. 100500

An ADA archivist will send you a link to the ADA Deposit Dataverse where you can securely upload your files. To access this website you will first be directed to create a user account. Please inform the ADA once you have done that step, so we can give you editing rights to your deposit page.

Please upload your data files and supporting documentation files to this site. Alternative file sharing services can be used by prior arrangement with ADA. For security reasons, please do not send data files by email.

Step 4 - Provide metadata

Good metadata is essential for findability and reusability of data. Please fill in as many metadata fields as you can in the provided deposit shell (where you uploaded your data). Navigate to the Metadata tab, and click ‘Add + Edit Metadata’ to edit the fields.

See our Metadata guidelines for ADA Dataverse and the DDI Controlled Vocabularies - Overview for recommended 'vocabulary TERMS'.

If you are still unsure about what information to provide in a given field, please refer to the Dataverse North Metadata Best Practises Guide for more extensive documentation.

Step 5 - License and Access conditions

Before data can be published on the ADA Dataverse, the data rights holder must sign a license agreement with the ADA. This license specifies the conditions under which the ADA can disseminate the data. See the Rights Managment section on the ADA wiki for more information.

The agreement allows you to specify the terms and conditions of access for data users, the details users must provide, and the process by which data requests will be assessed and approved, see Access Conditions for details. The ADA can advise on what access condition would suit your data.