Technical Infrastructure
Repository Software
The ADA implements the OAIS Reference Model [42] with deployed Dataverse [49] installations for the SIP (deposit.ada.edu.au), AIP (dataverse-test.ada.edu.au) and DIP (dataverse.ada.edu.eu).
The Dataverse Project [49] is open source and community-supported code. The Dataverse team has described how the Dataverse software meets CoreTrustSeal Technical Infrastructure requirements [50].
ADA also implements web-based tools, including those developed and maintained in-house, to support its archiving process:
Open Source:
- Metabase for reporting analytics [82]
- OSTicket task management and tracking application [51]
In-House :
- Curation and Risk Assessment Tool (CARAT) [73]
- ADA Deposit and Preservation Tool (ADAPT) [6]
- Ingest Reporting Tool (upload of SPSS, generates data dictionary, quality check, confidentiality check) [74]
- Coordinated Access to Data, Research and Environments (CADRE) [97]
Data access provision, from both dataverse.ada.edu.au and anu-dataverse.ada.edu.au production DIP instances, is managed through Osticket [51] ticketing software, with access being Granted or Rejected in Dataverse itself by adding file permissions for approved users. Along with Dataverse, Osticket as a service is required to be available to manage access requests.
The ADA has implemented a stand-alone web application that is built around the 5 Safes [96]: CADRE (Coordinated Access to Data, Research and Environments) [97]. CADRE integrates ADA's Production Dataverse [10] for specific datasets, facilitating more efficient access requests and approvals/rejections. Requests for Production Dataverse [10] datasets managed through CADRE [97] are approved and revoked within CADRE [97], with granting/revoking access to and from users in ADA's Production Dataverse [10] occurring programmatically.
ADA will gradually manage all its dataset access requests through CADRE [97].
Version Control
The Harvard Dataverse Project team uses GitHub [54] for its version control system. Any in-house tools or software that ADA produces internally are managed through GitHub.
IT Service Management
Due to its simplicity when compared to the ITIL Service Management framework, the YASM Service Management framework has been initiated by the ADA Technical Manager. As part of this YASM framework, a Service Portfolio has been created, to record and track ADA’s various internal and external services.
At the end of every calendar month, the ADA DevOps role performs updates and maintenance tasks on all of ADA’s services, to keep code and applications up to date. A simple Change Management approach is taken to inform the internal ADA team and external users of changes that will be implemented with upcoming Dataverse upgrades. ADA maintains Dataverse installations whose sole purpose is to test changes in new Dataverse releases before pushing the update to ADA’s three primary publicly consumed installations. ADA plans the release schedule of new Dataverse versions in step with the Dataverse Project’s releases. Not all available functionality is desired for ADA so any new functionality is tested and evaluated to determine when to enable it.
NCI [7] performs an automated weekly set of security tests on ADA’s NCI-hosted services. A report with any identified issues is emailed to the ADA technical team and to the ADA Director. The ADA technical team carries out maintenance to address the report-identified issues. ADA also receives email alerts from the Australian Signals Directorate (Australia’s national cyber security and intelligence agency) [98] and/or the ANU Information Security Office (ISO) [99] bringing attention to any discovered problems with one or more ADA Services, with a request to address them. The ADA technical team works to address those problems and reports back to the alerting organisation.
Infrastructure Standards
ANU and NCI [7] have security standards in place to prevent ANU and NCI technical infrastructure from being adversely affected. NCI monitors the primary Dataverse installations with its f5 WAF [55] service.
Availability, Bandwidth & Connectivity
ADA's web services including its Dataverse installations are available 24/7. Data access requests are prioritised and managed within the time capabilities of the ADA Access Management team. Messaging is posted to the ADA’s website and production Dataverse instances to alert users to future ADA shutdowns, to allow users to submit data access requests in a timely manner.
NCI [7] manages network availability and bandwidth for ADA’s NCI-hosted services:
- Network details (17/10/2025):
- AVAILABILITY:
- Network is at ~ 99.95%
- AVAILABILITY:
- BANDWIDTH:
- Over 10Gbps available
- BANDWIDTH:
- CONNECTIVITY:
- Redundant network connectivity to ADA services
- CONNECTIVITY:
- DISASTER RECOVERY:
- Active - Failover automatic network recovery
- DISASTER RECOVERY:
- ADA's VM Datastore(s):
- ADA's VMs are all hosted on high availability compute clusters that will restart any VMs on surviving compute hosts within their hosted data centre cluster. ADA's VMs can also be manually restarted in the alternate data centre if there is a whole compute cluster failure or whole data centre outage.
- NAS File Server Volume(s):
- ADA's NAS file server is hosted on high availability storage cluster that will failover any file server to surviving storage nodes within their hosted data centre cluster. ADA's file server can also be manually restarted in the alternate data centre if there is a whole storage cluster failure or whole data centre outage.
NCI also manages its f5 WAF that provides a level of protection for ADA’s Dataverse installations.
NCI manages SSL certificates for ADA’s web-based services and inform the ADA technical team when SSL certificates are about to expire, reissuing them for installing on ADA’s virtual machines (VM).
NCI alerts ADA about planned NCI infrastructure outages. The ADA team posts messaging on the Dataverse installations, and the ADA website, to alert users that the systems will be offline on the specified date(s) and time(s).
The ADA technical team has deployed monitoring software that detects when ADA’s VMs go offline, sending an email to the ADA Director, Technical Manager and DevOps roles. The ADA Devops (primarily) and Technical Manager (if necessary) work to get the systems back online, consulting with NCI if necessary.
Disaster Recovery
NCI [7] manages snapshots of ADA’s data as follows:
ADA's VM Datastore(s):
- Snapshot every 3 hours and retained for 24 hours.
- Snapshot every 24 hours and retained for 14 days.
- Mirrored to alternate data centre daily and retained for 24 hours.
NAS File Server Volume(s):
- Snapshot every hour and retain for 24 hours.
- Snapshot every day and retained for 7 days.
- Snapshot every week and retained for 1 month.
- Mirrored to alternate data centre daily and retained for 24 hours
Every 3 months ADA’s NCI mdss (mass data storage system) data is automatically tarballed and copied to the NCI mass storage / tape silo service.
The Dataverse databases are automatically backed up daily:
- Locally to their respective VM - retained for 1 month.
- to NCI external storage - retained for 6 months.
SSL Certificates and Domains
NCI is consulted on any technical issues related to SSL certificates and may inform the ADA team to consult ANU ITS. ANU ITS is also consulted on technical issues relating to ADA’s web service domains.
Technical Change
The Dataverse GitHub repo [53] is monitored for new releases. The ADA Director and Technical Manager are also members of the Dataverse User Community [19] and are made aware of new releases via that group. Any Dataverse bugs or new features needed by ADA are documented by the ADA Technical Manager on the Dataverse GitHub repo [53].
Any Dataverse feature requests for functionality deemed missing according to ADA requirements are created as an issue in the Dataverse GitHub [53] for consideration.
References
[42] Open Archival Information System (OAIS) Reference Model – (https://public.ccsds.org/pubs/650x0m2.pdf)
[49] The Dataverse Project – (https://dataverse.org)
[82] Metabase - (https://www.metabase.com/)
[51] osTicket – (https://github.com/osTicket/osTicket)
[73] ADA CARAT tool – (https://github.com/ADA-ANU/ADA_Research_Data_Tools/tree/main/ADA_DRAT_v2)
[6] ADAPT – (https://docs.ada.edu.au/index.php/ADAPT)
[74] ADA Ingest Reporting Tool – (https://github.com/ADA-ANU/ADA_Research_Data_Tools/tree/main/ADA_reports)
[97] CADRE - (https://cadre.ada.edu.au)
[96] 5 Safes - (https://fivesafes.org/)
[10] ADA Production Dataverse - (https://dataverse.ada.edu.au/)
[7] National Computational Infrastructure – (https://nci.org.au/)
[53] Dataverse GitHub – (https://github.com/IQSS/dataverse/issues)
[19] Dataverse User Community – (https://groups.google.com/g/dataverse-community?pli=1)