Technical Infrastructure
Repository Software
The ADA implements the OAIS Reference Model [42] with deployed Dataverse [49] installations for the SIP (deposit.ada.edu.au), AIP (dataverse-test.ada.edu.au) and DIP (dataverse.ada.edu.eu). The Dataverse Project [49] is open source and community-supported code. The Dataverse team has described how the Dataverse software meets CoreTrustSeal Technical Infrastructure requirements [50].
ADA also implements web-based tools, developed and maintained in-house, to support its archiving process:
- Curation and Risk Assessment Tool (CARAT) [73]
- ADA Deposit and Preservation Tool (ADAPT) [6]
- Ingest Reporting Tool (upload of SPSS, generates data dictionary, quality check, confidentiality check) [74]
Access management for data access requests from both dataverse.ada.edu.au and anu-dataverse.ada.edu.au production DIP instances is managed through Osticket [51] ticketing opensource software, with access being Granted or Rejected in Dataverse itself by adding file permissions for approved users. Osticket as a service is required to be available to the same extent as Dataverse to be able to manage access requests.
Version Control
The Harvard Dataverse Project team uses GitHub [54] for its version control system. Any tools or software that ADA produces internally are managed through GitHub.
IT Service Management
Due to its simplicity when compared to the ITIL Service Management framework, the YASM Service Management framework has been initiated by the ADA Technical Manager. As part of this YASM framework, a Service Portfolio has been created, to record and track ADA’s various internal and external services.
At the end of every calendar month, the ADA DevOps role performs updates and maintenance tasks on all of ADA’s services, to keep code and applications up to date. A simple Change Management approach is taken to inform the internal ADA team and external Dataverse users of changes that will be implemented with upcoming Dataverse upgrades. ADA maintains Dataverse installations whose sole purpose is to test changes in new Dataverse releases before pushing the update to ADA’s three primary publicly consumed installations. ADA plans the release schedule of new Dataverse versions in step with the Dataverse Project’s releases. Not all available functionality is desired for ADA so any new functionality is tested and evaluated to determine when to enable it.
NCI performs an automated weekly set of security tests on ADA’s NCI-hosted services. A report with any identified issues is emailed to the ADA technical team and to the ADA Director. The ADA technical team carries out maintenance to address the report-identified issues. ADA services are also monitored by the ANU Chief Information Office (CIO) [52], and ADA receives emails that alert the ADA team to any discovered problems, with a request to address them.
Infrastructure Standards
ANU and NCI [7] have security standards in place to prevent ANU and NCI technical infrastructure from being adversely affected. NCI monitors the primary Dataverse installations with its f5 WAF [55] service.
Availability, Bandwidth & Connectivity
The ADA Dataverse installations are available 24/7. Requests are prioritised and managed within the time capabilities of the ADA Access Management team. Messaging is posted to the ADA’s website and production Dataverse instances to alert users to future ADA shutdowns, to allow users to submit data access requests in a timely manner.
NCI manages network availability and bandwidth for ADA’s NCI-hosted services. NCI also manages its f5 WAF that provides a level of protection for the Dataverse installations.
ANU’s central ITS services [52] manage the domains for each Dataverse installation as well as DNS updates for any services not behind the f5 WAF. ITS is responsible for monitoring the domain registrations, to ensure they are renewed before services become unavailable when a domain registration expires.
NCI manages SSL certificates for ADA’s web-based services and inform the ADA technical team when SSL certificates are about to expire, reissuing them for installing on ADA’s virtual machines (VM).
The ADA technical team has implemented monitors that detect when ADA’s VMs go offline, sending an email to the ADA Director, Technical Manager and DevOps roles. The ADA Devops and Technical Manager work to get the systems back online, consulting with NCI if necessary.
ADA is alerted to planned NCI infrastructure outages. The ADA team posts messaging on the Dataverse installations, and the ADA website, that the systems will be offline on the specified date(s) and time(s).
Disaster recovery
Hourly snapshots of ADA’s NCI storage are taken, as well as snapshots/backups of ADA’s Dataverse installations including the Dataverses’ local file storage. Backups of the Dataverse databases are created on their specific VM and stored for 3 months. NCI can restore the ADA project storage from regular backups. If any Dataverse installation has to be re-deployed from a snapshot, the ADA Devops role works in conjunction with NCI to get them back up and running with the most recent snapshot.
NCI is consulted on any issues with SSL certificates and may inform the ADA team to consult ANU ITS. ANU ITS is also consulted on issues relating to the domain.
Technical Change
The Dataverse GitHub repo is monitored for new releases. The ADA staff are also members of the Dataverse User Community [19] and are made aware of new releases via that group as well. Any Dataverse bugs or new features needed by ADA are documented by the ADA Technical Manager on the Dataverse GitHub repo.
Any technical changes relating to Preservation and/or Reuse identified by the Archivist Team are brought up with the ADA Technical Team on an as-needed case-by-case basis. The technical change is discussed and evaluated as to whether it is required and possible to implement. If the ADA technical team can implement the needed change, the team manages it in consultation with the Archivist team. The ADA technical team consults with identified external sources to implement changes where required.
Any Dataverse feature requests for functionality deemed missing according to ADA requirements are created as an issue in the Dataverse GitHub [53] for consideration.
References
[38] Technical Infrastructure - (https://docs.ada.edu.au/index.php/Technical_Infrastructure)
[50] Dataverse support for CTS – (https://dataverse.org/book/technical-infrastructure)
[51] osTicket – (https://github.com/osTicket/osTicket)
[52] ANU ITS – (https://services.anu.edu.au/business-units/information-technology-services)
[53] Dataverse GitHub – (https://github.com/IQSS/dataverse/issues)
[54] GitHub – (https://github.com)[55] F5 – (https://www.f5.com/)
[42] Open Archival Information System (OAIS) Reference Model – (https://public.ccsds.org/pubs/650x0m2.pdf)
[49] The Dataverse Project – (https://dataverse.org)
[6] ADAPT – (https://docs.ada.edu.au/index.php/ADAPT)
[73] ADA CARAT tool – (https://github.com/ADA-ANU/ADA_Research_Data_Tools/tree/main/ADA_DRAT_v2)
[74] ADA Ingest Reporting Tool – (https://github.com/ADA-ANU/ADA_Research_Data_Tools/tree/main/ADA_reports)
[7] National Computational Infrastructure – (https://nci.org.au/)
[19] Dataverse User Community – (https://groups.google.com/g/dataverse-community?pli=1)