Technical Infrastructure

From ADA Public Wiki
Revision as of 05:52, 6 September 2024 by JMcDougall (Sọ̀rọ̀ | contribs)
Jump to navigation Jump to search

Repository Software

ADA implements the OAIS Reference Model [42] with deployed Dataverse [49] installations for the SIP (deposit.ada.edu.au), AIP (dataverse-test.ada.edu.au) and DIP (dataverse.ada.edu.eu). The Dataverse Project [49] is open source and community-supported code. The Dataverse team has described how the Dataverse software meets Core Trust Seal Technical Infrastructure requirements [50].

ADA also implements web-based tools, developed and maintained in-house, to support its archiving process:

  • Curation and Risk Assessment Tool (CARAT)
  • ADA Deposit and Preservation Tool (ADAPT) [6]
  • Ingest Reporting Tool (upload of SPSS, generates data dictionary, quality check, confidentiality check) - Shiny application

Access management for Access Requests to data in both dataverse.ada.edu.au and anu-dataverse.ada.edu.au production DIP instances is managed through Osticket [51] ticketing open source software, with an access request being Granted or Rejected in Dataverse itself. Osticket as a service is required to be available to the same extent as Dataverse to be able to manage access requests to datasets.

Version Control for Repository Generate Software

The Harvard Dataverse Project team uses GitHub [54] for its version control system. Any tools or software that ADA produces internally are managed through GitHub.

IT Service Management

Due to its simplicity when compared to the ITIL Service Management framework, the YASM Service Management framework has been initiated by the ADA Technical Manager. As part of this YASM framework, a Service Portfolio has been created, to record and track ADA’s various internal and external services.

Additionally, at the end of every calendar month, the ADA DevOps role performs updates and maintenance tasks on all of ADA’s services, to keep the code/applications up to date.  A simple Change Management approach is taken to inform the internal ADA team and external Dataverse user of changes that will be implemented with upcoming Dataverse upgrades: 
  • ADA maintains Dataverse installations whose sole purpose is to test changes in new Dataverse releases before pushing the update to ADA’s three primary publicly consumed installations.
  • ADA plans the release schedule of new Dataverse versions in step with the Dataverse Project’s releases. Not all available functionality is desired for ADA so any new functionality is tested and evaluated to determine when to enable it.

NCI performs an automated weekly set of security tests on ADA’s NCI-hosted services. A report with any identified problems/issues is emailed to the ADA technical team and to the ADA Director. The ADA technical team carries out maintenance to address the report-identified issues. ADA’s services are also monitored by the ANU Chief Information Office, and ADA receives emails that alert the ADA team to any discovered problems, with a request to address them.

International, Community or Technical Infrastructure Standards

ANU and NCI [7] have security standards in place to prevent ANU and NCI technical infrastructure from being adversely affected. NCI monitors the primary Dataverse installations with its f5 WAF service. The ANU CIO [52] monitors ANU infrastructure to ensure compliance to ANU operational standards. ADA receives reports and emails, usually with suggested fixes, when something problematic is detected. The ADA technical team implements the suggested fixes where possible.

Measure to Ensure Availability, Bandwidth & Connectivity

ADA’s Dataverse installations are available 24/7. Requests are prioritised and managed within the time capabilities of the ADA Access Management team. Messaging is posted to the ADA’s website and production Dataverse instances to alert users to future ADA shutdowns, to allow users to submit data access requests in a timely manner. NCI [7] manage the network availability and bandwidth for ADA’s NCI-hosted services. NCI also manages its f5 WAF [55] that provides a level of protection for the Dataverse installations.

ANU’s central ITS services [52] manage the domains for each Dataverse installation as well as DNS updates for any services not behind the f5 WAF. ITS is responsible for monitoring the domain registrations, to ensure they are renewed before services become unavailable when a domain registration expires. NCI manages SSL certificates for ADA’s web-based services and inform the ADA technical team when SSL certificates are about to expire, reissuing them for installing on ADA’s virtual machines (VM). The ADA technical team has implemented monitors that detect when ADA’s VM’s go offline, sending an email to the ADA Director, Technical Manager and DevOps roles:

  • The ADA Devops and Technical Manager work to get the systems back online, consulting with the NCI team if necessary.

ADA is alerted to planned NCI infrastructure outages and informs the ADA internal team. The ADA team posts messaging on the Dataverse installations, and the ADA website, that the systems will be offline on the specified date(s) within specified hours.

Disaster recovery:

  • Snapshots of ADA’s NCI storage are taken, as well as snapshots/backups of ADA’s Dataverse installations including the Dataverses’ local file storage.
    • Hourly
  • Backups of the Dataverse databases are created on their specific vm and stored for 3 months.
  • NCI can restore the NCI ADA project storage.
  • If any Dataverse installation has to be re-deployed from a snapshot, the ADA Devops role works in conjunction with NCI to get them back up and running with the most recent snapshot.
  • If anything goes awry due to the domain, ANU ITS is consulted.
  • If anything goes awry with SSL certificates, NCI is consulted. NCI may inform the ADA team to consult ANU ITS about the issue.

Processes to Monitor & Manage Technical Change

The Dataverse GitHub repo is monitored for new releases. The ADA staff are also members of the Dataverse User Community [19] and are made aware of new releases via that group as well. Any Dataverse bugs or new features needed by ADA are documented by the ADA Technical Manager on the Dataverse GitHub repo. Any technical changes relating to Preservation and/or Reuse are discovered by the Archivist Team and brought up with the ADA Technical Team on an as-needed case-by-case basis:

  • The technical change is discussed and evaluated whether it is required, and if it’s possible to implement.
  • If the ADA technical team can implement the needed change, the team manages it in consultation with the Archivist team.
  • If the ADA technical team can’t implement the needed change, consultation with identified external sources takes place to investigate if that external source can implement the change.
  • Any new Dataverse feature requests for functionality deemed missing according to ADA’s requirements are created as an issue in the Dataverse GitHub [53]:
    • The Dataverse team decides to implement the feature request or not, and when.
  • If the change is not deemed viable for whatever reason, it is documented and shelved and revisited if necessary, at a later date.