Australian Data Archive
archiveCanberra, Australia
Research output, citation impact, and the most-cited recent papers from Australian Data Archive. Aggregated across the NobleBlocks index of 300M+ scholarly works.
Top-cited papers from Australian Data Archive
This poster is presented in tandem with the Sensitive Data IG session at RDA18 (which shares the same name). In previous meetings, the Sensitive Data IG has scoped the interest and needs of the RDA community. At this meeting, we focus on specific challenges and opportunities associated with working with Sensitive Data. This poster explores one of these - how do we understand sensitive data definitions in different regions and disciplines, and how might we develop a shared language around sensitive data. For example, developing an understanding of different community agreed vocabularies and how these relate to each other. This poster will consist of three sections. First, case studies of different types of sensitive data will be presented. These will include, for example, case studies from the humanities, medicine, and military cases. Second, different classification systems for the level of sensitive data will be shown. This will explore, for example, what constitutes a high level of sensitive data in one classification system compared to another. Third, examples of how sensitive data is managed across different regions will be given. For example, what protocols govern the use of sensitive data in the Australian context compared to the European context? By presenting these examples, this poster aims to promote wider discussion on how sensitive data is currently defined. Further, it aims to draw out how context can impact whether data is considered sensitive. In doing so, it takes steps towards developing a common language for discussing sensitive data classifications and levels across contexts/disciplines/localities. This poster compliments the RDA Sensitive Data IG session. It is developed by the Sensitive Data IG co-chairs and is an opportunity for us to convey our ideas and seek input from the community in a different forum. More information and contact details for the Sensitive Data IG available at https://www.rd-alliance.org/groups/sensitive-data-interest-group
The expansion of data holdings to incorporate qualitative content has been a major emphasis of the Australian Data Archive since 2007, focussed on the establishment of ADA Qualitative (formerly AQuA). While there have been significant challenges in efforts during this time to encourage qualitative researchers to deposit content with the archive, the deposit of these new data forms have also created new challenges for the archive in ingesting, processing and dissemination. These challenges have been threefold: - methodological: what changes do researchers need to make in their methods to support archival practice - technical: how does ADA adapt its existing metadata schema and data management software (DDI2 and Nesstar) to support qualitative content - practical: how are processing procedures for archivists changed when documenting qualitative content This paper explores each of these challenges in turn, focussing particularly on the adoption of the QuDEx schema developed by the UK Data Archive to support qualitative data archiving. The paper will discuss ADA's experience with the use of the QuDEx schema to address these three challenges, and provide suggestions for future developments of the schema and qualitative archiving more generally.
DDI Lifecycle and DDI-CDI provide significant capabilities for the integration and harmonisation of content across datasets. As part of the recently completed WorldFAIR project lead by CODATA, a team from the Australian Data Archive (ADA) and Sikt lead a work package to examine ways for improvement of FAIR practices in the management of harmonised content in cross-national social surveys. This work was completed in three stages – a review of comparative survey data management practices at Sikt and ADA; development of a human and machine-actionable workflow for harmonisation of social surveys (the Cross-Cultural Survey Harmonisation workflow – CCSH) that leverages DDI and other standards; and a proof-of-concept test of the CCSH workflows leveraging services available at ADA and Sikt through their respective Colectica registries. Overall, the pilot demonstrated that the CCSH workflow forms a viable foundation for standardising and progressively automating the process of survey data harmonisation. However the pilot also showed that there is still a significant degree of human manual input required – and thus has more work to do to be truly FAIR. We thus provide recommendations for data managers and the Alliance as to how more integration and automation might be achieved in future.
The Social Surveys Work Package (WP06) of the WorldFAIR project is focussed on the improvement of FAIR practices in the management of harmonised content in cross-national social surveys. The first report from the Work Package (Deliverable 6.1) provided an overview of the practices of comparative (cross-national) social surveys, through case studies of: (1) the European Social Survey (ESS) and (2) a satellite study, the Australian Social Survey International – European Social Survey (AUSSI-ESS). The focus of this Deliverable 6.2 is oriented towards progress on three recommendations (Rs) from that first report - the use of the DDI Lifecycle and variable cascade (R6.1 and R6.2), and requirements for formal registries of variables and reusable content (R6.5). To achieve this, this paper explores the development of recommended practices for the management and processing of cross-national survey data for the establishment of harmonised social science datasets. In this deliverable, we outline a proposed workflow for the processing of data harmonisation of social surveys, that takes account of the practical steps required to bring diverse content together in a machine-actionable way, and that could best take advantage of external registered, persistent content. This workflow considers the core steps involved in the harmonisation process, key issues that occur in the processing of data during this process, and potential resolutions of these issues. These resolutions are all oriented towards improving FAIR practices in the harmonisation process - through the use of reusable, accessible metadata structures that can both improve processing consistency for current projects, and be applied to future harmonisation projects. The key conclusions are two-fold. Firstly, there is a key need for the application of standardised workflows to enable consistent interaction with registry content held across multiple data repositories. The proposed workflow detailed in this report is a first effort at such a workflow model. Secondly, there is a need for consistent pre-processing of data and metadata within repositories to reduce error handling in the harmonisation process. The final section of the report provides an initial set of processing rules that could be used in such circumstances. The final third phase of this Work Package will then focus on testing this workflow and rules on new waves of data coming from the AUSSI and ESS projects. Visit WorldFAIR online at http://worldfair-project.eu. WorldFAIR is funded by the EC HORIZON-WIDERA-2021-ERA-01-41 Coordination and Support Action under Grant Agreement No. 101058393.
Presentations from the workshop on 'Referencing Data in Publications: Principles, Policy and Practice' held on 28 October 2015 at the Australian Academy of Sciences, Canberra. This workshop was convened by ANDS and CODATA as part of an international series of workshops organised by the CODATA Task Group on Data Citation.
The importance of data citation for understanding the impact of social surveys has becoming increasingly recognized as a priority concern among research infrastructure providers and funders (ANDS, 2012, NSF, 2012; Ball and Duke, 2012). For data archives, data citation provides a mechanism to understand the dissemination activities of the archive, particularly in enabling access to data for secondary use. While social science data archives have long recommended or required the use of citations as a condition of access to datasets, the compliance with this condition is minimal (Piwowar, 2011). For this reason, many data archives and repositories have implemented or are currently exploring new mechanisms for enabling data citation, such as DOIs. Such a pilot study being conducted by the Australian Data Archive (ADA). This project involves three elements: - a review of the current literature on data citation practices in Australian and international social science - a survey of current practice among users of 5 major Australian social science data sets - a pilot study of the use of DOIs with ADA datasets. The paper will present the current results of this project, recommendations for the ADA regarding data citation, and implications for data archives and repositories, more generally.
DDI Lifecycle and DDI-CDI provide significant capabilities for the integration and harmonisation of content across datasets. As part of the recently completed WorldFAIR project lead by CODATA, a team from the Australian Data Archive (ADA) and Sikt lead a work package to examine ways for improvement of FAIR practices in the management of harmonised content in cross-national social surveys. This work was completed in three stages – a review of comparative survey data management practices at Sikt and ADA; development of a human and machine-actionable workflow for harmonisation of social surveys (the Cross-Cultural Survey Harmonisation workflow – CCSH) that leverages DDI and other standards; and a proof-of-concept test of the CCSH workflows leveraging services available at ADA and Sikt through their respective Colectica registries. Overall, the pilot demonstrated that the CCSH workflow forms a viable foundation for standardising and progressively automating the process of survey data harmonisation. However the pilot also showed that there is still a significant degree of human manual input required – and thus has more work to do to be truly FAIR. We thus provide recommendations for data managers and the Alliance as to how more integration and automation might be achieved in future.
As sensitive data are increasingly used for research purposes, reducing the risk of data misuse has become particularly crucial. At the same time, as demonstrated during the COVID19 crisis, sharing high-quality data is a sine qua non to assess and compare research results and to leverage data to their fullest capacity. The Sensitive Data Interest Group aims to promote the FAIR principles and reproducible research, while drawing attention to the unique risks associated with sensitive data and exploring mitigation strategies for these risks. Synthesis research that aggregates data at large scales often uses several kinds of sensitive data, but the ethical and legal issues are often not fully addressed, especially when harmonising differing ethical and legal considerations across regions. Further complicating matters, “sensitive data” are often not even defined in the same way. As a result, reproducing research in different regions or contexts is often difficult, and sensitive data sharing processes are not well sustained. In this poster, our group proposes the following working definition of sensitive data, adapted from David et al., 2020, “Templates for FAIRness evaluation criteria - RDA-SHARC IG” https://zenodo.org/record/3922069#.YCJU7ehKg2w : Information that is regulated by law due to possible risk for plants, animals, individuals and/or communities and for public and private organisations. Sensitive personal data include information related to racial or ethnic origin, political opinions, religious or philosophical beliefs, trade union membership and data concerning the health or sex life of an individual. These data that could be identifiable and potentially cause harm through their disclosure. For local and government authorities, sensitive data is related to security (political, diplomatic, military data, biohazard concerns, etc.), environmental risks (nuclear or other sensitive installations, for example) or environmental preservation (habitats, protected fauna or flora, in particular). The sensitive data of a private body concerns in particular strategic elements or elements likely to jeopardise its competitiveness. Our Sensitive Data IG will workshop this definition and present a summary of the aims of the group and our charter under the RDA validation process. Through the Sensitive Data IG, we aim to provide a forum for a range of communities to share their requirements and jointly develop strategies, support, recommendations and guidelines relevant to sensitive data. We propose defining common goals around how to address the risk associated with different types of sensitive data (e.g. ecological data, indigenous data, human health data, etc.), as well as to responsibly disseminate, aggregate, and use preexisting heterogeneous sensitive data at a global scale. This group will partner with other IGs and WGs to produce recommendations and guidelines around sensitive data (for example, “Sensitive Data Toolkit for Researchers Part 2: Human Participant Research Data Risk Matrix”, https://zenodo.org/record/4088954#.YFZCUq_7Q2w) . We welcome participation and contributions from the entire RDA community and more broadly.