DataONE

archiveSanta Barbara, United States

Research output, citation impact, and the most-cited recent papers from DataONE (United States). Aggregated across the NobleBlocks index of 300M+ scholarly works.

Total works

Citations

1.9K

h-index

i10-index

Also known as

Data Observation Network for EarthDataONE

Top-cited papers from DataONE

Public Participation in Scientific Research: a Framework for Deliberate Design

Jennifer Shirk, Heidi L. Ballard, Candie C. Wilderman, Tina Phillips +4 more

2012· Ecology and Society1.4Kdoi:10.5751/es-04705-170229

Shirk, J. L., H. L. Ballard, C. C. Wilderman, T. Phillips, A. Wiggins, R. Jordan, E. McCallie, M. Minarchek, B. V. Lewenstein, M. E. Krasny, and R. Bonney. 2012. Public participation in scientific research: a framework for deliberate design. Ecology and Society 17(2): 29. https://doi.org/10.5751/ES-04705-170229

Bridging Theory with Practice: An Exploratory Study of Visualization Use and Design for Climate Model Comparison

Aritra Dasgupta, Jorge Poco, Yaxing Wei, Robert B. Cook +2 more

2015· IEEE Transactions on Visualization and Computer Graphics59doi:10.1109/tvcg.2015.2413774

Evaluation methodologies in visualization have mostly focused on how well the tools and techniques cater to the analytical needs of the user. While this is important in determining the effectiveness of the tools and advancing the state-of-the-art in visualization research, a key area that has mostly been overlooked is how well established visualization theories and principles are instantiated in practice. This is especially relevant when domain experts, and not visualization researchers, design visualizations for analysis of their data or for broader dissemination of scientific knowledge. There is very little research on exploring the synergistic capabilities of cross-domain collaboration between domain experts and visualization researchers. To fill this gap, in this paper we describe the results of an exploratory study of climate data visualizations conducted in tight collaboration with a pool of climate scientists. The study analyzes a large set of static climate data visualizations for identifying their shortcomings in terms of visualization design. The outcome of the study is a classification scheme that categorizes the design problems in the form of a descriptive taxonomy. The taxonomy is a first attempt for systematically categorizing the types, causes, and consequences of design problems in visualizations created by domain experts. We demonstrate the use of the taxonomy for a number of purposes, such as, improving the existing climate data visualizations, reflecting on the impact of the problems for enabling domain experts in designing better visualizations, and also learning about the gaps and opportunities for future visualization research. We demonstrate the applicability of our taxonomy through a number of examples and discuss the lessons learnt and implications of our findings.

Visual Reconciliation of Alternative Similarity Spaces in Climate Modeling

Jorge Poco, Aritra Dasgupta, Yaxing Wei, William W. Hargrove +4 more

2014· IEEE Transactions on Visualization and Computer Graphics26doi:10.1109/tvcg.2014.2346755

Visual data analysis often requires grouping of data objects based on their similarity. In many application domains researchers use algorithms and techniques like clustering and multidimensional scaling to extract groupings from data. While extracting these groups using a single similarity criteria is relatively straightforward, comparing alternative criteria poses additional challenges. In this paper we define visual reconciliation as the problem of reconciling multiple alternative similarity spaces through visualization and interaction. We derive this problem from our work on model comparison in climate science where climate modelers are faced with the challenge of making sense of alternative ways to describe their models: one through the output they generate, another through the large set of properties that describe them. Ideally, they want to understand whether groups of models with similar spatio-temporal behaviors share similar sets of criteria or, conversely, whether similar criteria lead to similar behaviors. We propose a visual analytics solution based on linked views, that addresses this problem by allowing the user to dynamically create, modify and observe the interaction among groupings, thereby making the potential explanations apparent. We present case studies that demonstrate the usefulness of our technique in the area of climate science.

Hybrid Genetic Algorithms and Simulated Annealing for Multi-trip Vehicle Routing Problem with Time Windows

Amalia Kartika Ariyani, Wayan Firdaus Mahmudy, Yusuf Priyo Anggodo

2018· International Journal of Electrical and Computer Engineering (IJECE)21doi:10.11591/ijece.v8i6.pp4713-4723

Vehicle routing problem with time windows (VRPTW) is one of NP-hard problem. Multi-trip is approach to solve the VRPTW that looking trip scheduling for gets best result. Even though there are various algorithms for the problem, there is opportunity to improve the existing algorithms in order gaining a better result. In this research, genetic algoritm is hybridized with simulated annealing algoritm to solve the problem. Genetic algoritm is employed to explore global search area and simulated annealing is employed to exploit local search area. Four combination types of genetic algorithm and simulated annealing (GA-SA) are tested to get the best solution. The computational experiment shows that GA-SA1 and GA-SA4 can produced the most optimal fitness average values with each value was 1.0888 and 1.0887. However GA-SA4 can found the best fitness chromosome faster than GA-SA1.

Jatropha Curcas Disease Identification With Extreme Learning Machine

Triando Hamonangan Saragih, Diny Melsye Nurul Fajri, Wayan Firdaus Mahmudy, Abdul Latief Abadi +1 more

2018· Indonesian Journal of Electrical Engineering and Computer Science12doi:10.11591/ijeecs.v12.i2.pp883-888

Jatropha is a plant that has many functions, but this plant can be attacked by various diseases. Expert systems can be applied in identifying so that can help both farmers and extension workers to identify the disease. one of method that can be used is Extreme Learning Machine. Extreme Learning Machine is a method of learning in Neural Network which has a one-time iteration concept in each process. In this study get a maximum accuracy of 66.67% with an average accuracy of 60.61%. This proves the identification using Extreme Learning Machine is better than the comparison method that has been done before.

Using Peer Review to Support Development of Community Resources for Research Data Management

Heather Soyka, Amber E Budden, Vivian B. Hutchison, US Geological Survey +4 more

2017· Journal of eScience Librarianship6doi:10.7191/jeslib.2017.1114

Abstract Objective: To ensure that resources designed to teach skills and best practices for scientific research data sharing and management are useful, the maintainers of those materials need to evaluate and update them to ensure their accuracy, currency, and quality. This paper advances the use and process of outside peer review for community resources in addressing ongoing accuracy, quality, and currency issues. It further describes the next step of moving the updated materials to an online collaborative community platform for future iterative review in order to build upon mechanisms for open science, ongoing iteration, participation, and transparent community engagement. Setting: Research data management resources were developed in support of the DataONE (Data Observation Network for Earth) project, which has deployed a sustainable, long-term network to ensure the preservation and access to multi-scale, multi-discipline, and multi-national environmental and biological science data (Michener et al. 2012). Created by members of the Community Engagement and Education (CEE) Working Group in 2011-2012, the freely available Educational Modules included three complementary components (slides, handouts, and exercises) that were designed to be adaptable for use in classrooms as well as for research data management training. Methods: Because the modules were initially created and launched in 2011-2012, the current members of the (renamed) Community Engagement and Outreach (CEO) Working Group were concerned that the materials could be and / or quickly become outdated and should be reviewed for accuracy, currency, and quality. In November 2015, the Working Group developed an evaluation rubric for use by outside reviewers. Review criteria were developed based on surveys and usage scenarios from previous DataONE projects. Peer reviewers were selected from the DataONE community network for their expertise in the areas covered by one of the 11 educational modules. Reviewers were contacted in March 2016, and were asked to volunteer to complete their evaluations online within one month of the request, by using a customized Google form. Results: For the 11 modules, 22 completed reviews were received by April 2016 from outside experts. Comments on all three components of each module (slides, handouts, and exercises) were compiled and evaluated by the postdoctoral fellow attached to the CEO Working Group. These reviews contributed to the full evaluation and revision by members of the Working Group of all educational modules in September 2016. This review process, as well as the potential lack of funding for ongoing maintenance by Working Group members or paid staff, provoked the group to transform the modules to a more stable, non-proprietary format, and move them to an online open repository hosting platform, GitHub. These decisions were made to foster sustainability, community engagement, version control, and transparency. Conclusion: Outside peer review of the modules by experts in the field was beneficial for highlighting areas of weakness or overlap in the education modules. The modules were initially created in 2011-2012 by an earlier iteration of the Working Group, and updates were needed due to the constant evolving practices in the field. Because the review process was lengthy (approximately one year) comparative to the rate of innovations in data management practices, the Working Group discussed other options that would allow community members to make updates available more quickly. The intent of migrating the modules to an online collaborative platform (GitHub) is to allow for iterative updates and ongoing outside review, and to provide further transparency about accuracy, currency, and quality in the spirit of open science and collaboration. Documentation about this project may be useful for others trying to develop and maintain educational resources for engagement and outreach, particularly in communities and spaces where information changes quickly, and open platforms are already in common use.

A Discussion of Value Metrics for Data Repositories in Earth and Environmental Sciences

Cynthia Parr, Corinna Gries, Margaret O’Brien, Robert R. Downs +4 more

2019· Data Science Journal6doi:10.5334/dsj-2019-058

Despite growing recognition of the importance of public data to the modern economy and to scientific progress, long-term investment in the repositories that manage and disseminate scientific data in easily accessible-ways remains elusive. Repositories are asked to demonstrate that there is a net value of their data and services to justify continued funding or attract new funding sources. Here, representatives from a number of environmental and Earth science repositories evaluate approaches for assessing the costs and benefits of publishing scientific data in their repositories, identifying various metrics that repositories typically use to report on the impact and value of their data products and services, plus additional metrics that would be useful but are not typically measured. We rated each metric by (a) the difficulty of implementation by our specific repositories and (b) its importance for value determination. As managers of environmental data repositories, we find that some of the most easily obtainable data-use metrics (such as data downloads and page views) may be less indicative of value than metrics that relate to discoverability and broader use. Other intangible but equally important metrics (e.g., laws or regulations impacted, lives saved, new proposals generated), will require considerable additional research to describe and develop, plus resources to implement at scale. As value can only be determined from the point of view of a stakeholder, it is likely that multiple sets of metrics will be needed, tailored to specific stakeholder needs. Moreover, economically based analyses or the use of specialists in the field are expensive and can happen only as resources permit.

Data Citation: Let's Choose Adoption Over Perfection

Daniella Lowenberg, Rachael Lammey, Matthew B. Jones, John Chodacki +1 more

2021· Zenodo (CERN European Organization for Nuclear Research)6doi:10.5281/zenodo.4701079

This perspective piece on the perceived barriers and ways forward to advance data citation practices was written by members of the Make Data Count team which is funded by the Alfred P. Sloan Foundation. For more information on our initiative, visit https://makedatacount.org.

Optimization of Dempster-Shafer’s Believe Value Using Genetic Algorithm for Identification of Plant Diseases Jatropha Curcas

Triando Hamonangan Saragih, Wayan Firdaus Mahmudy, Yusuf Priyo Anggodo

2018· Indonesian Journal of Electrical Engineering and Computer Science5doi:10.11591/ijeecs.v12.i1.pp61-68

Jatropha curcas is a plant that can be used as a substitute for diesel fuel. Lack of knowledge of farmers and the limited number of experts and extension agents into the problem of dealing with the disease Jatropha curcas plant which resulted in lower quality of Jatropha curcas. Dempster-Shafer method can be a solution for decision making based on previous research. The difference in beliefs of every expert in seeing Jatropha diseases are important because Dempster-Shafer can not solve this problem. Optimization using genetic algorithms can solve this problem. Optimization of belief values using genetic algorithms can improve the accuracy of the results of this system are using Dempster-Shafer. On the results of this system provides the highest system accuracy value, opimization of belief values using genetic algorithms gives a more significant result than the use of Dempster-Shafer only.

Exploiting The Digital Revolution: Developing Capacity And Integrating Data Across The Disciplines Of Science

CODATA, Participants Of The First ICSU-CODATA Workshop On Data Standards: Developing A Roadmap For Data Integration, Geoffrey Boulton, Simon Hodson +4 more

2018· Zenodo (CERN European Organization for Nuclear Research)2doi:10.5281/zenodo.1193642

In June 2017, the International Council for Science (ICSU) and its Committee on Data for Science and Technology (CODATA) brought together international scientific unions and associations of ICSU and the International Social Science Council (ISSC) that have made major strides in this area of work, as well as other organisations that curate standards and vocabularies for particular disciplines. The objective of the meeting was to develop an action plan to realise the full potential of the data science, technologies, and infrastructures currently being created by specific disciplinary groups and expand those efforts on an inter- and trans-disciplinary basis. The meeting identified key opportunities of the digital revolution and how they can be achieved. Priorities for action include: the need for examples of the benefits that have already been realised by specific disciplinary groups and inter- and trans-disciplinary projects; the need to extend activities to disciplinary fields that have not yet developed strategies , for developing interoperable vocabularies, standards and models, and for the creation of effective “information communities”; there must be a major effort to achieve interoperability within and between disciplines, without this, the national and regional initiatives to create cloud or platform technologies designed to provide services to support data priorities will fall far short of their potential; international scientific unions and associations, and the international councils of which they are members, are uniquely qualified for this task, and their engagement is essential if its promise is to realised; there is a need to develop a flagship programme on one or more major global challenge themes to develop, demonstrate and apply the methods of linking and integrating data from across the disciplines in the production and use of actionable knowledge. Such a programme will entail a long-term, decadal commitment. It will convene and support the scientific members of ICSU and ISSC, serve as a mechanism for their engagement with relevant international research initiatives, significantly strengthen their data capacities and relate to the priorities of research funding bodies such as the Belmont Forum. The immediate next step was a major ICSU-CODATA workshop in November 2017 to bring together the full range of scientific international unions and associations with organisations working on complex global problems to sharpen the design of the flagship project and create the international, multi-disciplinary data community needed to convert these opportunities into solutions.

Permafrost Discovery Gateway: A web platform to enable discovery and knowledge-generation of permafrost Big Imagery products

Anna Liljedahl, Benjamin Jones, Michael Brubaker, Amber E Budden +4 more

2019· Helmholtz-Zentrum für Polar-und Meeresforschung (Alfred-Wegener-Institut)2

Permafrost thaw has been observed at several locations across the pan-Arctic in recent decades, yet the pan-Arctic extent and potential spatial-temporal variations in thaw are poorly constrained. Thawing of ice-rich permafrost can be inferred and quantified with satellite imagery due to the subsequent differential ground subsidence and erosion that also affects land surface cover, storage and flow of water, sediment, and nutrients. However, a lack of supporting cyberinfrastructure necessary to harness information from the existing and rapidly growing collection of high-resolution satellite imagery (Big Imagery) has limited our advances in understanding the nature of pan-Arctic permafrost degradation. In the coming four years, we will empower the broader Arctic community with a cyberinfrastructure platform, the Permafrost Discovery Gateway (PDG), aimed at making Big Imagery permafrost information accessible and discoverable through novel visualization and analysis tools designed with input from users of the PDG, e.g. the diverse peoples living, working, and/or studying in the Arctic. From the start of the project, we will engage the user-community through in-person and online meetings to ensure effective development of permafrost Big Imagery products for archiving, processing, analyzing, and visualizing. The framework will utilize existing resources, such as the (1) NSF supported data management resources the Arctic Data Center and Clowder, (2) web application visualization tools (Fluid Earth Viewer, Google Earth, and Gapminder Foundation), (3) high performance computing resources (XSEDE, Google Earth Engine etc.), and (4) and satellite imagery (Polar Geospatial Center, Landsat, Sentinel, and Planet). The PDG will include the management of ingesting remote sensing big data into machine and deep learning models. We welcome collaborations with national and international Native, industry, and academic organizations and individuals to ensure broad community engagement and dissemination. The PDG will enable diverse peoples to contribute to and have access to pan-Arctic permafrost knowledge, which can immediately inform the economy, security, and resilience of the Nation, the Arctic region, and the globe with respect to pan-Arctic change.

Evaluating the Effectiveness of Data Management Training: DataONEâ€™s Survey Instrument

Chung‐Yi Hou, Heather Soyka, Vivian B. Hutchison, Isis Sema +2 more

2017· International Journal of Digital Curation1doi:10.2218/ijdc.v12i2.508

 Effective management is a key component for preparing data to be retained for future long term access, use, and reuse by a broader community. Developing the skills to plan and perform data management tasks is important for individuals and institutions. Teaching data literacy skills may also help to mitigate the impact of data deluge and other effects of being overexposed to and overwhelmed by data. The process of learning how to manage data effectively for the entire research data lifecycle can be complex. There are often multiple stages involved within a lifecycle for managing data, and each stage may require specific knowledge, expertise, and resources. Additionally, although a range of organizations offers data management education and training resources, it can often be difficult to assess how effective the resources are for educating users to meet their data management requirements. In the case of Data Observation Network for Earth (DataONE), DataONEâ€™s extensive collaboration with individuals and organizations has informed the development of multiple educational resources. Through these interactions, DataONE understands that the process of creating and maintaining educational materials that remain responsive to community needs is reliant on careful evaluations. Therefore, the impetus for a comprehensive, customizable Education EVAluation instrument (EEVA) is grounded in the need for tools to assess and improve current and future training and educational resources for research data management. In this paper, the authors outline and provide context for the background and motivations that led to creating EEVA for evaluating the effectiveness of data management educational resources. The paper details the process and results of the current version of EEVA. Finally, the paper highlights the key features, potential uses, and the next steps in order to improve future extensions and revisions of EEVA.

Machine Comprehension-Incorporated Relevance Matching

Chen Zhang, Hao Wang, Liang Zhou, Yijun Wang +1 more

20191doi:10.1109/icdm.2019.00096

In current web search engines, the relevance between a query and web pages (i.e. documents) is measured by Text Matching (TM) models. The documents retrieved mainly focus on matching text queries themselves but fail to find target information towards the user intent (e.g. the direct answer to question-style queries). Thus the document containing target information that users want may not be ranked at the top 1 in the search result or even not be recalled. Besides, as voice search and voice-powered assistants are entering our life, queries tend to be long tail ones, which needs a search engine evolved into a higher level of semantic relevance matching. Therefore, it is necessary to build an intent-target relevance matching model in modern search scenarios. This paper proposes a unified model of Machine Comprehension-incorporated Relevance Matching (MCRM). Totally, MCRM models how web users choose the relevant documents to read by observing the titles or summaries, and further look for target information from them. To accomplish that, we first formulate two tasks as Text Matching and Target Extracting. For learning each task, a Context-augmented Matching network (ContMatch) and a Matching-fused machine Comprehension network (MatComprehend) are proposed. Then, they are integrated into an end-to-end framework that can not only measure the semantic relevance but also extract the intent-related target, by deeply comprehending the semantics hidden in queries and exploiting intent-target relations between queries and documents. In MCRM, the two tasks are jointly learned by a multi-task learning approach, where the semantic relevance measured by ContMatch, and the intent-target relevance captured by MatComprehend, are combined and enhanced mutually for boosting the final performance. We conduct extensive experiments on real-world data. The experimental results demonstrate the superiority of MCRM against the state-of-the-art relevance models.

Data Integration Initiative: Planning Document

Isc Codata Data Integration Initiative, Geoffrey Boulton, Simon Hodson, Heide Hackmann +4 more

2018· Zenodo (CERN European Organization for Nuclear Research)1doi:10.5281/zenodo.1319525

During 2017, CODATA initiated and led a discussion with data science groups and international scientific unions and associations about the timeliness of a major initiative on interdisciplinary data integration. Meetings at the ICSU HQ in Paris in June 2017 and at the Royal Society of London in November 2017 produced a report and communiqué supporting a long-term initiative and outlining some of the essential issues to be addressed. The key priorities for this initiative are to address data integration in support of major global challenges and to develop relevant data capacities across all the disciplines of science. An ad hoc steering group was created to plan how these should be carried forward, comprising: CODATA: Geoffrey Boulton – President; Simon Hodson – Executive Director. ICSU: Heide Hackmann – Executive Director. Application Domain leaders: Laura Merson - Infectious Disease Outbreaks; Virginia Murray – Disaster Risk Reduction; Stephen Passmore – Resilient Cities. Data Scientists: Simon Cox – CSIRO; Lesley Wyborn – ANU; Bob Hanisch – NIST; Phil Archer – Consultant. Supporting the steering group in making contributions to the initiative are: Gisbert Glaser - ICSU; Katsia Paulavets - ICSU; Bill Michener – DataONE; Kevin Blanchard - PHE; John Broome – CODATA. The formal governance of the initiative is yet to be determined. This planning paper is an outcome of a meeting of the steering and supporting group on 19 January 2018. It is designed as a first scoping of the purpose, structure and roadmap for the initiative, and will shortly be made available to the community of practice represented by the attendees and invitees of the 2017 meetings. Its primary use is as a live document for planning purposes. It is not an early draft of a bid for support or funding, though it is likely to be a source text for such.

Making Data Count (Dataone Users Group Meeting, 13 July 2015)

Matt Jones, John Kratz, Amber E Budden

2015· Zenodo (CERN European Organization for Nuclear Research)doi:10.5281/zenodo.29900

The Data-Level Metrics presentation was given at the DataONE Users Group meeting on 13 July 2015. It reported on the progress of the NSF-funded Making Data Count project to design and develop metrics that track and measure data use, “data-level metrics” (DLM). DLM are a multidimensional suite of indicators, measuring the broad range of activity surrounding the reach and use of data as a research output. All project information can be found here: http://mdc.lagotto.io.

Sales automation: concepts, justification, planning, and implementation

Todd C. Scofield, Donald R. Shaw

1992

Computers have infiltrated and streamlined just about every business function - except Yet the potential of new technologies and software to stimulate sales from existing customers and future prospects has never been greater. Automation provides managers with a rationale and a platform to bring automation into their organizations. Readers will learn how to: differentiate between the many systems and applications available; keep track of customers throughout the entire buying cycle and enhance customer relationships; use telecommunications and database management techniques; budget, plan and administer a low-risk programme; and introduce automation and train personnel to feel comfortable using it. Sales information is a powerful marketing asset. And yet, as the authors say, There remains a very large, basically untapped, almost unexplored frontier for computing in the realm of sales. Using computer power to its greatest competitive advantage is what Automation is all about.

Search all NobleBlocks papers mentioning “DataONE” →