News
  • COAR Statement on AI and Open Science
    COAR Statement on AI and Open Science

    Artificial intelligence (AI) is proving highly disruptive to open science as it becomes both widely used in the analysis and production of scholarly literature and deeply embedded in the public information commons. On the one hand, researchers across all domains are harnessing the power of AI and machine learning to do things previously unimaginable – such as rapidly processing massive datasets or synthesising large corpora in many different languages – greatly accelerating scientific progress and leading to new discoveries. On the other hand, AI is seriously challenging some of the fundamental assumptions on which open science rests, putting the open science ecosystem at risk in ways that demand urgent attention.

    Open Science and AI aren’t merely complementary, they are structurally interdependent. Open resources are the raw materials for training AI models and for their application; while well-functioning AI tools are increasingly critical for conducting groundbreaking research. Yet AI, as a major consumer of open science outputs, also brings with it attribution problems, the potential for information contamination, and aggressive automated traffic that strains the very infrastructure on which it depends. Left unaddressed, these pressures threaten to reverse much of what the open science movement has achieved.

    Large language models (LLMs), which are now starting to dominate most discovery systems, introduce an intrinsic attribution problem that breaks the chain of scholarly credit and verification. When you ask an LLM a question, it generates an answer by synthesizing patterns from the millions of documents it was trained on. But an LLM does not track which specific sources contributed to which parts of its response. Because of this, authors and institutions are increasingly pushing back on their work being used without consent or attribution for AI training, and this tension seems to be growing.

    LLMs do not retrieve facts, they generate statistically plausible text. This means they can produce convincing claims, references, and summaries that have no basis in reality, but that can easily be mistaken for legitimate scholarship. This, along with a steady rise in paper mills augmented by AI-generated fake papers, has the potential to pollute the scholarly information space. And because LLMs are retrained continuously on new content, errors and fabrications can be absorbed, laundered, and amplified with each successive training cycle, making the problem self-reinforcing over time.

    Beyond the integrity concerns, repositories and other data providers are being inundated with aggressive bot traffic, placing serious strain on open science infrastructures. COAR’s community resources for “Dealing with Bots” represents a constructive first step in helping repositories navigate this challenge. However, there are currently no ideal solutions at the moment, and the problem is already producing a troubling response: a growing number of repositories are erecting barriers to machine access in order to protect their systems – blocking all machine access so that friendly indexing and AI systems can no longer access those resources.

    Open science was designed to make the scholarly record more trustworthy, accessible, and impactful. However, the pressures described above threaten to invert that mission, turning openness into a vulnerability rather than a strength. We must work together to develop solutions that maintain the integrity of the scholarly commons, while also remaining as open as possible to ensure that research continues to drive scientific advancements. The open science ecosystem was hard-won and will not sustain itself without effort. As such, COAR is committed to working with our community and other stakeholders on the steps needed to advance the vision of open science. As a first step, we recommend the repository community take the following concrete actions:

    • Remain open. Do not block access to well-behaved machines. Researchers (and others) need access to content in repositories and rely on machines to access that content. Adopt only the measures needed to maintain full operations.

    • Improve trust markers. Undertake metadata curation, adopt PIDs, link records with related content elsewhere, and encourage your communities to participate in open peer review initiatives that link open peer reviews to repository resources (e.g. publish, review, curate)

    • Keep humans-in-the-loop. Validate the items being deposited into the repository to ensure they are legitimate contributions

    • Engage with research communities. Inform the research community about the negative impacts of not sharing their work

    • Contribute content to open source AI models. Participate in the development of AI systems that are transparent, scholar-led, and that provide clear attribution to original source materials

    • Develop community norms. Work with partners (scholarly-led or industry partners) to develop the appropriate governance, norms, and practices that ensure the integrity of the scholarly record


    Source:​ [Navigating the Uneasy Interdependence of AI and Open Science] 

    [Read the full original article here]: https://coar-repositories.org/news-updates/navigating-the-uneasy-interdependence-of-ai-and-open-science/


    EIVX (Enlighten Innovation and Vision with X)

    EIVX is an international academic open access publisher. We are dedicated to building a high-quality platform for sharing research outcomes, promoting knowledge dissemination and innovation, and supporting the integration of industry, academia, and research to empower the transformation of scientific achievements.

    Agriculture and Biology

    Agriculture and Biology (ISSN 3106-5988). It is an international, peer-reviewed, open access journal dedicated to publishing highly professional research in all fields related to agricultural and biological sciences.

    Public Health and Environment

    Public Health and Environment (ISSN 3007-5424). It is an international, peer-reviewed, open access journal. It publishes scientific articles relevant to global public health, the natural world and its intersections with human society, and assembles them into issues that raise awareness and understanding of public health problems and solutions.