FLOSS Research

Open Source AI Definition – Weekly update Mar 18

Open Source Initiative - Mon, 2024-03-18 12:58
Comments on draft 0.0.6 from the forum
  • Point raised by participant that training data has been listed both as optional and a precondition. This might cause confusion as it is unclear whether we should have the right to access training data or know what training data was used for the model
  • To contribute, read the new draft here 
Moving on to next steps! Town hall this Friday Comments on definitions under “What is open source AI? Still strong debate about access to training data 
  • There is a fear that this will harm the ecosystem in the long run, as the original work of the model never can be “forked” to improve the model itself.
Categories: FLOSS Research

ClearlyDefined at the ORT Community Days

Open Source Initiative - Wed, 2024-03-13 10:45

Once again Bosch’s campus in Berlin received ORT Community Days, the annual event organized by the OSS Review Toolkit (ORT) community. ORT is an Open Source suite of tools to automate software compliance checks.

During this two day event, members from startups like Double Open and NexB, as well as large corporations like Mercedes-Benz, Volkswagen, CARIAD, Porsche, Here Technologies, EPAM, Deloitte, Sony, Zeiss, Fraunhofer, and Roche, came together to discuss best practices around software supply chain compliance.

The ClearlyDefined community had an important presence at the event, represented by E. Lynette Rayle and Lukas Spieß from GitHub and Qing Tomlinson from SAP. I had the pleasure to represent the Open Source Initiative as the community manager for ClearlyDefined. The mission of ClearlyDefined is to crowdsource a global database of licensing metadata for every software component ever published. We see the ORT community as an important partner towards achieving this mission.

Relevant talks

There were several interesting talks at ORT Community Days. These are the ones I found most relevant to ClearlyDefined:

Philippe Ombredanne presented ScanCode, a project of great importance to ClearlyDefined, as we use this tool to detect licenses, copyrights, and dependencies. Philippe gave an overview of the project and its challenges. For ClearlyDefined, we would like to see better accuracy and performance improvements. 

Sebastian Schuberth presented the Double Open Server (DOS) companion for ORT. DOS is a server application that scans the source code of open source components, stores the scan results for use in license compliance pipelines, and provides a graphical interface for manually curating the license findings. I believe there’s an opportunity to integrate DOS with ClearlyDefined by providing access to our APIs to fetch licensing metadata and allowing the sharing of curations.

Marcel Kurzmann and Martin Nonnenmacher presented Eclipse Apoapsis, another ORT server that makes use of its integration APIs for dependency analysis, license scanning, vulnerability databases, rule engine, and report generation. Again, I feel we could also integrate Eclipse Apoapsis with ClearlyDefined the same way as with DOS.

Till Jaeger gave an excellent talk about curation of ORT output from the perspective of FOSS license compliance. He highlighted the Cyber Resilient Act (CRA), which brings legal provisions for SBOMs, and which will likely increase the need for tools like ORT. Till shared the many challenges in the curation process, particularly the compatibility issues from dual licensing, and went on to showcase the OSADL compatibility matrix.

Presenting ClearlyDefined

I had the privilege of presenting ClearlyDefined together with E. Lynette Rayle from GitHub and we got some really good feedback and questions from the audience.

With the move towards SBOMs everywhere for compliance and security reasons, organizations will face great challenges to generate these at scale for each stage on the supply chain, for every build or release. Additionally, multiple organizations will have to curate the same missing or wrongly identified licensing metadata over and over again.

ClearlyDefined is well suited to solve these problems by serving a cached copy of licensing metadata for each component through a simple API. Organizations will also be able to contribute back with any missing or wrongly identified licensing metadata, helping to create a database that is accurate for the benefit of all.

GitHub is well aware of these challenges and is interested in helping its users in this regard. They recently added 17.5 million package licenses sourced from ClearlyDefined to their database, expanding the license coverage for packages that appear in dependency graph, dependency insights, dependency review, and a repository’s software bill of materials (SBOM).

To make use of ClearlyDefined’s data, a user can simply make a call to its API service. For example, to fetch licensing metadata from the lodash library on NPM at version 4.17.21, one would call:

curl -X GET "https://api.clearlydefined.io/definitions/npm/npmjs/-/lodash/4.17.21" -H "accept: */*"

This API call would be processed by the service for ClearlyDefined, as illustrated in the diagram below. If there’s a match in the definition store, then that definition would be sent back to the user. Otherwise, this request would trigger the crawler for ClearlyDefined (part of the harvesting process), which would download the lodash library from NPM, scan the library, and write the results to the raw results store. The service for ClearlyDefined would then read the raw results, summarize it, and create a definition to be written in the definition store. Finally, the definition would be served to the user.

The curation process is done through another API call via PATCHes. For example, the below PATCH updates a declared license to Apache-2.0:

"contributionInfo": {
      "summary": "[Test] Update declared license",
      "details": "The declared license should be Apache as per the LICENSE file.",
      "resolution": "Updated declared license to Apache-2.0.",
      "type":"incorrect",
      "removeDefinitions":false
  },

This curation is handled by the service for ClearlyDefined, as illustrated in the diagram below. The curation would trigger the creation of a PR in ClearlyDefined’s curated-data repository, which would be reviewed by and signed off by two curators. The PR would then be merged and written in the curated-data store.

GitHub has deployed its own local Harvester for ClearlyDefined, as illustrated in the diagram below. GitHub’s OSPO Policy Service posts requests to GitHub’s Harvester for ClearlyDefined, which downloads any components and dependencies from various package managers, scans these components, and writes the results directly to ClearlyDefined’s raw results store. GitHub’s OSPO Policy Service fetches definitions from the service for ClearlyDefined as well as licenses and attributions from GitHub’s Package License Gateway. GitHub maintains a local cache store which is synced with any updates from ClearlyDefined’s changes-notifications blob storage.

ClearlyDefined’s development has seen an increased participation from various organizations this past year, including GitHub, SAP, Microsoft, Bloomberg, and CodeThink.

Currently, maintainers of ClearlyDefined are focused on ongoing maintenance. Key goals for ClearlyDefined in 2024 include:

  • Publishing periodic releases and switching to semantic versioning
  • Bringing dependencies up to date (in particular using the latest scancode)
  • Improving the NOASSERTION/OTHER issue
  • Advancing usability and the curation process through the UI 
  • Enhancing the documentation and process for creating a local harvest

Our slides are available here.

Relevant breakout sessions

ORT Community Days provided several breakout sessions to allow participants to discuss pain points and solutions.

A special discussion around curations was led by Sebastian Schuberth and E. Lynette Rayle. The ORT Package Curation Data can be broken down into two categories: metadata interpretations and legal curations. The group discussed their thoughts about the curation process and its challenges, including handling false positives and the sharing of curations.

Nowadays, no conference would be complete without at least one talk or discussion about Artificial Intelligence. A group gathered to discuss the potential use of AI to improve user experience as well as for OSS compliance. The majority of attendees believed ORT’s documentation could be improved through the use of AI and even an assistant would be helpful to answer the most common questions. As for the use of AI for OSS compliance, there’s a lot of potential here, and one idea would be to use ClearlyDefined’s curation dataset to fine tune a LLM.

Conclusion

The second edition of ORT Community Days represented a unique opportunity for the ClearlyDefined community to better engage with the ORT community. We were able to meet the maintainers and members of ORT and learn from them about the current and future challenges. We were also able to explore how our communities can further collaborate. 

On behalf of the ClearlyDefined community, I would like to thank the organizers of this wonderful event: Marcel Kurzmann, Nikola Babadzhanov, Surya Santhi, and Thomas Steenbergen. I would also like to thank E. Lynette Rayle, Lukas Spieß and Qing Tomlinson from the ClearlyDefined community who have accepted my invitation to participate in this conference.

If you are interested in Open Source supply chain compliance and security, I invite you to learn a bit more about the ClearlyDefined and the ORT communities. You might also be interested in my report from FOSS Backstage.

Categories: FLOSS Research

Three perspectives from FOSS Backstage

Open Source Initiative - Wed, 2024-03-13 10:45

As a community manager, I find FOSS Backstage to be one of my favorite conferences content-wise and community-wise. This is a conference that happens every year in Berlin, usually in early March. It’s a great opportunity to meet community leaders from Europe and across the world with the goal of fostering discussions around three complementary perspectives: a) community health and growth, b) project governance and sustainability, and c) supply chain compliance and security.

Community health and growth

While there were several interesting talks, one of the highlights of the “Community health and growth” track was Tom “spot” Callaway’s talk embracing your weird: community building through fun & play. Tom shared some really interesting ideas to help members bond together: a badge program, a candy swap activity, a coin giveaway, a scavenger hunt, and a karaoke session.

FOSS Backstage this year was special because I got to finally meet 3 members from the ClearlyDefined community who have given a new life to this project: E. Lynette Rayle and Lukas Spieß from GitHub and Qing Tomlinson from SAP. While we did not go into a scavenger hunt or a karaoke session (that would have been fun), we spent most of our time during the week having lunch and dinner together, watching talk sessions together, networking with old and new acquaintances, and even going for some sightseeing in Berlin. This has allowed us to not only share ideas about the future of ClearlyDefined, but most importantly to have fun together and create a strong bond between us.

Please find below a list of interesting talks from this track:

Project governance and sustainability

In last year’s FOSS Backstage, I had the opportunity to meet Thomas Steenbergen for the first time. He’s the co-founder of ClearlyDefined and the OSS Review Toolkit (ORT) communities. Project governance and sustainability is something Thomas deeply cares about, and I was honored to be invited to give  a talk together with him for this year’s conference.

Our talk was about aligning wishes of multiple organizations into an Open Source project. This is a challenge that many projects face: oftentimes they struggle to align wishes and get commitment from multiple organizations towards a shared roadmap. There’s also the challenge of the “free rider” problem, where the overuse of a common resource without giving back often leads to the tragedy of the commons. Thomas shared the idea of a collaboration marketplace and a contributor commitment agreement where organizations come together to identify, commit, and implement a common enhancement proposal. This is a strategy that we are applying to ORT and ClearlyDefined.

Our slides are available here.

Please find below a list of interesting talks from this track:

Supply chain compliance and security

Under the “supply chain compliance and security” track, I was happy to watch a wonderful talk from my friend Ana Jimenez Santamaria entitled looking at Open Source security from a community angle. She has been leading the TODO Group at the Linux Foundation for quite a few years now, and it was interesting to learn how they are helping OSPOs (Open Source Program Offices) to create a trusted software supply chain. Ana highlighted three takeaways:

  • OSPOs integrate Open Source in an organization’s IT infrastructure.
  • Collaboration between employees, Open Source staff, and security teams with the Open Source ecosystem offers a complete security coverage across the whole supply chain.
  • OSPOs have the important mission of achieving digitalization, innovation and security in a healthy and continuous way.

Please find below a list of interesting talks from this track:

Bonus: Open Source AI

Nowadays, no conference would be complete without at least one talk about Artificial Intelligence, so Frank Karlitschek’s keynote what the AI revolution means for Open Source and our society was very welcome! Frank demonstrated that Open Source AI can indeed compete with proprietary solutions from the big players. He presented Nextcloud Assistant that runs locally, and that can be studied and modified. This assistant offers several exciting features: face recognition in photos, text translation, text summarization, text generation, image generation, speech transcript, and document classification –  all this while preserving privacy.

It’s worth pointing out that the Open Source Initiative is driving a multi-stakeholder process to define an “Open Source AI” and everyone is welcome to be part of the conversation.

Conclusion

I had a wonderful time at FOSS Backstage and I invite everyone interested in community, governance, and supply chain to join this amazing event next year. I would like to thank the organizers who work “backstage” to put together this conference. Thank you Paul Berschick, Sven Spiller, Alexander Brateanu, Isabel Drost-Fromm, Anne Sophie Riege, and Stefan Rudnitzki. A special thanks also to the volunteers, speakers, sponsors, and last but not least to all attendees who made this event special.

If you are interested in Open Source supply chain compliance and security, I invite you to learn a bit more about the ClearlyDefined and the ORT communities. Be sure to check out my report from the ORT Community Days.

Categories: FLOSS Research

Open Source AI Definition – weekly update Mar 11

Open Source Initiative - Mon, 2024-03-11 02:00

Big week, marked by the release of draft 0.0.6! The document is available for live comments and more general comments on the forum.

Changes in the section “What is open source AI”

  • added, “Precondition to exercise these freedoms is to have access to the preferred form to make modifications to the system.” With an example of what this looks like for a machine learning system
  • checklist to evaluate legal documents: the component details are presented. They reflect the results of the working groups
  • Change in wording from “license” to “legal document”

The preamble is left untouched, more discussion seems to be needed.

We held our fifth town hall meeting this previous Friday, the 8th of March.

Click here to access the recording

Why were these 4 systems picked and not others? Will more AI systems be analyzed?
  • (Participant question) out of 4 working groups, 2 refer to models under proprietary licenses. Of the 4 groups, 3 refer to LLMs. The Groups do not reflect what will (likely) be in the definition, therefore, it is a waste of time for OSI to consider them when crafting a definition.
  • (OSI response) It is important to have a diversified analysis, as at this stage, we are considering how the models operate rather than their license. The objective of the working groups was to identify the required components to exercise the 4 freedoms, and we found them to be quite similar. 

Next steps: Analyze a combination of each of the systems, which systems have these components, and find and review their accompanying legal documents. Follow this thread on the forum if you want to help.

How will the Open Source AI Definition and the “classic” OSD interact?
  • An interesting question was raised for which there should be an answer once the Open Source AI Definition gets closer to being feature complete.
Categories: FLOSS Research

A candid conversation on The Changelog Podcast about defining Open Source AI, and what is really at stake

Open Source Initiative - Tue, 2024-03-05 01:00

I was recently invited to join hosts Adam Stacoviak and Jerod Santo on The Changelog podcast. The Changelog features deep technical reviews and conversations about the most recent news in the world of software, and this was the first time anyone from the OSI has appeared on the show. 

After introducing the Open Source Initiative, we discussed the challenges of not only defending the Definition itself, but the idea that we need a Definition at all. And I was able to explain the complicated nature of being a global nonprofit organization defending the Open Source Definition for over 25 years.

I outlined the three programs that comprise the work of the OSI—legal and licensing, policy and standards, and advocacy and outreach—at which time we dove right into the project that falls under the latter program: the Open Source AI Definition.

Open Source AI is not the same as Open Source software. This reality led to the Deep Dive: AI project, now in year 3, in which OSI is collaborating with some of the largest corporations, researchers, creators, foundations and others. 

The Changelog hosts asked a lot of great questions and we had a candid and productive conversation. I hope you’ll follow the link to listen to the full episode: Changelog Interviews: What exactly is Open Source AI?

As I shared with Adam and Jerod, I’m hosting bi-weekly discussions on the status of the project and we’ve put together a forum for public input, so if you are interested in learning more about this or contributing, you are welcome to join us at discuss.opensource.org.

Categories: FLOSS Research

Open Source AI Definition – weekly update Mar 4

Open Source Initiative - Mon, 2024-03-04 05:00

A weekly summary of interesting threads on the forum.

The results from the working groups are in

The groups that analyzed OpenCV and Bloom have completed their work and the results of the votes have been published.

We now have a full overview of the result of the four (Llama2, Pythia, Bloom, OpenCV) working groups and the recommendations that they have produced.

Access our spreadsheet to see the complete overview of the compiled votes. This is a major milestone of the co-design process.

Discussion on access to training data continues

This conversation continues with a new question: What does openness look like when original datasets are not accessible due to privacy preserving?

Is the definition of “AI system” by the OECD too broad?

Central question: How can an “AI system” be precisely defined to avoid loopholes and ensure comprehensive coverage under open-source criteria?

“AI system” might create loopholes in open-source licensing, potentially allowing publishers to avoid certain criteria. 

Though, defining “AI system” is useful to clarify what constitutes an open-source AI, needed to outline necessary components, like sharing training code and model parameters, while acknowledging the need for further work on aspects such as model architecture.

If you have not already seen, our fourth town hall meeting was held on the 23/02-2024. Access the recording here and the slides here.

A new townhall meeting is scheduled for this week.

Categories: FLOSS Research

NTIA engages civil society on questions of open foundation models for AI, hears benefits of openness in the public interest

Open Source Initiative - Wed, 2024-02-28 04:52

The recent US Executive Order on AI directs action for numerous federal agencies. This includes directing the National Telecommunications and Information Agency (NTIA*) to discuss benefits, risks and policy choices associated with dual-use foundation models, which are powerful models that can be fine-tuned and used for multiple purposes, with widely available model weights. 

The NTIA process is centered on a Request for Comment soliciting public feedback about how making model weights and other model components widely available creates benefits or risks to the broader economy, communities, individuals, and to national security.

NTIA also initiated a series of listening sessions last December. Owing to OSI’s critical effort in the Defining Open Source AI project, we are grateful to have been included in their most recent listening session organized by the Center for Democracy & Technology (CDT) for Civil Society organizations. We joined other non-profits working in the public interest to share comments, concerns and encouragement in a generous two hour session with NTIA staff. 

The core of the discussions was centered around open versus closed models. Several organizations brought historical perspectives going back to battles over Open Source in the 90s. A short list of key takeaways from organizations weighing in during the session:

  • Open models represent marginal risk. More research is needed to understand where unacceptable risks lie beyond generating negative scenarios – for both open and closed models.
  • Encouragement to not regulate the emerging technology itself, rather focus on addressing bad actors and bad behavior.
  • Understand the benefits to research in open models, and in particular to provide transparency and accountability to privacy, security and bias concerns.
  • Consider equitable access to economic benefits by keeping models open as well as an established factor in innovation.
  • Completion of the OSI’s Defining Open Source AI and clarifying terms would greatly assist policy discussions.

NTIA staff expressed an interest in understanding what lessons we might draw from the Open Source software community’s experience with the federal government over the years. (OSI expects to speak to this in their formal response to NTIA’s Request for Comment).

OSI ED Stefano Maffulli provided OSI’s perspective in his comments at the meeting:

The Open Source Initiative is a 501(c)(3) nonprofit organization that is driving a global, multistakeholder discussion to find an unequivocal definition of Open Source AI. We’ve been maintaining the Definition of Open Source software for over 25 years, providing a stable north star for all participants in the Open Source ecosystem, including US federal agencies. 

The Department of Defense, Department of Commerce, Office of Management and Budget, Center for Medicaid/Medicare Services and others are examples of agencies which have relied on the standard Open Source Definition maintained by OSI in crafting their IT policies. 

The Open Source Definition has demonstrated that massive social benefits accrue when you remove the barriers to learning, using, sharing and improving software systems. There is ample evidence that giving users agency, control and self-sovereignty of their technical choices produces an ecosystem based on permissionless innovation. Recent research estimates that if Open Source software didn’t exist, firms would have to spend the equivalent of 8.8 trillion dollars to replace it. This is all based on the clear definition of Open Source software and the list of approved licenses that the Open Source Initiative maintains.

The same kind of unambiguous definition of terms is also needed and deserved in the domain of AI. We’re aware of various uses of the term ‘Open Source’ referring to AI systems and machine learning models whose terms of service have a wide range of obligations and restrictions. 

We found AI systems available publicly with full implementation details, code and data distributed without any obligations as well as other systems only available with limited implementation details, no data, very limited amount of description of the data used to train the model… all generally referred to as “Open Source.”

It’s worth noting that Open Source licenses are a way to flip the intellectual property system: the approved licenses grant rights to users instead of removing them. When thinking about the terms of distribution for model weights, which are basically facts, we should aim to remove the intellectual property regime to begin with.

We’re very concerned about the “economic upside capture” licensing terms we’ve seen in popular models like Llama2, for example. These terms of use are designed to create a network that favors only one economic actor (like the original distributor).

Uncertainties break the innovation cycles. This lack of clarity of terms doesn’t help consumers, scientists, developers or regulators. We’re on target to deliver a usable definition of Open Source AI by the end of October 2024. The definition work is focusing on identifying the preferred form to make modifications to an AI system: the equivalent of “source code” for software programs. This preferred form will be the basis to grant users the same level of self-sovereignty over the AI technologies.

* The NTIA, located within the US Department of Commerce, is the Executive Branch agency that is principally responsible by law for advising the President on telecommunications and information policy issues.

Coming up next: What might we draw from Open Source software’s experience with the federal government?

Categories: FLOSS Research

New risk assessment framework offers clarity for open AI models

Open Source Initiative - Tue, 2024-02-27 12:45

There is a debate within the AI community around the risks of widely releasing foundation models with their weights and the societal impact of that decision. Some are arguing that the wide availability of Llama2 or Stable Diffusion XL are a net negative for society. A position paper released today shows that there is insufficient evidence to effectively characterize the marginal risk of these models relative to other technologies. 

The paper was authored by Sayash Kappor of Princeton University and Rishi Bommasani of Stanford University, me and others and is directed at AI developers, researchers investigating the risks of AI, competition regulators, and policymakers who are challenged with how to govern open foundation models. 

This paper introduces a risk assessment framework to be used with open models. This resource helps explain why the marginal risk is low in some cases where we already have evidence from past waves of digital technology. It reveals that past work has focused on different subsets of the framework with different assumptions, serving to clarify disagreements about misuse risks. By outlining the necessary components of a complete analysis of the misuse risk of open foundation models, it lays out a path to a more constructive debate moving forward.

I hope this work will support a constructive debate where risks of AI are grounded in science and today’s reality, rather than hypothetical, future scenarios. This paper offers a position that balances the case against open foundation models with substantiated analysis and a useful framework on which to build. Please read the paper and leave your comments on Mastodon or LinkedIn.

Categories: FLOSS Research

Modernized, streamlined, and fediverse-friendly: OpenSource.org is fully migrated and ready to connect!

Open Source Initiative - Tue, 2024-02-27 03:00

Two years ago, we started migrating our website from Drupal to WordPress. We knew it wasn’t going to be a quick weekend project, but more of a journey. Today, we celebrate the final leg of this journey – merging our blog back into the main site, creating a unified online experience for our community.

Let’s rewind to 2022. Our Drupal site, while trusty, was starting to show its age. It lacked the modern features and it was self hosted, which was taking a huge toll on our team. We knew a change was necessary, but a complete overhaul would have taken too long. So, we decided to move in steps: blog first, main site later.

We first migrated our blog content to a brand new, WordPress-powered platform in early 2023. This gave us a taste of the agility and flexibility WordPress offered. We loved the intuitive interface, the vast plugin ecosystem, and the worry-free managed WordPress provided by DreamHost.

Emboldened by this success, we set our sights on the bigger challenge: migrating the entire website. This wasn’t just about moving content; it was about restructuring, modernizing, and enhancing. We meticulously migrated web pages, ensuring the least amount of URL broke during the transition.

But migration wasn’t just about moving pixels and text. We took this opportunity to modernize our licenses pages. We added missing metadata and made them easily accessible to our users with a dedicated search engine. We also created a Custom Post Type for directors and forms to improve how we handle the nominations for the board elections

Closing the loop with the blog

Now, here we are, at the final stage of our migration journey: merging the blog back into the main site. This completes the circle, creating a unified online experience where our blog seamlessly integrates with the rest of our content – licenses, events, elections, blog and more.

But the most exciting part? We’ve embraced the power of the fediverse! Comments on our blog posts can now be posted and shared across different platforms, fostering a lively and open discussion space. This integration with ActivityPub opens up our content to a wider audience and encourages a more vibrant online community.

Looking back, our Drupal-to-WordPress migration was an odyssey filled with technical hurdles, strategic decisions, and moments of pure satisfaction. We learned, we created a single-sign-on mechanism for OSI members that works on other sites (OpenSource.net and the forum, to start) and ultimately, we emerged with a website that is modern, functional, and better serves our mission. 

Next steps for opensource.org

Next project for us will be a content cleanup and expansion. We will soon start combing through years of content, removing outdated information and streamlining what remained. This decluttering will make space for new content for the website to be more useful, letting visitors learn what Open Source is and how it can help them. We’ll also add more features for OSI members based on the new forum. Explore the new blog, engage with our content, and join the conversation on the fediverse! And if you’re considering a website migration yourself, take heart from our story. With careful planning, the right tools, and the wonderful help of Automattic and the Pressable team, even the most complex migration can be a successful and rewarding journey.

Categories: FLOSS Research

Open Source AI Definition – weekly update Feb 23

Open Source Initiative - Fri, 2024-02-23 05:00

A weekly summary of interesting threads on the forum.

Is the definition of “AI system” by the OECD too broad?

Central question: Do we need to define what AI systems are?

Training data access

Central question: for a model to be open source, do we need “open” access to its training data?

Recognising Open Source “Components” of an AI System

Central question: Should the definition of Open Source AI take a gradient approach (such as the case with RAIL licence), judging the openness of the components of a model, rather than the whole of it? How do we balance between becoming a definition too restrictive? 

It is worth highlighting, that it is the intention of OSI to have a definition which is:

Also worth noting
  • Results from Pythia and Llama2 working groups are out!
  • Watch the recordings of the fourth town hall meeting on Defining Open Source AI and the accompanying slides.
Categories: FLOSS Research

A comparative view of AI definitions as we move toward standardization

Open Source Initiative - Fri, 2024-02-09 05:54

Discussions of Artificial Intelligence (AI) regulation will be heating up in 2024 with a provisional agreement for the EU AI Act having been reached in December 2023. The evolution of the EU AI Act is progressing toward a technology-neutral definition for AI to be applied to future AI systems. In the coming months, multiple states will agree on precise legal definitions, which reflect moral considerations of the role that AI will and will not be allowed to play in Europe for the very first time. And formally defining AI is an ongoing debate. 

Precise definitions within a rapidly expanding field are perhaps not the first things that come to mind when asked about pressing issues concerning AI. However, as its influence grows, arriving at one seems essential when considering how to regulate it. Agreeing on what AI is–and what it is not–on a transnational level, is proving to be increasingly important. Online spaces rarely respect sovereignty, and the role of AI in public life is expected to increase rapidly. 

Different countries and organizations have different definitions, though the AI Act is expected to provide some standardization, not only within the EU but also outside of it due to its influence. Other than providing a framework for businesses to operate within in the future, it further shows the anticipation of what, how and where AI will act and what it will develop towards. Let’s consider how different organizations and states currently are defining AI systems.

OECD

So far, the AI ACT’s definition of AI systems is expected to follow the OECD’s current definition. This currently seems to be the most influential definition and it reads as follows: 

An AI system is a machine-based system that, for explicit or implicit objectives, infers, from the input it receives, how to generate outputs such as predictions, content, recommendations, or decisions that can influence physical or virtual environments. Different AI systems vary in their levels of autonomy and adaptiveness after deployment.

Notably, the OECD’s definition has undergone changes from its first draft to the current one above. The removal of “human-based inputs” and the addition of “decisions” when referring to outputs reflects a potential for vastly limiting human-centred decisions and actions. While acknowledging that different systems vary in their autonomy, this change opens up the potential for full autonomy. This can be controversial, to say the least, and can be expected  to feed into the growing concerns of AI alignment. As we await the EU AI Act, if they indeed adopt the same or even a similar definition, it will be interesting to see their definition of personhood, considering the removal of “human-based” under inputs. 

ISO

The International Organization for Standardization has defined AI systems as follows:

AI

<engineered system> set of methods or automated entities that together build, optimize and apply a model (3.1.26) so that the system can, for a given set of predefined tasks (3.1.37), compute predictions (3.2.12), recommendations, or decisions

Note 1 to entry: AI systems are designed to operate with varying levels of automation (3.1.7).

Note 2 to entry: Predictions (3.2.12) can refer to various kinds of data analysis or production (including translating text, creating synthetic images or diagnosing a previous power failure). It does not imply anteriority.

<discipline> study of theories, mechanisms, developments and applications related to artificial intelligence <engineered system> (3.1.2)

AI System:

engineered system featuring AI <engineered system> (3.1.2)

Note 1 to entry: AI systems can be designed to generate outputs such as predictions (3.2.12), recommendations and classifications for a given set of human-defined objectives.

Note 2 to entry: AI systems can be designed to operate with varying levels of automation.

Here, there is a consideration of what kind of system is considered, notably an engineered one. This is interesting as previous definitions have been somewhat ambiguous about what technologies, in fact, will fall under such legislation. There is also a focus on the cooperation of different entities, not specified of human or otherwise. Notably, they do not mention the origin and what kind of input is being processed, though through “varying levels of automation” it can be inferred that it covers the balance between human or non-human inputs, thus offering varying levels of autonomy. 

South Korea

South Korea also adopted their definition of AI system in their 2023 AI Act, and it reads as follows:

Article 2 (Definitions) As used in this Act, the following terms have the following meanings.

  1. “Artificial intelligence” refers to the electronic implementation of human intellectual abilities such as learning, reasoning, perception, judgment, and language comprehension.

  2. “Artificial intelligence technology” means hardware technology required to implement artificial intelligence, software technology that systematically supports it, or technology for utilizing it.

While not mentioning AI systems, they attribute human attributes, like perception, to an electronic entity. While not mentioning “decisions,” attributing human characteristics perhaps makes that point redundant, as it can be interpreted as an actor, acting on a similar level as humans. Further, they are expansive on what technology is considered AI, as even a cable providing power can, under their current definition, be classified as a piece of AI technology. 

US Executive Order

In the last part of 2023, The Biden administration issued an executive order whereby they defined an AI system:

“a machine-based system that can, for a given set of human-defined objectives, make predictions, recommendations, or decisions influencing real or virtual environments. Artificial intelligence systems use machine- and human-based inputs to perceive real and virtual environments; abstract such perceptions into models through analysis in an automated manner; and use model inference to formulate options for information or action.”

Here, The Biden Administration merges human and machine-based inputs, highlighting the cooperation between the two actors. And while not legally binding, it shows intent. It shows more caution and perhaps skepticism regarding AI acting autonomously, as compared to any other of the major actors. Interestingly, the distinction between virtual and “real” (assuming this means physical, though the wording of it remains problematic) environments shows a similar skepticism to the scope and spheres that the Biden Administration is interested in AI occupying. This limits the controversial issue of potential autonomy present in previous definitions, though it limits communication between systems independently of human inputs, which can prove problematic in practice. 

Answers we are excited to see

As we enter into an important legislative year for AI, we are looking forward to getting answers to the following questions regarding the legal definitions of AI systems:

  • What definition of personhood will accompany the AI systems definition in the AI Act? And what does this mean for the intellectual protection of something entirely made by an AI, considering that it allows for large amounts of autonomy? That is, if it indeed follows the same definition as the OECD. 
  • What kind of technology will be considered to be AI? Will it range from Excel spreadsheets to LLMs? Are we considering “machine-based systems,” an “engineered system” or something else?
  • Will legislation be strong enough, or perhaps broad enough, to encompass the massive changes AI is currently undergoing? And what predictions can we infer that the EU is making on behalf of the future advancements of AI?

The post <span class='p-name'>A comparative view of AI definitions as we move toward standardization</span> appeared first on Voices of Open Source.

Categories: FLOSS Research

Open Source AI Definition: Where it stands and what’s ahead

Open Source Initiative - Wed, 2024-02-07 03:00

2023 was a big year of progress toward our goal of establishing a Definition for Open Source AI, but we still have a long way to go. Strong momentum of global collaboration toward this end will continue in 2024, and we need your help.

I detailed what we have accomplished and what lies ahead in a talk at FOSDEM and in the online public townhall meetings. There are more townhall meetings scheduled already, every two weeks at alternating times.

We began the drafting process by looking at the Free Software Definition during in-person and online sessions last year. The current draft v.0.0.5 of the Open Source AI Definition is as follows:

What is Open Source AI

To be Open Source, an AI system needs to be available under legal terms that grant the freedoms to:

  • Use the system for any purpose and without having to ask for permission.
  • Study how the system works and inspect its components.
  • Modify the system to change its recommendations, predictions or decisions to adapt to your needs.
  • Share the system with or without modifications, for any purpose.
From Open Source AI Definition draft 0.0.5

However, in order to get to the complete draft, we need to answer the following question: What is the preferred form to make modifications to an AI system?

The specification to consider are outlined in this diagram:

I also presented the 2024 timeline:

TL;DR is we are working toward an Open Source AI Definition release candidate 1 (RC1) by early summer and a version 1 (v. 1.0) in October. We have established working groups to analyze all the components of popular AI systems (like Llama2, Pythia, BLOOM, OpenCV), and new drafts of the Definition will be released monthly with town hall meetings every two weeks and constant public review. 

RC1 must be supported by at least 2 representatives for each of the 6 stakeholder groups.

V. 1.0 must be endorsed by at least 5 representatives for each of the 6 stakeholder groups.

As this Definition is the first one maintained by OSI to have a version number, we will have rules for maintenance and review as the technical and legal landscape of AI continues to change. The Board started working on this task.

Following are the 6 stakeholder groups we have identified: Invite them to join the forum or tell them to email/contact me.

Next steps
  • Bi-weekly townhalls to make the process more public.
  • Outreach to get more stakeholders involved.
  • Raise more funds to support this work in 2024.
  • OSI is updating the project landing page and engaging the board in preparation for review and approval of v. 1.0 later this year.

Public draft along with comments are available from the redesigned landing page at https://opensource.org/deepdive/. We invite you to join the conversation, and don’t forget to become an OSI member!

The post <span class='p-name'>Open Source AI Definition: Where it stands and what’s ahead</span> appeared first on Voices of Open Source.

Categories: FLOSS Research

The European regulators listened to the Open Source communities!

Open Source Initiative - Fri, 2024-02-02 04:10

During 2023, OSI and many others across the Open Source communities spent a great deal of time and energy engaging with the various co-legislators of the European Union (EU) concerning the Cyber Resilience Act (CRA). Together with a revision to Europe’s Product Liability Directive (PLD), the CRA will bring the responsibilities of product liability to software for the first time.

In the light of the EU’s own research showing the huge impact of Open Source on Europe’s economy, the authors of these legislative instruments sought to ensure that the lifecycle of Open Source software was impacted as little as possible. Indeed, at FOSDEM 2023 the authors of the CRA and PLD said as much in their first-of-a-kind main track appearance. But when we all looked at the details, community members found that was not as true as we hoped. As a range of organizations explained, the CRA was likely to be an existential threat to Open Source development, because instead of placing all the compliance requirements of the CRA on companies deploying Open Source software for profit, the obligations as written potentially fell on developers and Open Source foundations.

Reactions To The Final Text

Many OSI Affiliates engaged with the European Commission, European Parliament and European Council during 2023. With the welcome coordination of Open Forum Europe, a group met regularly to keep track of progress explaining the issues. Many of us also committed time and travel to meet in-person. As a result of all this effort from so many people, the final text of the CRA mitigated pretty much all the risks we had identified to individual developers and to Open Source foundations. As the Python Software Foundation said in their update:

…the final text demonstrates a crisper understanding of how open source software works and the value it provides to the overall ecosystem of software development.

And the Eclipse Foundation wrote:

The revised legislation has vastly improved its exclusion of open source projects, communities, foundations, and their development and package distribution platforms. It also creates a new form of economic actor, the “open source steward,” which acknowledges the role played by foundations and platforms in the open source ecosystem.

As the Apache Software Foundation said:

So, all in all, this is mostly good news for volunteers who run and innovate with open source software. Or, more accurately, much better than most of us could have imagined at the end of last summer.

This time last year OSI recommended that the CRA:

…exclude all activities prior to commercial deployment of the software and … clearly ensure that responsibility for CE marks does not rest with any actor who is not a direct commercial beneficiary of deployment.

That recommendation has been accepted and implemented, and the OSI is very grateful to the various experts who took the time to listen.

OSI Observations

While it’s all much better, and while the burden placed on individuals and charities is minimal, there are still challenges ahead. For example, the concerns that the Debian project articulated give cause for thought. With Open Source projects exempted from the requirement to place a CE certification mark on their software, downstream users will need to pay careful attention to their responsibilities under the CRA as well as to their liabilities to consumers under the PLD.

In particular, “digital artisans” using Open Source software at small scale – the main concern of Debian – will need guidance from the European Commission. While the experts we have met have all said that using an Open Source software distribution as part of a commercial activity is unlikely to require CE marking of the distribution itself, the interpretation of the key phrase “making available on the market” will need careful clarification. OSI encourages the Commission to seek expert advice from the Open Source communities as they did last year, and not to rely on outsourced consultants alone in preparing this advice.

FOSDEM 2024

There is also the question of how future engagement by legislators should proceed. The effort made by developers and Open Source foundations in 2023 is not sustainable, and the Commission needs to accommodate the Fourth Sector in future deliberations. To get this started, a group of us who have engaged during 2023 got together to organize a unique set of workshops at FOSDEM 2024 on Sunday February 4. If you want your voice heard, come along to one of the workshops!

The post <span class='p-name'>The European regulators listened to the Open Source communities!</span> appeared first on Voices of Open Source.

Categories: FLOSS Research

Announcing: The 2024 State of Open Source Report

Open Source Initiative - Thu, 2024-02-01 09:30

Brussels, February 1, 2024 – Today, the results of the annual Open Source survey conducted by OpenLogic by Perforce in collaboration with the OSI and the Eclipse Foundation were shared in the 2024 State of Open Source Report

The 2024 State of Open Source Report sheds light on the factors driving Open Source Software (OSS) adoption, the most in-demand Open Source technologies, and the difficulties that teams using OSS most frequently encounter. Also covered in the report is support and planning for end-of-life (EOL) or soon-to-be EOL software.

More than 2,000 open source users working across numerous industries all over the world answered more than two dozen questions about the use and support of OSS by their organizations, from large enterprises to early-stage startups.

Open Source practitioners and IT leadership alike should find the report enlightening. Three things in particular struck interest for me:

  • OpenTofu already has 30% of the users as Terraform
  • OpenSearch has 50% of the users of ElasticSearch
  • OSI is the third organization by donor after Linux Foundation and Apache Software Foundation! We made a lot of progress folks!

Also of interest is rapid growth in the AI/ML/DL space, both in the data itself and the concurrent investment in Open Source data technologies. OSI has been on a mission to establish a Definition of Open Source AI, so this trend is something we’re watching closely. 

Thank you to Perforce and the Eclipse Foundation for the production of this valuable resource. Please share the 2024 State of Open Source Report far and wide! 

I will be participating in a webinar along with Perforce Open Source Evangelist Javier Perez and Eclipse Foundation Director of Product Marketing Clark Roundy, on February 22nd. Register here and continue to support Open Source and be a part of the conversation!

The post <span class='p-name'>Announcing: The 2024 State of Open Source Report</span> appeared first on Voices of Open Source.

Categories: FLOSS Research

The OSI board expands, adding two new seats; focus on AI and international policies

Open Source Initiative - Thu, 2024-02-01 04:34

At the August board meeting, the OSI board voted to add two new appointed seats, and at the December board meeting named Professor Sayeed Choudhury and Gaël Blondelle as new board members.

The expansion of the board was voted upon to give greater operational stability and continuity to the organization. The rationale for this decision is explained in further detail in the August 2023 board meeting minutes. The new composition of the board is:

  • four directors elected among individual members (seated for two years), 
  • four directors elected among representatives of Affiliate organizations (seated for three years), and
  • four directors (previously two) appointed by the board (seated for three years).

All four board-appointed seats are carefully selected by the board based on the strategic priorities of the OSI. Professor Sayeed Choudhury and Gaël Blondelle were chosen to fill the two new board seats because of their expertise in areas that will be most relevant to the OSI in coming years: AI and international policies.

The skills and contacts Sayeed and Gaël bring to the board will serve the OSI’s mission and goals moving forward. The two new board members will also be instrumental in the fundraising efforts of the organization with their deep networks of corporate donors and grant givers.

Sayeed Choudhury

Sayeed Choudhury is the associate dean for digital infrastructure and director of the Open Source Programs Office (OSPO) at Carnegie Mellon University. He started the first OSPO based at a US university while at Johns Hopkins University. He is the director of an Alfred P. Sloan Foundation grant for coordination of University OSPOs and a co-investigator for the Black Beyond Data Project. He is the software task force leader and member of the steering committee for the Research Data Alliance (RDA) – US. Choudhury was a President Obama appointee to the National Museum and Library Services board. He has testified for the Research Subcommittee of the Congressional Committee on Science, Space and Technology.

“The Open Source Initiative plays an important role in the Open Source ecosystem from a community, legal and policy perspective,” said Choudhury. “Carnegie Mellon University has recently launched two initiatives that focus on impact from Open Source software — Ecosystem for Next Generation Infrastructure (ENGIN) and Open Forum for AI (OFAI). I look forward to partnering with the OSI board and working with the OSI membership on these initiatives and other programs being advanced by the OSI.”

Gaël Blondelle 

Gaël Blondelle joined the Eclipse Foundation in 2013 and now serves as chief membership officer. He has been involved in the Open Source arena for more than 18 years in a number of key roles. Blondelle co-founded an Open Source start-up and worked as its chief technology officer. He then worked in business development for an Open Source systems integration company and managed a strategic research project aiming to create an Open Source ecosystem with major industrial players. Blondelle joined the Eclipse Foundation to pursue his goal of helping more companies work in Open Source, and to grow open, innovative and collaborative ecosystems.

“I am honored to join the OSI board, and look forward to helping the OSI onboard more sponsors and affiliates globally,” said Blondelle. “The work being done on the Open Source AI Definition is fantastic and we need an organization like the OSI to stand for Open Source AI in an elaborated and well articulated way. At the same time, we also need to stand for the Open Source Definition (OSD) that is regularly under attack from different sides. The OSD has enabled the development of Open Source technologies over the last 25 years, and we need to make sure this continues.”

Please join us in welcoming these two new board members to the OSI!

The post <span class='p-name'>The OSI board expands, adding two new seats; focus on AI and international policies</span> appeared first on Voices of Open Source.

Categories: FLOSS Research

Fixing a gap in the SEP regulation

Open Source Initiative - Wed, 2024-01-31 04:09

In OSI’s feedback to the European Commission’s proposed Standard Essential Patent (SEP) Regulation (SEP-R), OSI recommended that the legislation add a waiting period for patent claims registered as standard-essential after the standard has been ratified. The recommendation was based on the social purpose of tolerating the presence of royalty-due patents in standards at all.

SEPs in context

Royalty-due SEPs are an artifact of a requirements-led standardization process. Not all standards are affected by SEPs, and not all SEPs require licensing on royalty-due terms. While some standards are encumbered by patents registered by contributors to the standards process, patents are not an essential or inherent aspect of standardization.

Patents are mechanisms that exist for a societal reason — in order to create a benefit to society by encouraging inventors to openly share their techniques — not because there is any inherent “property” to recognize. So it is incumbent on government administrations to regulate their use so that a societal benefit is preserved.

As I explained for Open Forum Europe, some standards are developed in a sequence of activities that starts from a statement of requirements aiming to create a new market (“requirements-led”) while others are developed as a harmonization of existing industry implementations in an existing market (“implementation-led”).

  • The implementation-led approach (harmonizing existing markets) frequently arises in circumstances where recovery of R&D costs is already in hand and patent monetization is not a proportionate compromise. As a result, projects developed under an implementation-led approach (such as at OASIS and W3C) frequently opt for restriction-free (RF) terms that result in a negotiation-free usage since royalties are waived and do not need to be negotiated.
  • The requirements-led approach (specifying the interoperability for a future market) leads some standards development organizations (SDOs) to tolerate restricted licensing of included patented technologies due to the long lead-times in research and development investment by standards contributors. While royalty-due and negotiation-required licensing of SEPs is desirable for the commercial entities benefiting from the tradition, the bilateral negotiation with NDA-enforced privacy that results gives the incumbents market power that could be easily interpreted as anti-competitive.

Despite the practice of accommodating royalty-due patents in standards leading to barriers to entry in the resulting markets, tolerating SEP monetization appears a compromise that its advocates assert can be a proportionate remedy to the delayed monetization opportunity for participants. As a result SDOs put in place safeguards during the standardization process to avoid triggering anti-trust regulations, such as ensuring equal terms of participation for all in the process, requiring disclosure by participants of patents that could prove standard-essential, and especially in requiring negotiated terms to be “Fair, Reasonable And Nondiscriminatory” (FRAND) — although not backing this up practically.

Bugs in the process

But these SDO safeguards only prevent the SDO itself from being regarded as anti-competitive, and do nothing to protect the markets that go on to be created by requirements-led standards.

  1. What needs licensing is unclear. While the patents of those involved in the standardization process will have been declared, the resulting standard may not embody their claims, and others outside the SDO may make claims. Published standards are thus not accompanied by a list of patents that need to be licensed for implementation. The task of identifying exactly which patents need to be licensed for exactly which parts of the standard is therefore significant. That burden is only placed on smaller innovators and market entrants; the incumbents are likely to have cross-licensing agreements in place, making their market participation simpler and cheaper. 1 2
  2. Power is with the incumbents. While the term “FRAND” (Fair, Reasonable And Nondiscriminatory patent licensing terms) is much used, the reality is that the negotiations for patent licenses are 1:1 and conducted in commercial secrecy under NDA. There is no way any party can know if the terms they are offered are like those offered to others, and the power is imbalanced heavily in favor of the patent owner who will use early legal proceedings to force a conclusion. Since the patent owners are frequently the dominant market players, small companies and new market entrants are at a significant disadvantage.
  3. The cost of licensing is unknown. Since each patent is likely to need separate negotiation with large corporations, it’s hard to know what the cost of licensing a given standard will be, even after the list has been painstakingly built.
  4. Patent pools can demand unwarranted licensing. Patent pools are held up as a partial remedy for this. They sometimes list all the patents they are licensing but don’t explain why they are essential. As a result, the lists they produce can be inaccurate, especially when the pool is not connected with the standardization process. 1 2 3 A
Better markets with SEP-R

The proposed Standard Essential Patent Regulation addresses many of these issues as part of its proposals, and that’s the reason OSI broadly welcomed the proposal. Where royalty-due patents in standards are present, they should at least function to create a fair market for both patent owners and licensees.

OSI’s concern relates to a potential loop-hole in the new arrangements. Knowing that some patent owners prefer not to participate in standardization activity, and that some owners prefer to be as non-specific about essentiality as possible, OSI was concerned that the otherwise excellent public registration system might be ignored by some patent owners in order to bias the market towards adoption without possessing the full costs, deeming them free to disregard the regulation’s collective pricing measures. OSI considers this a gap in the regulation.

The late registration gap

Because of the improvements in SEP-R, implementers will be able to know which entities will require negotiation and assess whether to use the standard based on the registrations made by patent owners as well as on the collectively-agreed total royalty. But there is a risk the improvements will be avoided intentionally by some patent owners.

  • Late-registered patents are likely to be those not arising from the standards process. They are unlikely to be owned by participants in the collectively-agreed total royalty.
  • Since implementors could not take these patents and their burden into account, their late registration is likely to require at best revised costings, probably new engineering, and at worst market withdrawal by some implementers.
  • This represents market harm and needs to be discouraged and those in the market protected.
  • But SEP-R does not do so, leaving predatory late disclosure as a viable control point for incumbents and NPEs (trolls) whose advantage has been impacted by SEP-R.
  • The only major consequence of late-stage registration is the loss of royalties before the registration is valid; however, for a widely adopted standard this is likely to be of small consequence to the SEP owner over the long term. The market will already have formed and such a delay will significantly impact companies with products already in the market. Products with Open Source elements will be more significantly affected as they will likely need to remove affected capabilities.
Possible remedies

Recognizing that the existence of patents is for the enablement of a social good from an effective market, and recognizing that late registration of patents as essential to a standard after it has been promulgated harms those trusting the registry, it seems reasonable to apply a remedy both to ameliorate and discourage late registration. The best remedy to late registration would be to simply prevent any patents registered as essential to a standard from being able to claim any royalties in association with the implementation of the standard.

Realistically, this option would face huge opposition from SEP-dependent corporations and would be better considered a long term goal. 

Instead, OSI proposed that registering a patent as essential after the market has adopted a standard affected by it should result in a waiting period before royalties could be claimed. This would allow time for the adjustment of the allocation of the total estimated cost of licensing to accommodate the new patent, as well as allow the market to adjust to the new reality.

Given the pace at which these changes will be made, it seems reasonable to have a waiting period of at least two years from registration before patent royalties can become due.

Notes, Tags & Mentions
  • OSI’s interest in this topic arises from the well-documented reluctance of Open Source developers to entertain patent-encumbered standards. Their presence can sometimes be accommodated but reduces the stochastic confidence level that leads to Open Source being an effective trigger for innovation.
  • To read a similar discussion but from an Open Source perspective, see the OSI blog and my earlier article exploring the topic.
  • OSI made an earlier submission to the consultation and also published a corresponding article.

The post <span class='p-name'>Fixing a gap in the SEP regulation</span> appeared first on Voices of Open Source.

Categories: FLOSS Research

A public forum to discuss the Open Source AI Definition

Open Source Initiative - Fri, 2024-01-26 06:32

OSI announced a public forum to widen the conversations that will lead to version 1.0 of the Open Source AI Definition. The forums are part of our commitment to inclusiveness and transparency, matching the public town hall meetings that started two weeks ago.

The public forum‘s goal is to welcome the broader community to engage in the conversations surrounding AI. There is only one category at the moment, but we plan to expand the forum’s scope over time.

Access to the forum is restricted to OSI members: if you’re not already a member, you can register now for free or you can use this as an opportunity to support OSI’s work and become a full member: Donating $50 or more will give you the option to vote in the board’s upcoming election and support our programs. OSI’s membership also allows you to submit a story to OpenSource.net, the community-based magazine.

The video recordings and slides from the first two town halls are on the forum:

By making the discussions accessible to the public, we believe that we can encourage greater participation, diversity of perspectives, and ultimately, a more comprehensive understanding of the challenges and opportunities in Open Source AI.

We have an aggressive timeline, we know we must get to a version 1.0 quickly but we also want to get there with the right amount of support from a wide set of stakeholders. Join the forum today and help us speed up the process.

The post <span class='p-name'>A public forum to discuss the Open Source AI Definition</span> appeared first on Voices of Open Source.

Categories: FLOSS Research

How OSI will renew its board of directors in 2024

Open Source Initiative - Tue, 2024-01-16 08:19

In the next few weeks, the OSI board of directors will renew three of its seats with an open election process among its full individual members and affiliates. There will be two elections in March, running in parallel:

  • The affiliate organizations will elect one director
  • Individual members will elect two directors

The results of elections for both Individual and Affiliate member board seats are advisory with the OSI Board making the formal appointments to open seats based on the community’s votes.

Signup now to become a full individual member (Supporting or Professional) to qualify as a candidate when the application opens on Feb 5th.

2024 elections timeline The role of the board of directors

The board of directors is the ultimate authority responsible for the Open Source Initiative as a California public benefit corporation, with 501(c)3 tax-exempt status. The board’s responsibilities include oversight of the organization, approving the budget and supporting the executive director and staff to fulfill its mission. The OSI isn’t a volunteer-run organization anymore and the role of the directors has changed accordingly.

Each director is expected to be a counsel and a guide for staff rather than an active contributor. Directors should guide discussions, support the vision and mission of the organization, and advocate for the OSI. They’re also asked to support the fundraising efforts however they feel comfortable doing.

The board is governed by the bylaws. Each board member is expected to sign the board member agreement. Depending on expertise and availability, directors are expected to serve on the active committees: the license, fundraising, standards and financial committees.

Candidates will be asked to share their ideas on how they’ll contribute to the vision and mission, and the 2024 strategic objectives.

The rules for how OSI runs the elections are published on our website. We’ll communicate more details in the coming weeks: stay tuned for announcements on our social media channels (Fediverse, LinkedIn, Twitter.)

Affiliate organizations will receive instructions via email.

The post <span class='p-name'>How OSI will renew its board of directors in 2024</span> appeared first on Voices of Open Source.

Categories: FLOSS Research

A historic view of the practice to delay releasing Open Source software: OSI’s report

Open Source Initiative - Wed, 2024-01-10 10:00

The Open Source Initiative published today a new report that looks at the history of the business practice to delay releasing their code under freedom-respecting licenses. Since the early days of the Open Source movement, companies have experimented with finding a balance between granting their users the basic freedoms guaranteed by Open Source licenses while also capitalizing on their investments in software development. One common approach, albeit with many different flavors, is what this report calls “Delayed Open Source Publication” (DOSP) — “the practice of distributing or publicly deploying software under a proprietary license at first, then subsequently and in a planned fashion publishing that software’s source code under an Open Source license.”

The new report titled “Delayed Open Source Publication: A Survey of Historical and Current Practices” was authored by the team of Open Tech Strategies (Seth Schoen, James Vasile and Karl Fogel) based on crowdsourced interviews. Their research was made possible through a donation by Sentry and the financial contributions of OSI individual members. 

Like the authors, I found that the historical survey revealed numerous surprises, and what I found even more intriguing are the new questions raised (see Section 7) that beg for more dedicated research. 

I encourage you to give it a read and share it with others. We encourage feedback from the community: I hold office hours for OSI members and you can discuss this on Mastodon or LinkedIn.

Download the report.

The post <span class='p-name'>A historic view of the practice to delay releasing Open Source software: OSI’s report</span> appeared first on Voices of Open Source.

Categories: FLOSS Research

ClearlyDefined: recapping a year of progress and sharing a vision for 2024

Open Source Initiative - Mon, 2024-01-08 12:41

At the beginning of 2023, I started as a community manager for ClearlyDefined, with the goals of creating an open governance model for the project and helping the OSI to establish a neutral infrastructure to foster collaboration among multiple stakeholders. Thanks to the amazing work from our community members, a lot of progress has been made in 2023, but there’s still a lot of work ahead of us. In this post, we would like to highlight some milestones achieved this past year and acknowledge some individuals who have contributed to the project. We would also like to share a vision for 2024 and invite all organizations who care about the Open Source supply chain to become involved.

ClearlyDefined is an Open Source project and service that aims to serve as a global database of licensing metadata for every software component ever published. It was originally developed and used by Microsoft and it’s now in use at companies like GitHub, SAP, and Bloomberg, as well as Open Source projects like the Linux Foundation’s GUAC and ORT (OSS Review Toolkit). At the beginning of 2023, Open Source Initiative took over as community steward of the project.

In the first quarter, outstanding work was developed by Manny Martinez (Microsoft) in collaboration with Qing Tomlinson (SAP) to optimize ClearyDefined’s back-end, particularly the database. This work has resulted in a 10-fold decrease in terms of database size and costs.

In the second quarter, GitHub added 17.5 million package licenses sourced from ClearlyDefined to their database, expanding the license coverage for packages that appear in dependency graph, dependency insights, dependency review, and a repository’s software bill of materials (SBOM).

In the third quarter, we saw greater collaboration between GitHub and SAP spearheaded by E. Lynette Rayle and Qinq Tomlinson. They are making improvements to the documentation and  process of running a local ClearlyDefined harvest and sharing the licensing metadata with other harvesters.

In the fourth quarter, we saw various members currently using ClearlyDefined and new members alike coming together to create a unified vision for the project. Thomas Steenbergen, co-founder of ClearlyDefined and ORT, has come forward to help lead this effort. Key goals for ClearlyDefined in 2024 include:

  • Publishing periodic releases and switching to semantic versioning
  • Bringing dependencies up to date (in particular using the latest scancode)
  • Improving the NOASSERTION/OTHER issue (please check this analysis by Aleksandrs Volodjkins to learn more)
  • Advancing usability and the curation process through the UI 
  • Enhancing the documentation and process for creating a local harvest

ClearlyDefined’s mission is to help organizations to collaboratively achieve accurate licensing metadata (oftentimes part of SBOMs) at scale, for each stage on the supply chain, for every build or release. If your organization is interested in achieving better compliance and security of the Open Source supply chain, please consider joining ClearlyDefined. We are still working to consolidate a roadmap for 2024, and this is a great time to join the project and learn more about how ClearlyDefined can help your organization.

The post <span class='p-name'>ClearlyDefined: recapping a year of progress and sharing a vision for 2024</span> appeared first on Voices of Open Source.

Categories: FLOSS Research

Pages