OSS 2009 Doctoral Consortium Notes

OSS 2009 DC 5/3/2009

Proceedings at http://www.ua.ac.be/main.aspx?c=kris.ven&n=71197

Morning session: Firm involvement in OSS, innovation and economic issues


1. Juho Lindman - OSS Changes and Software Production Models

RQ: How openness changes software production models? IV - openness; DV - software production models (SPM)

SPMs - "mythical" OSS, inner source, shared source, collaboration w/ OSS communities

Looking beyond definition of OSS - bilateral audience of 1) hackers (identity) and 2) organizations (viability) - different business models and revenue logics

Setting - grad school of electronic business and software industry. To answer RQ, want to show how organizations behaved

5 papers: grounded theory, interviews, case studies, descriptive stats, narrative analysis
What is OSS?
How can OSS be implemented inside a company?
How are OSS licenses determined?
Why organizations publish OSS code?
What are the (organizational) consequence of OSS?

Data: total of 42 interviews, usage data, email list data
Issues - how to combine 5 papers into a coherent dissertation?
"Mythical" OSS = idealized; it's not mythical, it's heterogeneous


2. Mario Schaarschmidt - The Use of External Knowledge in Firm-Driven OSS Development

Comparison of closed and open innovation - hidden potential of integrating external sources in core development processes.

1. OSS is done by hobbyists and students; running a business on solely voluntary work is impossible: 45% are experienced IT professionals, 20% sysadmins, 30% of professionals were paid by day jobs to develop OSS
2. IP
3. By making your software open source you'll get thousands of developers working on it for free
Reality: most projects only have a dozen or so at the core; some have peripheral members supporting the work; others have no volunteers at all. Plus, initiating firms have to maintain resources.

Commercialization approaches: 2x2 based on firm/community initiation, firm participation high/low - interested in firm initiated with high firm participation
Dahlander and Wallin 2006 - SNA and email data that include firms (check this for JAIS sub)

Conflict of control and authority desires for firm and open participation
RQ: How are firms balancing the tension between the wish for control and the desire for voluntary external participants?

Eclipse as case study - IBM did most of the work, opened the development because they couldn't competitively keep it closed.
Eclipse web site used to have committer list with their affiliations - 912 committers, 110 projects, 104 firms involved as of November 2008

Hypotheses: quite a few - relationships between number of firms involved, number of paid/voluntary committers, number of voluntary/paid leaders

Regression analysis - more paid leaders seems to relate to less volunteer leaders?
Numerous additional questions... Control variables? Project size as a mediator?

Findings: most firm-controlled projects don't integrate volunteers; firms try to retain control through paid sources

-How is leadership defined? - as shown on Eclipse listing for project.

-Eclipse foundation versus project - Kris - Can you get the data to do this kind of analysis from other foundations/forges? Comparison across hosts would be very interesting.
-Other control variables/mediators: time from founding to first release, number of releases, project lifespan, number of firms initially involved versus currently involved (e.g. did it start as a partnership or was it closed source that was opened?)
-Control - Yates 1989, Control Through Communication - formalized processes and communication flows indicate managerial control


3. Mohammad AlMarzouq - The FLOSS Marketplace

Drivers of success for FLOSS development (based on Crowston & Howison 2003); not all developers contribute indefinitely - how to manage resources?

RQ: What factors increase contributors to FLOSS communities?

Perspective: communities and developers as rational value maximizing actors in a competitive marketplace, using TCA from Williams. Viewing participation as transaction, communities that reduce transaction costs can increase perception of participation as a rational choice - therefore projects more competitive if they can convince people to contribute by reducing effort required.

Participation transaction - 6 stages, first 3 are learning, second 3 are coordination, which is where the costs arise.

DV: sustainability - how effectively are users converted to developers/contributers.
IVs: Knowledge codifiability, knowledge completeness, knowledge diversity: amount and variety.
Controls: age, language, license, number of contributors, popularity
Sampling: frame - top 1000 from ohloh.net; Python, C, C++ projects; single repository listed
Quarterly data for 2008; R packages - ohcount, igraph
Operationalizations: Knowledge completeness - proxy is modularity, using SNA measure for package dependencies
Knowledge diversity - SLoC at beginning of analysis period, number of modules


Theoretical construct operationalization - issues of quantity versus quality - in general the operationalizations don't seem to really capture the definition of the constructs.
Diversity - amount, maybe also look at volume of communication in some period of time, because that is also important to getting up to speed in participation - # of messages in a given time period
Modularity - knowledge completeness - measure is great but not sure how well it captures the theoretical construct
Knowledge amount - sketchy measure - verbosity, quantity versus quality
Identifying new contributors - might be able to do w/ Mechanical Turk?
Sustainability - # of new developers joining revision control - how about duration of membership of starting developer core?
Knowledge codifiability - include documentation as well?
Knowledge completeness - are there established standards for contribution? Existence of contribution guidelines should reduce required effort for knowledge completeness.

Modularity measure - would like to try this, maybe there's a collaboration opportunity?


4. Celina Raffl - Innovation through Cooperation, ICT Assessment and Design for an Inclusive Society

Value-laden design of technologies and effects on inclusiveness - ethical issues like freedom, openness and inclusion.
RQs - early stage, too many - goal is conveying policy and recommendations for inclusive FLOSS design.
Theoretically motivated issues - problems of timing, identification and integration for assessment, effective control - unpredictability of outcomes from corrective action (e.g. revenge effects)

Drawing on research on community-based FLOSS, social software, knowledge commons, participatory design, user/open innovation, collective intelligence

Questions lay out an interesting research agenda.
Does inclusivity relate to participation in the design, production, or use? Who can make changes to inclusivity in each of these aspects of technology?
Why FLOSS as an example of social inclusion? Transparency & ideology.
What problem are they trying to solve with the technology and who has a stake in that?
Try a policy lens - secondary analysis of policy related to social inclusion.
Big picture holistic perspective is difficult but really important. Doing the work in stages can build up that big picture without biting off too much to chew.

Not a survey - secondary data. Try longitudinal case study?


5. Eirini Kaliamvakou - Measuring Developer Contribution from Software Repository Data

SQO-OSS project - evaluating projects based on quality; understanding contribution quality requires defining quality and measuring it. Literature uses LOC for everything; however, that's not the full range of contribution - contribution to product (code) and contribution to process. Need to move beyond code and discuss coordination. Identity resolution is an issue with operationalization.


Identity resolution - maybe sample smaller projects where this might be less of an issue to work out?

IRC data - where are you going to get it? How about ethical issues with using content that is not expected to be public?
How about participation in the blogosphere?

"Amount of work" - what does this actually relate to? Just contribution counts - no idea how much effort went into a contribution. Novice developers put more work into the same volume of code.

What about the time dimension? The amount of time a person has been involved is going to have influence, longer time period of participation means more opportunity to contribute. Should control for duration of involvement in some fashion.

Opportunity to identify distinct informal roles based on activities people engage in. This would be really useful!


Afternoon session - OSS communities and development


6. Giacomo Poderi - Legitimate Peripheral Participation in FLOSS CoPs

Has changed his focus since the proceedings were submitted - now looking at relationships of power between peripheral and central members.

Focus on mutual dependence of practice and community evolution. Background rationale - conflict, membership roles, distribution of power.

RQ: How do relationships of power mediate the evolution of community practices during integration processes?
Goal: provide empirically grounded framework to relate causes of conflict, strategies for resolution

Power relationships - relational, has three main characteristics - origination, basic nature, manifestation (what, why, how). Influences of power on members and conflict situation. Example of GNOME Bugsquad membership/joining.

Multiple case study, qual methods - participant observation, interviews, documents/archives analysis. Analysis is hermeneutic content analysis (inductive? deductive?)

Sociological framework for power relationship in FLOSS, links development of community members and of community work practices. Highlights potential barriers to integration and indicates strategies and factors for lowering barriers to participation.


How many cases? (one)

Joe - this screams activity theory. Can work with the existing framework.
Yeliz - consider content analysis instead of participant observation - all seem to agree that participant observation could be problematic for studying power.
Walt - importance of separating out types of power, e.g. authority and coercion.
Issue of roles - core/periphery and definition. These may vary by case.


7. (me)

-2006 McCormack, Management Science v52 n7 - modularity in Netscape to Firefox
-influence of business context on modularity
-number of modules compared to number of people

structural complexity - coupling times lack of cohesion
separation of concerns - how a single issue/requirement spreads among components
number of issues per component


8. Yeliz Eseryel - Leadership in Apache FLOSS Teams

Apache - norms, IT infrastructure, F2F interactions - similarity to business context
How are leadership behaviors manifested? How are they perceived?

Recent memories for interviewees, also looking at key events (releases).

Leaders send more email in listservs - but may not be the only place that leadership is demonstrated. Overall activity levels may be a good indicator too. Leaders don't just talk more, they do more; there are also contributions from average users and quite a bit from non-leaders.

Boundary spanning is not evident in listservs, came more from interviews. Substantive task contribution focuses more on decision-making, includes more communication than other roles. Dynamics of leadership - how much of the contribution comes from which roles?


Task-oriented versus relationship-oriented - maybe due to event-based sampling? Perhaps at other times it would be more evenly split in terms of effort.

Chicken-and-egg with leadership and doing work - leader because they do the work; work because they are a leader.

Joe - more relationship management coming from leaders
Traits/skills causal to behavior? Lag effect of leadership, don't get credit right away. Also dynamics of leadership - shifts over time. Self-awareness of leadership in most cases.


9. Paula Bach - Supporting UX in Codeplex

Studying FLOSS within the walls of Microsoft - interest in studying designers. Need for UXD in order to advance FLOSS for non-technical users. Big projects like Firefox and OpenOffice.org and KDE have paid usability designers - what about projects on hosting sites?

Barriers to UXD in FLOSS: developer perceptions - of technical versus non-technical users, of usability problems, of simplicity and complexity; community integration - OSS culture, resources for UX work (ASCII art in emails!), no time for up-front design; process constraints - mental models from industry stifle innovation, feature bloat.

CodePlex - cultural value on UX and participants comfortable with being researched.

Theoretical framework - activity awareness. Activity theory didn't work effectively. Based on complex teamwork at the project level, not task level. Based on common ground, communities of practice, social capital, human development (how people help each other and learn).

Methodology - descriptive science and design science. Ethnomethodologically informed ethnography - Randall et al 2007.

Articulation work - mechanisms, the work you do to get work done (e.g., coordination).

Looking at design features for supporting UXD activities and contributors.


Joe - mockups? Doesn't understand what the design solution is. You've observed the theoretical ends and have understandings about interface design - what do the mockups look like that bring the pieces together? (UX workspace, to-do list, user research items, personas and scenarios, design space that can be uploaded-commented-iterated)

Set of tools that recreate functionality of code artifacts, but for design artifacts. Makes process transparent to everyone involved? Visibility and transparency are not the same. To what extent, if you institutionalize the design artifacts in this system, will that make the design/decision-making process more understandable to programmers? (That is the issue - need to go in and see. Design challenge from Mozilla Labs, but separate from development.)

Kris - research methodology - qualitative ethnography with design science - how do you designate the methodology? Is it mixed methods? Mixed methods is often used with both qualitative and quantitative research, but didn't see quant research here. Where do you position the work?

Walt - one aspect of UX is that it's mediated by the current interface, people learn to adapt to the interface. How do you deal with it when you go from descriptive to proscriptive? You create a new interface which creates new experiences which creates new usability concerns - do you only do it once and it's done? (in the projects, or in the designs that were created?) Based on the descriptive work, translation to prescriptive - new experiences upon implementation, need more research to improve the interface, etc - iterative development of UX support interface itself.

Kris - in the research design, you were going to study UX for CodePlex in specific, not FLOSS in general. So would the work generalize? (issue of ambiguity in types of projects) Would formulate research question more generally to open source in general - if you did the same for SourceForge, would SF be in the research question? (inclusion of host site dev team in the model - does the way CodePlex development works affect the observations)

Joe - Participatory design that leverages crowdsourcing wasn't mentioned. (participatory design - getting the users to help with development) Why crowdsourcing and not participatory design? (crowdsourcing is a larger scale phenomenon - tied to innovation; difference from usual concept of participatory design) If crowdsourcing is of interest, you can potentially apply this more broadly than FLOSS - whether users articulate design possibilities, make collective decisions by rating designs, etc. If you're in design science, it's all about utility, so the mechanism of how to use the crowd is important - if you could implement that in a toolkit, it would be very useful.


Lightning round talks

Radhu - Pattern-based methods for requirements discovery and classification in FLOSS development. FLOSS has requirements, but they aren't formalized - have to read between the lines. Trying to discovery and classify those requirements, and the distribution of requirements through the life of a project. Using NLP to identify the requirements, using 5 levels (token, POS, RE-Logic, RE-model, domain). Using feature requests in SF based on the idea that there will be high requirements density in feature requests.

Kelly - Historical grounding of coordination, organization and authority in knowledge production, comparing making and presentation of OED and Grimm's Deutsches Worterbuch - different production modes. Relevance to FLOSS - heterogeneity of contributors, distributed work, basis of success and failure, maintaining quality control w/o alienating contributors (!), indications of reasons for proliferation. Comparison to Wikipedia and FLOSS - so fundamentally about heterogeneous distributed collaboration.

Yixin - Open innovation and open source - how are they related wrt firm-community relationships? Issues of ownership with open innovation, is it an original product or a mashup, and what's the source of the production, individual versus community. How does meaning of innovation change based on source of production and ownership?

Steffen - Customization of FLOSS in companies - adaptations without feedback to the project, little research of FLOSS maintenance in companies. Study CMS project in a startup for several iterations - action research (participatory, qualitative) - found specific situations during FLOSs customization, proposes a process model for FLOSS customization. Also analyzed 3 popular frameworks' support for FLOSS customization.

Oh, and the Swedes use Ikea too.