AutoGradeLinuxSNA

A cross-disciplinary research project at the insertion of Software Engineering, Network Science, Strategic Management, and the exchange of Knowledge and Information.

About

AutoGradeLinuxSNA is a research project mining the Automotive Grade Linux (AGL) software ecosystem with Social Network Analysis

AGL is a collaborative open source project that is bringing together automakers, suppliers and technology companies to accelerate the development and adoption of a fully open software stack for the connected car. With Linux at its core, AGL is developing an open platform from the ground up that can serve as the de facto industry standard to enable rapid development of new features and technologies. If you drive high-end cars from Japanese and German vendors, is quite likely that you are using AGL even without your knowledge.

Image of a neural network

Motivation

  • To what extent are cars, powered by open-source software?
  • Why and how big car-makers like Toyota, Daimler and BMW are open-sourcing the production of advanced and complex technologies that were previously done behind closed doors?
  • In turbulent times for the automotive industry, why do key players cooperate with competitors in an open-source way (i.e., engaging in open-coopetition)?

Key references

Key theoretical references
On open-coopetition
  • Teixeira, J. A. (2023). Towards understanding open­-coopetition -- Lessons from the automotive industry. in Proceedings of the 44th International Conference on Information Systems (ICIS 2023) AIS. Open-access right here.
  • Teixeira, J. A., Ahmed, S. S., Laine-Kronberg, L., Mezei, J., & Smailhodzic, E. (2025). Towards understanding open and coopetitive platform ecosystems: The case of TensorFlow. in Proceedings of the 33rd European Conference on Information Systems (ECIS 2025) AIS (accepted 24 April 2025). Open-access right here.
On commercial involvement in OSS projects
  • Li, X., Zhang, Y., Osborne, C., Zhou, M., Jin, Z., & Liu, H. (2025). Systematic literature review of commercial participation in open source software. ACM Transactions on Software Engineering and Methodology, 34(2), 1-31. Available via IEEEhttps://doi.org/10.1145/3690632.
  • Qin, M., Zhang, Y., Zhou, M., Wang, Z., Li, H., & Liu, H. (2025). Developers’ Views on Commercial Involvement in OSS-A Survey from Three Projects. IEEE Transactions on Software Engineering. Available via IEEE dx.doi.org/10.1109/TSE.2025.3568056.
  • Wissel, J., Zaggl, M., & Lindberg, A. (2020). Control vs freedom: how companies manage knowledge sharing with open source software communities. Proceedings of the 53rd Hawaii International Conference on System Sciences, IEEE. Open-access at http://hdl.handle.net/10125/64343
On ecosystems thinking
On IP in coopetitive relationships in the auto industry
  • Holgersson, M. (2018). Technology-based coopetition and intellectual property management. Routledge Companion to Coopetition Strategies, 1993. Available at https://www.ip-research.org/
Key methodological references
Mining of Software Repositories with SNA
  • Teixeira, J., Robles, G., & González-Barahona, J. M. (2015). Lessons learned from applying social network analysis on an industrial Free/Libre/Open Source Software ecosystem. Journal of Internet Services and Applications, 6, 1-27. Open-access at https://link.springer.com/article/10.1186/s13174-015-0028-2.
  • Teixeira, J., Mian, S., & Hytti, U. (2016). Cooperation Among Competitors in the Open-Source Arena: The Case of Openstack. In Proceedings of the 37th International Conference on Information Systems (ICIS 2016) AIS. Open-access at https://arxiv.org/abs/1612.09462 .
  • Herbold, S., Amirfallah, A., Trautsch, F., & Grabowski, J. (2021). A systematic mapping study of developer social network research. Journal of Systems and Software, 171, 110802. Closed-access at https://www.sciencedirect.com/science/article/pii/S0164121220302077 .
  • Osborne, C., Daneshyan, F., He, R., Ye, H., Zhang, Y., & Zhou, M. (2025). Characterising Open Source Co-opetition in Company-hosted Open Source Software Projects: The Cases of PyTorch, TensorFlow, and Transformers. Proceedings of the ACM on Human-Computer Interaction, 9(2), 1-30. Open-access at https://doi.org/10.1145/3710944.
  • Zhu, J., & Wei, J. (2019, May). An empirical study of multiple names and email addresses in oss version control repositories. In 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR) (pp. 409-420). IEEE. Closed-access at https://ieeexplore.ieee.org/abstract/document/8816766.
  • Schreiber, R. R., & Zylka, M. P. (2020). Social network analysis in software development projects: A systematic literature review. International Journal of Software Engineering and Knowledge Engineering, 30(03), 321-362. Closed-access at https://doi.org/10.1142/S021819402050014X.
    More and more companies participate in open source software to gain competitive advantages. This leads to interesting research fields in the collaboration of these competing companies and their software developer in these projects. -- Schreiber, R. R., & Zylka, M. P. (2020) p.24
SNA evolution theory
  • Contractor, N. S., Wasserman, S., & Faust, K. (2006). Testing multitheoretical, multilevel hypotheses about organizational networks: An analytic framework and empirical example. Academy of management review, 31(3), 681-703. Available at JSTOR https://www.jstor.org/stable/pdf/20159236.pdf.
Longitudinal SNA
  • Block, P., Stadtfeld, C., & Snijders, T. A. (2019). Forms of dependence: Comparing SAOMs and ERGMs from basic principles. Sociological Methods & Research, 48(1), 202-239. Closed-access on ResearchGate at https://www.researchgate.net/profile/Per-Block.
Case study research
Visualization of collaborative activities
  • Isenberg, P., Elmqvist, N., Scholtz, J., Cernea, D., Ma, K. L., & Hagen, H. (2011). Collaborative visualization: Definition, challenges, and research agenda. Information Visualization, 10(4), 310-326.
  • Matusiak, K. K., Osinska, V., Organisciak, P., & Thomas Pitts, R. (2025). Research methods and the use of visual representation in library and information science research. Journal of the Association for Information Science and Technology, 76(3), 527-544.

Aim, objectives and research questions

Overall aim:
  • Advance our understanding on open-source, open-coopetition and the automotive industry by looking at the case of Automotive Grade Linux;
Objectives:
Primary objectives:
  • Contribute to open-source motivations literature;
  • Contribute to open-coopetition literature;
  • Contribute to the literature of information and knowledge management on large, open and coopetitive inter organisational networks/ecosystems;
  • Document the history and the evolution of Automotive Grade Linux
  • Advance the mining of software repositories with Social Network Analysis;
Secondary objectives:
  • Contribute to platform ecosystems literature;
  • Contribute to coopetition literature;
  • Contribute to open-innovation literature;
  • Contribute to triple helix innovation literature;
  • Improve existing open-source tools that mine software repositories;
Research questions:
Regarding tech giants
  • RQ1) Why do car giants like Toyota and BMW co-produce advanced and complex technological platforms in an open-source way?
  • RQ2) Why are different organizations cooperating with competitors in the co-production of open-source automotive platforms such as AGL?
Regarding non-commercial organizations
  • RQ3) How did non-commercial entities contribute to the development of the Automotive Grade Linux?
  • RQ4) How do the different freedoms of open and coopetitive platforms align with the different interests of commercial, non-commercial and governmental organizations?
Regarding Information and Knowledge Management
  • RQ5) How AGL participants balance knowledge protection with knowledge sharing?
  • RQ6) How the openness culture of open-source communities clash with the hierarchical nature of the automotive industry?

Research team

By chronological order as contributors worked on the project:



Methodological overview

Mix-methods approach

We combined and virtual-ethnography (VE) with a Social Network Analysis (SNA) over publicly-available and naturally-occurring open-source data that allowed us to re-construct and visualize the evolution of collaboration and information sharing in AGL as a sequence of networks.s Knowledge from the VE informed the SNA and the other way around as we attempted not only to retrieve collaborative networks but also to interpret and explain them. We will also engage with active developers and a community manager to validate our preliminary results and findings.

Virtual Ethnography

We started by screening, by virtual ethnographic manners, publicly available data such as company announcements, financial reports and specialized-press that allowed us to gain insights of the industrial context. Then we could better design the mining of software repositories with SNA.

Social Network Analysis

After attaining a better understanding of the competitive dynamics of the automotive industry in general and AGL in particular, we started extracting and analysing the social network of the OpenStack community leveraging SNA (Scott, 2012; Wasserman and Faust, 1994), which is an emergent method widely established across disciplines of social sciences in general (Borgatti and Foster, 2003; Uzzi, 1996; Wasserman and Faust, 1994; Watts, 2004)

For understanding the evolution of the code-based collaboration, we connect developers who work on the same file, constructing a network of collaboration activities among developers. With the visualization of the network over time, we gain insights on collaboration and rivalry within the software project.

How we modeled the network
Modelling collaboration from the source-code repositories change-log

The collaborative network during a certain time slice can be formally defined as: Gt = (V,Av,E) Where: V = A set of nodes representing the developers contributing to the AGL open-source software project E = A set of edges, identifying the connections between two developers if they have worked on the same software source-code file. Av = A set of nodes-attributes, capturing each developer’s company affiliation. This information is extracted from the email address of each developer.

Semi-Structured interviews

Targeting software developers (i.e., code contributors to AGL core) and program managers at the top 10 firms.

Guiding questionnaire for max 40 min interview:

  1. I assume from the data I have collected so far, that you are a contributor AGL? Is that correct?
  2. What kind of contributions have you added to AGL? Can you give examples?
  3. What motivated you to contribute?
  4. Are you the only one contributing to AGL at your organization, or do you have a team that contributes regularly to AGL?
  5. Will your contributions contribute to your career advancement?
  6. Would it be possible to file a patent related to your contribution?
  7. Are you worried that others might make money with your contribution?
  8. Do you perceive some inter-individual or interorganizational competition in AGL or is it all about collaboration?
  9. Do you consider contributing more to AGL in the future? What about other projects?
  10. Do you think that governments should encourage contributions to open-source communities from universities? What about commercial companies?
  11. What are the main barriers that SMEs or large companies faces when contributing upstream to an open-source projects?
  12. In what way non-commercial organizations (research institutes, university, public sector, foundations) get involved in AGL?
FINAL: I created the following additional social network visualizations and reports about collaboration in the AGL ecosystem. Would you mind taking a fast look at them?

Real time transcription using Wisper


Tools

Tools for mining git repositories with SNA
Tools for the visualization of social networks
  • visone is a software for the visual creation, transformation, exploration, analysis and representation of network data, jointly developed at the University of Konstanz and the Karlsruhe Institute of Technology.
  • Tulip is an information visualization framework dedicated to the analysis and visualization of relational data. Tulip aims to provide the developer with a complete library, supporting the design of interactive information visualization applications for relational data that can be tailored to the problems he or she is addressing. Developed by LaBRI, University of Bordeaux, France.
Tools for the statistical analysis of social networks
  • statnet is a suite of open source R-based software packages for network analysis, along with a comprehensive set of training materials. Developed by Pavel Krivitsky, Skye Bender-deMoll, Michał Bojanowski, Carter T. Butts, Steven M. Goodreau, Mark S. Handcock, David R. Hunter, Chad Klumb, and Martina Morris among others.
  • Goldfish is a software tool (i.e. R package) for the analysis of time-stamped network data using a variety of models. In particular, it implements different types of Dynamic Network Actor Models (DyNAMs), a class of models that is tailored to the study of actor-oriented network processes through time. Goldfish also implements different versions of tie-oriented relational event models. Developed by members of the Chair of Social Networks at ETH Zürich and James Hollway at the Graduate Institute in Geneva.
  • RSiena RSiena is a R package designed for the analysis of longitudinal network data using Stochastic Actor-Oriented Models (SAOMs). Developed by Tom A.B. Snijders and his colleagues, RSiena allows researchers to model and understand the dynamics of social networks over time. With RSiena, you can analyze how network ties evolve based on various factors, such as network structure, actor attributes, and external influences. The software is particularly useful for studying the formation and dissolution of ties in social networks, making it a valuable tool for sociologists, organizational researchers, and other social scientists.
  • Relevent. Relational Event Models (REMs) are a powerful tool for analyzing event data in social networks. Unlike traditional network models that focus on static snapshots, REMs are designed to handle continuous streams of interaction events, such as emails, phone calls, or social media posts. The R package for REMs, typically referred to as `relevent`, allows researchers to model the occurrence of relational events based on various factors, including past interactions, network structure, and actor attributes. This makes REMs particularly useful for studying the dynamics of communication and interaction in social networks.

Results

→ Stage 1 - Finding the repositories with SNA

AGL is a Linux distro for automotive industry. It is high dependent of the Yocto Project whose goal is to produce tools and processes that enable the creation of Linux distributions for embedded and IoT software that are independent of the underlying architecture of the embedded hardware.

AGL is orchestrated in Git + Gerrit and it is organized across dozens of Git repositories. They are organized with the "AGL", "staging", "src", and "apps" categories.

Most Automotive Grade Linux sites require a Linux Foundation (LF) identity to login. Only with a LF login we can update their Wiki, access Jira for issue tracking, or access source code in Gerrit and Git.

After creating the LF identity, the researcher could login in https://gerrit.automotivelinux.org/gerrit/admin/repos and access to more than 250 repositories. Gerrit provided the link to the Git repository for each repository. By cloning one of the AGL main repositories using the following command:

git clone https://gerrit.automotivelinux.org/gerrit/AGL/AGL-repo
we could list all the remote branches, which correspond to the different AGL releases, with the command:
git branch -r

→ Stage 2 - Finding the different releases

AGL Releases

We then wrote a bash script that recursively checkouts in all release branches and lists the first commit date and the last commit date for each release. In the case of AGL, releases are fish names. The results are in this html table:
Release Version Codename (Fish) Release Date
AGL 1.0 Agile Albacore January 2016
AGL 2.0 Bouncing Blowfish June 2016
AGL 3.0 Cool Catfish January 2017
AGL 4.0 Daring Dab July 2017
AGL 5.0 Excited Eel January 2018
AGL 6.0 Fearless Flounder July 2018
AGL 7.0 Great Guppy January 2019
AGL 8.0 Happy Halibut July 2019
AGL 9.0 Icy Icefish January 2020
AGL 10.0 Jazzy Jellyfish July 2020
AGL 11.0 Kicking Koi January 2021
AGL 12.0 Lively Lamprey July 2021
AGL 13.0 Mighty Marlin January 2022
AGL 14.0 Nimble Needlefish July 2022
AGL 15.0 Outgoing Octopus January 2023
AGL 16.0 Proud Pike July 2023
AGL 17.0 Quick Quillback January 2024
AGL 18.0 Radiant Ricefish July 2024
AGL 19.0 Super Salmon February 2025
More information on how AGL manages its releases, can be found at Schedule and Milestones AGL Roadmap, WIKI - AGL Latest release notes, and WIKI - AGL Release notes archive.

→ Stage 3 - Getting the Git commit logs for each repository and each release

We followed the instructions at https://wiki.automotivelinux.org/agl-distro/source-code on how to get the source code. We quickly learned that AGL uses the Repo tool for managing repositories and structures the code around the concept of layers. That layered approach to manage and organize the software components that make up the automotive platform is inspired by the Yocto Project, which is a framework for creating custom Linux distributions for embedded systems. As the repo tool was used, we get all the repositories ready to be mined in a set of six directories/layers (that aggregate many repositories behind the scenes).
Layer Description
meta-agl The core layer for AGL containing essential configurations and recipes needed to build the AGL distribution.
meta-agl-cluster-demo Focuses on the instrument cluster demo, including configurations and recipes specific to demonstrating instrument cluster functionality.
meta-agl-demo Contains demo applications and configurations for showcasing AGL's capabilities through various demonstration setups.
meta-agl-devel Intended for development purposes, including experimental features and new recipes under active development.
meta-agl-extra Contains additional recipes and configurations that can be optionally included in the AGL platform.
meta-agl-telematics-demo Focuses on telematics demonstrations, including configurations and recipes specific to showcasing telematics capabilities.
So to get the commit logs for albacore release on the meta-agl layer we:
$ cd meta-agl
$ git checkout albacore
$ git log --pretty=format:"==%an;%ae;%ad==" --name-only > meta-agl-albacore.IN
The log is then ready to be mines with SNA with ScrapLogGit2Net.

→ Stage 4 - Analysis Git commit logs with SNA for each repository and each release

~/PycharmProjects/ScrapLogGit2Net/scrapLog.py -r meta-agl-albacore.IN >
meta-agl-albacore.scraplog.out
./formatFilterAndViz-nofi-GraphML.py -nl spring -p -l
~/meta-agl/meta-agl-albacore.NetworkFile.graphML
We can then transform it to a network of organizations by summing collaborative edges.
./formatFilterAndViz-nofo-GraphML.py -n spring -ff iot
~/rep-clones/jat-websites/autogradelinuxsna/reproducibility-guide/meta-agl-albacore.NetworkFile-transformed-to-nofo.graphML

Key results:

Social Network of the Top 20 organizational contributors to AGL - At meta-agl-layer

Expected contributions

Expected theoretical contributions

Preliminary answers to the guiding research questions:
  • Why do the big players shaping the automotive industry are open-sourcing the production of advanced and complex technology?
    • Extended R&D reach?;
    • Extending the size of the market?;
    • Creating demand for complementary products and services ?;
    • Finding external complementarities?;
    • Providing strong arguments for future antitrust cases?;
    • Easier cooperation and integration with non-commercial organizations?;
    • Extended reputation in interactions with non-commercial organizations?;
    • Easier talent identification and evaluation?;
  • Why are different organizations cooperating with competitors in the co-production of advanced technological platforms?

Expected methodological contributions

  • A simple but yet elegant way of combining quantitative, quantitative and relational social network data.

Publications

By chronological order as results got published:

  • Teixeira, J. A. (2023). Towards understanding open­-coopetition -- Lessons from the automotive industry. in Proceedings of the 44th International Conference on Information Systems (ICIS 2023) AIS. Open-access right here.

Contact

Jose Teixeira < jose.teixeira AT abo.fi >

Renesa Tamannum < Renesa Tamannum AT abo.fi >