About
Quantitative analysis of the evolution of the TensorFlow collaborative social network over time using SAOMs (Stochastic Actor-Oriented Models)
Motivation
-
Increase our understanding of collaboration and information sharing in open-source coopetitive software ecosystems, i.e. open-coopetition
(see Teixeira 2023).
-
Assess if the evolution of TensorFlow collaborative open-source software ecosystem is a matter of (1) randomness, (2) well-known mechanisms of
network evolution, or (3) strategy as executed by the firms that contribute to the software ecosystem.
Key references
Key theoretical references
- Contractor, N. S., Wasserman, S., & Faust, K. (2006). Testing multitheoretical, multilevel hypotheses about organizational networks: An analytic framework and empirical example. Academy of Management Review, 31(3), 681-703.
Available at JSTOR https://www.jstor.org/stable/20159236.
- Teixeira, J. A., Ahmed, S. S., Laine-Kronberg, L., Mezei, J., & Smailhodzic, E. (2025).
Towards understanding open and coopetitive platform ecosystems: The case of TensorFlow. in
Proceedings of the 33th European Conference on Information Systems (ECIS 2025) AIS (conditionally accepted 28 Feb 2025).
Open-access right here.
- Teixeira, J. A. (2024). Towards understanding open-coopetition -- Lessons from the automotive industry. in
Proceedings of the 44th International Conference on Information Systems (ICIS 2023) AIS.
Open-access right here.
Also available from AISel https://aisel.aisnet.org/icis2023/isdesign/isdesign/5/ .
-
Li, X., Zhang, Y., Osborne, C., Zhou, M., Jin, Z., & Liu, H. (2025). Systematic literature review of commercial participation in open source software. ACM Transactions on Software Engineering and Methodology, 34(2), 1-31.
Available at ACM DL https://doi.org/10.1145/3690632.
Key methodological references
-
Teixeira, J., Robles, G., & González-Barahona, J. M. (2015). Lessons Learned from Applying Social Network Analysis on an Industrial Free/Libre/Open Source Software Ecosystem. Journal of Internet Services and Applications, 6, 1-27.
Open-access at https://link.springer.com/article/10.1186/s13174-015-0028-2.
- Lindberg, A., Schecter, A., Berente, N., Hennel, P., & Lyytinen, K. (2024). The Entrainment of Task Allocation and Release Cycles in Open Source Software Development. Management Information Systems Quarterly, 48(1), 67-94.
Available at AISel https://aisel.aisnet.org/misq/vol48/iss1/5/.
-
Holme, P., & Saramäki, J. (2012). Temporal Networks. Physics reports, 519(3), 97-125.
Available via Elsevier https://www.sciencedirect.com/science/article/abs/pii/S0370157312000841 .
-
Snijders, T. A., Van de Bunt, G. G., & Steglich, C. E. (2010). Introduction to stochastic actor-based models for network dynamics. Social Networks, 32(1), 44-60.
Available via Elsevier https://www.sciencedirect.com/science/article/abs/pii/S0378873309000069 .
-
Butts, C. T. (2008). A Relational Event Framework for Social Action. Sociological Methodology, 38(1), 155-200.
Available at JSTOR https://www.jstor.org/stable/20451153.
Related working papers from others
-
Osborne, C., Daneshyan, F., He, R., Ye, H., Zhang, Y., & Zhou, M. (2024). Characterising Open Source Co-opetition in Company-hosted Open Source Software Projects: The Cases of PyTorch, TensorFlow, and Transformers. arXiv preprint arXiv:2410.18241.
Open-access at https://doi.org/10.48550/arXiv.2410.18241 .
Aim and research questions
Overall aim:
- Advance our understanding on open-source, and open-coopetition by looking at the case of TensorFLow;
Research questions:
-
How does the TensorFlow collaborative social network evolve over time?
-
It is a matter of (1) randomness, (2) well-known mechanisms of
network evolution, or (3) behavioural strategy from the firms that contribute to the TensorFlow ecosystem?
Data
The relational network data is retrieved from the Git repository using ScrapLogGit2Net.
The tool, first described in Teixeira et al. (2015),
connects developers that co-editing the same source-code file. Here there is the assumption that co-editing the same source file traces some cooperative and/or information sharing behaviour.
The TensorFLow collaborative network during a certain time slice can be formally defined as:
Gt = (V,Av,E)
Where:
V = A set of nodes representing the developers contributing to the TensorFlow core open-source software project
E = A set of edges, identifying the connections between two developers if they have worked on the same software source-code file.
Av = A set of nodes-attributes, capturing each developer’s company affiliation. This information is extracted from the email address of each developer and/or the GitHub API.
Description of the collected data
Metadata and Paradata briefs
| Mined repository |
https://github.com/tensorflow/tensorflow.git |
| Mining tool |
https://github.com/jaateixeira/ScrapLogGit2Net |
| Miner |
Jose Teixeira |
| Last collection |
18 October 2024 |
| Covered lifespan of the project |
7 Nov 2013 - 12 April 2024 |
| Segmentation |
Year by year - 2013 to 2024 |
| Number of networks |
1 capturing the overall project lifespan + 9 capturing a year each |
| Nodes |
Individual software developers (bots were filtered out), id by email |
| Edges |
Cooperation and information sharing, association by co-editing same source-code file |
| Node-attributes |
Organizational affiliation, association by email domain and GitHub API |
| File/Network format |
graphml - http://graphml.graphdrawing.org/ |
| Archival of the File/Network/GraphML files |
https://github.com/jaateixeira/ScrapLogGit2Net/tree/main/test-data/TensorFlow/icis-2024-wp-networks-graphML |
| Related publications |
Teixeira, J. A., Ahmed, S. S., Laine-Kronberg, L., Mezei, J., & Smailhodzic, E. (2025).
Towards understanding open and coopetitive platform ecosystems: The case of TensorFlow. in
Proceedings of the 33th European Conference on Information Systems (ICIS 2023) AIS (conditionally accepted 28 Feb 2025).
Open-access right here.
|
Networks
Network 1 - Capturing the overall project lifespan of the project
| Node classifier: |
Human software developer |
| Edge classifier: |
Collaboration and information sharing |
| Number of nodes: |
4219 |
| Number of edges: |
378309 |
| Node attributes: |
e-mail, color, affiliation |
| Edge attributes: |
Null |
| Captured time span: |
7 Nov 2013 - 12 April 2024 |
| Network data file format: |
graphml - http://graphml.graphdrawing.org/ |
| Network data available: |
https://github.com/jaateixeira/ScrapLogGit2Net
(here)
|
Network 2 - Capturing code collaboration during 2015
| Node classifier: |
Human software developer |
| Edge classifier: |
Collaboration and information sharing |
| Number of nodes: |
47 |
| Number of edges: |
170 |
| Node attributes: |
e-mail, color, affiliation |
| Edge attributes: |
Null |
| Captured time span: |
Year 2015 |
| Network data file format: |
graphml - http://graphml.graphdrawing.org/ |
| Network data available: |
https://github.com/jaateixeira/ScrapLogGit2Net
(here)
|
Network 3 - Capturing code collaboration during 2016
| Node classifier: |
Human software developer |
| Edge classifier: |
Collaboration and information sharing |
| Number of nodes: |
610 |
| Number of edges: |
14368 |
| Node attributes: |
e-mail, color, affiliation |
| Edge attributes: |
Null |
| Captured time span: |
| Year 2016 |
| Network data file format: |
graphml - http://graphml.graphdrawing.org/ |
| Network data available: |
https://github.com/jaateixeira/ScrapLogGit2Net
(here)
|
Network 4 - Capturing code collaboration during 2017
| Node classifier: |
Human software developer |
| Edge classifier: |
Collaboration and information sharing |
| Number of nodes: |
916 |
| Number of edges: |
23101 |
| Node attributes: |
e-mail, color, affiliation |
| Edge attributes: |
Null |
| Captured time span: |
| Year 2017 |
| Network data file format: |
graphml - http://graphml.graphdrawing.org/ |
| Network data available: |
https://github.com/jaateixeira/ScrapLogGit2Net
(here)
|
Network 5 - Capturing code collaboration during 2018
| Node classifier: |
Human software developer |
| Edge classifier: |
Collaboration and information sharing |
| Number of nodes: |
923 |
| Number of edges: |
24424 |
| Node attributes: |
e-mail, color, affiliation |
| Edge attributes: |
Null |
| Captured time span: |
| Year 2018 |
| Network data file format: |
graphml - http://graphml.graphdrawing.org/ |
| Network data available: |
https://github.com/jaateixeira/ScrapLogGit2Net
(here)
|
Network 6 - Capturing code collaboration during 2019
| Node classifier: |
Human software developer |
| Edge classifier: |
Collaboration and information sharing |
| Number of nodes: |
943 |
| Number of edges: |
26531 |
| Node attributes: |
e-mail, color, affiliation |
| Edge attributes: |
Null |
| Captured time span: |
| Year 2019 |
| Network data file format: |
graphml - http://graphml.graphdrawing.org/ |
| Network data available: |
https://github.com/jaateixeira/ScrapLogGit2Net
(here)
|
Network 7 - Capturing code collaboration during 2020
| Node classifier: |
Human software developer |
| Edge classifier: |
Collaboration and information sharing |
| Number of nodes: |
896 |
| Number of edges: |
26416 |
| Node attributes: |
e-mail, color, affiliation |
| Edge attributes: |
Null |
| Captured time span: |
Year 2020 |
2015
| Network data file format: |
graphml - http://graphml.graphdrawing.org/ |
| Network data available: |
https://github.com/jaateixeira/ScrapLogGit2Net
(here)
|
Network 8 - Capturing code collaboration during 2021
| Node classifier: |
Human software developer |
| Edge classifier: |
Collaboration and information sharing |
| Number of nodes: |
728 |
| Number of edges: |
19396 |
| Node attributes: |
e-mail, color, affiliation |
| Edge attributes: |
Null |
| Captured time span: |
Year 2021 |
| Network data file format: |
graphml - http://graphml.graphdrawing.org/ |
| Network data available: |
https://github.com/jaateixeira/ScrapLogGit2Net
(here)
|
Network 9 - Capturing code collaboration during 2022
| Node classifier: |
Human software developer |
| Edge classifier: |
Collaboration and information sharing |
| Number of nodes: |
617 |
| Number of edges: |
14934 |
| Node attributes: |
e-mail, color, affiliation |
| Edge attributes: |
Null |
| Captured time span: |
Year 2022 |
| Network data file format: |
graphml - http://graphml.graphdrawing.org/ |
| Network data available: |
https://github.com/jaateixeira/ScrapLogGit2Net
(here)
|
Network 10 - Capturing code collaboration during 2023
| Node classifier: |
Human software developer |
| Edge classifier: |
Collaboration and information sharing |
| Number of nodes: |
588 |
| Number of edges: |
15129 |
| Node attributes: |
e-mail, color, affiliation |
| Edge attributes: |
Null |
| Captured time span: |
Year 2023 |
| Network data file format: |
graphml - http://graphml.graphdrawing.org/ |
| Network data available: |
https://github.com/jaateixeira/ScrapLogGit2Net
(here)
|
Data-set with all the 10 networks: Download tarball here;
Links
Hands on tutorials - beginner level
Books