About OSCAR

Project Mission

OSCAR is a project designed to quantify and analyze open source software contributions, specifically tracking GitHub activity across different technology companies.

Brief History

This project began in 2018 while Iwas working at Adobe. One of my responsibilities included managing Adobe's Open Source Office. Matt Asay, my boss at the time, asked if we could quantify the impact the Open Source Office has had on Adobe and roughly compare Adobe's open source activity to that of other technology companies. Inspired by Felipe Hoffa's "Top contributors to GitHub" work, this project slowly evolved over time, and over the years I have collected a lotof data.

How It Works

OSCAR works on an hourly "event loop."

1. Data Collection

The system downloads hourly GitHub activity archives from GitHub Archive, a public dataset that captures all GitHub public activity. We specifically track activity on repositories that were forked or watched in the previous 30 days, maintaining a rolling 30-day list of "popular" projects. Why do this? As a low-pass filter: these "popular" projects end up accounting for about 15%of public GitHub git pushevents.

GitHub repositories watched/forked, previous 30 days
2026-02-21 802888.625

802,888.625

2026-02-21

2026-02-22 802857.5652

802,857.565

2026-02-22

2026-02-23 802498.3333

802,498.333

2026-02-23

2026-02-24 799217.4167

799,217.417

2026-02-24

2026-02-25 793863.8333

793,863.833

2026-02-25

2026-02-26 791716.2917

791,716.292

2026-02-26

2026-02-27 788526.3182

788,526.318

2026-02-27

2026-02-28 787289

787,289

2026-02-28

2026-03-01 789296.6957

789,296.696

2026-03-01

2026-03-02 786606.6087

786,606.609

2026-03-02

2026-03-03 779809.3478

779,809.348

2026-03-03

2026-03-04 776468.1739

776,468.174

2026-03-04

2026-03-05 772738

772,738

2026-03-05

2026-03-06 772172.1667

772,172.167

2026-03-06

2026-03-07 772432.4783

772,432.478

2026-03-07

2026-03-08 775926.2609

775,926.261

2026-03-08

2026-03-09 772549.4348

772,549.435

2026-03-09

2026-03-10 760548.2917

760,548.292

2026-03-10

2026-03-11 751339.2917

751,339.292

2026-03-11

2026-03-12 745919.3043

745,919.304

2026-03-12

2026-03-13 739165.7826

739,165.783

2026-03-13

2026-03-14 735858.5417

735,858.542

2026-03-14

2026-03-15 735396.1818

735,396.182

2026-03-15

2026-03-16 733760.087

733,760.087

2026-03-16

2026-03-17 722289.3182

722,289.318

2026-03-17

2026-03-18 714465.125

714,465.125

2026-03-18

2026-03-19 708239.3333

708,239.333

2026-03-19

2026-03-20 701186.0435

701,186.044

2026-03-20

2026-03-21 696320.5652

696,320.565

2026-03-21

2026-03-22 695543.8333

695,543.833

2026-03-22

2026-03-23 690655

690,655

2026-03-23

0.9M
0.6M
2026-02-21
2026-03-23

Next, we look at all public GitHub git pushevents and try to find information about the users pushing code to these projects.

2. User-Corporation Association

For every user contributing to these "popular" projects, we query the GitHub APIto retrieve the companyfield from their profile. Company associations are extracted from user profiles and tracked over time, allowing us to detect when developers change employers or update their affiliations.

Users committing to 'popular' projects per hour, previous 30 days
2026-02-21 2590.9167

2,590.917

2026-02-21

2026-02-22 2548.6957

2,548.696

2026-02-22

2026-02-23 3253.9167

3,253.917

2026-02-23

2026-02-24 3351.5417

3,351.542

2026-02-24

2026-02-25 3261.9167

3,261.917

2026-02-25

2026-02-26 3301.875

3,301.875

2026-02-26

2026-02-27 3350.7727

3,350.773

2026-02-27

2026-02-28 2652.125

2,652.125

2026-02-28

2026-03-01 2507.2609

2,507.261

2026-03-01

2026-03-02 3212.8696

3,212.87

2026-03-02

2026-03-03 3466.913

3,466.913

2026-03-03

2026-03-04 3431.2609

3,431.261

2026-03-04

2026-03-05 3470.8333

3,470.833

2026-03-05

2026-03-06 3426.9167

3,426.917

2026-03-06

2026-03-07 2667.3043

2,667.304

2026-03-07

2026-03-08 2559.4348

2,559.435

2026-03-08

2026-03-09 3325.5217

3,325.522

2026-03-09

2026-03-10 3519.625

3,519.625

2026-03-10

2026-03-11 3452.625

3,452.625

2026-03-11

2026-03-12 3359.2609

3,359.261

2026-03-12

2026-03-13 3210.5217

3,210.522

2026-03-13

2026-03-14 2488.2917

2,488.292

2026-03-14

2026-03-15 2441.3182

2,441.318

2026-03-15

2026-03-16 3173.913

3,173.913

2026-03-16

2026-03-17 3309.5

3,309.5

2026-03-17

2026-03-18 3276.25

3,276.25

2026-03-18

2026-03-19 3209.4783

3,209.478

2026-03-19

2026-03-20 3046.9565

3,046.957

2026-03-20

2026-03-21 2366.9565

2,366.957

2026-03-21

2026-03-22 2309.0417

2,309.042

2026-03-22

2026-03-23 3113.2174

3,113.217

2026-03-23

3.6k
2.3k
2026-02-21
2026-03-23

There is some regular-expression'ing going on to roughly associate these company strings to specific corporations, but we do our best, especially for known companies. It's not a perfect way to create these associations, but it's better than looking at e-mails associated to the commits (which most other similar analyses use as their approach). For me personally, I associate my personal e-mail with my git commits, so I wanted to try a different approach.

3. Analysis and Storage

The User-company association data is exported to Google BigQueryfor large-scale analysis: every month, we generate comprehensive reportson corporate GitHub activity across monthly, quarterly, and yearly timeframes, providing insights into which organizations are most active in the open source ecosystem.

Acknowledgments

This project is built on the shoulders of giants and would not be possible without the following open source technologies:

Special thanks to Felipe Hoffa for pioneering GitHub data analysis techniques and inspiring this work.