In collaboration with Google, a new and expansive dataset has been released on BigQuery, significantly enhancing the original GitHub Archive project from 2012. This dataset, now over 3TB, is the largest source of GitHub activity available, covering data from more than 2.8 million open source repositories, including over 145 million unique commits and the contents of 163 million files. It offers researchers, organizations, and developers the ability to search and analyze open source software activity and trends using regular expressions. This initiative aims to document the vast collection of human knowledge encoded in software, and future efforts will focus on making open source data more accessible and valuable for a variety of users. Interested parties can explore this dataset on Google Cloud to gain insights into open source communities and software development patterns.