Understanding systematic differences in commercially available datasets reveals significant challenges in data accessibility and safety, influenced by licensing and geographic representation. A comprehensive audit highlights the growing disparity between commercially open and closed data, with an increasing number of publicly released datasets restricted from commercial use, impacting small companies and fostering a quality gap in data available for commercial applications. The audit uncovers a Western-centric bias in datasets, with limited representation from Asian, African, and South American regions, potentially leading to biases in model performance for non-Western users. Legal ambiguities further complicate data usage, as existing open-source licenses, primarily designed for software, are applied to datasets without modification, causing challenges in legal compliance and responsible data stewardship. The launch of the Data Provenance Explorer and a global initiative aims to enhance data transparency and responsible use, addressing the ethical, legal, and transparency issues identified.