Common Crawl’s massive internet archive may be giving AI companies access to paywalled journalism, according to a new report.