Thursday, 12 July 2018

Is it possible to write a BigQuery to retrieve binned counts of PyPI downloads over time?

The following code is an SQL query for google's BigQuery that counts the number of times my PyPI package has been downloaded in the last 30 days.

#standardSQL
SELECT COUNT(*) AS num_downloads
FROM `the-psf.pypi.downloads*`
WHERE file.project = 'pycotools'
  -- Only query the last 30 days of history
  AND _TABLE_SUFFIX
    BETWEEN FORMAT_DATE(
      '%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY))
    AND FORMAT_DATE('%Y%m%d', CURRENT_DATE())

Is it possible to modify this query so that I get the number of downloads every 30 days since the package was uploaded? The output would be a .csv that looks something like this:

date          count
01-01-2016    10
01-02-2016    20
    ..        ..
01-05-2018    100



from Is it possible to write a BigQuery to retrieve binned counts of PyPI downloads over time?

No comments:

Post a Comment