Create projects as quickly
as news breaks

Using Cookiecutter templates, projects can be spun up quickly whenever needed.

terminal

  • datakit project create
  • Creating project from template: /Users/lfenn/.cookiecutters/cookiecutter-r-project
    full_name [Larry Fenn]:
    email [lfenn@ap.org]:
    project_name [New Project]: Hudson Helicopter Crash
    project_slug [hudson-helicopter-crash]
    project_short_description [TK: short project description]: Pull FAA data on helicopter crashes
  • cd hudson-helicopter-crash

Simple, adaptable project structure

Keep all parts of a project cleanly separated: data, code, configuration, documentation.

terminal

    ├── README.md
    ├── analysis
    │   ├── analysis.R
    │   ├── graphs.Rmd
    │   └── story_lines.R
    ├── config
    │   └── datakit-data.json
    ├── data
    │   ├── manual
    │   ├── processed
    │   └── source
    ├── docs
    ├── output
    │   └── totals_over_time.csv
    └── trump-counties-employment.Rproj

Integrate with cloud storage
and hosting

Automate data and code syncing to take the guesswork out of storage and backup.

terminal

  • datakit data push
  • EXECUTING: aws s3 sync --profile default data/ s3://data.ap.org/projects/2019/trump-counties-employment/data/
    upload: data/manual/trumpjobs.sqlite to s3://data.ap.org/projects/2019/trump-counties-employment/data/manual/trumpjobs.sqlite
    upload: data/reports/graphs.html to s3://data.ap.org/projects/2019/trump-counties-employment/data/reports/graphs.html
    upload: data/source/LAUS/laucnty15.xlsx to s3://data.ap.org/projects/2019/trump-counties-employment/data/source/LAUS/laucnty15.xlsx
    upload: data/source/LAUS/laucnty16.xlsx to s3://data.ap.org/projects/2019/trump-counties-employment/data/source/LAUS/laucnty16.xlsx
    upload: data/source/LAUS/laucnty17.xlsx to s3://data.ap.org/projects/2019/trump-counties-employment/data/source/LAUS/laucnty17.xlsx
    upload: data/source/LAUS/laucnty18.xlsx to s3://data.ap.org/projects/2019/trump-counties-employment/data/source/LAUS/laucnty18.xlsx

Quickstart installation guide

For Python 3. If you do not have Python 3 installed on your machine, get the latest version here.

More detailed installation documents are available here.



1. Install datakit-project

This is our most popular plugin and sets you up nicely to use other plugins in the future if you want.

terminal

  • pip install datakit-project

2. Grab a template

The AP has created both a Python and R project structure template:

If you'd like to customize your own, any valid Cookiecutter will work.

terminal

  • datakit project create -t https://github.com/associatedpress/cookiecutter-r-project.git

3. Get to work

On the command line, datakit project create will create a project with a standardized file structure.

terminal

  • datakit project create

4. Adapt to your workflow

Additional plugins can help you manage the storage of flat data files, sync your code to GitLab or GitHub and push your output to data.world for sharing. Grab other plugins or develop your own!

Better collaborations, less mess

Questions? Comments? Drop us a line at datateam@ap.org
More information on the AP's data journalism program