Access Raw 990 Data via Command Line (Terminal)

The following steps outline how to access the Data Lake with command line tools (CLI). This option is recommended for advanced users. Alternatively, we recommend that you access the Data Lake directly via AWS.

Important Note!

This option requires the user to have AWS CLI Tools installed In your Terminal. If you do not have AWS CLI Tools installed follow these instructions. This option will allow the user to access the files programmatically.

Step by step guide to accessing data via Command Line Terminal:

1. Open your terminal

2. To access the main bucket & list contents, type the following to access main bucket:

aws s3 ls gt990datalake-rawdata –no-sign-request

Note: You can learn more about no-sign-request parameter here

3. For any bucket sub directories use a similar command with the url from this table.

4. To download contents from a bucket to your local computer, use the
following prompt:

aws s3 cp gt990datalake-rawdata/{FromTableAbove} yourlocalpath

An example for downloading index would be:

s3://gt990datalakerawdata/Indices/990xmls/index_all_years_efiledata_xmls_created_on_2023-10-29.csv index.csv

For additional commands visit AWS CLI documentation