Skip to content

get_timestamp

Return timestamp showing most recent entry from the given table
that has been processed by the ingestion lambda.

Args:
    table_name (str): table name to get timestamp for
    client (boto3 SSM Client)

Raises:
    KeyError: table_name does not exist
    ConnectionError : connection issue to parameter store

Returns:
    timestamp (datetime timestamp) : stored timestamp of most recent
    ingested data for given table

write_timestamp

Writes timestamp to parameter store for given table

Args:
    timestamp (timestamp) : timestamp of latest extracted data
    table_name (str) : table name to store timestamp for
    client (boto3 SSM Client) : client passed in to avoid recreating for each invocation

Raises:
    ConnectionError : connection issue to parameter store

Returns:
    None

collect_table_data

Returns all data from a table newer than most recent timestamp

Args:
    table_name (string)
    timestamp (timestamp)
    db_conn (pg8000 database connection)

Raises:
    KeyError: table_name does not exist
    ConnectionError : connection issue to parameter store


Returns:
    table_data (list) : list of dictionaries
        all data in table, one dictionary
        per row keys will be column headings

find_latest_timestamp

Iterates over data from one database table and returns the most recent timestamp

Args:
    table_data (list) : list of dictionaries representing rows of the table
    columns (list[str], optional keyword) : columns to search for timestamps.
        Defauts to ["last_updated"]

Raises:
    KeyError: columns do not exist

Returns:
    most_recent_timestamp (timestamp) : from list returns most recent
        timestamp from created_at/updated_at values

write_table_data_to_s3

Write file to S3 bucket as Json lines format

Args:
    table_name (string)
    table_data (list) : list of dictionaries all data in table,
        one dictionary per row keys will be column headings
    s3_client (boto3 s3 client)


Raises:
    FileExistsError: S3 object already exists with the same name
    ConnectionError : connection issue to S3 bucket

Returns:
    key (str): The S3 object key the data is written to

get_seq_id

From parameter store retrieves table_name : sequential_id key value pair

Args:
    table_name (string)


Raises:
    KeyError: table_name does not exist
    ConnectionError : connection issue to parameter store

Returns:
    sequential_id(int)

write_seq_id

To parameter store write table_name : sequential_id key value pair
-- checks sequential_id is one greater than previous sequential_id

Args:
    table_name (string)
    sequential_id(int)


Raises:
    KeyError: table_name does not exist
    ConnectionError : connection issue to parameter store

Returns:
    None